Skip to content

hugopinto/TextClassificationPlugin

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Processing Resource for GATE which uses DigitalPebble's TextClassification API (https://code.google.com/p/textclassification/)

Compilation
Run 'ant distrib'. The GATE compile dependencies are managed by IVY, you need an internet connection so that it can fetch the GATE jars.

Installation
Unzip the distribution archive into GATE/plugins or to the directory of your choice then load it using GATE's plugin management panel

Usage
The plugins contains 3 Processing resources :
- TrainingCorpusCreator : generates a lexicon + raw file + training vector in the specified directory. See https://code.google.com/p/textclassification/ for instructions on 
how to generate a model file using the base text classification API or http://www.csie.ntu.edu.tw/~cjlin/libsvm/ on how to use directly the libsvm tools over the vector file.
- ClassifierPR : takes a model and lexicon to classify the annotations specified in textAnnotationType
- NGram maker : generates ngrams that can be used as input for the corpus generation or classification

About

GATE Processing Resource wrapping DigitalPebble's TextClassification API

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%