EntityPro annotates named entities, i.e. proper names of persons, locations and organizations, in a text. The module is based on a statistical classifier and makes use of local features, gazetteers, long-distance features and distributional features extracted from very large non-annotated corpora.
To allow for easy domain adaptation, EntityPro implements white and black lists, through which you can force a specific behavior of the classifier on certain entities.
The module is available with pre-trained models in the news domain for three languages, and has been integrated into the TextPro Active Learning platform.


Algorithm: EntityPro uses Yamcha for feature extraction and SVM as a classification algorithm.

ResourcesI-CAB (Italian), CoNLL 2003 (English) and the EUCLIP dataset for German.

Evaluation benchmark: NER at Evalita 2007 (Italian).

