TextPro is a suite of modular Natural Language Processing (NLP) tools for analysis of Italian and English texts. The suite has been designed so as to integrate and reuse state of the art NLP components developed by researchers at FBK. The current version of the tool suite provides functions ranging from tokenization to chunking and Named Entity Recognition (NER).
TextPro performed the best on the task of Italian NER and Italian PoS Tagging atEVALITA 2007. When tested on a number of other standard English benchmarks, TextPro confirms that it performs as state of the art system.
The system’s architecture is organized as a pipeline of processors wherein each stage accepts data from an initial input or from an output of a previous stage, executes a specific task, and sends the resulting data to the next stage, or to the output of the pipeline.
Distributions for Linux and Mac are available, for both research and commercial purposes. A web-service version of the system is under development.