TextPro - Text Processing Tools

TextPro is a suite of modular Natural Language Processing (NLP) tools for analysis of written texts. The suite has been designed to integrate and reuse state of the art NLP components developed by researchers at FBK. The current version of the tool suite provides functions ranging from tokenization to parsing and named entity recognition.

The different modules included in TextPro have been evaluated in the context of several evaluation campaigns and international shared tasks, such as EVALITA (PoS tagging, named entity recognition, and parsing for Italian) and Semeval 2010 (keyphrase extraction from scientific articles in English).

The architecture of TextPro is organized as a pipeline of processors where each stage accepts data from an initial input (or from the output of a previous stage), executes a specific task, and outputs the resulting data (or sends it to the next stage).

We currently distribute a version for Linux and Mac, for both research and commercial purposes. The code for a basic web service version is also available for distribution.

Research Projects

We have been using TextPro in several research projects, including: