KeyPro

KeyPro (formerly called KX) is a tool for key-phrase extraction, which exploits basic linguistic annotations combined with simple statistical measures to select a list of weighted keywords from a text. In the example below, the sequence "Barack Obama" has been selected as the most relevant keyword.

KeyPro allows both for setting parameters (e.g. frequency thresholds for collocation and indicators for key-phrase relevance), and for domain adaptation, exploiting a corpus of documents in an unsupervised way.

KeyPro can also be easily adapted to new languages in that it requires only a PoS-Tagger to derive lexical patterns.

Example

Algorithm: Statistical measures.

Resources: List of keyword patterns, a reference corpus.

Evaluation benchmark: “Automatic Key-phrase Extraction from Scientific Articles” at SemEval 2010.

Reference:

Emanuele Pianta and Sara Tonelli. KX: A Flexible System for Keyphrase eXtraction. In Proceedings of SemEval 2010, Task 5: Keyword extraction from Scientific Articles, Uppsala (Sweden), 2010.