NLProcessor - Text Analysis Toolkit

NLProcessor by Infogistics is a successor for a set of Natural Language Processing technologies developed in the 1990s at the University of Edinburgh. NLProcessor is an engine which handles so-called "low-level" text processing routines: tokenisation, capitalised word normalisation, sentence segmentation, part-of-speech tagging and syntactic chunking which are necessary steps in building many kinds of text handling applications.

NLProcessor outputs linguistic information by directly marking text with XML tags: tokens are represented as "W" elements, word-class part-of-speech information is provided in their "C" attribute, noun and verb groups are marked as NounGroup and VerbGroup elements and sentences are marked with "S" elements. For example,
<NounGroup> <W C=NNP>John</W> </NounGroup>
<W C=VBZ>has</W><W C=VBN>been</W> <W C=VBD>given</W>
<W C=CD>25</W> <W C=NNS>bricks</W>
<W C=".">.</W>

