[ about infogistics ]
[ products ]
[ partners & customers ]
[ in the spotlight ]
[ jobs ]
[ contact ]
[ home ]


This site Internet

download evaluation version

see interactive on-line demo

download integrator-level documentation.
NLProcessor - Text Analysis Toolkit

NLProcessor by Infogistics is a successor for a set of Natural Language Processing technologies developed in the 1990s at the University of Edinburgh. NLProcessor is an engine which handles so-called "low-level" text processing routines: tokenisation, capitalised word normalisation, sentence segmentation, part-of-speech tagging and syntactic chunking which are necessary steps in building many kinds of text handling applications.

NLProcessor outputs linguistic information by directly marking text with XML tags: tokens are represented as "W" elements, word-class part-of-speech information is provided in their "C" attribute, noun and verb groups are marked as NounGroup and VerbGroup elements and sentences are marked with "S" elements. For example,
<S>
<NounGroup> <W C=NNP>John</W> </NounGroup>
<VerbGroup>
<W C=VBZ>has</W><W C=VBN>been</W> <W C=VBD>given</W>
</VerbGroup>
<NounGroup>
<W C=CD>25</W> <W C=NNS>bricks</W>
</NounGroup>
<W C=".">.</W>
</S>

[ home ] [ about infogistics ] [ products ]
[ in the spotlight ] [ jobs ] [ contact ] [ partners & customers ]

Copyright 2000 2001 Infogistics Ltd. All rights reserved.