WIT Press


\“NIBA – TAG” - A Tool For Analyzing And Preparing German Texts

Price

Free (open access)

Paper DOI

10.2495/DATA020331

Volume

28

Pages

Published

2002

Size

339 kb

Author(s)

G Fliedl & G Weber

Abstract

\“NIBA-TAG” – A tool for analyzing and preparing German texts G. Fliedl, G. Weber University of Klagenfurt – Department for Business Informatics and Application Systems (IWAS); Austria Abstract NIBA-TAG is a kind of multilevel natural language tagger with rich functionality. It functions as a word-stemmer, a morphological parser and a normal POS-Tagger, which uses syntactic and semantic features for contextually influenced word-tagging. Each rule is based on a ranking-mechanism which is currently related to the levels \“fact”, \“proposal” and \“guess”. One of the postprocessing-units analyzes the ranking-structure and can change a \“proposal” to a \“fact”, if enough rules made an identical proposal for a word. The default output is XML, where the level of precision can be specified. So one could generate a XML-file only including the guesses, or a file with all attributes relevant for the status of a proposal. 1 Introduction The automated analysis of language is important for many tasks in computer science. A mass of information exists in unstructured texts. For analyzing all different sorts of unstructured text we developed our tagging-tool. It is a very efficient instrument for filtering out the STRUCTURE OF CONTENT. In the field of textual analysis no satisfactory results are known up to now, because methods have always been focussed on mainly statistical methods (Kupiec [1]) or to a reduced linguistic functionality. So we concentrated on the development of a general tool for multilevel tagging. The system was implemented in Perl and Prolog; the technical features are: 15423 sentences (164380 words) of a testcorpus have been tagged in 1450 seconds. Linguistic analysis is done according to the NTMS model, which stands for

Keywords