WIT Press


High Performance Environment For Knowledge Discovering In Portuguese Language Texts In The Web

Price

Free (open access)

Paper DOI

10.2495/DATA060211

Volume

37

Pages

8

Published

2006

Size

339 kb

Author(s)

V. M. Bastos & N. F. F. Ebecken

Abstract

This paper describes the development and implementation of a practical and efficient methodology to construct a knowledge extraction environment that contemplates the search of information from Portuguese language Web sites. The application includes some text mining facilities, such as similarity and difference identification between pages and sites, content classification and document clustering. The application conception has its origin on the evaluation environment of competitive intelligence tasks over the Web. The increasing availability of information in the Web has motivated the proposal of an environment that presents the solutions in an integrated form, supplying results analysis according to the user indication. Keywords Web mining, business applications, knowledge discovering. 1 Introduction With the increasing availability of information on the Web, the identification and discovery involved in finding useful information became an onerous task that consumes much time. The search for information with added value in great masses of data has become reality. It has become increasingly necessary to have an automatic tool that evaluates this content, bringing to the attention of the consultants, the information, or, the isolated document classification or even though the strategic knowledge of basic importance for decision making. Text mining tools permit information extraction, which must be understandable, as accurate as possible and surprising. It is possible to understand the information through the knowledge presentation in the form of

Keywords

Web mining, business applications, knowledge discovering.