WIT Press

Opinion Classification Through Information Extraction


Free (open access)







679 kb

Paper DOI



WIT Press


L Dini & G Mazzini


In this paper we present an application of information extraction technologies to data mining. The system has the goal of producing reliable reports on opinions that customers express about companies and/or products. Information extraction is regarded as an enabling technology by which structured databases can be populated from unstructured texts available on the web. The approach can be considered as an alternative to standard text mining techniques: rather than applying data mining algorithms on textual inputs, we propose to apply syntactic and semantic processing in order to disclose structured information which abstracts completely from linear order of words and language dependent constructions. 1 Introduction 1.1 Information sources for customer opinion discovery As CRM is becoming more and more a factor of success for many companies, we observe a more and more urgent need of keeping track of indirect customer opinions, i.e. those opinions which are not addressed directly to the company but to some \“third party” institution. For instance, there are sites which collect opinions of customers about certain products and services, possibly providing legal advices. Complaints and appraisals are usually archived and made available to other customers (see for instance the site of the Italian Association for the Rights of Customers (www.aduc.it) or the commercial site www.complaints.com). Even more prominently, users tend to \“freely associate” into opinion sharing communities such as newsgroups, forums, chat communities etc. The set of documents stored into these information containers is a valuable source of informa-