WIT Press


Mining Linguistic Information Into An E-retail System

Price

Free (open access)

Volume

28

Pages

Published

2002

Size

758 kb

Paper DOI

10.2495/DATA020671

Copyright

WIT Press

Author(s)

M T Pazienza & M Vindigni

Abstract

Building more adaptive SW applications is a crucial issue to scale up Information Technology to the Web, where information is organized following different underlying knowledge and/or presentation models. To efficiently manage heterogeneous information sources agents must be able to cooperate, share their knowledge, and agree upon appropriate terminology to be used during interaction. We describe here an e-retail product comparison agent system aiming to supply users with synthesis information on product fitting at best their inquires. We will focus on the Named Entity Recognition and Classification (NERC) component that is very helpful in identifying relevant characteristics in a (multilingual) product description. 1 Introduction The continuous growth of the Web accesses and e-commerce transactions are producing a new generation of sites: e-retail portals, willing to help end-users in choosing accordingly to their needs among different products, presented in an uniform way to make easier their comparison. A number of commercial agent-based systems already exist that help Internet shoppers in deciding what and where to buy goods. Most of these agents extract relevant data from on-line product descriptions, summarizing and presenting results in a synthetic form to the final user. They don’t use a natural language technologies, and hence process strictly structured texts only, where product names, prices, and other features always appear in a fixed (or at least regular) order, making possible to use the page structure and/or mark-up tags as content delimiters. Moreover, they often assume pages to be expressed in uniform and monolingual manner (usually English), this being unsuitable for a multi-lingual society such as the European one. This motivates spreading approaches and techniques to cover

Keywords