WIT Press


A Neural-based Text Summarization System

Price

Free (open access)

Volume

37

Pages

8

Published

2006

Size

347 kb

Paper DOI

10.2495/DATA060191

Copyright

WIT Press

Author(s)

S. P. Yong, A. I. Z. Abidin & Y. Y. Chen

Abstract

The number of electronic documents as a media of business and academic information has increased tremendously after the introduction of the World Wide Web. Ever since, instances where users being overloaded with too much electronic textual information are inevitable. The users may only be interested in shorter versions of text documents but are overloaded with lengthy texts. The objective of the study is to develop a text summarization system that incorporates learning ability by combining a statistical approach, keywords extraction, and neural network with unsupervised learning. The system is able to learn to classify sentences when well trained with sufficient text samples. Users with strong background in writing English summaries have subjectively evaluated the outputs of the text summarization system based on contents. With the average contents score of 83.03%, the system is regarded to have produced an effective summary with most of the important contents of the original text extracted without compromising the summary’s readability. Keywords: keyword extraction, neural network, unsupervised learning. 1 Introduction The proliferation of electronic documents as a media for business and academic information in the World Wide Web has resulted in users being overloaded by electronic texts. Though users can sort out the documents through various search engines, the engines usually do a poor approximation. The engines only show the initial lines of the document. Users who do not use keywords to search for intended document effectively might come across a vast quantity of hyperlinks. Text summarization is an emerging field at the intersection of several research areas, including natural language processing, machine learning and information retrieval. It is essential to be able to extract the gist of the electronic documents by having a text summarization system to fully utilize these documents

Keywords

keyword extraction, neural network, unsupervised learning.