WIT Press


Data Scale Reduction Via Instances Summarization Using The Rough Set Theory

Price

Free (open access)

Paper DOI

10.2495/DATA000271

Volume

25

Pages

10

Published

2000

Size

1,043 kb

Author(s)

G. Gaumer & M. Quafafou

Abstract

Actually, the major obstacle encountered when applying Data Mining algorithms to real life data is the incapacity of these algorithms to handle very large data such as those stored in industrial databases. Developing new algorithm which require less memory and processing time will certainly help to solve this problem. But we followed here another way to solution, the reduction of the size of input data. We present in this article our new system CFSumm, which is dedicated to data summarization considered as a pre-process step before the use of a Data Min- ing Tool. The basic idea of this method is to summarize several instances suffi- ciently similar by a weighted pseudo-instance which can replace them for further processes. We explain in this article how the a-Rough Set Theory framework al- lows a

Keywords