On High Dimensional Data Spaces

S Dey; S A Roberts

doi:10.2495/DATA020251

WIT Press

On High Dimensional Data Spaces

Price

Free (open access)

Transaction

WIT Transactions on Information and Communication Technologies

Volume

Pages

Published

2002

Size

663 kb

Paper DOI

10.2495/DATA020251

WIT Press

Author(s)

S Dey & S A Roberts

Abstract

Data mining applications usually encounter high dimensional data spaces. Most of these dimensions contain ‘uninteresting’ data, which would not only be of little value in terms of discovery of any rules or patterns, but have been shown to mislead some classification algorithms. Since, the computational effort increases very significantly (usually exponentially) in the presence of a large number of attributes, it is highly desirable that all irrelevant attributes be weeded out at an early stage. Often, patterns of interest are embedded in lower dimensional subspaces of data. If the data space S has k attributes E {a1, a2...ak}, then a n-dimensional subspace s. of the data space S can be formed by selecting a combination of n attributes from the set {a1, a2...ak}, where n < k. It is usual to tackle this problem by getting some attributes and subspaces identified by the user (or domain experts). For even moderately large number of attributes, the number of possible subspaces is so large, that it is quite unlikely that the ‘experts’ would be able to identify all the ‘interesting’ subspaces. 1 Introduction The general problem, known as ‘the curse of high dimensionality’, has been studied extensively and several automatic methods for reduction of dimensionality have been reported in literature. Data mining applications require that: The results be comprehensible by the end-user Data distributions potentially non-conformant to any of the canonical forms be handled Potentially ‘interesting’ subspaces (as opposed to a subset of the original attributes) be identified

Keywords

Keep me updated

View Book

WIT Press, Ashurst Lodge, Ashurst, Southampton SO40 7AA, UK. Registered in England as a limited company No. 4741634

Connect with WIT Press: