WIT Press


CloNI: Clustering Of Square Root Of N -interval Discretization

Price

Free (open access)

Paper DOI

10.2495/DATA030221

Volume

29

Pages

10

Published

2003

Size

424 kb

Author(s)

C. Ratanamahatana

Abstract

CloNI: clustering of JN -interval discretization C. Ratanamahatana Department of Computer Science, University of California, Riverside, USA Abstract It is known that the naive Bayesian classifier typically works well on discrete data. All continuous attributes then need to be discretized beforehand for such applications. An inappropriate range of discretization intervals may result in degradation of performance. In this paper, we review previous work on continuous feature discretization and conduct an empirical evaluation of an improved method called Clustering of &-Interval Discretization (CloNI). CloNI tries to reduce the number of fi intervals in the datasets by iteratively combining two consecutive intervals together, according to their median distance until a stopping criteria is met. We also show that even though C4.5 decision trees can handle continuous features, we can significantly improve its performance in some domains if those features were discretized in advance. In our empirical

Keywords