WIT Press


Novel Pruning Based Hierarchical Agglomerative Clustering For Mining Outliers In Financial Time Series

Price

Free (open access)

Paper DOI

10.2495/CF080041

Volume

41

Pages

10

Page Range

33 - 42

Published

2008

Size

468 kb

Author(s)

D. Wang, P. J. Fortier & H. E. Michel

Abstract

Investors must make informed decisions using partial and imperfect information. As accuracy and completeness of information held by the investor rise, the probability for better decision making also rises. Similarity search based outlier detection in financial time series is key to making better decisions for many investment strategies and portfolio management techniques. This motivates people to utilize numerous data mining techniques to discover similarities from massive financial time series data pools. The research introduces a novel pruning based Hierarchical Agglomerative Clustering (HAC) algorithm to search for similarity among financial time series in high dimensional space using securities in the S&P500 index as experimental data. The algorithm is based on vertical and horizontal dimension reduction algorithms [11] and a unique similarity measurement definition [12] with the time value concept. This paper discloses a series of experiment results that illustrate the effectiveness of the algorithm. Keywords: outlier, data mining, computational finance, financial time series, similarity search, high dimension, clustering. 1 Introduction We propose a novel similarity search in high dimensional financial time series by using a pruning based HAC algorithm. The similarity search is performed after dimensionality reduction, which composes of an Attributes Selection (AS) algorithm [11] and a Piecewise Linear Representation (PLR) based Segmentation

Keywords

outlier, data mining, computational finance, financial time series,similarity search, high dimension, clustering.