Novel Pruning Based Hierarchical Agglomerative Clustering For Mining Outliers In Financial Time Series
Free (open access)
33 - 42
D. Wang, P. J. Fortier & H. E. Michel
Investors must make informed decisions using partial and imperfect information. As accuracy and completeness of information held by the investor rise, the probability for better decision making also rises. Similarity search based outlier detection in financial time series is key to making better decisions for many investment strategies and portfolio management techniques. This motivates people to utilize numerous data mining techniques to discover similarities from massive financial time series data pools. The research introduces a novel pruning based Hierarchical Agglomerative Clustering (HAC) algorithm to search for similarity among financial time series in high dimensional space using securities in the S&P500 index as experimental data. The algorithm is based on vertical and horizontal dimension reduction algorithms  and a unique similarity measurement definition  with the time value concept. This paper discloses a series of experiment results that illustrate the effectiveness of the algorithm. Keywords: outlier, data mining, computational finance, financial time series, similarity search, high dimension, clustering. 1 Introduction We propose a novel similarity search in high dimensional financial time series by using a pruning based HAC algorithm. The similarity search is performed after dimensionality reduction, which composes of an Attributes Selection (AS) algorithm  and a Piecewise Linear Representation (PLR) based Segmentation
outlier, data mining, computational finance, financial time series,similarity search, high dimension, clustering.