Clustering Of Time Series Using A Similarity Between Segments And Bands Determined By Patterns Of Technical Analysis
Free (open access)
R. Basagoiti & E. Juaristi
The representation based on segments, continuous or discontinuous, has been used for dimensionality reduction of temporal data. This reduction is essential for the posterior process of data mining. In this work, patterns of technical analysis, like those defined in Lo et al. (2002), are used to extract the extremes to get two different representations. The first one is based on segments drawn between extremes of the patterns, STA. The second one is based on the best harmonic of the Fourier decomposition between the same extremes, FTA. Due to the subjective nature of technical analysis some parameters are considered in the process of extreme extraction and pattern selection. Once a representation is adopted, similarities between segments defined in Keogh et al. (2002) and between bands suggested by us are used for clustering and the results compared with those obtained with the Euclidean and the lower bound of the dynamic time warping distance defined in Keogh. Keywords: data approximation, dimensionality reduction, time series clustering, pattern extraction. 1 Introduction With the rapid increase of stored data, the interest in the discovery of hidden information has exploded in the last decade. The focus has mainly been on classification, clustering, query by content and relationship finding. Treat data with temporal dependencies is an important problem. A time series data is a sequence of real values, each of which represents a value measured at a point in time. We can find examples of time series data in diverse sources and applications, such as stock prices and currency exchange data. There has been
data approximation, dimensionality reduction, time series clustering, pattern extraction.