Outlier Detection Based On Projection-based Ordering
Free (open access)
A. Shojaie & P.-N. Tan
Simple one-dimensional order statistics have played an important role in one-dimensional data analysis. However, extensions of these methods to high dimensional data have not been properly addressed. Even though ordering is of natural interest in many problems, the lack of natural and unambiguous methods to order or rank multi-dimensional data has hindered the adoption of these methods for more complex applications. Projection-based depth functions are mainly based on the idea of finding a \“centre-outward” ordering from the \“deepest” point of data. These functions are providing promising tools for ordering multi-dimensional data and may also be used for outlier detection and cluster analysis. This paper develops an outlier-detection algorithm based on a new method for projection-based ordering of multi-dimensional data. Keywords: outlier detection, projection-based ordering, data mining. 1 Introduction Outliers are data points whose characteristics are very different form the rest of the data based on some measures . There are two potential benefits of applying outlier detection techniques: (1) for data pre-processing and (2) for anomaly detection. Many data mining methods and statistical measures are susceptible to presence of outliers. It is therefore crucial to detect and eliminate outliers in the pre-processing step. Mean is an example of a statistical measure that may not effectively represent the location of data points when outliers are present. Many
outlier detection, projection-based ordering, data mining.