Outlier Detection And Data Association For Data Mining Criminal Incidents
Free (open access)
S Lin & D E Brown
Outlier detection has been extensively studied in the field of statistics, and a number of discordancy tests have been developed. Most of these studies treat outliers as \“noise” and they try to eliminate the effects of outliers by removing outliers or develop some outlier-resistant methods. However, in data mining, we consider outliers \“meaningful input signals” rather than \“noise”. In some cases, outliers represent unique characteristics of the objects, which are important to an organization. Law enforcement is one area where outlier detection is critically important. In law enforcement, we want to associate criminal incidents caused by the same person/group and detect outliers from this behavior. The purpose of this paper is two-folded: to describe an outlier detection technique and to propose a data association method based upon this technique. We focus our analysis on categorical data since these data are typically found in crime analysis. First, we develop an outlier score function, which measures the \“extremeness” level for observations. Then we discuss some of the properties of this outlier score function. Finally, we describe a data association method based on the outlier score function. We apply our data association method to a real robbery data from the Richmond, Virginia, USA. Results show that the approach is promising.