A Method For Association Rule Quality Evaluation Based On Information Theory
Free (open access)
D. Sitnikov, E. Titova & O. Ryabov
The concept of patterns representing functional, logical and other dependencies in data lies in the basis of the Data Mining technology. One of the wide spread forms for representing discovered knowledge patterns is association rules. A method for evaluating an association rule from the viewpoint of information theory has been suggested, which allows us to calculate a generalized characteristic of associations (based on mutual information) with the help of the well known association rule parameters: Support, Confidence and Improvement. Using such a characteristic of associations complements the traditional association parameters and allows setting a linear order on the set of associations, which is useful for evaluating and filtering obtained dependencies. Besides we have carried out analysis of the dependence of the association rule self-descriptiveness on the standard parameters. 1 Introduction A general definition of association rules has been suggested in : Let m 2 1 I ,..., I , I L =be a set of object features. Let Т be a set of records. Each record t is represented by a binary vector 1 ] k [ t =if t contains the feature k I and 0 ] k [ t =if t does not contain the feature k I ) m , 1 k (=. Let X be a subset including some features from L, i.e. L X ⊆. We say that the record t satisfies X if , X I k ⊆∀1 ] k [ t =. An association rule is an expression in the form Y X →, where L X ⊆, L Y ⊆, at that ∅=∩Y X .