The Association Rule Mining Context

(File Last Modified Wed, May 29, 2002.)

Brief Notes of Association Rule

Description
Properties of the rule
Formal model and measurements
Algorithm
Differences between association rules and ML

General Association Rule Mining

Classical Association Rule Mining: Weights=0
Class Association Rules: RHS=class
Contrast Set: special class association rules that reflect group difference
Treatment: discriminate rules that separate best class with non-best ones

Brief Notes of Association Rule

Description

Association rules describe ``correlation of events''. It means, those events occure together with greater frequency than they are indepent of one-another.
Events can be a combination of attribute values or items.
Typical application examples: Analysis of basket data( database of customer transactions, each transaction contains items purchased by a customer in a visit)

Properties of the rule

   antecedent(LHS):a set of one or more attribute values or items. 
   consequent(RHS): a single attribute value or item.

Each association rule indicates that the frequency with which the RHS occurs in the data is higher among examples that have the LHS attribute values than among those do not.

Formal model and measurements

    Let I = {i1, i2, ..., im} be a set of literals, called items. 
    Let T be a database of transactions, each transaction is represented as a vector.
    Let X (LHS) be a set of items in I. 
    Let Y (RHS) be a set of item in I, where there is no overlap between X and Y
    An association rule is an implication of the form X --> Y( LHS --> RHS)

measurements:

Confidence--if and only if c% of transactions in T that contain X also contain Y:
```
         c = |X and Y| / |X|
```
Support--if and only if s% of transactions in T contain X or Y:
```
         s = |X or Y| / |T|
```
Coverage--The proportion of examples covered by the LHS:
```
         coverage=|LHS|/|T|
```
Lift--The confidence divided by the proportion of all examples that are covered by the RHS:
```
         lift=confidence x |T| / |RHS|
```
Leverage--The proportion of additional examples covered by both the LHS and RHS above those expected if the LHS and RHS were independent of each other:
```
         leverage=( |LHS and RHS| / |T| ) - ( |LHS|/|T| * |RHS|/|T| )
```

Algorithm

The task of finding association rules can be decomposed into two subproblems:

Step1.Find large itemsets.(support above minimum support)-- APRIORI
Step2.Generate all assiciation rules from the large itemsets.(confidence above minimum confidence)

Differences between association rules and ML

Goal: association rule mining targeted at finding all rules that describe association between sets of items while ML targeted at discovering classification rules. There is a single, pre-specified class attribute
Algorithm: association rule mining has an emphasis on processing large training sets. It's algorithm tend to process data via database access(minimize the passes through databases)

General Association Rule Mining

In order to reflect different importance of or different emphasis on each item, it is desirable to assign weights to individual items. This results in a generalized version of association rule mining. The problem has been addressed by C.H.Cai et al as mining association rules with weighted items.

Classical Association Rule Mining: Weights=0

When all weights=0, the general association rule mining becomes the typical association rule mining. The Apriori algorithm (R.Agrawal) and its derivatives have been recognized as the benchmark of this problem. Apriori contains a 2-step mining stragege:

Step1: Find out all large item sets whose support > minimum support. This is based on the downward closure property: if an item set {i1,i2,...,ik} is a large set (i.e., support > minimum support ), so is every subset of k-1.
Step2: For a given large item set, generate all rules whose confidence > minimum confidence.

Other researchers did some work to further adapt the algorithm to decrease computational overhead under certain condition. For example, Geoffery Webb proposed an efficient search strategy (OPUS algorithm) when all data can be maintained in memory.

Class Association Rules: RHS=class

If we use classified data as input and restrict the RHS to the class attribute, it will return a special subset of association rules, which is called class association rules. Thus, it's possible to integrate classification and association rule mining. As an example, Bing Liu et al present a framework to build an accurate classifier based on class association rules.

Contrast Set: special class association rules that reflect group difference

If we see classes as contrast groups, learning about group difference is another problem in many domains. This could be solved by mining a special kind of class association rules -- discriminate rules among the classes. Discriminate rules could be defined as a subset of class association rules.

For example:

Rules X=>class1, X=>class2 have the same LHS X, but different RHS, where X is a conjunction of attributes and values while class1 and class2 are 2 classes. LHS X is a discriminator if the rule X=>classN has different level of support in different classes. Stephen Bay and Michael Pazzani addressed this kind of problem and provide an algorithm for mining contrast set which, in the above example is the LHS X.

Treatment: discriminate rules that separate best class with non-best ones

In TAR2, we assume there exists some partial ordering among the classes: i.e., weights are assigned to classes rather than items. We want to find rules that when apply, will increase the frequency of the high-weighted class while at the same time, decrease the frequency of low-weighted classes. To do this, consider the following example:

Suppose a classified training set has 4 ascending weighted classes: class1, class2, class3, class4, we call class4 which has the highest weight the best class while the others non-best classes. TAR2 tries to find a conjunction of attribute values X such that X=> class4 has a high confidence while X=>class1, X=>class2, X=>class3 have low confidences. LHS X is called treatment.

Build 11. Apr 12, 2003

Home

About this site

Literature Review

Data Mining

Machine Learning

Software Engineering

Research Notes

agents.pod
Agents in a Wild World

context.pod
The Association Rule Mining Context

constrain.pod
Constraining Discussions in Requirements Engineering via Models

reuse.pod
Reusing Models for Requirements Engineering

hyref.pod
Some of my references