Integrating Classification and Association Rule Mining

(File Last Modified Wed, May 29, 2002.)

Review: Integrating Classification and Association Rule Mining

Problem Addressed
Approach Proposed
Algorithm to generate a complete set of CARs(Class Association Rule)
Algorithm to build a classifier
Experiment results
Conclusion
Insights

Review: Integrating Classification and Association Rule Mining

Problem Addressed

integrate both classification rule mining and association rule mining to build a classifier that classifies efficiently with increased accuracy.

Approach Proposed

discretizing continuous attributes.
generating all class association rules.
building a classifier based on the above rules.

Algorithm to generate a complete set of CARs(Class Association Rule)

CAR: X-->Y, where X is a subset of items(treatments) and Y is a class
confidenc c: |cases satisfy X are labeled Y| / |cases satisfy X| = c%
support s: |cases satisfy X are labeled Y| / |total cases| = s%

find ruleitems(<itemset, class label>) that are both frequent and accurate by making multiple passes over the data.

        frequent:ruleitems that have support above minsup.
        accurate:ruleitems that have confidence above minconf.

rule generated is pruned useing pessimistic error rate based pruning method in C4.5

Algorithm to build a classifier

to choose a set of high precedence(greater support,confidence) rules in CARs to cover the training data.

algorithm satisfies two conditions

       each training case is covered by the rule with the highest recedence.
       every rule chosen correctly classifies at least one remaining training case.

discard those rules chosen that don't improve the accuracy.
an improved algorithm is developed to complete the task by making only slightly more than one pass over the remaining data for each rule.

Experiment results

this classfier is run on 26 datasets from UCI. It outperforms C4.5 on 16 datasets, and the average error rate on total 26 is lower than that of C4.5
runtime is seconds when all data is kept in memory.

Conclusion

It gives a new way to construct accurate classifiers.
It makes association rule mining applicable to clssification tasks.
By integrate classification and association mining together, it helps to solve a number of problems such as

Insights

this paper mentioned its use of a entropy method to discretize continuous attrbutes, which I plan to have a look at.
in tar2, skew is similar to support, which can be used to optimize the combination process.
question: the goal of tar2? ( what kinds of result we want to mine from the data? classifications? discreminations? associations?)

Build 11. Apr 12, 2003

Home

About this site

Literature Review

Machine Learning

Software Engineering

B

bay99.pod
Detcting change in categorical data: mining contrast sets

C

cai98mining.pod
Mining association rules with weighted items

cohen.pod
Finding Interesting Associations without Support Pruning

confRule.pod
Mining Confident Rules Without Support Requirement

L

liu98.pod
Integrating Classification and Association Rule Mining

M

mbre01ri.pod
Modular Model Checking of SA/RT Models Using Assoiation Rules

W

webb00.pod
Efficient search for association rules

A

agrawal93.pod
Mining Association Rules between Sets of Items in Large Databases

agrawal94.pod
Fast algorithm for mining association rules

G

goebel99.pod
A Survey of Data Mining and Knowledge Discovery Software Tools

mendonca99.pod
Mining Software Engineering Data: A Survey