Mining association rules with weighted items

(File Last Modified Wed, May 29, 2002.)


Review: Mining association rules with weighted items

Problem Addressed

Weighted association rule mining: given weight to each item, and mine the association rule between them. Weights represent the relative improtance of each item.

Approach Proposed

  • Still uses the standard 2-step mining algorithm, first find out all large itemsets, then mine the association rules inside a large set.
  • In the first step, there is a significant difference as compared to Apriori: Apriori algorithm is not applicable becase the downward closure property(subsets of a large itemset are also large) is not ture in weighted association rule mining.
  • Definition used:
             weighted support = sum(Wj)*support
             large itemset: weighted suport >= weighted minimum support.
             weighted support = sum(Wj)*support
             interesting rule: both large and confidence >= minimum confidence.
             k-support bound: it is possible to compute the minimum count(support) 
                                     for a large k-itemset X containing a q-itemset Y (k>q).
  • Algorithm1: use k-support bound instead of downward closure to search for large itemsets.
  • Algorithm2: the weight of an itemset is normalized by the size of the itemset, this is called normalized weighted association rules
             normalized weighted support = sum(Wj)*support / k      (k=size of the itemset)
             any high-order subset of a largeset is also large, 
                       a large (k+1)-itemset must be a low-order superset of a large k-itemset.

Validation

A performance study is carried out for both algorithms, using syntheic data and weights. Comparisons are made regarding to the following criteria

  • overall execution time
  • execution time for each pass
  • passes needed
  • size of the candidate itemsets

Scale-up experiments: how the performance varies with the number of items and transactions

  • number of items scale-up
  • number of transactions scale-up: algorithm complexity is exponetial to the number of transactions.

Special case experiment: item weights are binary(0/1)

Contribution

  • Weighted association rule mining is the generalization of the association rule mining problem
  • Difference between the mining weighted and unweighted association rules is the downward closure property.
  • Two algorithms are proposed and described, performance evaluation has been done on both agorithms.

Comments

  • It is an important point that the problem addressed in this paper is a generalization of association rule mining, and it changes the way in designing the core mining algorithm. Thus, they hit the Apriori algorithm which is regarded as a standard in this field. But the authors didn't baseline their approach to Apriori, which should easily be done by initializing the weight of each item to zero.
  • In the experiment evaluation, the authors do not mention the actual experiment results (e.g. association rules generated). Concentration is put on comparing their two algorithms rather than the mining itself. I think the paper could be more valuable if it place it's work in a broader context.

Insight

  • By reading this paper, I didn't see a close analogy between the weighted items and TAR2's weighted classes. However, a deeper understanding in association rule mining did remind me that there exists close relation between the algorithms ( using support to find large itemset and using confidence to mine rules in a large set). Instead of satisfy both minimum support and confidence, TAR2 only uses the confidence in its rule check stage, which is the exact reason that results in the appearance of overly small sets.
  • It is possible and desirable to rewrite TAR2's algorithm description using terms as ``support'' and ``confidence''.

Build 11. Apr 12, 2003


  *  Home

  *  About this site

Literature Review
  *  Data Mining

  *  Machine Learning

  *  Software Engineering

  *  Research Notes



B

bay99.pod
Detcting change in categorical data: mining contrast sets


C

cai98mining.pod
Mining association rules with weighted items

cohen.pod
Finding Interesting Associations without Support Pruning

confRule.pod
Mining Confident Rules Without Support Requirement


L

liu98.pod
Integrating Classification and Association Rule Mining


M

mbre01ri.pod
Modular Model Checking of SA/RT Models Using Assoiation Rules


W

webb00.pod
Efficient search for association rules


A

agrawal93.pod
Mining Association Rules between Sets of Items in Large Databases

agrawal94.pod
Fast algorithm for mining association rules


G

goebel99.pod
A Survey of Data Mining and Knowledge Discovery Software Tools

mendonca99.pod
Mining Software Engineering Data: A Survey