Mining association rules with weighted items
Weighted association rule mining: given weight to each item, and mine the association rule between them. Weights represent the relative improtance of each item.
-
Still uses the standard 2-step mining algorithm, first find out all large itemsets, then mine the association rules inside a large set.
-
In the first step, there is a significant difference as compared to Apriori: Apriori algorithm is not applicable becase the downward closure property(subsets of a large itemset are also large) is not ture in weighted association rule mining.
-
Definition used:
weighted support = sum(Wj)*support
large itemset: weighted suport >= weighted minimum support.
weighted support = sum(Wj)*support
interesting rule: both large and confidence >= minimum confidence.
k-support bound: it is possible to compute the minimum count(support)
for a large k-itemset X containing a q-itemset Y (k>q).
-
Algorithm1: use k-support bound instead of downward closure to search for large itemsets.
-
Algorithm2: the weight of an itemset is normalized by the size of the itemset, this is called normalized weighted association rules
normalized weighted support = sum(Wj)*support / k (k=size of the itemset)
any high-order subset of a largeset is also large,
a large (k+1)-itemset must be a low-order superset of a large k-itemset.
A performance study is carried out for both algorithms, using syntheic data and weights. Comparisons are made regarding to the following criteria
-
overall execution time
-
execution time for each pass
-
passes needed
-
size of the candidate itemsets
Scale-up experiments: how the performance varies with the number of items and transactions
-
number of items scale-up
-
number of transactions scale-up: algorithm complexity is exponetial to the number of transactions.
Special case experiment: item weights are binary(0/1)
-
Weighted association rule mining is the generalization of the association rule mining problem
-
Difference between the mining weighted and unweighted association rules is the downward closure property.
-
Two algorithms are proposed and described, performance evaluation has been done on both agorithms.
-
It is an important point that the problem addressed in this paper is a generalization of association rule mining, and it changes the way in designing the core mining algorithm. Thus, they hit the Apriori algorithm which is regarded as a standard in this field. But the authors didn't baseline their approach to Apriori, which should easily be done by initializing the weight of each item to zero.
-
In the experiment evaluation, the authors do not mention the actual experiment results (e.g. association rules generated). Concentration is put on comparing their two algorithms rather than the mining itself. I think the paper could be more valuable if it place it's work in a broader context.
-
By reading this paper, I didn't see a close analogy between the weighted items and TAR2's weighted classes. However, a deeper understanding in association rule mining did remind me that there exists close relation between the algorithms ( using support to find large itemset and using confidence to mine rules in a large set). Instead of satisfy both minimum support and confidence, TAR2 only uses the confidence in its rule check stage, which is the exact reason that results in the appearance of overly small sets.
-
It is possible and desirable to rewrite TAR2's algorithm description using terms as ``support'' and ``confidence''.
| Build 11. Apr 12, 2003
Home
About this site
Literature Review
Data Mining
Machine Learning
Software Engineering
Research Notes
Bbay99.pod
Detcting change in categorical data: mining contrast sets
Ccai98mining.pod
Mining association rules with weighted items cohen.pod
Finding Interesting Associations without Support Pruning confRule.pod
Mining Confident Rules Without Support Requirement
Lliu98.pod
Integrating Classification and Association Rule Mining
Mmbre01ri.pod
Modular Model Checking of SA/RT Models Using Assoiation Rules
Wwebb00.pod
Efficient search for association rules
Aagrawal93.pod
Mining Association Rules between Sets of Items in Large Databases agrawal94.pod
Fast algorithm for mining association rules
Ggoebel99.pod
A Survey of Data Mining and Knowledge Discovery Software Tools mendonca99.pod
Mining Software Engineering Data: A Survey |