The Association Rule Mining Context(File Last Modified Wed, May 29, 2002.)
Brief Notes of Association Rule
Description
Properties of the ruleantecedent(LHS):a set of one or more attribute values or items. consequent(RHS): a single attribute value or item. Each association rule indicates that the frequency with which the RHS occurs in the data is higher among examples that have the LHS attribute values than among those do not.
Formal model and measurementsLet I = {i1, i2, ..., im} be a set of literals, called items. Let T be a database of transactions, each transaction is represented as a vector. Let X (LHS) be a set of items in I. Let Y (RHS) be a set of item in I, where there is no overlap between X and Y An association rule is an implication of the form X --> Y( LHS --> RHS) measurements:
AlgorithmThe task of finding association rules can be decomposed into two subproblems:
Differences between association rules and ML
General Association Rule MiningIn order to reflect different importance of or different emphasis on each item, it is desirable to assign weights to individual items. This results in a generalized version of association rule mining. The problem has been addressed by C.H.Cai et al as mining association rules with weighted items.
Classical Association Rule Mining: Weights=0When all weights=0, the general association rule mining becomes the typical association rule mining. The Apriori algorithm (R.Agrawal) and its derivatives have been recognized as the benchmark of this problem. Apriori contains a 2-step mining stragege:
Other researchers did some work to further adapt the algorithm to decrease computational overhead under certain condition. For example, Geoffery Webb proposed an efficient search strategy (OPUS algorithm) when all data can be maintained in memory.
Class Association Rules: RHS=classIf we use classified data as input and restrict the RHS to the class attribute, it will return a special subset of association rules, which is called class association rules. Thus, it's possible to integrate classification and association rule mining. As an example, Bing Liu et al present a framework to build an accurate classifier based on class association rules.
Contrast Set: special class association rules that reflect group differenceIf we see classes as contrast groups, learning about group difference is another problem in many domains. This could be solved by mining a special kind of class association rules -- discriminate rules among the classes. Discriminate rules could be defined as a subset of class association rules. For example: Rules X=>class1, X=>class2 have the same LHS X, but different RHS, where X is a conjunction of attributes and values while class1 and class2 are 2 classes. LHS X is a discriminator if the rule X=>classN has different level of support in different classes. Stephen Bay and Michael Pazzani addressed this kind of problem and provide an algorithm for mining contrast set which, in the above example is the LHS X.
Treatment: discriminate rules that separate best class with non-best onesIn TAR2, we assume there exists some partial ordering among the classes: i.e., weights are assigned to classes rather than items. We want to find rules that when apply, will increase the frequency of the high-weighted class while at the same time, decrease the frequency of low-weighted classes. To do this, consider the following example: Suppose a classified training set has 4 ascending weighted classes: class1, class2, class3, class4, we call class4 which has the highest weight the best class while the others non-best classes. TAR2 tries to find a conjunction of attribute values X such that X=> class4 has a high confidence while X=>class1, X=>class2, X=>class3 have low confidences. LHS X is called treatment. | Build 11. Apr 12, 2003 ![]() ![]() Literature Review![]() ![]() ![]() ![]() A agents.pod
C context.pod
constrain.pod
R reuse.pod
H hyref.pod
|