Very Simple Classification Rules Perform Well On Most Commonly Used Datasets
-
1-Rule: rules that classify an object on the basis of a single attribute.
-
1R: the learning sysytem, whose input is a set of training examples and whose output is a 1-rule.
1R
-
is run on 16 commaonly used datasets from UCI, and the resuts are compared to C4.
-
results: 1R's rules are only a few percentage points less accurate, on most of the datasets, than C4's pruned decision tree.
1R*:
-
is defined as an upper bound of predictive accuracy that 1R system can achieve after possible improvements to 1R's criterion for selecting rules.
-
results: on almost all the datasets studied, 1R* turns out to be very similar to the accuracy of C4.
1Rw: highest accuracy of the 1-Rule produced when the whole dataset is used by 1R for both training and testing.
-
simple-rule learning systems are often a viable alternative to systems that learn more complex rules. If a complex rule is induced, its additional complexity must be justified by its being correspondingly more accurate than a simple rule.
-
1R can be used to predict the accuracy of the rules produced by more sophisticated machine learning systems. This prediction can be used as a benchmark accuracy.
Why simple machine learner performs mostly as well as complex learner like C4?
-
C4 doesn't miss opportunities to exploit additional complexity in order to improve its accuracy: C4's pruned trees were the same accuracy as its unpruned ones.
-
It may simply be a fact that on those particular dataset 1-rules are almost as accurate as more complex rules.
-
In some datasets classes and the values of some attributes are almost in 1-1 corespondence.
Most of these datasets are typical of the data available in a commonly occurring class of real classification problems.
-
the datasets are drawn from a real-life domain as opposed to having been constructed artificially.
-
the particular examples in the dataset and the attributes have not been specially engineered by the ml community to make them easy.
The ``simplicity first'' methodology is a promising alternative to the existing methodology, whose main premise is that a learning system should search in very large hypothesis spaces containing very complex hypotheses.
This paper is an experimental report. It compares 1R and C4 through a detailed description of 3 sets of experiments. For each set of experiment, it gives results followed by analysis of the results and implications from the results. Analysis on the result is very concrete, which includes the explanation of exceptions.
The fact that ``simple learner works almost as well as complex ones'' seems a support of your theory: ``there always exists key variables(simple rules) that controll the world''. But the author has a different view, which I think is a point:
-
There is no theoretical proof of this phenomenon, it may simply be a fact that it is true on particular datasets.
-
Those particular datasets turn out to be ``representative'' of the datasets that actually arise in practice.
-
Although it is ture that some real world problems do not have simple solutions, it doesn't imply all real problems are hard.
-
So the jusfication of simple learners is: they are a desirable solution for a kind of real world problems. Furthermore, they generate insight of tradeoffs between accuracy and complexity.
The author claims: systems designed using the ``simplicity first'' methodology are guaranteed to produce rules that are near-optimal with respect to simplicity. If the accuracy of the rule is unsatisfatory, then there does not exist a satisfactofy simple rule. I am wondering: If tar2 fails on some domains, does it indicate there are no simple controllers in those domains?
| Build 11. Apr 12, 2003
Home
About this site
Literature Review
Data Mining
Machine Learning
Software Engineering
Research Notes
Hholte93.pod
Very Simple Classification Rules Perform Well On Most Commonly Used Datasets
Jjj99.pod
An Architecture for Exploring Large Design Spaces
Mmair00.pod
An Investigation of Machine Learning Based Prediction Systems mbre01bo.pod
Using Machine Learning to Predict Projct Effort: Empirical Case Studies in Data-Starved Domains
Qquinlan86.pod
Induction of Decision Trees
Sshawlik91.pod
Symbolic and Neural Learning Algorithms: An Experimental Comparison
KKARDIO.pod
Qualitative modelling and learning in KARDIO |