Induction of Decision Trees

(File Last Modified Wed, May 29, 2002.)


Review: Induction of Decision Trees

Concepts

  1. Learning: the acquisition of structured knowledge in the form of concepts, discrimination nets, or production rules. It enable a system to do the same task more efficiently the next time.
  2. Machine learning
    • Understand and improve efficiency of human learning -- Computer-aided instruction
    • Discover new things or structure that is unknown to humans --- Data mining
    • Fill in skeletal or incomplete specifications about a domain --- knowledge-based expert system
  3. Inductive learning: use specific examples to reach general conclusions. To develop a classfication rule that can determine the class of any object from its values of attributes.

Approach-Decision Tree

  • Decision trees: is a simple inductive learning structure, search for patterns in the given examples and must be able to examine all of them during learning. Given an instance of an object, which is specified by a set of attributes, the tree returns a P/N classfication about that instance
  • Example - ID3, C4 (TDIDT: Top-Down Induction of Decision Trees)
  • Bias:A preference for one hypothesis over another.
  • Ockham's razor: The simplest one that is consistent with all observations is the best. A simpler tree is more likely to capture the rule in the problem, thus can classify correctly more future objects. Great complexity makes a decision tree an explanation rather than a learning system of the training set.
  • Criteria for evaluating the performance -- predictive accuracy

Construct the decision tree

Algorithm

  • Recursively/iteratively select the ``best attribute'' to use at the current node in the tree
  • Choose the attribute A with the max gain as the root
  • Divides the set according to the infomation gain value of A.
  • Repeat the procedure until the subset includes objects of one class, thus it reaches the leaf.

Entropy based criterion

  1. Gain criterion: the information conveyed by a messag depends on its probability and can be measured in bits as minus the logarithm to base 2 of that probability.
  2. Gain ration criterion: gain criterion has a strong bias in favor of tests with many outcome. This can be rectified by a kind of normalization.
               gain ratio(X) = gain(X)/split info(X)

(under construction...)

Build 11. Apr 12, 2003


  *  Home

  *  About this site

Literature Review
  *  Data Mining

  *  Machine Learning

  *  Software Engineering

  *  Research Notes



H

holte93.pod
Very Simple Classification Rules Perform Well On Most Commonly Used Datasets


J

jj99.pod
An Architecture for Exploring Large Design Spaces


M

mair00.pod
An Investigation of Machine Learning Based Prediction Systems

mbre01bo.pod
Using Machine Learning to Predict Projct Effort: Empirical Case Studies in Data-Starved Domains


Q

quinlan86.pod
Induction of Decision Trees


S

shawlik91.pod
Symbolic and Neural Learning Algorithms: An Experimental Comparison


K

KARDIO.pod
Qualitative modelling and learning in KARDIO