Treatment Learning:
Implementation and Application

Ying Hu
Electrical and Computer Engineering
University of British Columbia

May 2003

Abstract

Data mining and machine learning focus on inducing previously unknown, potentially useful, and ultimately understandable information from data. In this master's thesis, we propose a new learning approach called treatment learning. Treatment learning aims at mining a small number of control variables in a large option space that can lead to better system behavior. It addresses two central issues in data mining:

We design and implement a novel mining algorithm and deliver two treatment learners that are freely downloadable from an online distribution. We describe the implementation details of both learners and compare them through algorithmic performance analysis.

We conduct extensive data experiments and case studies to demonstrate the effectiveness of using treatment learner to seek a small number of control variables that constrain the option space to a tight, near-optimal convergence.

We compare treatment learning with other learning schemes in the framework of feature subset selection for supervised classification. Our treatment learner selects smaller feature subsets than most other methods with minimal or no loss in classification accuracy. Treatment learner has been successfully applied to various research domains through a collaboration with other researchers. By presenting four examples, we show the general paradigms of using it for decision making.

Thesis Organization

This thesis discusses four main topics:
  1. Introduction of treatment learning in the context of machine learning and data mining.
  2. A detailed description of treatment learning by providing algorithm implementation and performance comparison of two treatment learners.
  3. A evaluation of treatment learning with respect to other state-of-the-art techniques in the framework of Feature Subset Selection (FSS) for supervised classification.
  4. Application of treatment learning in various research domains.
The above topics are organized as follows:

In chapter 2, we present a literature review that serves as background of this thesis. Two groups of concepts and techniques are outlined: one is supervised classification in machine learning, the other is association rule mining in data mining. We also review some recent development in integration of classification and association rule mining. All of them are closely relevant to the topics discussed in this thesis, and represent the state-of-the-art in each of these areas.

In chapter 3, we first bring forward the concept of narrow funnel effect: an observation repeated in many researches, where most domain variables are controlled by a very small subset. We then introduce treatment learning as an ideal way to identify funnel variables: a lightweight learning approach that focuses on producing the minimal models to describe significant differences among groups of data. We go deep into the problem by presenting implementation details of a treatment learner TAR2. This is followed by two case studies illustrating the effectiveness of using treatment learner in practice for actionable decision making. Finally, we relate treatment learning to extensions of standard learning techniques and general change detecting algorithms to show their differences and the novelty of our approach.

In chapter 4, we examines the algorithmic performance of the learner described in the previous chapter. We point out its efficiency limitation by reporting runtime curves with respect to parameters such as data size and treatment size. After analyzing the search procedure that leads to the problem, we solve it by employing a series of strategies, including a random sampling algorithm. The improved learner TAR3 is evaluated through comparison experiments with TAR2 and a revised case study. The results show that TAR3 has made major improvement in efficiency: it can reach stable conclusions in linear time.

In chapter 5, we further explore treatment learning in the framework of Feature Subset Selection for supervised classification. Feature subset selection is the process of identifying and removing as much of the irrelevant and redundant information from data as possible prior to learning. We use treatment learner as feature subset selector on ten commonly used datasets and compare the result with six standard techniques. Experiments show that our approach is the best overall feature subset selection method. It finds the smallest feature subsets with minimal or no loss in classification accuracy.

In chapter 6, we present real world applications of treatment learning to demonstrate how it can be integrated into different research frameworks to assist decision making. We present studies in four domains:

Among them some are model-based while others are data-present. In either case, we give brief background and state the approach and goal of the study. Although each case is discussed from a domain-specific point of view, we emphasize the general applicability of treatment learning and the approach of modelling the problem such that we can make decisions by identifying minimal key factors in the domain.

In chapter 7, we conclude this by reviewing the main contributions of our research and pointing out future research issues.

(backup)