Treatment learning

Treatment Learner

For my thesis research (thesis outline), I have designed and delivered two software packages: TAR2 and TAR3. Both are specific data mining tools called treatment learner. Treatment learners, like other machine learners are rule discovery paradigms. However, classical machine learners like C4.5 aim at discovering classification rules: i.e. given a classified training set, they output rules that are predictive of the class attribute. Treatment learner differs from those learners in that:

Treatment learner assumes the classes can be assessed by their scores (some domain-specific measure).
Highly scored classes are preferable to lower scored classes.
Further, one class is more desirable than all others, which is called the best class.
Treatment learner finds rules that predict both increased frequency of the best class and decreased frequency of the worst class.

That is, treatment learner finds discriminate rules that drive the system from the worst class to the best class.

Treatment learner takes in classified data sets and output treatments. A treatment is one or a conjunction of attribute value pairs. It is a constraint on future controllable inputs of the system. In summary, treatment learners give us controllers rather than classifiers. To understand the distinction, consider the case of someone reading a map. Classifiers say "you are here" on the map while controllers say "go this way".

You can find a detailed illustration of how TAR2 works in intro.pdf.

Why that name?

TAR2 and TAR3 are based on a prolog prototype "TARZAN"- a post-processor to a decision tree learner. Description of TARZAN can be found in Practical Large Scale What-if Queries: Case Studies with Software Risk Assessment

TAR2 and TAR3 are written in C. They are data miners that no longer need the decision-tree pre-processor. Both learners involve a combination of search and self-defined heuristic evaluation of attribute utility. While TAR2's breadth-first search can grow exponentially, TAR3 fix the problem by employing a series of strategies including random sampling. On datasets where TAR2 is exponential, TAR3 runs in linear time.

Installation

Download the file tar3.zip shown at the bottom of this page. Depending on the experiment, TAR2 can also be downloaded for baseline comparison purpose.

Simply unzip tar3.zip and you get the following:

Source code of TAR3 and a N-way cross validation facility.
DOS executables to run TAR2 and X-way cross validation experiment (UNIX executables are not provided but can be easily generated by compiling the source code files directly).
Sample datasets and corresponding configuration/output files.
Documents including user manual and several associated research papers.

The directory structure of the un-zipped TAR3 system is as follows:

README:
COPYRITE: includes the GPL-2 Copy policy
.\doc user instruction and pdf's
.\src source files for TAR3 and N-way cross validation
.\bin all executables
.\samples sample data sets and output files

Research Papers

Practical Large Scale What-if Queries: Case Studies with Software Risk Assessment: on the thing that preceded TAR2;
Condensing Uncertainty via Incremental Treatment Learning: a discussion of three detailed applications of TAR2;
Data Mining for Very Busy People: best general overview;
Just Enough Learning (of Association Rules): The TAR2 "Treatment" Learner: most details on the internals of TAR2.

See also \doc in the download zip files.

Memory

With Window98, TAR2 easily handles 350,000 examples (13 attributes) in 64M, but need more (suggest 196M) memory to handle more than, (say)550,000 examples (in 80sec).

Download

TAR2.2; start with .\dispatchTAR2\doc\TAR2intro.pdf.
TAR3; start with .\tar3\doc\TAR3manual.pdf, or have a look at the manual page;