Prior work in inductive learning focused on generic
algorithms that sought to reduce complexity. Thus,
simplifying assumptions were made. 1. All data
resides on a single processor, and resides
entirely in main memory;
Clearly in modern organizations today, most data
resides in a distributed
architecture with only small portions being resident
in main memory at each moment.
2. Each datum is considered equally important and uniform
costs are assumed. In real world contexts, different
exemplars frequently have
varying costs.
3. All features are freely acquired with no
computational or monetary
costs. This is unrealistic for many applications,
such as medical diagnosis. Usually, the test for each
feature consumes different costs and cannot be
ignored, i.e., accurate models that only take
advantage of the most expensive features are not
acceptable.
4. Model is computed on the basis of complete
knowledge. A learned hypothesis will
be applied to scenarios that are completely
represented in the training set. This
assumption is more often violated than satisfied.
There are usually new and unknown
patterns that traditional hypotheses will either
ignore or misclassify.
algorithms that sought to reduce complexity. Thus,
simplifying assumptions were made. 1. All data
resides on a single processor, and resides
entirely in main memory;
Clearly in modern organizations today, most data
resides in a distributed
architecture with only small portions being resident
in main memory at each moment.
2. Each datum is considered equally important and uniform
costs are assumed. In real world contexts, different
exemplars frequently have
varying costs.
3. All features are freely acquired with no
computational or monetary
costs. This is unrealistic for many applications,
such as medical diagnosis. Usually, the test for each
feature consumes different costs and cannot be
ignored, i.e., accurate models that only take
advantage of the most expensive features are not
acceptable.
4. Model is computed on the basis of complete
knowledge. A learned hypothesis will
be applied to scenarios that are completely
represented in the training set. This
assumption is more often violated than satisfied.
There are usually new and unknown
patterns that traditional hypotheses will either
ignore or misclassify.