Controlling Error Rates with Cost-Sensitive Support Vector Machines
Overview
In both traditional classification and anomaly detection there are two types of errors
we can make (associated with the two classes). Traditionally, algorithms have focused
on minimizing the probability of making an error. However, in many practical settings
the two types of errors have very different costs. For example, in tumor classification
the probability of error is not the most appropriate criterion. The Neyman-Pearson (NP) approach seeks
instead to minimize false negatives while constraining false positives to be
below a certain significance level. This approach is useful in both the classification
setting as well as that of anomaly detection. The theory of learning classifiers with
respect to the NP criterion has recently begun to emerge and there is now a
need for the development of practical algorithms.
We show that support vector machines (SVMs) - one of the most powerful family of algorithms for
classification - can be adapted to this setting in a natural manner.
Cost-sensitive SVMs for NP Learning
There are two immediate options for adapting SVMs to the
NP criterion. Since SVM-based classifiers can be interpreted as
hyperplanes in an appropriate feature space, one option is to simply
shift the hyperplane to achieve the desired trade-off between false
positives and false negatives. Another approach is to use a
"cost-sensitive" SVM which introduces class-specific weights, penalizing
training errors from one class more than the other. The key challenge in this approach lies in appropriately setting the free parameters
in the SVM (in particular those which determine the relative costs for the two
error types). With this in mind we have developed the theory of the 2nu-SVM, providing a characterization
of the feasible parameter set for this method. Further contributions include a novel heuristic for improved
error estimation and a strategy for efficiently searching the
parameter space of this method. In the work below we have
shown that the 2nu-SVM is much more effective for NP
classification than simply shifting the decision boundary.
Anomaly Detection with Minimum Volume Sets
In anomaly detection the goal is to identify measurements as being
either normal/typical or abnormal/anomalous. Thus, we
would like to find a set such that points inside the set
correspond to typical data while points outside are anomalies. In
practice, it is often the case that such a set must be ``learned"
from a collection of training samples gathered
under normal conditions. In other words, we must be able to detect
anomalies without knowing what they look like. For example, in
machine fault detection, we would like to predict when a machine is
about to fail, but cannot gather data from a failed machine because
it would entail breaking the machine.
In this situation one possibility is to estimate the minimum
volume set (MV-set) - the set with smallest volume enclosing a
specified probability mass. Finding this set is actually equivalent to
finding an NP classifier if the anomalous class was distributed according
to a uniform distribution. Thus we can use the SVM algorithms described
above for anomaly detection by simply generating an artificial data set
according to a uniform distribution. While effective in some cases,
this simple approach to generating the uniform
data suffers from the curse of dimensionality. Therefore we have also
devised improved methods for generating artificial uniform data for
the two-class approach. We find that, in general, our method
performs more reliably than existing methods.
Software
We have modified the LIBSVM
package to implement the 2nu-SVM (described in the work above). Our modifications
can be downloaded here.
The version posted is a Matlab interface for LIBSVM 2.8 (based on one written by Jun-Cheng Chen,
Kuan-Jen Peng, Chih-Yuan Yang, and Chih-Huai Cheng from National Taiwan University).
If you would prefer to use your own version, we have instructions for how to modify
LIBSVM to implement the 2nu-SVM here.
E-mail md-at-rice-dot-edu if you find any bugs or have any questions.
Selected Publications
- M. A. Davenport,
"Error Control for Support Vector Machines,"
M.S. Thesis, Rice University, April 2007.
- M. A. Davenport, R. G. Baraniuk, and C. D. Scott,
"Learning minimum volume sets with support vector machines,"
in IEEE Internaitonal Workshop on Machine Learning for Signal Processing (MLSP),
Maynooth, Ireland, 2006.
- M. A. Davenport, R. G. Baraniuk, and C. D. Scott,
"Controlling false alarms with support vector machines,"
in IEEE Internaitonal Conference on Acoustics, Speech, and Signal Processing (ICASSP),
Toulouse, France, 2006.
- M. A. Davenport,
"The 2nu-SVM: A cost-sensitive extension of the nu-SVM,"
Rice University ECE Technical Report TREE 0504, October 2005.
Links
2nu-SVM Software,
Clay Scott,
Richard Baraniuk
|