Department of Mathematics,
University of California San Diego
****************************
Special Statistics
Alejandro Murua
University of Washington
A recipe for combining classifiers, and a particular application to isolated (spoken) digit recognition
Abstract:
Recently there has been a fair amount of interest in combining several classification trees so as to obtain better decision rules. Techniques such as bagging, boosting, and randomized trees are particularly popular in statistics and computer science.The best PAC-learning theoretical bounds on the classification error rate achieved by these techniques do not offer any insight into howone should combine these classifiers in order to reduce the error rate. In this talk I will present the notion of weakly dependent classifiers, and show that when both the dependence between the classifiers is low, and the expected margins (a measure of confidence in the classifiers) are large, then exponential upper bounds on the classification error rate can be achieved. In particular, experiments with several data sets indicate that thereappears to be a trade-off between weak dependence and expected margins in the sense that to compensate for low expected margins there should be low mutual dependence between the classifiers.The results will be motivated and become more intuitive through an application of randomized relational decision trees to speech recognition.
Host: Ian Abramson
February 13, 2003
1:00 PM
AP&M 6438
****************************