Fourth, our experiments have proved that using a pseudo-training sample to
enlarge the sample size is a good strategy. As no work has been done to compare the
mixed-sample classifier design with the pseudo-training sample classifier design, it is
difficult to say which one is better.
Finally, from all of the above discussions, we confirm that the classifier design
can be improved on by using a pseudo-training sample. The problem yet to be solved is
one of knowing how to avoid the contamination of bad pseudo-patterns. There are two
possible solutions to the problem: one is to exclude the outliers from the original
training sample set before generating the pseudo-training samples. The other is to
replace the outliers in the original training sample with their local means. Although it is
believed that these two approaches are able to enhance the classifier design, they
warrant further research.
Reference
[CMN85] Chernick, M.R., Murthy, V.K. and Nealy, C.D. (1985), “Application of
bootstrap and other resampling techniques: evaluation of classifier
performance”, Pattern Recognition Letters, 3, pp.167 - 178.
[DH92] Davison, A. C. and Hall, P. (1992), “On the bias and variability of bootstrap
and cross-validation estimates of error rate in discrimination problems”,
Biometrika, 79, pp.279 - 284.
[DHS01] Duda, R.O, Hart, P. E. and Stork, D. G., (2001), Pattern Classification,
Second Edition, Wiley & Sons (2001).
[Ef79] Efron, B. (1979), “Bootstrap Method: Another Look at the Jackknife”,
Annals of Statistics, 7, pp.1 - 26.
[Ef83] Efron, B. (1983), “Estimating the error rate of a prediction rule: improvment
on cross-validation”, Journal of the American Statistical Association, 78,
pp.316 - 331.
[ET97] Efron, B. and Tibshirani, R. J. (1997), “Improvement on cross-validation: the
.632 + bootstap method”, Journal of the American Statistical Association, 92,
pp.548 - 560.
[Ha86] Hand, J.D., (1986), “Recent advances in error rate estimation”, Pattern
Recognition Letters, 4, pp.335 - 346.
[JDC87] Jain, A.K., Dubes, R.C. and Chen, C. (1987), “Bootstrap techniques for error
estimation”, IEEE Transactions on Pattern Analysis and Machine
Intelligence, PAMI-9, 628 - 633.
[TT92] Taylor, M. S. and Thompson, J. R., (1992), “A nonparametric density
estimation based resampling algorithm”, Exploring the Limits of Bootstrap,
John Wiley & Sons, Inc., pp.397 - 404.
[WO03] Wang, Q. and Oommen, B. J. (2003), “Classification Error-Rate Estimation
Using New Pseudo-Sample Bootstrap Methods”, Pattern Recognition in
Information System, PRIS-2003, Jean-Marc Ogier and Eric Trupin (Eds.), pp.
98-103.
[Wu00] Wang, Q. (2000), “Bootstrap Techniques for Statistical Pattern Recognition”,
Master Thesis, Carleton University.
[Zh87] Zhen, Z. (1987), “Random weighting methods”, Acta Math. Appl. Sinica, 10,
pp.247 - 253.
168