class probability estimated and the original quanti-
ties used for SemiBoost. This was motivated by an
observation that the confidence levels relating to the
unlabeled samples could be adjusted by subtracting
the probability estimates as a penalty cost. Using the
modified criterion, the confidence values relating to
the labeled and unlabeled data can be balanced.
The experimental results demonstrate that the
modified sampling criterion performs well with the
S3VM, particularly when the impacts of the positive
class and negative class are similar at the boundary.
Furthermore, the results demonstrate that the classifi-
cation accuracy of the proposed algorithm is superior
to that of the traditional algorithms when appropri-
ately selecting a small amount of unlabeled data. Al-
though it has been demonstrated that S3VM can be
improved using the modified criterion, many tasks re-
main to be improved. A significant task is the selec-
tion of an optimal, or near optimal, cardinality for the
strong samples in order to further improve the clas-
sification accuracy. Furthermore, it is not yet clear
which types of significant datasets are more suitable
for using the selection strategy for S3VM. Finally, the
proposed method has limitations in the details that
support its technical reliability, and the experiments
performed were limited. Future studies will address
these concerns.
REFERENCES
Adankon, M. M. and Cheriet, M. (2011). Help-training for
semi-supervised support vector machines. In Pattern
Recognition, volume 44, pages 2946–2957.
Ben-David, S., Lu, T., and Pal, D. (2008). Does unlabeled
data provably help? worst-case analysis of the sam-
ple complexity of semi-supervised learning. In Proc.
the 22th Ann. Conf. Computational Learning Theory
(COLT08), pages 33–44, Helsinki, Finland.
Bennett, K. P. and Demiriz, A. (1998). Semi-supervised
support vector machines. In Proc. Neural Information
Processing Systems, pages 368–374.
Blum, A. and Mitchell, T. (1998). Combining labeled and
unlabeled data with co-training. In Proc. the 11th Ann.
Conf. Computational Learning Theory (COLT98),
pages 92–100, Madison, WI.
Chakraborty, S. (2011). Bayesian semi-supervised learning
with support vector machine. In Statistical Methodol-
ogy, volume 8, pages 68–82.
Chang, C. -C. and Lin, C. -J. (2011). LIBSVM : a library for
support vector machines. In ACM Trans. on Intelligent
Systems and Technology, volume 2, pages 1–27.
Chapelle, O., Sch¨olkopf, B., and Zien, A. (2006). Semi-
Supervised Learning. The MIT Press, Cambridge,
MA.
Dagan, I. and Engelson, S. P. (1995). Committee-based
sampling for training probabilistic classifiers. In A.
Prieditis, S. J. Russell, editor, Proc. Int’l Conf. on Ma-
chine Learning, pages 150–157, Tahoe City, CA.
Du, J., Ling, C. X., and Zhou, Z. -H. (2011). When does co-
training work in real data? In IEEE Trans. on Knowl-
edge and Data Eng., volume 23, pages 788–799.
Duin,R. P. W., Juszczak, P., de Ridder, D., Paclik, P.,
Pekalska, E., and Tax, D. M. J. (2004). PRTools 4:
a Matlab Toolbox for Pattern Recognition. Delft Uni-
versity of Technology, The Netherlands.
Everingham, M., Van Gool, L., William, C. K. I., Winn,
J., and Zisserman, A. (2007). The PASCAL Visual
Object Classes Challenge 2007 (VOC2007) Results.
Goldberg, A. B. (2010). New Directions in Semi-Supervised
Learning. University of Wisconsin - Madison, Madi-
son, WI.
Goldberg, A. B., Zhu, X., Singh, A., Zhu, Z., and Nowak,
R. (2009). Multi-manifold semi-supervised learning.
In D. van Dyk, M. Welling, editor, Proc. the 12th Int’l
Conf. Artificial Intelligence and Statistics (AISTATS),
pages 99–106, Clearwater, FL.
Huber, P. J. (1981). Robust Statistics. John Wiley & Sons,
New York, NY.
Jiang, Z., Zhang, S., and Zeng, J. (2013). A hybrid gener-
ative/discriminative method for semi-supervised clas-
sification. In Knowledge-Based System, volume 37,
pages 137–145.
Joachims, T. (1999a). Making large-Scale SVM Learning
Practical. In B. Sch?lkopf, C. Burges, A. Smola, ed-
itor, Advances in Kernel Methods - Support Vector
Learning, pages 41–56, Cambridge, MA. The MIT
Press.
Joachims, T. (1999b). Transductive inference for text clas-
sification using support vector machines. In Proc. the
16th Int’l Conf. on Machine Learning, pages 200–209,
San Francisco, CA. Morgan Kaufmann.
Kuo, H. -K. J. and Goel, V. (2005). Active learning with
minimum expected error for spoken language under-
standing. In Proc. the 9th Euro. Conf. on Speech Com-
munication and Technology, pages 437–440, Lisbon.
Interspeech.
Le, T. -B. and Kim, S. -W. (2012). On improving semi-
supervised MarginBoost incrementally using strong
unlabeled data. In P. L. Carmona, J. S. S´anchez,
and A. Fred, editor, Proc. the 1st Int’l Conf. Pat-
tern Recognition Applications and Methods (ICPRAM
2012), pages 265–268, Vilamoura-Algarve, Portugal.
Leng, Y., Xu, X., and Qi, G. (2013). Combining active
learning and semi-supervised learning to construct
SVM classifier. In Knowledge-Based Systems, vol-
ume 44, pages 121–131.
Li, Y. -F. and Zhou, Z. -H. (2011). Improving semi-
supervised support vector machines through unlabeled
instances selection. In Proc. the 25th AAAI Conf. on
Artificial Intelligence (AAAI’11), pages 386–391, San
Francisco, CA.
Lu, T. (2009). Fundamental Limitations of Semi-Supervised
Learning. University of Waterloo, Waterloo, Canada.
Mallapragada, P. K., Jin, R., Jain, A. K., and Liu, Y. (2009).
SemiBoost: Boosting for semi-supervised learning. In
ICPRAM2014-InternationalConferenceonPatternRecognitionApplicationsandMethods
58