0 100 200 300 400 500 600
80
85
90
95
100
Number of randomly selected window
Average Recall (%)
OVA−SVM
OVA−AdaBoost−SVM
Sparse−ECOC−SVM
Figure 2: Average Recall values versus the number of ran-
domly selected window for each method.
4 CONCLUSION
In this paper, we have proposed a method for the clas-
sification and retrieval of business form type docu-
ment images. In our method, we incorporate BoVW
model using a set of features based on structural vari-
ations in local image patches and present an approach
to learn the visual words’ histogram at layout level.
Using a real-world bank data, we perform the analy-
sis of different multiclass classification strategies and
ensemble classifiers (Boosting) method with SVM as
a base learner. Although initial results in this study
seem to be promising, we believe that the proposed
document image classification approach should also
be investigated on real benchmark datasets. Further-
more, the effectiveness of the proposed local feature
descriptors in this work should also be compared with
that of the existing descriptors in literature, e.g., SIFT
and SURF. Both of these issues remain as a future
work to validate the robustness of the proposed ap-
proach.
ACKNOWLEDGEMENTS
This work was partially supported by the Scientific
and Technological Research Council of Turkey under
Grant 3120918 and by Yapi Kredi Bank under Grant
62609.
REFERENCES
Allwein, E., Schapire, R., and Singer, Y. (2001). Reducing
multiclass to binary: A unifying approach for margin
classifiers. J. Mach. Learn. Res., 1:113–141.
Bagheri, M., Montezar, G., and Escalera, S. (2012). Error
correcting output codes for multiclass classification:
Application to two image vision problems. In CSI In-
ternatioanal Symposium on Artificial Intelligence and
Signal Processing (AISP), pages 508–513.
Crammer, K. and Singer, Y. (2000). On the learnability and
design of output codes for multiclass problems. In
Proceedings of the Thirteenth Annual Conference on
Computational Learning Theory, pages 35–46.
Csurka, G., Dance, C., Fan, F., Willamowski, F., and Bray,
C. (2004). Visual categorization with bags of key-
points. In Workshop on Statistical Learning in Com-
puter Vision, ECCV, pages 1–22.
Dietterich, T. and Bakiri, G. (1995). Solving multiclass
learning problems via error-correcting output codes.
J. Artif. Int. Res., 2(1):263–286.
Fan, K., Wang, Y., and Chang, M. (2001). Form docu-
ment identification using line structure based features.
In Proc.of the 6th Int. Conf. on Document Anal. &
Recognition, pages 704–708.
Hu, J., Kashi, R., and Wilfong, G. (1999). Document im-
age layout comparision and classification. In Proc. of
the 6th Int. Conf. on Document Anal. & Recognition,
pages 285–288.
Kumar, J. and Doermann, D. (2013). Unsupervised classi-
fication of structurally similar document images. In
Proc. of the 12th Int. Conf. on Document Anal. &
Recognition, pages 1225–1229.
Kumar, J., Ye, P., and Doermann, D. (2012). Learning Doc-
ument Structure for Retrieval and Classification. In In-
ternational Conference on Pattern Recognition (ICPR
2012), pages 1558–1561.
Lazebnik, S., Schmid, C., and Ponse, J. (2006). Beyond
bags of features: Spatial pyramid matching for recog-
nizing natural scene categories. In Computer Vision
and Pattern Recognition, IEEE Conf, pages 2169–
2178.
Li, X., Wang, L., and Sung, E. (2008). Adaboost with svm-
based component classifiers. Eng. Appl. Artif. Intell.,
21(5):785–795.
Smith, D. and Harvey, R. (2011). Document retrieval using
sift image features. 17(1):3–15.
Sokolova, M. and Lapalme, G. (2009). A systematic anal-
ysis of performance measures for classification tasks.
Inf. Process. Manage., 45(4):427–437.
Vapnik, V. (1998). Statictal Learning Theory. John Wiley
and Sons, Inc.,New York.
Winn, J., Criminisi, A., and Minka, T. (2005). Object cat-
egorization by learned universal visual dictionary. In
ICCV, pages 1800–1807.
Yang, Y. and Newsam, S. (2011). Spatial pyramid co-
occurrence for image classification. In Computer
Vision (ICCV),International Conference on, pages
1465–1472.
Yao, B., Khaslo, A., and Fei-Fei, L. (2011). Combining ran-
domization and discrimination for fine-grained image
categorization. In Proc. CVPR.
Zheng, Z., Wu, X., and Srihari, R. (2004). Feature selection
for text categorization on imbalanced data. SIGKDD
Explor. Newsl., 6(1):80–89.
DocumentImageClassificationViaAdaBoostandECOCStrategiesBasedonSVMLearners
255