Authors:
Maroua Mehri
1
;
Petra Gomez-Krämer
2
;
Pierre Héroux
3
;
Alain Boucher
2
and
Rémy Mullot
2
Affiliations:
1
University of La Rochelle and University of Rouen, France
;
2
L3i, France
;
3
LITIS, France
Keyword(s):
Ancient Digitized Books, Pixel Labeling, Texture, Multiresolution, Consensus Clustering, Clustering And Classification Accuracy Metrics.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Computer Vision, Visualization and Computer Graphics
;
Document Analysis and Understanding
;
Feature Selection and Extraction
;
Image Understanding
;
Pattern Recognition
;
Software Engineering
;
Theory and Methods
Abstract:
In this article, a complete framework for the comparative analysis of texture features is presented and evaluated for the segmentation and characterization of ancient book pages. Firstly, the content of an entire book is characterized by extracting the texture attributes of each page. The extraction of the texture features is based on a multiresolution analysis. Secondly, a clustering approach is performed in order to classify automatically the homogeneous regions of book pages. Namely, two approaches are compared based on two different statistical categories of texture features, autocorrelation and co-occurrence, in order to segment the content of ancient book pages and find homogeneous regions with little a priori knowledge. By computing several clustering and classification accuracy measures, the results of the comparison show the effectiveness of the proposed framework. Tests on different book contents (text vs. graphics, manuscript vs. printed) show that those texture features a
re more suitable to distinguish textual regions from graphical ones, than to distinguish text fonts.
(More)