Authors:
N. Nikolaou
1
;
E. Badekas
1
;
N. Papamarkos
1
and
C. Strouthopoulos
2
Affiliations:
1
Democritus University of Thrace, Greece
;
2
Technological Educational Institution of Serres, Greece
Keyword(s):
Document processing, Text localization, Page layout analysis, Color quantization.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Biomedical Signal Processing
;
Computer Vision, Visualization and Computer Graphics
;
Data Manipulation
;
Feature Extraction
;
Features Extraction
;
Health Engineering and Technology Applications
;
Human-Computer Interaction
;
Image and Video Analysis
;
Informatics in Control, Automation and Robotics
;
Methodologies and Methods
;
Neurocomputing
;
Neurotechnology, Electronics and Informatics
;
Pattern Recognition
;
Physiological Computing Systems
;
Segmentation and Grouping
;
Sensor Networks
;
Signal Processing, Sensors, Systems Modeling and Control
;
Soft Computing
Abstract:
Abstract.A new method for text localization in cover color pages and general color document images is presented. The colors of the document image are reduced to a small number using a color reduction technique based on a Kohonen Self Organized Map (KSOM) neural network. Each color defines a color plane in which the connected components (CCs) are extracted. In each color plane a CC filtering procedure is applied which is followed by a local grouping procedure. At the end of this stage, groups of CCs are constructed which are next refined by obtaining the Direction Of Connection (DOC) property for each CC. Using the DOC property, the groups of CCs are classified as text or non text regions. Finally, text regions identified in the different color planes are superimposed and the final text localization of the entire document is achieved. The proposed technique was extensively tested with a large number of color documents.