A System for Historical Documents Transcription based on Hierarchical Classification and Dictionary Matching

Camelia Lemnaru; Andreea Sin-Neamțiu; Mihai-Andrei Vereș; Rodica Potolea

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

A System for Historical Documents Transcription based on Hierarchical Classification and Dictionary Matching

Topics: Clustering and Classification Methods; Mining High-Dimensional Data; Pre-Processing and Post-Processing for Data Mining; Structured Data Analysis and Statistical Methods

In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 0IC3K, 353-357, 2012 , Barcelona, Spain

Authors: Camelia Lemnaru ; Andreea Sin-Neamțiu ; Mihai-Andrei Vereș and Rodica Potolea

Affiliation: Technical University of Cluj-Napoca, Romania

Keyword(s): Handwriting Recognition, Historical Document, Hierarchical Classifier, Dictionary Analysis, Kurrent Schrift.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Mining High-Dimensional Data ; Pre-Processing and Post-Processing for Data Mining ; Structured Data Analysis and Statistical Methods ; Symbolic Systems

Abstract: Information contained in historical sources is highly important for the research of historians; yet, extracting it manually from documents written in difficult scripts is often an expensive and time-consuming process. This paper proposes a modular system for transcribing documents written in a challenging script (German Kurrent Schrift). The solution comprises of three main stages: Document Processing, Word Processing and Word Selector, chained together in a linear pipeline. The system is currently under development, with several modules in each stage already implemented and evaluated. The main focus so far has been on the character recognition module, where a hierarchical classifier is proposed. Preliminary evaluations on the character recognition module has yielded ~ 82% overall character recognition rate, and a series of groups of confusable characters, for which an additional identification model is currently investigated. Also, word composition based on a dictionary matching app roach using the Levenshtein distance is presented. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.138

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Lemnaru, C., Sin-Neamțiu, A., Vereș, M.-A. and Potolea, R. (2012). A System for Historical Documents Transcription based on Hierarchical Classification and Dictionary Matching. In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2012) - KDIR; ISBN 978-989-8565-29-7; ISSN 2184-3228, SciTePress, pages 353-357. DOI: 10.5220/0004143003530357

@conference{kdir12,
author={Camelia Lemnaru and Andreea Sin{-}Neamțiu and Mihai{-}Andrei Vereș and Rodica Potolea},
title={A System for Historical Documents Transcription based on Hierarchical Classification and Dictionary Matching},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2012) - KDIR},
year={2012},
pages={353-357},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004143003530357},
isbn={978-989-8565-29-7},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2012) - KDIR
TI - A System for Historical Documents Transcription based on Hierarchical Classification and Dictionary Matching
SN - 978-989-8565-29-7
IS - 2184-3228
AU - Lemnaru, C.
AU - Sin-Neamțiu, A.
AU - Vereș, M.
AU - Potolea, R.
PY - 2012
SP - 353
EP - 357
DO - 10.5220/0004143003530357
PB - SciTePress