5 Conclusions
This paper proposed an image-based indexing system based on fuzzy pattern
recognition built specifically for 17th century documents. The processing sequence
was presented, from the early candidate filtering to the actual computation of
similarity values, and test results and procedure were summarized.
The indexer system achieved quality results. Despite some problems detected with
mpact italic text and small word images with few specific features to extract, most
indexing runs returned a large and accurate list of matches, provided word
segmentation worked suitably. False matches near the top of the list were limited. The
filters developed for indexing performed very well, drastically cutting processing time
while retaining high quality output.
Further work can include the development of an aut
omatic parameter adjustment
system based on measurable properties of the documents being processed.
This work was partly supported by: the “Programa de Financiamento Plurianual de
Unidades de I&D (POCTI), do Quadro Comunitário de Apoio III”; the FCT project
POSI/SRI/41201/2001; “Programa do FSE-UE, PRODEP III, no âmbito do III
Quadro Comunitário de apoio”; and program FEDER. We also wish to express our
acknowledgments to the Portuguese Bibioteca Nacional, whose continuous support
has made possible this work.
