ROBUST SPOKEN DOCUMENT RETRIEVAL BASED ON MULTILINGUAL SUBPHONETIC SEGMENT RECOGNITION

Shi-wook Lee; Kazuyo Tanaka; Yoshiaki Itoh

doi:10.5220/0002636201340139

ROBUST SPOKEN DOCUMENT RETRIEVAL BASED ON MULTILINGUAL SUBPHONETIC SEGMENT RECOGNITION

Shi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh

2004

Abstract

This paper describes the development and application of a subphonetic segment recognition system for spoken document retrieval. Following from the development of an open-vocabulary spoken document retrieval system, where the retrieval process is accomplished in the symbolic domain by measuring the distance between the parts of subphonetic segment results from pattern recognition in the acoustic domain, the system proposed here performs matching based on subphonetic segment as more basic unit than the semantic unit. As such, the system is not constrained by vocabulary or grammar, and can be readily extended to multilingual tasks. This paper presents the proposed spoken document retrieval system including the proposed subphonetic segment recognition scheme, and evaluates the performance and feasibility of the system through experimental application to multilingual retrieval tasks.

References

E. Voorhees and D. Harman (1998). "Overview of the Seventh Text REtrieval Conference" In Proc. of the 7th Text Retrieval Conference (TREC-7) pp. 1-24 .
K. Ng (2000). "Subword-based approaches for Spoken Document Retrieval" In Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA .
M. A. Siegler, et al. (1997). "Automatic Segmentation, Classi cation and Clustering of Broadcast News Audio" In ARPA Speech Recognition Workshop pp. 97- 99.
K. Spärck Jones, G. J. F. Jones, J. T. Foote and S. J. Young (1996). "Experiments in spoken document retrieval" In Information Processing and Management 32(4):pp. 399-417.
K. Tanaka, et al. (2001). "Speech data retrieval system constructed on a universal phonetic code domain" In Proc. of ASRU2001 pp. 1-4.
S. Lee, et al. (2002). "Evaluation of speech data retrieval system using sub-phonetic sequence" In Proc. of Autumn Meeting of the Acoustical Society of Japan pp. 159-160.
Y. Itoh and K. Tanaka (2001). "Automatic Labeling and Digesting for Lecture Speech Utilizing Repeated Speech by Shift CDP" In Proc. of EUROSPEECH-2001 pp. 1805-1808.
K. Fukunaga (1990). "Introduction to Statistical Pattern Recognition" Academic Press
T. Kawahara, et al. (1998). "Sharable software repository for Japanese large vocabulary continuous speech recognition" In Proc. of ICSLP'98 pp. 3527-3260.
The CMU Pronouncing Dictionary (v. 0.6), http://www.speech.cs.cmu.edu/cgi-bin/cmudict.

Download

Paper Citation

in Harvard Style

Lee S., Tanaka K. and Itoh Y. (2004). ROBUST SPOKEN DOCUMENT RETRIEVAL BASED ON MULTILINGUAL SUBPHONETIC SEGMENT RECOGNITION . In Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 5: ICEIS, ISBN 972-8865-00-7, pages 134-139. DOI: 10.5220/0002636201340139

in Bibtex Style

@conference{iceis04,
author={Shi-wook Lee and Kazuyo Tanaka and Yoshiaki Itoh},
title={ROBUST SPOKEN DOCUMENT RETRIEVAL BASED ON MULTILINGUAL SUBPHONETIC SEGMENT RECOGNITION},
booktitle={Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 5: ICEIS,},
year={2004},
pages={134-139},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002636201340139},
isbn={972-8865-00-7},
}

in EndNote Style

TY - CONF
JO - Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 5: ICEIS,
TI - ROBUST SPOKEN DOCUMENT RETRIEVAL BASED ON MULTILINGUAL SUBPHONETIC SEGMENT RECOGNITION
SN - 972-8865-00-7
AU - Lee S.
AU - Tanaka K.
AU - Itoh Y.
PY - 2004
SP - 134
EP - 139
DO - 10.5220/0002636201340139