SCRIPT-DESCRIPTION PAIR EXTRACTION FROM TEXT DOCUMENTS OF ENGLISH AS SECOND LANGUAGE PODCAST

Hyungjong Noh; Minwoo Jeong; Sungjin Lee; Jonghoon Lee; Gary Geunbae Lee

doi:10.5220/0002773000050010

SCRIPT-DESCRIPTION PAIR EXTRACTION FROM TEXT DOCUMENTS OF ENGLISH AS SECOND LANGUAGE PODCAST

Hyungjong Noh, Minwoo Jeong, Sungjin Lee, Jonghoon Lee, Gary Geunbae Lee

2010

Abstract

One of the best effective way to learn a language is having a conversation with a native speaker. However it is often very expensive way. A good alternative way is using Dialog-Based Computer Assisted Language Learning (DB-CALL) systems. The feedback quality in DB-CALL systems is very important. Therefore, to provide various expressions as feedback information, we propose a method which extracts script and their description sentence pairs from English as a Second Language (ESL) podcast web site. A linear CRFs classifier is used to find the corresponding description sentences and several features are selected according to the characteristics of the ESL text documents. The experimental results show that the performance of our system is acceptable.

References

Hazel, M., Mervyn, J., 2005. Scenario-Based Spoken Interaction with Virtual Agents. Computer Assisted Language Learning.
Ian, M., Stephanie, S., 2007. Immersive Second Language Acquisition in Narrow Domains: A Prototype ISLAND Dialogue System. In SLaTE-2007.
Wik, P., Hjalmarson, A., Brusk, J., 2007. DEAL a serious game for CALL practicing conversational skills in the trade domain. In SLaTE-2007.
Lafferty, J., McCallum, A., Pereira, F., 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. ICML.
Barbara, D., Domeniconi, C., Kang, N., 2003. Mining Relevant Text from Unlabelled Documents. Third IEEE International Conference on Data Mining.
Rege, M., Dong, M., Fotouhi, F., 2006. Co-clustering documents and words using Bipartite Isoperimetric Graph Partitioning. Sixth IEEE International Conference on Data Mining.
Sindhwani, V., Melville, P., 2008. Document-Word CoRegularization for Semi-supervised Sentiment Analysis. Proceedings of IEEE International Conference on Data Mining.
Cong, G., Wang, L., Lin, C., Song, Y., Sun, Y., 2008. Finding Question-Answer Pairs from Online Forums. Proceedings of the 31st annual international ACM SIGIR conference.
Ding, S., Cong, G., Lin, C., Zhu, X., 2008. Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums. Proceedings of ACL-08: HLT.
Sun, B., Mitra, P., Zha, H., Giles, C., Yen, J., 2007. Topic Segmentation with Shared Topic Detection and Alignment of Multiple Documents. Proceedings of the 30th annual international ACM SIGIR conference.
Fellbaum, C., 1998. WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press.

Download

Paper Citation

in Harvard Style

Noh H., Jeong M., Lee S., Lee J. and Geunbae Lee G. (2010). SCRIPT-DESCRIPTION PAIR EXTRACTION FROM TEXT DOCUMENTS OF ENGLISH AS SECOND LANGUAGE PODCAST . In Proceedings of the 2nd International Conference on Computer Supported Education - Volume 1: CSEDU, ISBN 978-989-674-023-8, pages 5-10. DOI: 10.5220/0002773000050010

in Bibtex Style

@conference{csedu10,
author={Hyungjong Noh and Minwoo Jeong and Sungjin Lee and Jonghoon Lee and Gary Geunbae Lee},
title={SCRIPT-DESCRIPTION PAIR EXTRACTION FROM TEXT DOCUMENTS OF ENGLISH AS SECOND LANGUAGE PODCAST},
booktitle={Proceedings of the 2nd International Conference on Computer Supported Education - Volume 1: CSEDU,},
year={2010},
pages={5-10},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002773000050010},
isbn={978-989-674-023-8},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 2nd International Conference on Computer Supported Education - Volume 1: CSEDU,
TI - SCRIPT-DESCRIPTION PAIR EXTRACTION FROM TEXT DOCUMENTS OF ENGLISH AS SECOND LANGUAGE PODCAST
SN - 978-989-674-023-8
AU - Noh H.
AU - Jeong M.
AU - Lee S.
AU - Lee J.
AU - Geunbae Lee G.
PY - 2010
SP - 5
EP - 10
DO - 10.5220/0002773000050010