Table 9: VoiceSeg variation of the cluster parameter
;
parameters: d=15 second, TN=2.
cluster distance
precision recall
60 sec 42.39% 50.65%
120 sec 65.75% 62.38%
180 sec 45.95% 44.38%
240 sec 46.27% 40.28%
300 sec 46.23% 33.77%
5 CONCLUSION AND OUTLOOK
In this paper, we present a system of segmenting
imperfect transcripted lecture videos. We show in our
evaluation that it is possible to detect boundaries in an
imperfect transcript. The results are surprisingly high.
Bear in mind that the raw material is highly erroneous
(only 70%-80% being correctly recognized). The
parameter d=15 seconds,
=120 seconds and a TN of
2 are considered optimum. The VoiceSeg algorithm
(precision 65.75%, recall 62.38%) is more
successful at detecting the boundaries compared to
the adapted TextTiling-algorithm (precision 38.30%,
recall 23.38%) and the baseline algorithm (precision
32.50%, recall 33.66%).
Our Algorithm and the TextTiling-algorithm have
problems in detecting boundaries inside repetition-
segments or inside overview-segments (which occur at
the beginning and end of a lecture). In these segments,
many infrequent words occur very close together.
Further study may be carried out with our new
algorithm VoiceSeg. We will study the influence of
difference weighting equations and the influence of
other cluster values (length, correlation, etc…) to the
results. We will adapt the cue phrase, or other pattern
detecting techniques in the areas around potential
boundaries (for example, pauses). Another important
point is to compare our result with the result of other
state of the art topic segmentation algorithm (Galley
et. al., 2003), (Choi 2000) using erroneous transcripts.
We are also working on a "lecture-browser" for a
simple navigation through the corpus of lectures. This
lecture-browser will help students in their learning and
will make the process of learning more efficient. The
combination of pedagogical and content description
leads to novel forms of visualization and exploration
of course lectures.
REFERENCES
Schillings V.; Meinel, C., 2002. tele-TASK - Teleteaching
Anywhere Solution Kit. In Proceedings. ACM
SIGUCCS 2002, 130-133. Providence, USA.
Hürst, W., 2003. A qualitative study towards using large
vocabulary automatic speech recognition to index
recorded presentations for search and access over the
web. IADIS International Journal on WWW/Internet,
Volume I, Number 1: 43-58.
Chau, M.; Jay, F.; Nunamaker, Jr.; Ming, L., Chen, H.
,2004. Segmentation of Lecture Videos Based on Text:
A Method Combining Multiple Linguistic Features. In
Proceedings of the 37th Hawaii International
Conference on System Sciences. Hawaii, USA.
Baeza-Yates, R.; Ribeiro-Neto, B., 1999. Modern
Information Retrieval. New York, USA: Addison-
Wesley.
Glass, J.; Hazen, T.J.; Hetherington, L.; Wang, C., 2004.
Analysis and Processing of Lecture Audio Data:
Preliminary Investigations. In Proceedings of the
HLT-NAACL 2004 Workshop on Interdisciplinary
Approaches to Speech Indexing and Retrieval, 9-12.
Boston, MA, USA.
Linckels, S.; Meinel, Ch.; Engel, T., 2005. Teaching in
theCyber Age: Technologies, Experiments, and
Realizations. In Proceedings of 3. Deutschen e-
Learning Fachtagung der Gesellschaft für Informatik
(DeLFI), 225 – 236. Rostock, Germany.
Repp, S.; Meinel, C., 2006. Semantic Indexing for
Recorded Educational Lecture Videos. In Proceedings
of the Fourth Annual IEEE International Conference
on Pervasive Computing and Communications
Workshops (PERCOMW'06), 240-245. Pisa, Italy.
Nicola, S., 2004. Applications of Lexical Cohesion
Analysis in the Topic Detection and Tracking Domain.
Ph.D. diss., Dept. of Computer Science, University
College Dublin.
Hearst, Marti A., 1997. TextTiling: Segmenting Text into
Multi-paragraph Subtopic Passages. Computational
Linguistics 23, 33-64. Cambridge, MA: MIT Press.
Reynar, J. C., 1998. Topic Segmentation: Algorithms and
application. Ph.D. diss., University of Pennsylvania.
Morris, J.; Hirst, G., 1991. Lexical cohesion computed by
thesaural relations as an indicator of the structure of
text. Computational Linguistics 17, 21-48. Cambridge,
MA: MIT Press.
Tür, G; Hakkani-Tür, D; Shriberg, E., 2001. Integrating
Prosodic and Lexical Cues for Automatic Topic. In
Segmentation CoRR
Beeferman, D; Adam, L.; Berger, A.; Lafferty, J., 1999.
Statistical Models for Text Segmentation. In Machine
Learning 34
Choi, F., 2000. Advance in domain independent linear text
segmentation. In Proceedings of NAACL
Galley, M.; McKeown, K.; Fosler-Lussier, E.; Jing, H.,
2003. Discourse Segmentation of Multi-Party
Conversation In Proceedings of the 41st Annual
Meeting of the Association for Computational
Linguistics
SIGMAP 2006 - INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA
APPLICATIONS
322