INTERVENANT CLASSIFICATION IN AN AUDIOVISUAL DOCUMENT
Jeremy Philippeau, Julien Pinquier, Philippe Joly
2006
Abstract
This document deals with the definition of a new descriptor for audiovisual document indexing : the intervenant. We actually focus on its audiovisual localization, this is to say its place in an audiovisual sequence and its classification in 3 categories : IN, OUT or OFF. Based on the comparison of different analysis tools of both audio and video modes, we define a set of descriptors which can automatically be filled, potentially relevant to classify the intervenant localization. This decision is taken on the base of transition modeling between classes.
References
- Furui, S. (1981). Cepstral analysis technique for automatic speaker verification. In IEEE Trans. Acoust. Speech Signal Process., volume 29, pages 254-272.
- Jaffre, G. and Joly, P. (2004). Costume: A new feature for automatic video content indexing. In RIAO 2004, pages 314-325, Avignon, France.
- Kijak, E. (2003). Structuration multimodale des videos de sports par modeles stochastiques. PhD thesis, Universite de Rennes 1.
- Kraaij, W., Smeaton, A., Over, P., and Arlandis, J. (2004). Trecvid 2004 - an introduction. In Proceedings of the TRECVID 2004 Workshop, pages 1-13, Gaithersburg, Maryland, USA.
- Mokbel, C., Jouvet, D., and J., M. (1995). Blind equalization using adaptitive filtering for improving speech recognition over telephone. In European Conference on Speech Communication and Technology, pages 817-820, Madrid, Spain.
- Potamianos, G., Graf, H., and Cosatto, E. (1998). An image transform approch for hmm based automatic lipreading. In Proceedings of the Internationnal Conference on Image Processing, volume 3, pages 173-177, Chicago.
- Potamianos, G., Neti, C., Luettin, J., and Matthews, I. (2004). Audio-visual automatic speech recognition: An overview. In Bailly, G., Vatikiotis-Bateson, E., and Perrier, P., editors, Issues in Visual and Audio-Visual Speech Processing. MIT Press.
- Tianhao, L., Q.-J. F. (2006). Analyze perceptual adaptation to spectrally-shifted vowels with gmm technique. In 10th Annual Fred S. Grodins Graduate Research Symposium, pages 120-121. USC School of Engineering.
Paper Citation
in Harvard Style
Philippeau J., Pinquier J. and Joly P. (2006). INTERVENANT CLASSIFICATION IN AN AUDIOVISUAL DOCUMENT . In Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2006) ISBN 978-972-8865-64-1, pages 185-188. DOI: 10.5220/0001570801850188
in Bibtex Style
@conference{sigmap06,
author={Jeremy Philippeau and Julien Pinquier and Philippe Joly},
title={INTERVENANT CLASSIFICATION IN AN AUDIOVISUAL DOCUMENT},
booktitle={Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2006)},
year={2006},
pages={185-188},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001570801850188},
isbn={978-972-8865-64-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2006)
TI - INTERVENANT CLASSIFICATION IN AN AUDIOVISUAL DOCUMENT
SN - 978-972-8865-64-1
AU - Philippeau J.
AU - Pinquier J.
AU - Joly P.
PY - 2006
SP - 185
EP - 188
DO - 10.5220/0001570801850188