Multimedia Analysis of Video Sources

Juan Arraiza Irujo, Montse Cuadros, Naiara Aginako, Matteo Raffaelli, Olga Kaehm, Naser Damer, Joao P. Neto


Law Enforcement Agencies (LEAs) spend increasing efforts and resources on monitoring open sources, searching for suspicious behaviours and crime clues. The task of efficiently and effectively monitoring open sources is strongly linked to the capability of automatically retrieving and analyzing multimedia data. This paper presents a multimodal analytics system, created in cooperation with European LEAs. In particular it is described how the video analytics subsystem produces a workflow of multimedia data analysis processes. After a first analysis of video files, images are extracted in order to perform image comparison, classification and face recognition. In addition, audio content is extracted to perform speaker recognition and multilingual analysis of text transcripts. The integration of multimedia analysis results allows LEAs to extract pertinent knowledge from the gathered information.


  1. Bay, H., Tuytelaars, T., and Gool, L. V. (2006). Surf: Speeded up robust features. In In ECCV, pages 404- 417.
  2. Bosma, W., Vossen, P., Soroa, A., Rigau, G., Tesconi, M., Marchetti, A., Monachini, M., and Aliprandi, C. (2009). Kaf: a generic semantic annotation format. In Proceedings of the GL2009 Workshop on Semantic Annotation.
  3. Chen, Y., Zhou, X., and Huang, T. S. (2001). One-class svm for learning in image retrieval. pages 34-37.
  4. Damer, N., Opel, A., and Shahverdyan, A. (2013). An overview on multi-biometric score-level fusion - verification and identification. In Marsico, M. D. and Fred, A. L. N., editors, ICPRAM, pages 647-653. SciTePress.
  5. Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., and Ouellet, P. (2011). Front-end factor analysis for speaker verification. Audio, Speech, and Language Processing, IEEE Transactions on, 19(4):788-798.
  6. Ejaz, N., Tariq, T. B., and Baik, S. W. (2012). Adaptive key frame extraction for video summarization using an aggregation mechanism. Journal of Visual Communication and Image Representation, 23(7):1031-1040.
  7. Fratric, I. and Ribaric, S. (2011). Local binary lda for face recognition. In Proceedings of the COST 2101 European conference on Biometrics and ID management, BioID'11, pages 144-155, Berlin, Heidelberg. Springer-Verlag.
  8. Huang, G. B., Jain, V., and Learned-Miller, E. G. (2007). Unsupervised joint alignment of complex images. In ICCV, pages 1-8. IEEE.
  9. Lowe, D. G. (2003). Distinctive image features from scaleinvariant keypoints.
  10. Maybury, M. T. (2012). Multimedia Information Extraction: Advances in Video, Audio, and Imagery Analysis for Search, Data Mining, Surveillance and Authoring. Wiley.
  11. Nadeau, D. and Sekine, S. (2007). A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1):3-26.
  12. Rui, Y., Huang, T., and Mehrotra, S. (1998). Exploring video structure beyond the shots. In Multimedia Computing and Systems, 1998. Proceedings. IEEE International Conference on, pages 237-240.
  13. Sun, Z. and Fu, P. (2003). Combination of color- and objectoutline-based method in video segmentation.
  14. Tuytelaars, T. and Mikolajczyk, K. (2008). K.: Local invariant feature detectors: A survey. FnT Comp. Graphics and Vision, pages 177-280.
  15. Viola, P. and Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, volume 1, pages I-511-I-518 vol.1.
  16. Zhang, H., Kankanhalli, A., and Smoliar, S. W. (1993). Automatic partitioning of full-motion video. Multimedia systems, 1(1):10-28.
  17. Zhang, J. F., Wei, Z. Q., Jiang, S. M., Li, J., Xu, S. J., and Wang, S. (2012). An improved algorithm of video shot boundary detection. Advanced Materials Research, 403:1258-1261.

Paper Citation

in Harvard Style

Arraiza Irujo J., Cuadros M., Aginako N., Raffaelli M., Kaehm O., Damer N. and P. Neto J. (2014). Multimedia Analysis of Video Sources . In Proceedings of the 11th International Conference on Signal Processing and Multimedia Applications - Volume 1: MUSESUAN, (ICETE 2014) ISBN 978-989-758-046-8, pages 346-352. DOI: 10.5220/0005126903460352

in Bibtex Style

author={Juan Arraiza Irujo and Montse Cuadros and Naiara Aginako and Matteo Raffaelli and Olga Kaehm and Naser Damer and Joao P. Neto},
title={Multimedia Analysis of Video Sources},
booktitle={Proceedings of the 11th International Conference on Signal Processing and Multimedia Applications - Volume 1: MUSESUAN, (ICETE 2014)},

in EndNote Style

JO - Proceedings of the 11th International Conference on Signal Processing and Multimedia Applications - Volume 1: MUSESUAN, (ICETE 2014)
TI - Multimedia Analysis of Video Sources
SN - 978-989-758-046-8
AU - Arraiza Irujo J.
AU - Cuadros M.
AU - Aginako N.
AU - Raffaelli M.
AU - Kaehm O.
AU - Damer N.
AU - P. Neto J.
PY - 2014
SP - 346
EP - 352
DO - 10.5220/0005126903460352