Table 6: Summary of the selected features for the LSQD.
No. Measure Statistic Occ. (%)
1 Pitch Mean 100.00
2 Pitch Std 100.00
3 RUF - 100.00
4 MFFC 4 Mean 99.10
5 MFCC 5 Std 97.30
6 HNR Mean 84.68
7 EE Std 84.68
8 EE Max/Median 83.78
9 SP Mean 83.78
10 MFCC 3 Mean 82.88
11 MFCC 1 Std 78.38
12 ZCR Max/Mean 78.38
13 STE Std 66.67
14 ∆MFCC 5 Std 55.86
15 EE Mean 52.25
16 ∆MFCC 1 Std 51.35
17 STE Mean 48.65
18 ∆MFCC 3 Std 46.85
19 MFCC 1 Mean 43.24
20 MFCC 5 Mean 37.84
and quadratic detectors better than those based on
neural networks. This can be explained by the fact
that the loss of generalization is directly related to
overfitting tendencies. In that way, neural networks
can work better for a specific environment (real or fic-
tional), or when a single database is used for training
and test. However, they are not able to get good re-
sults when the databases are crossed.
Future work will focus on using other types
of classifiers and testing the system with different
databases (e.g. videogames). The use of additional
features and statistics will also be explored.
ACKNOWLEDGEMENTS
This work has been funded by the Spanish Ministry
of Economy and Competitiveness (under project
TEC2015-67387-C4-4-R, funds Spain/FEDER)
and by the University of Alcal
´
a (under project
CCG2015/EXP-056).
REFERENCES
Chen, L.-H., Hsu, H.-W., Wang, L.-Y., and Su, C.-W.
(2011). Violence detection in movies. In Computer
Graphics, Imaging and Visualization (CGIV), 2011
Eighth International Conference on, pages 119–124.
IEEE.
Demarty, C.-H., Penet, C., Gravier, G., and Soleymani,
M. (2012). The mediaeval 2012 affect task: violent
scenes detection. In Working Notes Proceedings of
the MediaEval 2012 Workshop.
Doukas, C. N. and Maglogiannis, I. (2011). Emergency
fall incidents detection in assisted living environments
utilizing motion, sound, and visual perceptual compo-
nents. IEEE Transactions on Information Technology
in Biomedicine, 15(2):277–289.
Garc
´
ıa-G
´
omez, J., Bautista-Dur
´
an, M., Gil-Pita, R.,
Mohino-Herranz, I., and Rosa-Zurera, M. (2016). Vi-
olence detection in real environments for smart cities.
In Ubiquitous Computing and Ambient Intelligence:
10th International Conference, UCAmI 2016, San
Bartolom
´
e de Tirajana, Gran Canaria, Spain, Novem-
ber 29–December 2, 2016, Part II, pages 482–494.
Springer.
Giannakopoulos, T., Kosmopoulos, D., Aristidou, A., and
Theodoridis, S. (2006). Violence content classifica-
tion using audio features. In Hellenic Conference on
Artificial Intelligence, pages 502–507. Springer.
Gil-Pita, R., L
´
opez-Garrido, B., and Rosa-Zurera, M.
(2015). Tailored mfccs for sound environment clas-
sification in hearing aids. In Advanced Computer
and Communication Engineering Technology, pages
1037–1048. Springer.
Jalil, M., Butt, F. A., and Malik, A. (2013). Short-time en-
ergy, magnitude, zero crossing rate and autocorrela-
tion measurement for discriminating voiced and un-
voiced segments of speech signals. In Technological
Advances in Electrical, Electronics and Computer En-
gineering (TAEECE), 2013 International Conference
on, pages 208–212. IEEE.
Krug, E. G., Mercy, J. A., Dahlberg, L. L., and Zwi, A. B.
(2002). The world report on violence and health. The
lancet, 360(9339):1083–1088.
Mohino, I., Gil-Pita, R., and Alvarez, L. (2011). Stress de-
tection through emotional speech analysis. Springer.
Mohino, I., Goni, M., Alvarez, L., Llerena, C., and Gil-Pita,
R. (2013). Detection of emotions and stress through
speech analysis. Proceedings of the Signal Process-
ing, Pattern Recognition and Application-2013, Inns-
bruck, Austria, pages 12–14.
Nam, J., Alghoniemy, M., and Tewfik, A. H. (1998). Audio-
visual content-based violent scene characterization. In
Image Processing, 1998. ICIP 98. Proceedings. 1998
International Conference on, volume 1, pages 353–
357. IEEE.
Tzanetakis, G. and Cook, P. (2002). Musical genre classifi-
cation of audio signals. IEEE Transactions on speech
and audio processing, 10(5):293–302.
Xu, M., Chia, L.-T., and Jin, J. (2005). Affective con-
tent analysis in comedy and horror videos by audio
emotional event detection. In 2005 IEEE Interna-
tional Conference on Multimedia and Expo, pages 4–
pp. IEEE.
ICPRAM 2017 - 6th International Conference on Pattern Recognition Applications and Methods
462