Authors:
Lucas B. Sena
;
Francisco D. B. S. Praciano
;
Iago C. Chaves
;
Felipe T. Brito
;
Eduardo Rodrigues Duarte Neto
;
Jose Maria Monteiro
and
Javam C. Machado
Affiliation:
Computer Science Department, Universidade Federal do Ceará, Fortaleza, Brazil
Keyword(s):
Audio Classification, Multi-context, Convolutional Neural Networks, Mel Spectograms.
Abstract:
Audio classification is an important research topic in pattern recognition and has been widely used in several domains, such as sentiment analysis, speech emotion recognition, environment sound classification and sound events detection. It consists in predicting a piece of audio signal into one of the pre-defined semantic classes. In recent years, researchers have been applied convolution neural networks to tackle audio pattern recognition problems. However, these approaches are commonly designed for specific purposes. In this case, machine learning practitioners, who do not have specialist knowledge in audio classification, may find it hard to select a proper approach for different audio contexts. In this paper we propose AUDIO-MC, a general framework for multi-context audio classification. The main goal of this work is to ease the adoption of audio classifiers for general machine learning practitioners, who do not have audio analysis experience. Experimental results show that our f
ramework achieves better or similar performance when compared to single-context audio classification techniques. AUDIO-MC framework shows an accuracy of over 80% for all analyzed contexts. In particular, the highest achieved accuracies are 90.60%, 93.21% and 98.10% over RAVDESS, ESC-50 and URBAN datasets, respectively.
(More)