idea that humans don’t detect emotions from pure
audio signal but from signal that had been
previously processed by the cochlea, in this work we
proposed a new feature set for music emotion
recognition.
An audio signal was filtered with a bank of
Gammatone filters resulting with 20 channels each
representing output of the basilar membrane at
particular location corresponding to filter’s central
frequencies. During automated process proposed
features were extracted as average power of each
filter's output and compared with other state of the
art features. Support vector machine and TreeBagger
classifiers were used for performance evaluation.
Experimental results on 1000 Songs Database
showed that the proposed feature vector outperforms
basic set as well as MFCC features. Proposed
features also performed better in terms of accuracy
gain when combined with the base set. Comparison
with MFCC coefficient is particularly relevant
because it gives us the real insight on how well our
proposed feature performs over other perceptually-
motivated feature.
Although the results are good they still need to
be improved for real-life applications where
emotional changes should be tracked continuously
during audio clip. In our future work, we will focus
our research in direction of developing improved
features based on auditory perception.
Extracting features from a more complex model
of auditory processing, thus simulating cochlea in
more detail could bring us further in improving
music emotion recognition.
ACKNOWLEDGEMENTS
This work has been fully supported by the Croatian
Science Foundation under the project number UIP-
2014-09-3875.
REFERENCES
Bai, J., Peng, J., Shi, J., Tang, D., Wu, Y., Li, J., Luo, K.,
2016. Dimensional music emotion recognition by
valence-arousal regression, 2016 IEEE 15th
International Conference on Cognitive Informatics &
Cognitive Computing (ICCI*CC), Palo Alto, CA, pp.
42-49.
Glasberg, B.R., Moore, B. C. J., 1990. Derivation of
auditory filter shapes from notched-noise data,
Hearing Res., vol. 47, pp. 103–108.
Hampiholi, V., 2012. A method for Music Classification
based on Perceived Mood Detection for Indian
Bollywood Music, World Academy of Science,
Engineering and Technology. vol.6 December.
Hevner, K., 1936. Experimental studies of the elements of
expression in music. American Journal of
Psychology,vol. 48, pp. 246–268.
Kartikay, A., Ganesan, H., Ladwani, V. M, 2016.
Classification of music into moods using musical
features, International Conference on Inventive
Computation Technologies (ICICT), Coimbatore, pp.
1-5.
Kim, Y. E., Schmidt, E. M., Migneco, R., Morton B. G.,
Richardson P., Scott, J., Speck, J. A., Turnbull, D.,
2010. Emotion Recognition: a State of the Art
Review., 11th International Society for Music
Information and Retrieval Conferenc.
Kumar, N., Guha, T., Huang, C. W., Vaz, C., Narayanan,
S. S., 2016. Novel affective features for multiscale
prediction of emotion in music, IEEE 18th
International Workshop on Multimedia Signal
Processing (MMSP), Montreal, QC, pp. 1-5.
Lartillot, O., Toiviainen, P., 2007. A Matlab Toolbox for
Musical Feature Extraction From Audio, International
Conference on Digital Audio Effects, Bordeaux.
Lu, L., Liu, D., Zhang, H.-Y. 2006. Automatic Mood
Detection and Tracking of Music Audio Signals, IEEE
Transactions on Audio, Speech, and Language
Processing, Vol. 14, No. 1, January.
Patterson, R. D., Robinson K., Holdsworth, J., McKeown,
D., Zhang, C., Allerhand, M. H., 1992. Complex
sounds and auditory images In Auditory Physiology
and Perception, (Eds.), Oxford, pp. 429-446.
Russell, J. A, 1980. A circumplex model of affect. J.
Personality Social Psychology, vol. 39, no. 6, pp.
1161–1178.
Soleymani, M., Caro, M. N., Schmidt, E. M., Sha, C.-Y.,
Yang, Y.-H., 2013. 1000 Songs for Emotional
Analysis of Music, Proc. of the 2Nd ACM
International Workshop on Crowdsourcing for
Multimedia., Barcelona, Spain.
Wood, P. A., Semwal, S. K., 2016. An algorithmic
approach to music retrieval by emotion based on
feature data, 2016 Future Technologies Conference
(FTC), San Francisco, CA, pp. 140-144.
Yang, Y.-H., Liu C.-C., Chen, H.-H., 2006. Music
emotion classification: A fuzzy approach, ACMMM,
pp. 81-84.
SIGMAP 2017 - 14th International Conference on Signal Processing and Multimedia Applications