
and “wants to go home”. These three classes ex-
hibited higher feature variations when compared to
the closely related other classes. Collecting data for
more number of unique entities (cows) for ”Asking
for water” and ”Wants to go home” classes could in-
crease accuracy of classification of these classes. Ta-
ble 16 summarizes performance comparison of prior
arts methods. This table provides a representative ref-
erence for the animal speech classification accuracy
of the existing deep learning technologies. Our re-
sults are added in the last row.
4.3.2 Unseen Dataset
The performance of the models with overall data is
checked for subject independent unseen data. The
classification of five classes is achieved with test ac-
curacy of 87% and 88% with kappa coefficients of
0.83 and 0.85 for MFCC features and OpenSMILE
features, respectively. The detailed performance met-
rics and confusion matrix for the model with MFCC
features for unseen data can be seen in Table 14 and
in Fig. 3, respectively. Similarly, the detailed perfor-
mance metrics and confusion matrix for the model
with OpenSMILE features for unseen data can be
seen in Table 15 and in Fig. 4, respectively. From
Table 3, the accuracy was highest at 97% with both
the feature extraction methods for overall data (i.e.,
when all augmentation methods are used together).
We find that the model with OpenSMILE features per-
formed slightly better for subject independent unseen
data with 88% accuracy than with MFCC features
with 87% accuracy. Results indicate that the proposed
methodology can be effective in monitoring cattle be-
haviour.
5 CONCLUSIONS
The results of this study could be utilised to create so-
phisticated non-invasive systems for tracking the be-
haviour of cattle and generating a richer understand-
ing about their individual and collective response in
various scenarios. This would be helpful for farmers
and ranchers to manage them. Future studies could
focus on expanding the horizon of the study to include
aspects like handling the effect of background noise,
intrusion detection by recognizing non-cattle sounds,
and new cattle scenarios.
REFERENCES
Briggs, F., Huang, Y., Raich, R., Eftaxias, K., Lei, Z.,
Cukierski, W., Hadley, S. F., Hadley, A., Betts, M.,
Fern, X. Z., et al. (2013). The 9th annual mlsp com-
petition: New methods for acoustic classification of
multiple simultaneous bird species in a noisy environ-
ment. In IEEE international workshop on machine
learning for signal processing (MLSP), pages 1–8.
Deller Jr, J. R. (1993). Discrete-time processing of speech
signals. In Discrete-time processing of speech signals,
pages 908–908.
Dousti Mousavi, N., Aldirawi, H., and Yang, J. (2023).
Categorical data analysis for high-dimensional sparse
gene expression data. BioTech, 12(3):52.
Esposito, M., Valente, G., Plasencia-Cala
˜
na, Y., Du-
montier, M., Giordano, B. L., and Formisano, E.
(2023). Semantically-informed deep neural networks
for sound recognition. In IEEE International Con-
ference on Acoustics, Speech and Signal Processing
(ICASSP), pages 1–5.
Eyben, F., W
¨
ollmer, M., and Schuller, B. (2010). Opens-
mile: the munich versatile and fast open-source au-
dio feature extractor. In Proceedings of the 18th ACM
international conference on Multimedia, pages 1459–
1462.
Ganchev, T., Fakotakis, N., and Kokkinakis, G. (2005).
Comparative evaluation of various mfcc implementa-
tions on the speaker verification task. In Proceedings
of the SPECOM, volume 1, pages 191–194. Citeseer.
Gong, Y., Yu, J., and Glass, J. (2022). Vocalsound: A
dataset for improving human vocal sounds recogni-
tion. In IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP), pages 151–
155.
Green, A., Clark, C., Favaro, L., Lomax, S., and Reby, D.
(2019). Vocal individuality of holstein-friesian cattle
is maintained across putatively positive and negative
farming contexts. Scientific Reports, 9(1):18468.
Hara, K., Saito, D., and Shouno, H. (2015). Analysis of
function of rectified linear unit used in deep learning.
In 2015 international joint conference on neural net-
works (IJCNN), pages 1–8. IEEE.
Herlin, A., Brunberg, E., Hultgren, J., H
¨
ogberg, N., Ryd-
berg, A., and Skarin, A. (2021). Animal welfare im-
plications of digital tools for monitoring and manage-
ment of cattle and sheep on pasture. Animals, 11(3).
Huang, J., Zhang, T., Cuan, K., and Fang, C. (2021). An
intelligent method for detecting poultry eating be-
haviour based on vocalization signals. Computers and
Electronics in Agriculture, 180:105884.
Ikeda, Y. and Ishii, Y. (2008). Recognition of two psy-
chological conditions of a single cow by her voice.
Computers and Electronics in Agriculture, 62(1):67–
72. Precision Livestock Farming (PLF).
Jindal, S., Nathwani, K., and Abrol, V. (2021). Classifi-
cation of infant behavioural traits using acoustic cry:
An empirical study. In 2021 12th International Sym-
posium on Image and Signal Processing and Analysis
(ISPA), pages 97–102.
Identifying Indian Cattle Behaviour Using Acoustic Biomarkers
601