Hochreiter, S. and Schmidhuber, J. (1997). Lstm can solve
hard long time lag problems. Advances in neural in-
formation processing systems, pages 473–479.
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D.,
Wang, W., Weyand, T., Andreetto, M., and Adam,
H. (2017). Mobilenets: Efficient convolutional neu-
ral networks for mobile vision applications. arXiv
preprint arXiv:1704.04861.
Huang, Q., Wang, W., and Zhang, Q. (2017). Your glasses
know your diet: Dietary monitoring using electromyo-
graphy sensors. IEEE Internet of Things Journal,
4(3):705–712.
Ittichaichareon, C., Suksri, S., and Yingthawornsuk, T.
(2012). Speech recognition using mfcc. In Interna-
tional conference on computer graphics, simulation
and modeling, pages 135–138.
Kamachi, H., Kondo, T., Hossain, T., Yokokubo, A., and
Lopez, G. (2021). Automatic segmentation method of
bone conduction sound for eating activity detailed de-
tection. In Adjunct Proceedings of the 2021 ACM In-
ternational Joint Conference on Pervasive and Ubiq-
uitous Computing and Proceedings of the 2021 ACM
International Symposium on Wearable Computers,
pages 310–315.
Kamachi, H., Kondo, T., Yokokubo, A., and Lopez, G.
(2020). Classification method of eating behavior
by dietary sound collected in natural meal environ-
ment. In Activity and Behavior Computing, ABC
2020. Smart Innovation, Systems and Technologies,
volume 204, pages 135–152. Springer, Singapore.
Kingma, D. P. and Ba, J. (2014). Adam: A
method for stochastic optimization. arXiv preprint
arXiv:1412.6980.
Lee, J., Park, J., Kim, K. L., and Nam, J. (2017). Sample-
level deep convolutional neural networks for music
auto-tagging using raw waveforms. arXiv preprint
arXiv:1703.01789.
Navarro, J. M., Mart
´
ınez-Espa
˜
na, R., Bueno-Crespo, A.,
Mart
´
ınez, R., and Cecilia, J. M. (2020). Sound lev-
els forecasting in an acoustic sensor network using a
deep neural network. Sensors, 20(3):903.
NCD Risk Factor Collaboration (2016). Trends in adult
body-mass index in 200 countries from 1975 to 2014:
a pooled analysis of 1698 population-based measure-
ment studies with 19· 2 million participants. The
lancet, 387(10026):1377–1396.
Nkurikiyeyezu, K., Kamachi, H., Kondo, T., Jain, A.,
Yokokubo, A., and Lopez, G. (2021). Classification
of eating behaviors in unconstrained environments.
Biomedical Engineering Systems and Technologies,
BIOSTEC 2020. Communications in Computer and
Information Science, 1400:592–609.
Ntalampiras, S., Kosmin, D., and Sanchez, J. (2021).
Acoustic classification of individual cat vocalizations
in evolving environments. In 2021 44th International
Conference on Telecommunications and Signal Pro-
cessing (TSP), pages 254–258. IEEE.
P
¨
aßler, S. and Fischer, W.-J. (2011). Acoustical method
for objective food intake monitoring using a wearable
sensor system. In 2011 5th International Conference
on Pervasive Computing Technologies for Healthcare
(PervasiveHealth) and Workshops, pages 266–269.
IEEE.
Prakash, J., Yang, Z., Wei, Y.-L., Hassanieh, H., and Choud-
hury, R. R. (2020). Earsense: earphones as a teeth
activity sensor. In Proceedings of the 26th Annual
International Conference on Mobile Computing and
Networking, pages 1–13.
Purwins, H., Li, B., Virtanen, T., Schl
¨
uter, J., Chang, S.-Y.,
and Sainath, T. (2019). Deep learning for audio signal
processing. IEEE Journal of Selected Topics in Signal
Processing, 13(2):206–219.
Selamat, N. A. and Ali, S. H. M. (2020). Automatic food in-
take monitoring based on chewing activity: A survey.
IEEE Access, 8:48846–48869.
Semeniuta, S., Severyn, A., and Barth, E. (2016). Recur-
rent dropout without memory loss. arXiv preprint
arXiv:1603.05118.
Shen, Y., Salley, J., Muth, E., and Hoover, A. (2016).
Assessing the accuracy of a wrist motion tracking
method for counting bites across demographic and
food variables. IEEE journal of biomedical and health
informatics, 21(3):599–606.
Shor, J., Jansen, A., Maor, R., Lang, O., Tuval, O., Quitry,
F. d. C., Tagliasacchi, M., Shavitt, I., Emanuel, D.,
and Haviv, Y. (2020). Towards learning a universal
non-semantic representation of speech. arXiv preprint
arXiv:2002.12764.
Shuzo, M., Komori, S., Takashima, T., Lopez, G., Tatsuta,
S., Yanagimoto, S., Warisawa, S., Delaunay, J.-J., and
Yamada, I. (2010). Wearable eating habit sensing
system using internal body sound. Journal of Ad-
vance Mechanical Design, Systems, and Manufactur-
ing, 4(1):158–166.
Tensorflow (2020). Sound classification with yamnet.
https://github.com/tensorflow/models/tree/master/
research/audioset/yamnet/, (Accessed:20 December
2021).
Totakura, V., Janmanchi, M. K., Rajesh, D., and Hussan,
M. T. (2020). Prediction of animal vocal emotions us-
ing convolutional neural network. International Jour-
nal of Scientific & Technology Research, 9(2):6007–
6011.
Vasudevan, H., Michalas, A., Shekokar, N., and Narvekar,
M. (2020). Advanced Computing Technologies and
Applications: Proceedings of 2nd International Con-
ference on Advanced Computing Technologies and
Applications—ICACTA 2020. Springer Nature.
Xu, Y., Kong, Q., Wang, W., and Plumbley, M. D. (2018).
Large-scale weakly supervised audio classification us-
ing gated convolutional neural network. In 2018 IEEE
international conference on acoustics, speech and sig-
nal processing (ICASSP), pages 121–125. IEEE.
Yamaji, T., Mikami, S., Kobatake, H., Kobayashi, K.,
Tanaka, H., and Tanaka, K. (2018). Does eating fast
cause obesity and metabolic syndrome? Journal of
the American College of Cardiology, 71(11S):A1846–
A1846.
Yang, X., Doulah, A., Farooq, M., Parton, J., McCrory,
M. A., Higgins, J. A., and Sazonov, E. (2019). Sta-
Bone Conduction Eating Activity Detection based on YAMNet Transfer Learning and LSTM Networks
83