[4, 11, 12]. These paths cover only 3% of the possi-
ble paths followed by the agent: this means the agent
does not follow a single optimal path or even cou-
ple of paths, but instead uses a wide variety of paths
depending on breath phenomena detected during the
examination.
6 CONCLUSIONS
We have presented a unique application of reinforce-
ment learning for lung sounds auscultation, with the
objective of designing an agent being able to perform
the procedure interactively in the shortest time possi-
ble.
Our interactive agent is able to perform an intelli-
gent selection of auscultation points. It performs the
auscultation using only 3 points out of a total of 12,
reducing fourfold the examination time. In addition
to this, no significant decrease in diagnosis accuracy
is observed, since the interactive agent gets only 2.5
percent points lower accuracy than its static counter-
part that performs an exhaustive auscultation using all
available points.
Considering the research we have conducted, we
believe that further improvements can be done in the
solution proposed. In the near future, we would like
to extend this work to show that the interactive solu-
tion can completely outperform any static approach to
the problem. We believe that this can be achieved by
increasing the size of the dataset or by more advanced
algorithmic solutions, whose investigation and imple-
mentation was out of the scope of this publication.
REFERENCES
Bernstein, A., Burnaev, E., and N. Kachan, O. (2018). Re-
inforcement Learning for Computer Vision and Robot
Navigation.
Bi, S., Liu, L., Han, C., and Sun, D. (2014). Finding the
optimal sequence of features selection based on rein-
forcement learning. In 2014 IEEE 3rd International
Conference on Cloud Computing and Intelligence Sys-
tems, pages 347–350.
Bottou, L., Curtis, F. E., and Nocedal, J. (2018). Optimiza-
tion methods for large-scale machine learning. SIAM
Review, 60:223–311.
C¸ akir, E., Parascandolo, G., Heittola, T., Huttunen, H., and
Virtanen, T. (2017). Convolutional recurrent neu-
ral networks for polyphonic sound event detection.
CoRR, abs/1702.06286.
Fard, S. M. H., Hamzeh, A., and Hashemi, S. (2013). Using
reinforcement learning to find an optimal set of fea-
tures. Computers & Mathematics with Applications,
66(10):1892 – 1904. ICNC-FSKD 2012.
Gomez, C., Oller, J., and Paradells, J. (2012). Overview
and evaluation of bluetooth low energy: An emerging
low-power wireless technology.
Hyacinthe, L. R. T. (1819). De l’auscultation m
´
ediate ou
trait
´
e du diagnostic des maladies des poumons et du
coeur (On mediate auscultation or treatise on the di-
agnosis of the diseases of the lungs and heart). Paris:
Brosson & Chaud
´
e.
Kandaswamy, A., Kumar, D. C. S., Pl Ramanathan, R.,
Jayaraman, S., and Malmurugan, N. (2004). Neu-
ral classification of lung sounds using wavelet coef-
ficients. Computers in biology and medicine, 34:523–
37.
Kato, T. and Shinozaki, T. (2017). Reinforcement learning
of speech recognition system based on policy gradient
and hypothesis selection.
Kilic, O., Kılıc¸, z., Kurt, B., and Saryal, S. (2017). Clas-
sification of lung sounds using convolutional neural
networks. EURASIP Journal on Image and Video Pro-
cessing, 2017.
Kingma, D. P. and Ba, J. (2014). Adam: A method for
stochastic optimization. CoRR, abs/1412.6980.
Littmann 3200 (2009). Littmann
R
. https:
//www.littmann.com/3M/en_US/
littmann-stethoscopes/products/, Last ac-
cessed on 2018-10-30.
Liu, Y., Tang, J., Song, Y., and Dai, L. (2018). A capsule
based approach for polyphonic sound event detection.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A.,
Antonoglou, I., Wierstra, D., and Riedmiller, M.
(2013). Playing atari with deep reinforcement learn-
ing.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Ve-
ness, J., Bellemare, M. G., Graves, A., Riedmiller,
M., Fidjeland, A. K., Ostrovski, G., Petersen, S.,
Beattie, C., Sadik, A., Antonoglou, I., King, H., Ku-
maran, D., Wierstra, D., Legg, S., and Hassabis, D.
(2015). Human-level control through deep reinforce-
ment learning. Nature, 518(7540):529–533.
Nair, V. and Hinton, G. E. (2010). Rectified linear units im-
prove restricted boltzmann machines. In Proceedings
of the 27th International Conference on International
Conference on Machine Learning, ICML’10, pages
807–814, USA. Omnipress.
Palaniappan, R., Sundaraj, K., Ahamed, N., Arjunan, A.,
and Sundaraj, S. (2013). Computer-based respiratory
sound analysis: A systematic review. IETE Technical
Review, 33:248–256.
Puterman, M. L. (1994). Markov Decision Processes: Dis-
crete Stochastic Dynamic Programming. John Wiley
& Sons, Inc., New York, NY, USA, 1st edition.
Sabour, S., Frosst, N., and Hinton, G. E. (2017). Dynamic
routing between capsules. In NIPS.
Sammut, C. and Webb, G. I., editors (2010). Bellman Equa-
tion, pages 97–97. Springer US, Boston, MA.
Sarkar, M., Madabhavi, I., Niranjan, N., and Dogra, M.
(2015). Auscultation of the respiratory system. An-
nals of thoracic medicine, 10:158–168.
Interactive Lungs Auscultation with Reinforcement Learning Agent
831