the dataset. Prediction confusion might be improved
increasing the number of samples from the dyed lifted
polyps and dyed resection margins as well as from z-
line and esophagitis classes.
6 FUTURE WORK
Several deep convolutional neural networks have been
published since Inception v3, such as (Huang et al.,
2017b), (He et al., 2015) (Zhu et al., 2017), (Wong
et al., 2016), (Xu et al., 2016). Experiments can be
done using these newly proposed architectures in con-
junction with data augmentation techniques.
Stacking additional dense layers can be another
direction worth to be investigated, as well as mak-
ing a more exhaustive experimentation with different
activation functions such ELU (Clevert et al., 2015),
LeakyRelu (Zhu et al., 2017), Swish (Ramachandran
et al., 2017) etc.
A different investigation might consist in visual-
izing high level learned features from the last convo-
lutional layers, in order to improve our grasp of the
discriminative characteristics learned by the network.
All our experiments have been conducted over the
first version of the Kvasir dataset; repeting training
and validation on the recently released extended ver-
sion would provide an important additional validation
of our methodology.
Finally, it would be particularly useful to further
extend the Kvasir dataset with new classes, in order to
meet diagnosis needs in the direction of several other
very known and diffused diseases such as Chron’s dis-
ease. We are currently exploring the possibility to co-
operate with the gastroenterology department of the
Sant’Orsola Hospital in Bologna to extend the dataset
along these lines.
7 CONCLUSIONS
In this work we addressed the problem of gastroin-
testinal disease detection and identification. By a sim-
ple combination of Convolutional Neural Networks,
transfer learning, and data augmentation we outper-
fomed previous techniques in terms of precision, re-
call, and f-measure, while essentially preserving the
same accuracy. Our experimentation confirms once
more that data augmentation is a viable technique for
boosting deep learning in presence of small dataset.
REFERENCES
(2017). Digestive diseases statistics for the united states.
Accessed: 2017-11-03.
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z.,
Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin,
M., Ghemawat, S., Goodfellow, I., Harp, A., Irving,
G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kud-
lur, M., Levenberg, J., Man
´
e, D., Monga, R., Moore,
S., Murray, D., Olah, C., Schuster, M., Shlens, J.,
Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Van-
houcke, V., Vasudevan, V., Vi
´
egas, F., Vinyals, O.,
Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and
Zheng, X. (2015). TensorFlow: Large-scale machine
learning on heterogeneous systems. Software avail-
able from tensorflow.org.
Bengio, Y. (2012). Deep learning of representations for
unsupervised and transfer learning. In Guyon, I.,
Dror, G., Lemaire, V., Taylor, G., and Silver, D., edi-
tors, Proceedings of ICML Workshop on Unsupervised
and Transfer Learning, volume 27 of Proceedings of
Machine Learning Research, pages 17–36, Bellevue,
Washington, USA. PMLR.
Chollet, F. et al. (2015). Keras.
Clevert, D., Unterthiner, T., and Hochreiter, S. (2015). Fast
and accurate deep network learning by exponential
linear units (elus). CoRR, abs/1511.07289.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-
Fei, L. (2009). ImageNet: A Large-Scale Hierarchical
Image Database. In CVPR09.
Farfade, S. S., Saberian, M. J., and Li, L. (2015). Multi-
view face detection using deep convolutional neural
networks. CoRR, abs/1502.02766.
Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., Courville, A., and Ben-
gio, Y. (2014). Generative Adversarial Networks.
ArXiv e-prints.
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep resid-
ual learning for image recognition. arXiv preprint
arXiv:1512.03385.
Huang, G., Li, Y., Pleiss, G., Liu, Z., Hopcroft, J. E.,
and Weinberger, K. Q. (2017a). Snapshot ensembles:
Train 1, get M for free. CoRR, abs/1704.00109.
Huang, G., Liu, Z., van der Maaten, L., and Weinberger,
K. Q. (2017b). Densely connected convolutional net-
works. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition.
Kingma, D. P. and Ba, J. (2014). Adam: A method for
stochastic optimization. CoRR, abs/1412.6980.
Krizhevsky, A., Nair, V., and Hinton, G. Cifar-10 (canadian
institute for advanced research).
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012).
Imagenet classification with deep convolutional neu-
ral networks. In Pereira, F., Burges, C. J. C., Bottou,
L., and Weinberger, K. Q., editors, Advances in Neu-
ral Information Processing Systems 25, pages 1097–
1105. Curran Associates, Inc.
LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard,
R. E., Hubbard, W., and Jackel, L. D. (1989). Back-
KALSIMIS 2018 - Special Session on Knowledge Acquisition and Learning in Semantic Interpretation of Medical Image Structures
204