output and target binary masks, we could directly cal-
culate precision, recall and F-measure.
The best of the proposed models provided F-
measure of 55% on the test set. This value itself
is way from the perfect score. However, getting the
general location of the plaque and slightly imprecise
shape already reduces the value below 100%. The
gold standard consisted of approximate polygons, so
repeating it precisely is virtually impossible. More
significantly problematic factors were related to the
false positives at the large bright areas, such as overly
activated points near the temporal bones and optic
nerves. Another common source of errors was re-
lated to mistakenly activated small regions (noise un-
related to the demyelinating plaques). On the other
hand, presence of selected points in the general area
of demyelinating plaques is a notable advantage of the
suggested model.
This result leaves much room for improvement.
Larger data set, which would include greater vari-
ety of cases, is expected to improve the results. Us-
ing 50 ×50 tiles could be considered disadvantageous
when compared to larger tiles, based on assumption
that larger visual fields could make it easier to recog-
nize temporal bones and optical nerves. However, the
initial tests on larger tiles resulted in all-zero network
outputs, because great majority of target outputs was
black. This problem would have to be addressed by
some specific approach such as cost function modifi-
cation. Another solution could involve creating a sep-
arate tool to remove the irrelevant parts from the im-
age – which means everything besides the brain itself,
where myelin sheath of neurons is visible.
Using convolutional neural networks for medical
image processing is usually difficult because of lim-
ited sizes of data sets. This common problem occured
to our work as well. However, our analysis is a step
towards more efficient solutions. Our approach to the
dynamic threshold selection and chosen measure of
localization correctness (F-measure of the binary ma-
trix) will be useful for testing the future models.
The solutions mentioned above are mostly slight
improvements to the researched method. Another
possible way of the future work involves using pre-
trained CNNs as a part of the model. This is likely
to involve very complex and general solutions such
as AlexNet (Krizhevsky et al., 2012) or VGG (Si-
monyan and Zisserman, 2014). Despite the original
objective of those networks, which is classification,
crucial parts of the same models could be used for
localization as well. Apparently, classification and lo-
calization with CNNs are vastly similar tasks, and one
training process could result in an integrated solution
to both of them (Sermanet et al., 2013). The presence
of the pooling layers results in lower output mask res-
olution. This problem, however, could be addressed
with deconvolutional neural networks (Zeiler and Fer-
gus, 2013).
ACKNOWLEDGEMENTS
This project has been partly funded with support from
National Science Centre, Republic of Poland, deci-
sion number DEC-2012/05/D/ST6/03091.
Authors would like to express their gratitude to the
Department of Radiology of Barlicki University Hos-
pital in Lodz for making head MRI sequences avail-
able.
REFERENCES
Cheng, G., Zhou, P., and Han, J. (2016). Learning rotation-
invariant convolutional neural networks for object de-
tection in vhr optical remote sensing images. IEEE
Transactions on Geoscience and Remote Sensing,
54(12):7405–7415.
Cires¸an, D. C., Meier, U., Masci, J., Gambardella, L. M.,
and Schmidhuber, J. (2011). Flexible, high perfor-
mance convolutional neural networks for image clas-
sification. In Proceedings of the Twenty-Second Inter-
national Joint Conference on Artificial Intelligence -
Volume Volume Two, IJCAI’11, pages 1237–1242.
Dai, J., He, K., and Sun, J. (2014). Convolutional fea-
ture masking for joint object and stuff segmentation.
CoRR, abs/1412.1283.
de Brebisson, A. and Montana, G. (2015). Deep Neural
Networks for Anatomical Brain Segmentation. ArXiv
e-prints, 1502.02445.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-
Fei, L. (2009). ImageNet: A Large-Scale Hierarchical
Image Database. In CVPR09.
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving deep
into rectifiers: Surpassing human-level performance
on imagenet classification. CoRR, abs/1502.01852.
Hubel, D. H. and Wiesel, T. N. (1965). Receptive fields and
functional architecture in two nonstriate visual areas
(18 and 19) of the cat. Journal of Neurophysiology,
28:229–289.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012).
Imagenet classification with deep convolutional neu-
ral networks. In Pereira, F., Burges, C. J. C., Bottou,
L., and Weinberger, K. Q., editors, Advances in Neu-
ral Information Processing Systems 25, pages 1097–
1105. Curran Associates, Inc.
LeCun, Y. and Bengio, Y. (1995). Convolutional networks
for images, speech, and time-series. In Arbib, M. A.,
editor, The Handbook of Brain Theory and Neural
Networks. MIT Press.
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998).
Gradient-based learning applied to document recogni-
tion. In Proceedings of the IEEE, pages 2278–2324.
Localization of Demyelinating Plaques in MRI using Convolutional Neural Networks
63