(a) Point cloud from ground truth (b) Point cloud predicted by the proposed method
Figure 7: Comparison of point clouds extracted from the ground truth and by our approach.
We have also calculated the loss on the test in-
stances. In average, the RMSE per image was 13.18.
As it can be seen from Figure 6, there was quite a
big fluctuation in losses (around 11-15). However, the
ground truth was incomplete for some of the images,
which might have affected the ability to properly eval-
uate each individual instances.
However, in some extreme cases (see Figure 7),
the reconstruction by the proposed approach was just
partially successful. To correct issues like this, an ap-
proach which also incorporates on contextual infor-
mation could be used in the future.
5 CONCLUSIONS
Minimally invasive surgical techniques are very im-
portant in clinical settings, however, they require
computational support to allow surgeons to effec-
tively use these techniques in practice. In this paper,
an approach based on deep neural networks has been
introduced, which is unlike the state-of-the-art ap-
proaches, only relies on the input pixels of the stereo
image pair. The approach has been evaluated on a
publicly available dataset and compared well to the
results obtained by a state-of-the-art technique.
ACKNOWLEDGMENTS
This work was supported in part by the project VKSZ
14-1-2015-0072, SCOPIA: Development of diagnos-
tic tools based on endoscope technology supported by
the European Union, co-financed by the European So-
cial Fund.
REFERENCES
Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Good-
fellow, I. J., Bergeron, A., Bouchard, N., and Ben-
gio, Y. (2012). Theano: new features and speed im-
provements. Deep Learning and Unsupervised Fea-
ture Learning NIPS 2012 Workshop.
Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu,
R., Desjardins, G., Turian, J., Warde-Farley, D., and
Bengio, Y. (2010). Theano: a CPU and GPU math
expression compiler. In Proceedings of the Python for
Scientific Computing Conference (SciPy). Oral Pre-
sentation.
Chollet, F. (2015). Keras. https://github.com/fchollet/keras.
Glorot, X. and Bengio, Y. (2010). Understanding the diffi-
culty of training deep feedforward neural networks. In
International conference on artificial intelligence and
statistics, pages 249–256.
Glorot, X., Bordes, A., and Bengio, Y. (2011). Deep sparse
rectifier neural networks. In International Conference
on Artificial Intelligence and Statistics, pages 315–
323.
Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression:
Biased estimation for nonorthogonal problems. Tech-
nometrics, 12(1):55–67.
LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learn-
ing. Nature, 521(7553):436–444.
Maier-Hein, L., Mountney, P., Bartoli, A., Elhawary, H., El-
son, D., Groch, A., Kolb, A., Rodrigues, M., Sorger,
J., Speidel, S., et al. (2013). Optical techniques for
3d surface reconstruction in computer-assisted laparo-
scopic surgery. Medical image analysis, 17(8):974–
996.
Nair, V. and Hinton, G. E. (2010). Rectified linear units
improve restricted boltzmann machines. In Proceed-
ings of the 27th International Conference on Machine
Learning (ICML-10), pages 807–814.
Pratt, P., Stoyanov, D., Visentini-Scarzanella, M., and Yang,
G.-Z. (2010). Dynamic guidance for robotic surgery
using image-constrained biomechanical models. In
Medical Image Computing and Computer-Assisted
Intervention–MICCAI 2010, pages 77–85. Springer.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I.,
and Salakhutdinov, R. (2014). Dropout: A simple way
SPCS 2016 - International Conference on Signal Processing and Communication Systems
120