scenarios. The whole pipeline consists of an almost
autonomous way of generating labels, to create in a
fast way a dataset of the environment. This is used
to train the proposed DNN, which can generalize the
training data in a meaningful way: a direct correlation
between geometrical movement and the DNN output
probability. This is only possible when the DNN re-
lies on geometrical properties. It is also important to
notice that we experimented with removing the VMs
from the environment, and the DNN performances re-
main the same; but no extensive experiments were
performed with this focus.
To conclude, we show that it is possible to use
convolution layers to create deep architectures, while
maintaining all the geometrical properties of the data.
This can be applied in all the fields where the posi-
tion, or geometrical properties of the features are as
important as the features themselves.
Future Work. The new approach proposed in this
paper showed that is possible to use DNNs for ge-
ometrically based data. Moreover it can be applied
as the core localization subsystem for a visual based
navigation system. This would rely on the geometri-
cal properties of the real world environment to possi-
bly create a human like approach to navigation. If we
consider that humans are able to perfectly navigate in
any environment with solely visual information, then
the proposed approach would fit the need of a human
like navigation system. To accomplish this, more con-
siderations and research needs to be done, focusing on
the system itself: the use of 3D cameras, stereoscopic
cameras and more.
On a more general level this approach shows po-
tential in applications where geometrical structure
matters. As an example we could consider FMRI
scans: these visual data are directly correlated to the
brain areas. Another example could be visual games:
from chess to Atari games. Given the ability of the
proposed DNN to rely on the positions of the fea-
tures and to generalize the geometrical properties of
the data, the possible benefits are clear.
REFERENCES
Arora, S., Bhaskara, A., Ge, R., and Ma, T. (2014). Provable
bounds for learning some deep representations. In In-
ternational Conference on Machine Learning, pages
584–592.
Bonin-Font, F., Ortiz, A., and Oliver, G. (2008). Visual
navigation for mobile robots: A survey. Journal of
intelligent and robotic systems, 53(3):263.
Dosovitskiy, A., Springenberg, J. T., Riedmiller, M., and
Brox, T. (2014). Discriminative unsupervised fea-
ture learning with convolutional neural networks. In
Advances in Neural Information Processing Systems,
pages 766–774.
Elfes, A. (1987). Sonar-based real-world mapping and nav-
igation. IEEE Journal on Robotics and Automation,
3(3):249–265.
Everett, H. R. and Gage, D. W. (1999). From labo-
ratory to warehouse: Security robots meet the real
world. The International Journal of Robotics Re-
search, 18(7):760–768.
Floreano, D. and Wood, R. J. (2015). Science, technology
and the future of small autonomous drones. Nature,
521(7553):460.
Guivant, J., Nebot, E., and Baiker, S. (2000). Autonomous
navigation and map building using laser range sensors
in outdoor applications. Journal of robotic systems,
17(10):565–583.
Koenig, N. and Howard, A. (2004). Design and use
paradigms for gazebo, an open-source multi-robot
simulator. In Intelligent Robots and Systems. Pro-
ceedings. 2004 IEEE/RSJ International Conference
on, volume 3, pages 2149–2154.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-
agenet classification with deep convolutional neural
networks. In Advances in neural information process-
ing systems, pages 1097–1105.
LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learn-
ing. Nature, 521(7553):436–444.
Levinson, J., Askeland, J., Becker, J., Dolson, J., Held, D.,
Kammel, S., Kolter, J. Z., Langer, D., Pink, O., Pratt,
V., et al. (2011). Towards fully autonomous driving:
Systems and algorithms. In Intelligent Vehicles Sym-
posium (IV), 2011 IEEE, pages 163–168. IEEE.
Lowe, D. G. (2004). Distinctive image features from scale-
invariant keypoints. International journal of computer
vision, 60(2):91–110.
Marszałek, M., Schmid, C., Harzallah, H., and Van De Wei-
jer, J. (2007). Learning object representations for vi-
sual object class recognition. In Visual Recognition
Challenge workshop, in conjunction with ICCV.
Rifkin, R. and Klautau, A. (2004). In defense of one-vs-all
classification. Journal of machine learning research,
5(Jan):101–141.
Sejnowski, T. J. and Tesauro, G. (1989). The Hebb rule for
synaptic plasticity: algorithms and implementations.
Neural models of plasticity: Experimental and theo-
retical approaches, pages 94–103.
Sukkarieh, S., Nebot, E. M., and Durrant-Whyte, H. F.
(1999). A high integrity imu/gps navigation loop for
autonomous land vehicle applications. IEEE Transac-
tions on Robotics and Automation, 15(3):572–578.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.,
Anguelov, D., Erhan, D., Vanhoucke, V., and Rabi-
novich, A. (2015). Going deeper with convolutions.
In Proceedings of the IEEE conference on computer
vision and pattern recognition, pages 1–9.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wo-
jna, Z. (2016). Rethinking the inception architecture
for computer vision. In Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition,
pages 2818–2826.
A Deep Convolutional Neural Network for Location Recognition and Geometry based Information
35