system offers many possibilities to simplify the every-
day life of those who are visually impaired or blind
and can be used without any previous training on the
device.
The following section discusses problems and op-
portunities on how a continuation of this work could
evolve. The augmented scene shows inaccuracies, es-
pecially on small objects. This leads to inaccurate
highlighting of some selected objects due to missing
calibration.
Another problem is that the compact hardware of
the HoloLens is rather unpleasant and uncomfortable
after long periods of wear. Limited battery capacity
and a permanently required network connection also
limit the mobility of the prototype.
The next version of the HoloLens is suppos-
edly deep learning capable (Microsoft Research Blog,
2017). Considering that the computing power of the
new version is strong enough, object detection could
be performed directly on the HoloLens.
For this reason the HoloLens was used during
the prototyping process, the use of less costly and
more durable hardware with well attuned specifica-
tions needs to be considered.
In order to provide a quicker overview for peo-
ple with less severe visual impairment, texts with the
names of object classes could be displayed in the field
of view.
In addition to object recognition, there is extra in-
formation that can be obtained from captured images.
Furthermore, the spatial awareness of the HoloLens
would make it possible to warn the user if he/she is
standing in front of a wall or obstacle at a certain dis-
tance. As proven by (Garon et al., 2016), the resolu-
tion of depth information can be increased using an
external depth sensor.
Other applications already recognize signs or bank
notes (Sudol et al., 2010), using various Optical Char-
acter Recognition (OCR) frameworks. OCR software
could be combined into the prototype to extend these
functionalities. The program could also be extended
to include recognition of humans, signs or texts in
real-time.
AUTHORS CONTRIBUTION
MB implemented the software and co-authored the
paper, ME authored the manuscript and tested the im-
plementation. CMF coined the idea, supervised the
work and co-authored the manuscript. All authors ap-
proved the final version.
REFERENCES
Abboud, S., Hanassy, S., Levy-Tzedek, S., Maidenbaum,
S., and Amedi, A. (2014). EyeMusic: Introducing a
“visual” colorful experience for the blind using audi-
tory sensory substitution. Restorative Neurology and
Neuroscience, 32(2):247–257.
Bach-y-Rita, P. (2004). Tactile sensory substitution studies.
Annals-New York Academy Of Sciences, 1013:83–91.
Costa, B., Pires, P. F., Delicato, F. C., and Merson, P. (2014).
Evaluating a representational state transfer (REST) ar-
chitecture: What is the impact of rest in my architec-
ture? In Proceedings of the IEEE/IFIP Conference on
Software Architecture, (ICSA), pages 105–114.
Dalal, N. and Triggs, B. (2005). Histograms of oriented
gradients for human detection. In Proceedings of
the IEEE Conference on Computer Vision and Pattern
Recognition, (CVPR), pages 886–893.
El-Aziz, A. A. and Kannan, A. (2014). JSON encryption. In
Proceedings of the International Conference on Com-
puter Communication and Informatics (ICCCI), pages
1–6.
Garon, M., Boulet, P.-O., Doironz, J.-P., Beaulieu, L., and
Lalonde, J.-F. (2016). Real-Time High Resolution 3D
Data on the HoloLens. In Proceedings of the IEEE In-
ternational Symposium on Mixed and Augmented Re-
ality (ISMAR-Adjunct), 2016, pages 189–191.
Girshick, R. (2015). Fast R-CNN. In Proceedings of
the IEEE International Conference on Computer Vi-
sion,(ICCV), pages 1440–1448.
Girshick, R., Iandola, F., Darrell, T., and Malik, J. (2015).
Deformable part models are convolutional neural net-
works. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR),
pages 437–446.
Lowe, D. G. (1999). Object recognition from local scale-
invariant features. In Proceedings of the 7
th
IEEE In-
ternational Conference on Computer Vision, (ICCV),
volume 2, pages 1150–1157.
Maidenbaum, S., Arbel, R., Abboud, S., Chebat, D., Levy-
Tzedek, S., and Amedi, A. (2012). Virtual 3D shape
and orientation discrimination using point distance in-
formation. In Proceedings of the 9
th
International
Conference on Disability, Virtual Reality & Associ-
ated Technologies, (ICDVRAT), pages 471–474.
Meijer, P. B. (1992). An experimental system for auditory
image representations. IEEE Transactions on Biomed-
ical Engineering, 39(2):112–121.
Meijer, P. B. (2017). Augmented reality and
soundscape-based synthetic vision for the blind.
https://www.seeingwithsound.com [Accessed: Octo-
ber 6, 2017].
Microsoft (2017a). HoloToolkit.
http://www.webcitation.org/6u0tUB8dz [Accessed:
October 5, 2017].
Microsoft (2017b). Microsoft Visual Studio.
https://www.visualstudio.com/ [Accessed: Octo-
ber 5, 2017].
HEALTHINF 2018 - 11th International Conference on Health Informatics
560