7.2 Material Recognition
The succes ratio of the material recognition was 89%
and was preferable result. As shown in the Table 4,
“wood” and “metal” was good result. “Wood” materi-
als has grain and “metal” materials has specular high-
light. It is supposed that such appearances became
strong features which can distinguish them from other
materials. “Fabric” was moderate result also. We es-
timate that its rough surface became a good feature of
the “fabric” materials.
On the other hand, “plastic” was very hard to rec-
ognize. “Plastic” materials have specular highlight
like as the “metal”. We corrected learning samples
of the “plastic” while focuesd on such specular high-
light. However, object image clips extracted by the
object detection, did not have enough specular high-
light. This may be improved by more appropriate se-
lection of learning samples.
From the aspect of object location, overlapping of
objects dropped the success ratio. For example, when
some objects, which are composed of the “metal”,
“fabric” and so on, were put on a “table” whose sur-
face is “plastic”, the material of the “table” was not
classified successfully. This is a natural result, and we
have to use pixel level image segmentation to solve
this problem.
8 CONCLUSIONS
In this paper,we proposeda method to represent inter-
actions between virtual objects and real objects in MR
scene more realistically than conventional MR tech-
nologies by the material recognition of objects in the
real space. At first, RGB-D camera grabs a color im-
age and a depth image. From the color image, our
system detects objects to get the positions of objects
in the real space. Then, material recognition using
deep learning is performed over the objects’ image
clips, and 3D meshes of the detected objects are con-
structed. After that, the results of the material recog-
nition is reflected to each corresponding object’s 3D
mesh. Physical characteristics, such as friction and re-
pulsion coefficient and contact sound, are added into
the 3D meshes during the process. By overlaying the
3D meshes on the real world image, we can get more
realistic MR scene where not only the virtual objects
can interact with real objects, but also the motion of
the virtual objects changes by difference of materials
of the real objects. Our method will be applicable to
realize more realistic MR world which can be used to
many fields, such as sports with virtual balls, simula-
tion with virtual objects, and so on.
Currently, our method only recognizes the kind of
materials, and it does not consider how the material is
processed. For example, metal has been assumed to
be smooth on the surface, but some of them have been
rough machined. Similarly, varnished wood products
may have a smooth surface, but rough wooden objects
also exist. It is thought that more natural expression
becomes possible by considering not only the mate-
rial but also how its surface is processed. Addition-
ally, we have to consider pixel level object recogni-
tion. We used YOLO for ojbect recognition, and it
caluculates object’s bounding box. The bounding box
always includes other objects and it will affect mate-
rial recognition. We focuse on such points and con-
tinue to improve our method.
ACKNOWLEDGEMENTS
This work was supported by JSPS KAKENHI Grant
Number 17K01160.
REFERENCES
Inaba, M., Banno, A., Oishi, T., and Ikeuchi, K. (2012).
Achieving robust alignment for outdoor mixed reality
using 3d range data. In Proceedings of the 18th ACM
Symposium on Virtual Reality Software and Technol-
ogy, VRST ’12, pages 61–68, New York, NY, USA.
ACM.
Indyk, P., Motwani, R., Raghavan, P., and Vempala, S.
(1997). Locality-preserving hashing in multidimen-
sional spaces. In Proceedings of the Twenty-ninth
Annual ACM Symposium on Theory of Computing,
STOC ’97, pages 618–625, New York, NY, USA.
ACM.
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J.,
Girshick, R., Guadarrama, S., and Darrell, T. (2014).
Caffe: Convolutional architecture for fast feature em-
bedding. arXiv preprint arXiv:1408.5093.
Kakuta, T., Oishi, T., and Ikeuchi, K. (2008). Fast shad-
ing and shadowing of virtual objects using shadow-
ing planes in mixed reality. The journal of the Insti-
tute of Image Information and Television Engineers,
62(5):788–795.
Redmon, J., Divvala, S. K., Girshick, R. B., and Farhadi, A.
(2015). You only look once: Unified, real-time object
detection. CoRR, abs/1506.02640.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.,
Anguelov, D., Erhan, D., Vanhoucke, V., and Rabi-
novich, A. (2014). Going deeper with convolutions.
In Large Scale Visual Recognition Challenge 2014,
ILSVRC ’14.