Table 3: Average survey results.
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
Q1 3,0 2,4 4,0 2,4 3,6 2,7 4,0 3,0 3,4 2,6
Q2 2,0 3,4 1,6 2,9 1,7 2,6 1,7 2,4 1,6 2,9
Q3 3,4 2,6 4,0 2,6 4,0 2,7 3,9 3,3 4,1 2,4
Q4 3,7 3,1 4,0 3,0 4,0 2,7 4,0 3,7 3,7 3,4
Q5 3,7 3,7 4,0 3,1 3,6 3,4 4,6 4,0 4,1 3,6
It is especially visible in the quality survey question
(Q1), where better quality in real video is caused by
the low resolution of regenerated cloud space. All
respondents stated that generated video caused some
discomfort and didn’t feel natural, which is also
connected to low resolution. Better results were
achieved in scale and distance tests, where
respondents stated that impression of those was better
than average. Moreover, survey results were similar
in generated and real stereoscopic video.
5 CONCLUSIONS
Considering the lack of inpainting (which is a non-
trivial task) and intelligent layer division (also non-
trivial) in the above test, the achieved results are
strongly insufficient for commercial implementation.
In the case considered in this study, inpainting would
have to be carried out in a quite complex range for
each layer of the generated image, several dozen
times per second (minimum 25, preferably around
60). This is a criterion that effectively excludes the
use of this type of solution in the current state of the
art. However, we recommend returning to the
analysis of this task within the time frame of several
years.
ACKNOWLEDGEMENTS
This paper was created as a part of the EU project
“Development of a novel training ecosystem using
mixed reality (MR) technology”. The project is co-
financed by the European Regional Development
Fund under Priority Axis 1 Support for R&D by
enterprises, Measure 1.1. R&D projects of
enterprises, Sub-measure 1.1.1 Industrial research
and development works carried out by enterprises,
Intelligent Development Operational Program 2014-
2020.
REFERENCES
Apple Developer Documentation. (n.d.). Retrieved October
5, 2022, from https://developer.apple.com/
documentation/avfoundation/additional_data_capture/
capturing_photos_with_depth
Bilateral filter. (2022). In Wikipedia. Retrieved October 5,
2022, from https://en.wikipedia.org/wiki/Bilateral
_filter
Chan, T., Shen, J. (2000). Mathematical models for local
deterministic inpaintings. UCLA CAM TR, 00-11.
Faugueras, O. D., Toscani, G. (1989). The calibration
problem for stereoscopic vision. In Sensor devices and
systems for robotics (pp. 195-213). Springer, Berlin,
Heidelberg.
First Principles of Computer Vision. (2021). Simple Stereo
| Camera Calibration [Video]. YouTube. Retrieved
October 4, 2022, from https://www.youtube.com/
watch?v=hUVyDabn1Mg
Guided filter. (2022). In Wikipedia. Retrieved October 5,
2022, from https://en.wikipedia.org/wiki/Guided_filter
Heeger, D. J., Bergen, J. R. (1995). Pyramid-based texture
analysis/synthesis. In Proceedings of the 22nd annual
conference on Computer graphics and interactive
techniques (pp. 229-238).
He, K., Sun, J., & Tang, X. (2012). Guided image filtering.
IEEE transactions on pattern analysis and machine
intelligence, 35(6), 1397-1409.
Kim, J. H., Yun, Y., Kim, J., Yun, K., Cheong, W. S., &
Kang, S. J. (2019). Accurate camera calibration method
for multiview stereoscopic image acquisition. Journal
of Broadcast Engineering, 24(6), 919-927.
Köhler, R., Schuler, C., Schölkopf, B., & Harmeling, S.
(2014). Mask-specific inpainting with deep neural
networks. In German conference on pattern recognition
(pp. 523-534). Springer, Cham.
Raajan, N. R., Philomina, B. M. A. J., Parthiban, D., &
Priya, M. V. (2012). Camera calibration for
stereoscopic technique. In IEEE-International
Conference on Advances in Engineering, Science And
Management (ICAESM-2012) (pp. 582-585). IEEE.
Tschumperlé, D., & Deriche, R. (2005). Vector-valued
image regularization with
PDEs: A common framework for different applications.
IEEE transactions on pattern analysis and machine
intelligence, 27(4), 506-517.
Venkatesh, M. V., Cheung, S. C. S., & Zhao, J. (2009).
Efficient object-based video inpainting. Pattern
Recognition Letters, 30(2), 168-179.
Yang, H., & Zhang, Z. (2020). Depth image upsampling
based on guided filter with low gradient minimization.
The Visual Computer, 36(7), 1411-1422.