Table 1: Results for recorded camera path in four path
traced scenes.
Metric Scene Anisotropic Proposed
PSNR Classroom 26.1 26.6
Fireplace 31.8 32.1
Sponza 22.1 22.4
SanMiguel 22.0 22.1
SSIM Classroom 0.826 0.835
Fireplace 0.899 0.902
Sponza 0.695 0.704
SanMiguel 0.613 0.625
VMAF Classroom 26.7 37.7
Fireplace 37.6 47.3
Sponza 11.4 22.0
SanMiguel 10.9 16.5
frame on a single GeForce RTX 2080 Ti GPU. The
timing is likely fast enough, because it is measured
on a single contemporary GPU. This means that either
the quality or the speed, or both, can be improved sig-
nificantly on the next generations as inference hard-
ware acceleration support evolves. Furthermore, in
this experiment the network used 32-bit floating-point
precision. Using reduced precision would make net-
work faster likely without significant reduction in the
quality Venkatesh et al. (2017).
6 CONCLUSIONS
In this position paper we argued that machine learn-
ing algorithms are going to play a critical role in re-
constructing foveated rendering. Specifically, an in-
teresting option is to perform machine learning-based
reconstruction in the reduced foveated resolution to
reduce the computational complexity. In order to
provide initial proofs, we built a preliminary spatio-
temporal super-resolving CNN to reconstruct the pe-
riphery part of a foveated rendering pipeline which
functions in the reduced foveated resolution. The
method was tested with four different scenes in which
it improved the PSNR, SSIM and VMAF scores com-
pared to anisotropic filtering. In our opinion, af-
ter a more comprehensive architecture and parame-
ter search, a similar idea could be used to greatly re-
duce the computational complexity and improve the
result of the reconstruction in foveated photorealistic
rendering, making it usable in realistic mixed reality
setups of the future.
We believe that the improved quality of the ma-
chine learning based reconstruction can be translated
to more reduced foveated resolution and, therefore,
less overall rendering workload, but this is yet to
prove with more extensive user studies.
ACKNOWLEDGEMENTS
This work was supported by ECSEL JU project
FitOptiVis (project number 783162), the Tampere
University of Technology Graduate School, the Nokia
Foundation and the Emil Aaltonen Foundation.
REFERENCES
Albert, R., Godinez, A., and Luebke, D. (2019). Reading
speed decreases for fast readers under gaze-contingent
rendering. In Proceedings of the Symposium on Ap-
plied Perception.
Bako, S., Vogels, T., McWilliams, B., Meyer, M., Nov
´
ak,
J., Harvill, A., Sen, P., Derose, T., and Rousselle, F.
(2017). Kernel-predicting convolutional networks for
denoising monte carlo renderings. ACM Transactions
on Graphics (TOG), 36(4).
Barr
´
e-Brisebois, C. (2018). Game ray tracing: State-of-the-
art and open problems. High Performance Graphics
Keynote.
Caballero, J., Ledig, C., Aitken, A., Acosta, A., Totz, J.,
Wang, Z., and Shi, W. (2017). Real-time video super-
resolution with spatio-temporal networks and motion
compensation. In Proceedings of the IEEE Confer-
ence on Computer Vision and Pattern Recognition.
Chaitanya, C. R. A., Kaplanyan, A. S., Schied, C., Salvi,
M., Lefohn, A., Nowrouzezahrai, D., and Aila, T.
(2017). Interactive reconstruction of monte carlo im-
age sequences using a recurrent denoising autoen-
coder. ACM Transactions on Graphics (TOG), 36(4).
Dong, C., Loy, C. C., He, K., and Tang, X. (2015). Image
super-resolution using deep convolutional networks.
IEEE transactions on pattern analysis and machine
intelligence, 38(2):295–307.
Friston, S., Ritschel, T., and Steed, A. (2019). Perceptual
rasterization for head-mounted display image synthe-
sis. ACM Transactions on Graphics (TOG), 38(4).
Guenter, B., Finch, M., Drucker, S., Tan, D., and Snyder, J.
(2012). Foveated 3d graphics. ACM Transactions on
Graphics (TOG), 31(6).
Hays, J. and Efros, A. A. (2007). Scene completion us-
ing millions of photographs. ACM Transactions on
Graphics (TOG), 26(3).
Kajiya, J. (1986). The rendering equation. SIGGRAPH
Computer Graphics, 20(4).
Kaplanyan, A., Sochenov, A., Leimkuehler, T., Okunev, M.,
Goodall, T., and Gizem, R. (2019). Deepfovea: Neu-
ral reconstruction for foveated rendering and video
compression using learned statistics of natural videos.
ACM Transactions on Graphics (TOG), 38(4).
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Suk-
thankar, R., and Fei-Fei, L. (2014). Large-scale video
classification with convolutional neural networks. In
Proceedings of the IEEE conference on Computer Vi-
sion and Pattern Recognition.
GRAPP 2020 - 15th International Conference on Computer Graphics Theory and Applications
366