
Radford, A., Metz, L., and Chintala, S. (2015). Unsu-
pervised representation learning with deep convolu-
tional generative adversarial networks. arXiv preprint
arXiv:1511.06434.
Ramachandran, P., Zoph, B., and Le, Q. V. (2017).
Searching for activation functions. arXiv preprint
arXiv:1710.05941.
Ran, H., Liu, J., and Wang, C. (2022). Surface representa-
tion for point clouds. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recogni-
tion, pages 18942–18952.
Ridnik, T., Lawen, H., Noy, A., Ben Baruch, E., Sharir,
G., and Friedman, I. (2021). Tresnet: High perfor-
mance gpu-dedicated architecture. In proceedings of
the IEEE/CVF winter conference on applications of
computer vision, pages 1400–1409.
Riegler, G., Osman Ulusoy, A., and Geiger, A. (2017). Oct-
net: Learning deep 3d representations at high reso-
lutions. In Proceedings of the IEEE conference on
computer vision and pattern recognition, pages 3577–
3586.
Ronneberger, O., Fischer, P., and Brox, T. (2015). U-
net: Convolutional networks for biomedical image
segmentation. In Medical Image Computing and
Computer-Assisted Intervention–MICCAI 2015: 18th
International Conference, Munich, Germany, October
5-9, 2015, Proceedings, Part III 18, pages 234–241.
Springer.
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and
Chen, L.-C. (2018). Mobilenetv2: Inverted residu-
als and linear bottlenecks. In Proceedings of the IEEE
conference on computer vision and pattern recogni-
tion, pages 4510–4520.
Schwarz, K., Sauer, A., Niemeyer, M., Liao, Y., and
Geiger, A. (2022). Voxgraf: Fast 3d-aware image
synthesis with sparse voxel grids. arXiv preprint
arXiv:2206.07695.
Shi, W., Caballero, J., Husz
´
ar, F., Totz, J., Aitken, A. P.,
Bishop, R., Rueckert, D., and Wang, Z. (2016). Real-
time single image and video super-resolution using an
efficient sub-pixel convolutional neural network. In
Proceedings of the IEEE conference on computer vi-
sion and pattern recognition, pages 1874–1883.
Simonyan, K. and Zisserman, A. (2014). Very deep con-
volutional networks for large-scale image recognition.
arXiv preprint arXiv:1409.1556.
Su, H., Maji, S., Kalogerakis, E., and Learned-Miller, E.
(2015). Multi-view convolutional neural networks for
3d shape recognition. In Proceedings of the IEEE
international conference on computer vision, pages
945–953.
Szegedy, C., Ioffe, S., Vanhoucke, V., and Alemi, A. (2017).
Inception-v4, inception-resnet and the impact of resid-
ual connections on learning. In Proceedings of the
AAAI conference on artificial intelligence, volume 31.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.,
Anguelov, D., Erhan, D., Vanhoucke, V., and Rabi-
novich, A. (2015). Going deeper with convolutions.
In Proceedings of the IEEE conference on computer
vision and pattern recognition, pages 1–9.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wo-
jna, Z. (2016). Rethinking the inception architecture
for computer vision. In Proceedings of the IEEE con-
ference on computer vision and pattern recognition,
pages 2818–2826.
Tan, M. and Le, Q. (2019). Efficientnet: Rethinking model
scaling for convolutional neural networks. In Interna-
tional conference on machine learning, pages 6105–
6114. PMLR.
Tatarchenko, M., Dosovitskiy, A., and Brox, T. (2017). Oc-
tree generating networks: Efficient convolutional ar-
chitectures for high-resolution 3d outputs. In Proceed-
ings of the IEEE international conference on com-
puter vision, pages 2088–2096.
Valsesia, D., Fracastoro, G., and Magli, E. (2018). Learn-
ing localized generative models for 3d point clouds
via graph convolution. In International conference on
learning representations.
Wang, H., Jiang, Z., Yi, L., Mo, K., Su, H., and Guibas,
L. J. (2020). Rethinking sampling in 3d point
cloud generative adversarial networks. arXiv preprint
arXiv:2006.07029.
Wang, P.-S., Liu, Y., Guo, Y.-X., Sun, C.-Y., and Tong,
X. (2017). O-cnn: Octree-based convolutional neu-
ral networks for 3d shape analysis. ACM Transactions
On Graphics (TOG), 36(4):1–11.
Wang, Y., Sun, Y., Liu, Z., Sarma, S. E., Bronstein, M. M.,
and Solomon, J. M. (2019). Dynamic graph cnn
for learning on point clouds. Acm Transactions On
Graphics (tog), 38(5):1–12.
Wu, J., Wang, Y., Xue, T., Sun, X., Freeman, B., and Tenen-
baum, J. (2017). Marrnet: 3d shape reconstruction via
2.5 d sketches. Advances in neural information pro-
cessing systems, 30.
Wu, J., Zhang, C., Xue, T., Freeman, B., and Tenenbaum, J.
(2016). Learning a probabilistic latent space of object
shapes via 3d generative-adversarial modeling. Ad-
vances in neural information processing systems, 29.
Wu, J., Zhang, C., Zhang, X., Zhang, Z., Freeman, W. T.,
and Tenenbaum, J. B. (2018). Learning shape pri-
ors for single-view 3d completion and reconstruction.
In Proceedings of the European Conference on Com-
puter Vision (ECCV), pages 646–662.
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X.,
and Xiao, J. (2015). 3d shapenets: A deep representa-
tion for volumetric shapes. In Proceedings of the IEEE
conference on computer vision and pattern recogni-
tion, pages 1912–1920.
Xiang, P., Wen, X., Liu, Y.-S., Cao, Y.-P., Wan, P., Zheng,
W., and Han, Z. (2021). Snowflakenet: Point cloud
completion by snowflake point deconvolution with
skip-transformer. In Proceedings of the IEEE/CVF
international conference on computer vision, pages
5499–5509.
Xie, S., Girshick, R., Doll
´
ar, P., Tu, Z., and He, K. (2017).
Aggregated residual transformations for deep neural
networks. In Proceedings of the IEEE conference on
computer vision and pattern recognition, pages 1492–
1500.
Yan, Y., Mao, Y., and Li, B. (2018). Second:
Sparsely embedded convolutional detection. Sensors,
18(10):3337.
HD-VoxelFlex: Flexible High-Definition Voxel Grid Representation
217