
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., Courville, A., and Ben-
gio, Y. (2020). Generative adversarial networks. Com-
munications of the ACM, 63(11):139–144.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 770–778.
Ho, J., Jain, A., and Abbeel, P. (2020). Denoising diffusion
probabilistic models. Advances in neural information
processing systems, 33:6840–6851.
Huang, X., Liu, M.-Y., Belongie, S., and Kautz, J. (2018).
Multimodal unsupervised image-to-image translation.
In Proceedings of the European conference on com-
puter vision (ECCV), pages 172–189.
Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. (2017).
Image-to-image translation with conditional adversar-
ial networks. In Proceedings of the IEEE conference
on computer vision and pattern recognition, pages
1125–1134.
Kingma, D. P., Welling, M., et al. (2019). An introduction
to variational autoencoders. Foundations and Trends®
in Machine Learning, 12(4):307–392.
Long, J., Shelhamer, E., and Darrell, T. (2015). Fully con-
volutional networks for semantic segmentation. In
Proceedings of the IEEE conference on computer vi-
sion and pattern recognition, pages 3431–3440.
Marzullo, A., Moccia, S., Catellani, M., Calimeri, F., and
De Momi, E. (2021). Towards realistic laparoscopic
image generation using image-domain translation.
Computer Methods and Programs in Biomedicine,
200:105834.
Ozawa, T., Hayashi, Y., Oda, H., Oda, M., Kitasaka, T.,
Takeshita, N., Ito, M., and Mori, K. (2021). Synthetic
laparoscopic video generation for machine learning-
based surgical instrument segmentation from real la-
paroscopic video and virtual surgical instruments.
Computer Methods in Biomechanics and Biomedical
Engineering: Imaging & Visualization, 9(3):225–232.
Park, T., Liu, M.-Y., Wang, T.-C., and Zhu, J.-Y. (2019).
Semantic image synthesis with spatially-adaptive nor-
malization. In Proceedings of the IEEE/CVF con-
ference on computer vision and pattern recognition,
pages 2337–2346.
Pfeiffer, M., Funke, I., Robu, M. R., Bodenstedt, S.,
Strenger, L., Engelhardt, S., Roß, T., Clarkson, M. J.,
Gurusamy, K., Davidson, B. R., et al. (2019). Gen-
erating large labeled data sets for laparoscopic im-
age processing tasks using unpaired image-to-image
translation. In Medical Image Computing and Com-
puter Assisted Intervention–MICCAI 2019: 22nd In-
ternational Conference, Shenzhen, China, October
13–17, 2019, Proceedings, Part V 22, pages 119–127.
Springer.
Rau, A., Edwards, P. E., Ahmad, O. F., Riordan, P., Janatka,
M., Lovat, L. B., and Stoyanov, D. (2019). Implicit
domain adaptation with conditional generative adver-
sarial networks for depth prediction in endoscopy.
International journal of computer assisted radiology
and surgery, 14:1167–1176.
Rivoir, D., Pfeiffer, M., Docea, R., Kolbinger, F., Riedi-
ger, C., Weitz, J., and Speidel, S. (2021). Long-term
temporally consistent unpaired video translation from
simulated surgical 3d data. In Proceedings of the
IEEE/CVF international conference on computer vi-
sion, pages 3343–3353.
Ronneberger, O., Fischer, P., and Brox, T. (2015). U-
net: Convolutional networks for biomedical im-
age segmentation. arxiv 2015. arXiv preprint
arXiv:1505.04597.
Schiavina, R., Bianchi, L., Chessa, F., Barbaresi, U.,
Cercenelli, L., Lodi, S., Gaudiano, C., Bortolani, B.,
Angiolini, A., Bianchi, F. M., et al. (2021a). Aug-
mented reality to guide selective clamping and tumor
dissection during robot-assisted partial nephrectomy:
a preliminary experience. Clinical genitourinary can-
cer, 19(3):e149–e155.
Schiavina, R., Bianchi, L., Lodi, S., Cercenelli, L., Chessa,
F., Bortolani, B., Gaudiano, C., Casablanca, C.,
Droghetti, M., Porreca, A., et al. (2021b). Real-time
augmented reality three-dimensional guided robotic
radical prostatectomy: preliminary experience and
evaluation of the impact on surgical planning. Euro-
pean Urology Focus, 7(6):1260–1267.
Tartarini, L., Riccardo, S., Bianchi, L., Lodi, S., Gaudiano,
C., Bortolani, B., Cercenelli, L., Brunocilla, E., and
Marcelli, E. (2023). Stereoscopic augmented reality
for intraoperative guidance in robotic surgery. Journal
of Mechanics in Medicine and Biology, page 2340040.
Vercauteren, T., Unberath, M., Padoy, N., and Navab, N.
(2019). Cai4cai: the rise of contextual artificial intel-
ligence in computer-assisted interventions. Proceed-
ings of the IEEE, 108(1):198–214.
Wada, K. Labelme: Image Polygonal Annotation with
Python.
Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Tao, A., Kautz, J., and
Catanzaro, B. (2018). High-resolution image synthe-
sis and semantic manipulation with conditional gans.
In Proceedings of the IEEE conference on computer
vision and pattern recognition, pages 8798–8807.
Wang, W., Bao, J., Zhou, W., Chen, D., Chen, D., Yuan,
L., and Li, H. (2022). Semantic image synthesis via
diffusion models. arXiv preprint arXiv:2207.00050.
Xiang, Y., Schmidt, T., Narayanan, V., and Fox, D. (2017).
Posecnn: A convolutional neural network for 6d ob-
ject pose estimation in cluttered scenes. arXiv preprint
arXiv:1711.00199.
Yoon, J., Hong, S., Hong, S., Lee, J., Shin, S., Park, B.,
Sung, N., Yu, H., Kim, S., Park, S., et al. (2022). Sur-
gical scene segmentation using semantic image syn-
thesis with a virtual surgery environment. In Inter-
national Conference on Medical Image Computing
and Computer-Assisted Intervention, pages 551–561.
Springer.
Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. (2017).
Unpaired image-to-image translation using cycle-
consistent adversarial networks. In Proceedings of
the IEEE international conference on computer vi-
sion, pages 2223–2232.
VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications
652