ACKNOWLEDGEMENTS
This material is based upon work supported by the
Air Force Office of Scientific Research under award
number FA9550-22-1-0261; and partially supported
by the ESPOL project CIDIS-12-2022; the Span-
ish Government under Project PID2021-128945NB-
I00; and the ”CERCA Programme / Generalitat de
Catalunya”. The authors gratefully acknowledge the
NVIDIA Corporation for the donation of a Titan V
GPU used for this research.
REFERENCES
Andonian, A., Park, T., Russell, B., Isola, P., Zhu, J.-Y.,
and Zhang, R. (2021). Contrastive feature loss for im-
age prediction. In Proceedings of the IEEE/CVF inter-
national conference on computer vision, pages 1934–
1943.
Barron, J. T. and Poole, B. (2016). The fast bilateral solver.
In European conference on computer vision, pages
617–632. Springer.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., Courville, A., and Ben-
gio, Y. (2020). Generative adversarial networks. Com-
munications of the ACM, 63(11):139–144.
Goodfellow Ian, J., Jean, P.-A., Mehdi, M., Bing, X., David,
W.-F., Sherjil, O., and Courville Aaron, C. (2014).
Generative adversarial nets. In Proceedings of the
27th international conference on neural information
processing systems, volume 2, pages 2672–2680.
Guo, T., Huynh, C. P., and Solh, M. (2019). Domain-
adaptive pedestrian detection in thermal images. In
2019 IEEE International Conference on Image Pro-
cessing (ICIP), pages 1660–1664. IEEE.
Hui, T.-W., Loy, C. C., and Tang, X. (2016). Depth map
super-resolution by deep multi-scale guidance. In Eu-
ropean conference on computer vision, pages 353–
369. Springer.
Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. (2017).
Image-to-image translation with conditional adversar-
ial networks. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition (CVPR).
Jolicoeur-Martineau, A. (2018). The relativistic discrimina-
tor: a key element missing from standard gan. arXiv
preprint arXiv:1807.00734.
Kniaz, V. V., Knyaz, V. A., Hladuvka, J., Kropatsch,
W. G., and Mizginov, V. (2018). Thermalgan: Mul-
timodal color-to-thermal image translation for person
re-identification in multispectral dataset. In Proceed-
ings of the European Conference on Computer Vision
(ECCV) Workshops, pages 0–0.
Kopf, J., Cohen, M. F., Lischinski, D., and Uyttendaele, M.
(2007). Joint bilateral upsampling. ACM Transactions
on Graphics (ToG), 26(3):96–es.
Li, C., Xia, W., Yan, Y., Luo, B., and Tang, J. (2020). Seg-
menting objects in day and night: Edge-conditioned
cnn for thermal image semantic segmentation. IEEE
Transactions on Neural Networks and Learning Sys-
tems, 32(7):3069–3082.
Liu, J., Fan, X., Huang, Z., Wu, G., Liu, R., Zhong, W., and
Luo, Z. (2022). Target-aware dual adversarial learn-
ing and a multi-scenario multi-modality benchmark to
fuse infrared and visible for object detection. In Pro-
ceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pages 5802–5811.
Liu, R., Ge, Y., Choi, C. L., Wang, X., and Li, H. (2021).
Divco: Diverse conditional image synthesis via con-
trastive generative adversarial network. In Proceed-
ings of the IEEE/CVF Conference on Computer Vi-
sion and Pattern Recognition (CVPR), pages 16377–
16386.
Lu, Y. and Lu, G. (2021). An alternative of lidar in night-
time: Unsupervised depth estimation based on single
thermal image. In Proceedings of the IEEE/CVF Win-
ter Conference on Applications of Computer Vision,
pages 3833–3843.
Miyato, T., Kataoka, T., Koyama, M., and Yoshida, Y.
(2018). Spectral normalization for generative adver-
sarial networks. arXiv preprint arXiv:1802.05957.
Park, T., Efros, A. A., Zhang, R., and Zhu, J.-Y. (2020).
Contrastive learning for conditional image synthesis.
In ECCV.
Saleh, K., Abobakr, A., Attia, M., Iskander, J., Naha-
vandi, D., Hossny, M., and Nahvandi, S. (2019). Do-
main adaptation for vehicle detection from bird’s eye
view lidar point cloud data. In Proceedings of the
IEEE/CVF International Conference on Computer Vi-
sion Workshops, pages 0–0.
Su
´
arez, P. L., Sappa, A. D., and Vintimilla, B. X. (2019).
Image patch similarity through a meta-learning metric
based approach. In 2019 15th International Confer-
ence on Signal-Image Technology & Internet-Based
Systems (SITIS), pages 511–517. IEEE.
Xie, J., Feris, R. S., and Sun, M.-T. (2015). Edge-guided
single depth image super resolution. IEEE Transac-
tions on Image Processing, 25(1):428–438.
Yu, N., Liu, G., Dundar, A., Tao, A., Catanzaro, B., Davis,
L. S., and Fritz, M. (2021). Dual contrastive loss and
attention for gans. In Proceedings of the IEEE/CVF
International Conference on Computer Vision (ICCV),
pages 6731–6742.
Zhang, L., Gonzalez-Garcia, A., Van De Weijer, J., Danell-
jan, M., and Khan, F. S. (2018). Synthetic data gener-
ation for end-to-end thermal infrared tracking. IEEE
Transactions on Image Processing, 28(4):1837–1850.
Zhou, D., Wang, R., Lu, J., and Zhang, Q. (2018). Depth
image super resolution based on edge-guided method.
Applied Sciences, 8(2):298.
Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. (2017).
Unpaired image-to-image translation using cycle-
consistent adversarial networks. In Proceedings of
the IEEE international conference on computer vi-
sion, pages 2223–2232.
VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications
140