(a) input RGB image (b) input thermal image
(c) Trans. thermal image (d) Trans. RGB image
(e) Estimated disparity
(f) Ground truth
Figure 7: Input images and estimated results. RMSE
between the estimated disparity and ground truth was
3.25[pix].
each other such as RGB and thermal, our method
can estimate modality translated image and disparity.
Although the qualitative evaluation of our proposed
method is good, it is not enough because we cannot
use a sufficient number of images, including ground
truth. Thus, we construct the database for evaluation
and evaluate our proposed method extensively in fu-
ture work.
REFERENCES
Fergus, R., Singh, B., Hertzmann, A., Roweis, S. T., and
Freeman, W. T. (2006). Removing camera shake from
a single photograph. In ACM Transactions on Graph-
ics (TOG), volume 25, pages 787–794. ACM.
Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. (2017). Image-
to-image translation with conditional adversarial net-
works. In Proc. CVPR2017.
Kiku, D., Monno, Y., Tanaka, M., and Okutomi, M. (2014).
Simultaneous capturing of rgb and additional band im-
ages using hybrid color filter array. In Digital Pho-
tography X, volume 9023, page 90230V. International
Society for Optics and Photonics.
Levin, A., Sand, P., Cho, T. S., Durand, F., and Freeman,
W. T. (2008). Motion-invariant photography. In ACM
Transactions on Graphics (TOG), volume 27. ACM.
Luo, Y., Ren, J., Lin, M., Pang, H., Sun, W., Li, H., and
Lin, L. (2018). Single view stereo matching. In Proc.
CVPR2018, pages 155 – 163.
Monno, Y., Kiku, D., Kikuchi, S., Tanaka, M., and Oku-
tomi, M. (2014). Multispectral demosaicking with
novel guide image generation and residual interpola-
tion. In Image Processing (ICIP), 2014 IEEE Interna-
tional Conference on, pages 645–649. IEEE.
Raskar, R., Agrawal, A., and Tumblin, J. (2006). Coded
exposure photography: motion deblurring using flut-
tered shutter. In ACM Transactions on Graphics
(TOG), volume 25, pages 795–804. ACM.
Selvaraju, R., Cogswell, M., Das, A., Vedantam, R., Parikh,
D., and Batra, D. (2017). Grad-cam: Visual explana-
tions from deep networks via gradient-based localiza-
tio. In Proc. ICCV2017.
Treible, W., Saponaro, P., Sorensen, S., Kolagunda, A.,
ONeal, M., Phelan, B., Sherbondy, K., and Kamb-
hamettu, C. (2017). Cats: A color and thermal stereo
benchmark. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pages
2961–2969.
Zbontar, J. and LeCun, Y. (2016). Stereo matching by train-
ing a convolutional neural network to compare im-
age patches. Journal of Machine Learning Research,
17(1-32):2.
Zhi, T., Pires, B., Hebert, M., and Narasimhan, S. (2018).
Deep material-aware cross-spectral stereo matching.
In Proc. CVPR2018.
Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. (2017).
Unpaired image-to-image translation using cycle-
consistent adversarial networks. In Proc. ICCV2017.