Figure 13: Depth estimation result.
dation of the estimation, we compute the difference
of a reprojection image and an input image for each
depth. Figure 13 shows the distance of the images at
each depth. In this graph, the vertical axis indicates
the distance of the images and the horizontal axis in-
dicates the estimated depth. The red line in the figure
shows correct depth. In this graph, the distance of
the image becomes local minimum at the true depth.
This, in fact, indicates that the distance between the
reprojection image and input image represents the va-
lidation of the estimated depth. Therefore, our met-
hod can estimate the depth from a single image.
These experimental results indicate our proposed
model can describe the behavior of the light rays ef-
ficiently and effectively. In addition, the calibration
and reconstruction methods based on the model work
efficiently.
6 SIMULTANEOUS ESTIMATION
OF CAMERA PARAMETERS
AND 3D SCENE
We finally discuss the simultaneous estimation of 3D
shape and camera parameters from an input image. In
an ordinary stereo method, simultaneous estimation
of these parameters, known as bundle adjustment, can
be achieved by minimizing the reprojection error of
the correspondences. In fact, bundle adjustment in
our framework can be achieved in a manner similar
to the ordinary method. In our framework, instead
of point reprojection error, image reprojection error
should be minimized for 3D reconstruction and cali-
bration. Therefore, simultaneous estimation can also
be achieved by minimizing the same error.
In this simultaneous estimation, in addition to the
parameters of the 3D shape, camera parameters are
estimated as well. When the 3D scene is represented
by an N-th order Bezier, (N + 1)
2
parameters are re-
quired. In addition, L water drops require L ×M (M is
the number of coefficients) parameters for estimating
the camera model. Totally, (N + 1)
2
+LM parameters
should be estimated for the simultaneous estimation.
This, in fact, indicates that (N + 1)
2
+ LM or more
than (N + 1)
2
+LM constraints are necessary for esti-
mating these parameters. In our proposed estimation,
all pixels are used for this estimation, and then, suf-
ficient number of constraints are obtained when the
number of pixels are larger than (N + 1)
2
+ LM.
7 CONCLUSION
In this paper, we propose 3D scene reconstruction
from a single image using water drops. In our propo-
sed method, water drops in the images are regarded as
virtual cameras and the 3D shape is reconstructed by
using the virtual cameras. For the efficient description
of the virtual cameras, we utilize an optical aberra-
tion model by Zernike basis. By using the aberration
model, complicated light ray refractions can be des-
cribed via few coefficients. Furthermore, parametric
3D scene description is employed for estimating the
3D scene effectively. We show experimental results
in the simulation environment and the results demon-
strate the potential of our proposed method. In future
work, we extend our proposed method to the simul-
taneous estimation of camera parameters and 3D sce-
nes.
REFERENCES
Arvind V. Iyer, J. B. (2018). Depth variation and stereo
processing tasks in natural scenes. J. Vis, 18(6).
Bay, H., Tuytelaars, T., and Van Gool, L. (2006). Surf:
Speeded up robust features. In Computer vision–
ECCV 2006, pages 404–417. Springer.
Chen, C., Lin, H., Yu, Z., Kang, S. B., and Yu, J. (2014).
Light field stereo matching using bilateral statistics of
surface cameras. In IEEE Conference on Computer
Vision and Pattern Recognition (CVPR).
Cheung, G. K., Kanade, T., Bouguet, J.-Y., and Holler, M.
(2000). A real time system for robust 3d voxel recon-
struction of human motions. In Computer Vision and
Pattern Recognition, 2000. Proceedings. IEEE Confe-
rence on, volume 2, pages 714–720. IEEE.
Geary, J. M. (1995). Introduction to Wavefront Sensors.
Society of Photo Optical.
Klein, G. and Murray, D. (2009). Parallel tracking and map-
ping on a camera phone. In Proc. Eigth IEEE and
ACM International Symposium on Mixed and Aug-
mented Reality (ISMAR’09), Orlando.
Kolmogorov, V. and Zabih, R. (2002). Multi-camera scene
reconstruction via graph cuts. In Computer Vision
ECCV 2002, pages 82–96. Springer.
VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications
906