are unified so as to obtain a partial or sparse depth
map. Subsequently, using it as an initial value the
image generation consistency is imposed on all infor-
mation of observed multiple images to obtain a whole
depth map and other 3-D quantities accurately.
In this study, as a first step of our research, we
take up a simple problem, ”shape from two-view im-
ages,”and confirm the effectiveness of the image gen-
eration consistency. We suppose that initial values
of 3-D quantities including a depth map are obtained
by various feature-based clues, and in the numerical
evaluation below, good initial values are given heuris-
tically and are used. In future, we are going to de-
velop the system in which various feature-based clues
for obtaining rough and sparse 3-D quantities are ac-
tually used and the unification schemeproposed in this
study is effectively performed.
The intensity of images used in the numerical
evaluation consists of a diffuse reflectance and a spec-
ular reflectance. The strength of a diffuse reflectance
and a specular reflectance are unknown relative to
the strength of a parallel light source, but those are
constant on an object. Therefore, we recover both
strength using the length of a light source as a unit.
The direction of a light source is also unknownand re-
covered. We recover a depth and the other 3-D quan-
tities by the image generation consistency with two
images. The degree of the unknownvariables is larger
than the number of observations of one image, i.e. a
pixel number, hence it is worried that a unique solu-
tion cannot be determined by an usual shading analy-
sis using only one image. For the case where only the
diffuse reflectance exists, it was clarified that a two-
way ambiguity appears (Brooks and Horn, 1985). Ad-
ditionally, since there is no clear texture, an accurate
binocular disparity detection is difficult. Namely, our
strategy is expected to be needed to solve this problem
accurately in spite of the simpleness of this problem.
The above simple algorithm evaluated in this
study as a first step can be also regarded as a new
unification method of the binocular disparity and the
shading constraints. The most unification methods
proposed recently adopt almost the same strategy that
a stereo constraint is firstly used for specific image re-
gions or points where disparity detection can be easily
done to recover sparse depth map, and then a shading
constraint is used for the other region where the shad-
ing constraint can be used suitably (Samaras et al.,
2000). On the other hand, our algorithm does not
use the binocular disparity constraint directly and the
image generation consistency of two images is con-
cerned to at most, although a disparity detection re-
sult can be used as an initial value. As the similar
awareness of the issues, (Maki et al., 2002) proposed
a method based on the principle of the photometric
stereo using known object motion, but in which only
a shading and a motion are focused and a texture is
not considered essentially. As against this, our strat-
egy can deal with the distribution of albedo in princi-
ple, although, in this study, albedo is assumed to be
constant.
2 SHADING CONSISTENCY FOR
TWO-VIEWS
2.1 Formulation of Depth from Shading
Various shape from shading method have been exam-
ined (Zhang et al., 1999),(Szeliski, 1991), and almost
are based on the image irradiance equation:
I(x,y) = R(~n(x,y)), (1)
which represents that image intensity I at a image
point (x,y) is formulated as a function R of a surface
normal~n at the point (X,Y,Z) on a surface projecting
to (x,y) in the image. General R contains other vari-
ables such as a view direction, a light source direction
and albedo. These variables have to be determined in
advance or simultaneously with the shape from im-
ages in general.
From the image irradiance equation, image inten-
sity is uniquely determined by surface orientation not
by surface depth. Most formulations of shape from
shading problem have focused on determining surface
orientation using the parameters (p,q) representing
(Z
X
,Z
Y
), which is the first derivative of Z with respect
to X and Y. Hence, we can express the shape from
shading problem as solving for p(x,y) and q(x, y),
with which the irradiance equation holds, by minimiz-
ing the following objective function.
J ≡
Z
{I(x, y) − R(p(x,y),q(x,y))}
2
dxdy, (2)
where I(x, y) is an observed image intensity. How-
ever, this problem is highly under-constrained, and
additional constraints are required to determine a par-
ticular solution, for example a smoothness constraint.
Additionally, the solutions p(x,y) and q(x, y) will not
correspond to orientations of a continuous and dif-
ferential surface Z(x,y) in general. Therefore, the
post processing is required, which generates a surface
approximately satisfying the constraint p
Y
= q
X
, or
(Horn, 1990) proposed the objective function includ-
ing such a constraint implicitly.
To avoid these difficulties, we can represent
p(x,y) and q(x,y) as a first derivative of Z(x,y) ex-
plicitly and consider R(p,q) as a function of Z(x,y).
ShapefromMulti-viewImagesbasedonImageGenerationConsistency
335