where f(x,y,t) is an image intensity, and f
x
, f
y
, f
t
are the partial derivatives of f, where (x,y) is a coor-
dinate system in an image plane and t indicates time.
The gradient equation is applied to a successive image
pair, in principle. By substituting Eqs. 3 and 4 into
Eq. 5, the gradient equation for a rigid object with the
camera rotations in Sec. 2.1 is derived.
f
t
= −( f
x
v
r
x
+ f
y
v
r
y
)−(−f
x
r
y
+ f
y
r
x
)Z
0
d ≡−f
r
− f
u
d.
(6)
Using Eq. 6, the inverse depth can be directly recov-
ered without optical flow detection. This scheme is
effective for shape from multiple images, since the in-
verse depth can be considered as a common variable
for all image pairs.
We assume that f
(i, j)
t
, where i and j are a pixel
position and a frame number respectively, includes an
observation error according to the Gaussian distribu-
tion with an average of 0 and a variance of σ
2
o
, but
f
(i, j)
x
and f
(i, j)
y
have no errors. We consider that the
equation error of the gradient equation is dominant in
the error of f
(i, j)
t
.
In addition, {d
(i)
} should be assumed to have lo-
cal correlation spatially, since an object usually has a
smooth structure. As a simple modeling, we use the
following prior of {d
(i)
}.
p(d|σ
2
d
) =
1
(
√
2πσ
d
)
N
exp
(
−
d
⊤
Ld
2σ
2
d
)
, (7)
where d is an N-dimensional vector consisting of
{d
(i)
}, where N indicates the number of pixels, and
L indicates the matrix corresponding to the 2D Lapla-
cian operator, and we control σ
2
d
heuristically in con-
sideration of the smoothness of a recovered depth
map.
Based on the probabilistic models of {r
( j)
},
{f
(i, j)
t
} and { d
(i)
} defined above, we can statisti-
cally estimate the inverse depth map. By applying
the MAP-EM algorithm (Dempster et al., 1977), σ
2
o
and {d
(i)
} are determined as a MAP estimator based
on p(d,σ
2
o
|{f
(i, j)
t
}) and {r
( j)
} is also determined as
a MAP estimator from p({ r
( j)
}|{f
(i, j)
t
},
ˆ
σ
o
2
,{
ˆ
d}), in
which ˆ· means a MAP estimator. It is noted that the
uniform distribution should be used as the prior of σ
2
o
,
because of no information of σ
2
o
in advance. The de-
tails of the estimation algorithm are shown in the lit-
erature (Tagawa, 2010), in which σ
2
r
is also estimated,
but in this study, it is assumed to be known as a setting
value.
3 SELECTION OF IMAGE PAIRS
FOR ACCURATE RECOVERY
Based on the condition that many image pairs are
available, we propose a scheme that at every pixel
we discard the gradient equations, i.e. discard the
image pairs having a large approximation error. In
each pixel, we decide which image pair should be dis-
carded.
At the first step, we focus on an alias problem.
An alias in signal processing is a state caused by a
low sampling rate as compared with the maximum
frequency of signals. In this study, when an image
motion between two images is large against the spa-
tial wavelength of a dominant image intensity pattern,
the direction of the detected optical flow is opposite to
the true direction and causes a large recovery error of
a depth map. In the image region where the alias oc-
curs, the angle between the spatial gradient vectors
of successive image pairs f
(i, j)
s
and f
(i, j+1)
s
, where
f
(i, j)
s
= [ f
(i, j)
x
, f
(i, j)
y
]
⊤
, tends to be large. Therefore,
the angle defined by both vectors can be used to find
the alias region. The image pairs in which the angle is
large should be detected in each pixel and discarded
thresholding.
In the next step, we further select the appropri-
ate image pairs independently in each pixel from the
image pairs remained through the first step described
above in consideration of the amount of nonlinear
terms included in the observation of f
t
. The exact f
t
is represented as follows:
f
t
= −f
x
v
x
− f
y
v
y
−
1
2
f
xx
v
2
x
+ f
yy
v
2
y
+ 2f
xy
v
x
v
y
+···
(8)
After discarding the exceedingly inappropriate image
pairs by the first step, the nonlinear term can be con-
sidered small, and the second order term in Eq. 8 can
be approximated at every pixel i as follows:
−
1
2
n
( f
(i, j)
x
− f
(i, j+1)
x
)v
(i, j)
x
+ ( f
(i, j)
y
− f
(i, j+1)
y
)v
(i, j)
y
o
.
(9)
This representation can be introduced from the fol-
lowing gradient equations with respect to f
x
and f
y
,
f
xx
v
x
+ f
xy
v
y
+ f
xt
= 0, (10)
f
yx
v
x
+ f
yy
v
y
+ f
yt
= 0, (11)
with the approximations of f
xt
≈ f
(i, j+1)
x
− f
(i, j)
x
and
f
yt
≈ f
(i, j+1)
y
− f
(i, j)
y
. Equation 9 can be computed
without detecting the second derivatives of f, which
tend to be noisy.
In this step, we take into account the SNR of f
t
.
When optical flow is small relative to a dominant in-
tensity pattern, the first term in Eq. 8 is likely to be