microsaccade
drift
tremor
Figure 1: Illustration of fixational eye movement including
microsaccade, drift and tremor.
From the above discussions, in this study, we ex-
amine a depth perception model based on fixational
eye movements. The fixational eye movement is clas-
sified into three types as shown in Fig. 1: microsac-
cade, drift and tremor. Here, we focus on the tremor,
which is the smallest one of the three types, and
construct a computation algorithm using analogy of
tremor to confirm the effectiveness of the perception
model with tremor. Since the fixational eye move-
ment is an involuntary motion, it is realistically hard
to know all of the eye movements before depth re-
covery, and thus we treat them as stochastic variables.
This problem can be realized in the framework of the
Bayesian inference, and a stable algorithm is expected
to be constructed using the EM algorithm (Dempster
et al., 1977).
2 PERCEPTION MODEL WITH
FIXATIONAL EYE MOVEMENT
As a background of this study, we are examining a
two-step perception model in which monocular depth
perception based on fixational eye movement is used
for binocular stereopsis. Binocular stereopsis plays
an essential role in the depth perception of a human
vison system (Lazaros et al., 2008), but occulusions
often occur in it. By this two-step processing, this
occulusion problem is expected to be solved. In this
study, we propose mainly a model for the first step
perception constructed additionally with the follow-
ing two-step perception
1. perception in the period of drift and tremor;
2. perception in the period of microsaccade.
In the former, depth perception corresponding to
the whole period of one drift, instead of that cor-
responding to each tremor period, is assumed to be
caused by multiple fine movements of tremor over
one period of drift. Therefore, recognized depth value
has only the temporal resolution equivalent to the pe-
riod of one drift, and has only the spatial resolution
equivalent to the distance of movement of one drift.
However, because of treating small movements, the
gradient method explained in the next section can be
used, which needs no search process and hence, is
cost effective. It should be noted that, by adopting
drift as an unit of perception, variety of brightness pat-
terns in a neigboring region can be effectively used,
and as a result accurate perception of depth can be
realized.
In the latter, using the depth value obtained by
the former step with low resolution and eye move-
ment corresponding drift, image displacement before
and after microsaccade is detected by search process
and depth value is recognized. Since the results of
the former step can be used, the size of the local re-
gion where the brightness pattern is used to search
and the range of searching area can be appropriately
determined. Additionally, because microsaccade in-
dicates fast movement, by the latter step, depth per-
ception with high spatio-temporal resolution can be
done through small computation.
As a first report of our monocular perception
model, we construct an algorithm for the first step and
confirm its efficiency. To model completely the first
step, we have to integrate drift component into the al-
gorithm, but in this study, we focus only on tremor.
Hence, we ignore the temporal correlation of tremor
which is needed to form drift component, and we as-
sume that each small movements are independent of
each other.
3 GRADIENT METHOD USING
FIXATIONAL EYE MOVEMENT
3.1 Motion Model and Optical Flow
As shown in Fig. 2, we use perspective projection as
our camera-imaging model. The camera is fixed with
an (X,Y,Z) coordinate system, where the viewpoint,
i.e., lens center, is at origin O and the optical axis is
along the Z-axis. The projection plane, i.e. image
plane, Z = 1 can be used without any loss of gen-
erality, which means that the focal length equals 1.
A space point (X,Y, Z) on the object is projected to
the image point (x,y). The camera moves with trans-
lational and rotational vectors u = [u
x
,u
y
,u
z
]
⊤
and
r = [r
x
,r
y
,r
z
]
⊤
.
We introduce a motion model representing fixa-
tional eye movement. We can set a camera’s rotation
center at the back of lens center with Z
0
along opti-
cal axis. In this study, we pick out tremor from three
COMPUTATIONAL MODEL OF DEPTH PERCEPTION BASED ON FIXATIONAL EYE MOVEMENTS
329