level th that varies with the average image brightness.
This threshold is defined as
th = α ∗b
avg
The variable α is set empirically and can be adjusted
by the user until the pupil detection is satisfactory.
Since other regions in the image, such as eye lashes,
may also contain dark pixel groups, for each detected
group of dark pixels, it is necessary to calculate the
brightness of its neighboring pixels. Only those pix-
els that are surrounded by dark pixels within a user-
defined neighborhood area will be considered as can-
didates for the pupil. The pixel group that corre-
sponds to the pupil is found by shape matching as the
pixels corresponding to the pupil should form a cir-
cular object. Finally, for the identified pupil region,
the area and radius is calculated. We track the posi-
tion of the pupil center, which moves linearly with the
direction of gaze.
Gaze Mapping and Fixation Detection. To be able
to calculate the gaze position in the scene (i.e. on
perimeter surface), we need to provide a mapping be-
tween the position of the eye within the camera image
and the coordinates on the perimeter surface. The co-
ordinates of the eye position are defined by the coor-
dinates of the pupil center. The mapping is calculated
during a calibration routine, as usual in eye-tracking
applications. We use a 3 × 3 calibration grid as pre-
sented by (Li et al., 2005).
The scene points
~
s
i
= (x
s
i
,y
s
i
) are given in polar
coordinates and can be configured at the beginning
of an examination. We define following default val-
ues for the coordinates x
s
i
,y
s
i
∈ {−20,0,20}. The re-
sulting nine points from the combination of these co-
ordinates are presented during the calibration routine
sequentially, where each point is presented for 5 sec-
onds (corresponding to 100 frames at the sampling
frequency of the camera). The subject is asked to fix-
ate each presented calibration point. During stimulus
presentation the eye position ~e
i
= (x
e
i
,y
e
i
) in the im-
age is calculated using the algorithms for the pupil
detection described above. When the eye position is
stable for a time period f
T
, a fixation is assumed. In
order to achieve best mapping precision, during the
calibration procedure we expect long fixations f
T
>
1000ms (corresponding to 20 video frames). Thus,
the standard deviation f
D
of the eye position in the im-
age data is computed for the last 20 frames. When the
standard deviation respects an empirically determined
threshold th
D
= 4px that considers the inaccuracy of
the eye tracker, f
D
< th
D
a fixation is assumed. If a
fixation cannot be recognized (e.g. due to an impaired
cooperation) the missed stimulus is presented again.
The mapping of the eye position in the image to
the gaze position in the scene, - perimeter surface - we
use a first-order linear mapping (Li et al., 2005). For
each correspondence between
~
s
i
and ~e
i
, two equations
are generated that constrain the following mapping:
x
s
i
= a
x
0
+ a
x
1
x
e
i
+ a
x
2
y
e
i
y
s
i
= a
y
0
+ a
y
1
x
e
i
+ a
y
2
y
e
i
where a
x
i
and a
y
i
are undetermined coefficients of
the linear mapping. This linear formulation results
in six coefficients that need to be determined. Given
the nine point correspondences from the calibration
and the resulting 18 constraint equations, the coeffi-
cients can be solved using Single Value Decomposi-
tion (Hartley and Zisserman, 2000).
In a further step, for each presented stimulus
during the EFOV test, we have to find out whether
the stimulus was fixated by the subject. Generally,
when a presented stimulus is fixated, the subject’s
gaze oscillates around the stimulus location forming
a fixation cluster. A fixation is assumed if the gaze
is kept around the stimulus location for at least 300
ms (Liversedge et al., 2011). At a sampling rate of
20 Hz, as it is the case in the built-in cameras of the
Octopus perimeter, 300ms correspond to 6 frames (or
gaze points). After the presentation of a stimulus, our
algorithm searches for clusters of points in at least
6 sequential video frame. This parameter is config-
urable and can easily be adapted to other sampling
rates. If a fixation cluster is detected, we calculate
the cluster centroid that represents the location of
the fixation. As described above, for each stimulus
location (Figure 4(b) black dots) we calculate the
corresponding fixation location (Figure 4(b) red dots).
2.2 Modeling Fixation Data with the
Generalized Pareto Distribution
We observed that an exact match between the location
of the presented stimulus and the corresponding fixa-
tion is given very rarely. Instead, for a given stimulus
location, the distribution of the distances between the
stimulus location and the fixations of different sub-
jects corresponds to a Pareto distribution. The ques-
tion is: up to which distance between fixation and
stimulus d
seen
can a stimulus be considered as per-
ceived (seen)?
We used the Generalized Pareto Distribution
(GPD) to model the distribution of distances between
fixation and stimulus and implemented the model us-
ing Matlab (MATLAB, 2012). The probability den-
sity function of GPD is given by the following Equa-
tion 1 (Kotz and Nadarajah, 2000), (Embrechts et al.,
HEALTHINF2013-InternationalConferenceonHealthInformatics
8