defined as pixels with a significant intensity variation.
Examples of interest points are corners, junctions,
isolated points or specific texture points. In (Harris
and Stephens, 1988), Harris proposes to find such
points using a second moment matrix.
In (Laptev and Lindeberg, 2003) Laptev and Linde-
berg proposed a spatio-temporal extension to detect
what they call ”Space-Time Interest Points” (STIPs).
STIPs are points which are relevant both in space
and time. Theses points are especially interesting
because they focus information initially contained
in thousands of pixels on a few specific points
which can be related to spatio-temporal events in
the sequence. Typically, STIPs appear in articulated
motions (walking, running or jumping person).
However, it can be noted that constant motion of a
corner does not produce any STIPs.
STIPs detection is performed by using the
Hessian-Laplace matrix (Laptev, 2005) defined, for a
pixel (x, y) at time t having intensity I(x, y, t), by :
H(x, y, t) =
∂
2
I
∂x
2
∂
2
I
∂x∂y
∂
2
I
∂x∂t
∂
2
I
∂x∂y
∂
2
I
∂y
2
∂
2
I
∂y∂t
∂
2
I
∂x∂t
∂
2
I
∂y∂t
∂
2
I
∂t
2
(1)
In order to highlight STIPs, different criteria have
been proposed. As in (Laptev, 2005), we have cho-
sen the extension of the Harris corner function, called
”salience function”, defined by:
R(x, y, t) = det(H(x, y, t)) − k∗ trace(H(x, y, t))
3
(2)
where k is a parameter empirically adjusted. STIP
correspond to high values of the salience function.
We are make tests for different values of the stan-
dard deviations σ
s
and σ
t
. These tests highlight the
impact of Gaussian filters: when the values of σ
s
and σ
t
are low, the number of STIPs increases, but
the good detection rate decreases. On the contrary,
when the values ofσ
s
and σ
t
are high, the number of
STIPs decreases and good detection rate increases up
the 100%. However, the settings corresponding to a
100% rate provide a too small number of STIPs. Fi-
nally, a good compromise is σ
s
= 1.5 and σ
t
= 1.5.
Although there are methods to make an automatic ad-
justment, we preferred to define them manually in or-
der to optimize computation time.
3.2 Properties
STIP properties are well known particularly the rel-
ative stability with respect to geometric transforma-
tions. In our application, we lay interest in some
specific properties, such as the robustness of STIPs
against impulsive noise and contrast modification.
3.2.1 Low/High Contrast and Noise
An analysis of the effects of image quality on the
STIP detection has also been done. Two situations
were examined: contrast modifications and noise ad-
dition. The noise that were used is an impulsive noise
because it is the most difficult type of noise relative
to interest point detection. Table 1 shows the num-
ber of STIPs obtained for different contrast and noise
conditions.
Table 1: Influence of contrast and impulse noise.
Contrast 50 75 100 125 150 175
STIP 1 2 29 64 68 127
a) Contrast influence
Pow 0 20 20 50 50 50
Intensity 0 20 50+ 20 50 70+
STIP 29 29 33 49 78 126
b) Noise influence
80 sequences of video synthesis and athletics jump
k = 0, 04, σ
s
= σ
t
= 1.5,salience threshold = 150
The evaluation is performed by observing the vari-
ations of the number of STIPs compared with the ini-
tial situation (no contrast modification and no noise :
29 STIPs by frame). It can be noticed that the STIP
detection is very sensitive to contrast modification.
On the contrary, the number of STIPs is relatively sta-
ble with respect to impulse noise.
3.2.2 Video Compression
The last criterion that influences STIPs generation is
the compression factor of the video. Indeed, as a re-
sult of compression, straight lines show an aliasing
which may, under certain circumstances, be perceived
as angles (Clarke, 1995). This change causes the gen-
eration of STIPs.
Table 2: Influence of MPEG2 factor compression : average
number of STIP by frame.
Compression factor (%) 10 20 30 40 50
STIP by frame (nb) 29 29 30 38 44
Compression factor (%) 60 70 80 90 100
STIP by frame (nb) 51 62 77 90 118
80 sequences of synthesis videos
and athletics jump
k = 0, 04, σ
s
= σ
t
= 1.5
and salience threshold = 120
Table 2 shows the influence of MPEG2 compres-
sion factor on the number of generated STIPs. It is
important to note that the sequence with square has
not generated false positives. Indeed, no aliasing has
occurred. These results show that the compression
factor has an important influence past the threshold of
VISAPP 2012 - International Conference on Computer Vision Theory and Applications
202