dimensionality from 128 to 36, so that the PCA-
SIFT is fast for matching, but seems to be less
distinctive than the original SIFT as demonstrated in
a comparative study by (Mikolajczyk & Schmid,
2005).
(Bay et al., 2008) developed the Speeded Up
Robust Feature (SURF) method that is a
modification of the SIFT method aiming at better
run time performance of features detection and
matching. This is achieved by two major
modifications. In the first one, the Difference of
Gaussian (DoG) filter is replaced by the Difference
of Means (DoM) filter. The use of the DoM filter
speeds up the computation of features detection due
to the exploiting integral images for a DoM
implementation. The second modification is the
reduction of the image feature vector length to half
the size of the SIFT feature descriptor length, which
enables quicker features matching. These
modifications result in an increase computation
speed by a factor 3 compared to the original SIFT
method. However, this is insufficient for real-time
requirements.
In recent years, several papers (Heymann et al.,
2007) were published addressing the use of the
parallelism of modern graphics hardware (GPU) to
accelerate some parts of the SIFT algorithm, focused
on features detection and description steps. In
(Charriot
& Keriven, 2008) GPU power was
exploited to accelerate features matching. These
GPU-SIFT approaches provide 10 to 20 times faster
processing allowing real-time application.
The matching step can be speeded up by
searching for the Approximate Nearest Neighbor
(ANN) instead of the exact nearest neighbor. The
most widely used algorithm for ANN is the kd-tree
(Firedman et al., 1977), which successfully works in
low dimensional search space, but performs poorly
when feature dimensionality increases. (Lowe, 2004)
used the Best-Bin-First (BBF) method, which is
expanded from kd-tree by modification of the search
ordering so that bins in feature space are searched in
the order of their closest distance from the query
feature and stopping search after checking the first
200 nearest-neighbor candidates. The BBF provides
a speedup factor of 2 times faster than exhaustive
search while losing about 5% of correct matches. In
(Muja & Lowe., 2009) Muja and Lowe compared
many different algorithms for approximate nearest
neighbor search on datasets with a wide range of
dimensionality and they found that two algorithms
obtained the best performance, depending on the
dataset and the desired precision. These algorithms
used either the hierarchical k-means tree or multiple
randomized kd-trees.
In this paper, a novel strategy which is distinctly
different from all three of the above mentioned
strategies, is introduced to accelerate the SIFT
features matching step. The paper contribution is
summarized in two points.
Firstly, in the key-point detection stage, the SIFT
features are split into two types, Maxima and
Minima, without extra computational cost and at the
matching stage only features of the same type are
compared. since correct match can not be expected
between two features of different types.
Secondly, in the orientation assignment stage the
SIFT feature is extended by a new attribute without
extra computational cost. The novel attribute is an
angle between the original SIFT feature orientation
and a new different orientation
. Hence SIFT
features are divided into a few clusters based on the
introduced angle. At the matching stage, only
features of the almost same angle are compared. The
idea behined this is that correct matches can be
expected only between two features whose angles
differ for less than a pre-defined threshold.
The proposed method can be generalized for all
local feature-based matching algorithms which
detect two or more types of key-points (e.g. DoG,
LoG, DoM) and whose descriptors are rotation
invariant, where two different orientations can be
assigned (e.g. SIFT, SURF, GLOH).
2 ORIGINAL SIFT METHOD
The Scale Invariant Feature Transform (SIFT)
method, proposed by Lowe (Lowe, 2004), takes an
image and transforms it into a set of local features.
The SIFT features are extracted through the
following three stages:
1. Feature Detection and Localization: In this
stage, the locations of potential interest points in the
image are determined by detecting the extrema of
Difference of Gaussian (DoG) scale space. For
searching scale space extrema, each pixel in the
DoG images is compared with its 26 neighbors in
3×3 regions of scale-space. If the pixel is
lower/larger than all its neighbors, then it is labelled
as a candidate key-point. Each of these key-points is
exactly localized by fitting a 3D quadratic function
computed using a second order Taylor expansion
around key-point location. Hence key-points are
filtered by discarding points of low contrast and
points that correspond to edges.
2. Feature Orientation Assignment: An orientation
is assigned to each key-point based on local image
gradient data. For each pixel in a certain region
R
VISAPP 2010 - International Conference on Computer Vision Theory and Applications
288