2.1 Feature Detection
Scale-Invariant Feature Transform (SIFT) (Lowe,
1999) was proposed in 1999, and has become some-
what of an industry standard. It includes both a detec-
tor and a descriptor. The detector is based on calcu-
lating a Difference of Gaussians (DoG) with several
scale spaces.
Partially inspired by SIFT, the Speeded-Up Ro-
bust Features (SURF) (Bay et al., 2006) detector was
proposed, which uses integral images and Hessian de-
terminants. SURF and SIFT are often used as base
lines in evaluations of other detectors.
The detector chosen for our experiments was pro-
posed by (Xie et al., 2010) and is inspired by (Miko-
lajczyk and Schmid, 2004), particularly their use of
a multi-scale Harris operator. However, instead of in-
creasing the scale incrementally, Xie et al. examined a
large set of pictures to determine which scales should
be evaluated so that as many features as possible only
are discovered in one scale each. Then, weak cor-
ners are culled using the Hessian determinant. As the
fundamental operators are the Harris operator and the
Hessian determinant, it is called the "Harris-Hessian
detector".
2.2 Feature Description
SIFT, SURF, and many other descriptors use strate-
gies that are variations of histograms of gradients
(HOG). The area around each keypoint in an image
is divided into a grid with sub-cells. For each sub-
cell, a gradient is computed. Then, a histogram of the
gradients’ rotations and orientations is made for each
cell. These histogram then make up the descriptor.
SURF, while based on the same principle, uses Haar
wavelets instead of gradients. The resulting descrip-
tor vectors of a high dimension (usually >128) which
can be compared using, e.g., Euclidean distance.
Calonder et al. proposed a new type of descriptor
called Binary Robust Independent Elementary Fea-
tures (BRIEF) (Calonder et al., 2010). Instead of us-
ing HOGs, BRIEF samples one pair of points at a time
around the keypoint, then compares their respective
intensities. The result is a number of ones and ze-
ros that are concatenated into a string, i.e., forming
a "binary descriptor". They do not propose a single
sampling pattern, rather they consider five different
ones. The resulting descriptor is nevertheless a binary
string. The benefit of binary descriptors is mainly that
they are computationally cheap, as well as suitable for
comparison using Hamming distance, which can be
implemented efficiently using the XOR operation.
Further work into improving the sampling pattern
of a binary descriptor has been made, most notably
Oriented FAST and Rotated BRIEF (ORB) (Rublee
et al., 2011), Binary Robust Invariant Scalable Key-
points (BRISK) (Leutenegger et al., 2011), and Fast
Retina Keypoint (FREAK) (Alahi et al., 2012).
The descriptor we use in this paper is FREAK
(Alahi et al., 2012), where machine learning is used
to find a sampling pattern that aims to minimize the
number of comparisons needed. FREAK generates
a hierarchical descriptor allowing early out compar-
isons. As FREAK significantly reduces the number
of necessary compare operations, it is suitable for mo-
bile platforms with low compute power.
2.3 OpenCL
OpenCL
1
is an open framework for executing pro-
grams on heterogeneous computers, its model is well
suited for execution of programs on GPUs. It is very
similar to the Nvidia specific CUDA framework. It
was chosen for this project because it is supported
on both desktop and embedded devices such as the
Adreno 330 and Nvidia GTX 660 allowing us to run
the same implementation in multiple environments.
3 OUR APPROACH:
HARRIS-HESSIAN + FREAK
3.1 Harris-Hessian Detector
The detector consists of two steps: Discovering Harris
corners (Harris and Stephens, 1988) using the Harris-
affine-like (Mikolajczyk and Schmid, 2004) detec-
tor on nine pre-selected scales as well as two ad-
ditional scales surrounding the most populated one,
then culling weak points using a measure derived
from the Hessian determinant.
The Harris-Hessian detector was proposed by Xie
et al. (Xie et al., 2010) in 2009 and elaborated by them
in 2010. It is essentially a variation of the Harris-
Affine detector combined with a use of the Hessian
determinant to cull away "bad" keypoints. As the
name suggests, the detector consists of two steps: The
Harris step and the Hessian step.
The Harris step finds Harris corners (see Figure 1)
at gradually larger scales (denoted σ), then reexam-
ines the scales around the σ where the largest amount
of corners were found. This σ is said to be the charac-
teristic scale of the image. To reduce the likelihood of
discovering the same corners in multiple scales, Xie
1
Offical webpage of the OpenCL standard:
https://www.khronos.org.
ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods
518