lation (Mikolajczyk and Schmid, 2005). Their per-
formance is consistent across different experimental
settings (Moreels and Perona, 2007). At the same
time, while there have been extensions to multispec-
tral images (see e.g. (Brown and Susstrunk, 2011)),
the studies performed on hyperspectral images have
been comparatively limited. Hyperspectral images
substantially differ from RGB or monochromeimages
because of, among else, much larger data size, vary-
ing performance of sensor of different frequencies,
complex statistical relationships between recorded
spectra, variation in data resulting from push-broom
recording scheme, and varying noise across frequency
spectrum (Mukherjee et al., 2009; Vakalopoulou and
Karantzalos, 2014). While they can be reduced to
monochrome images (that could be a simple input
to classical interest point algorithms), that conver-
sion is not trivial and could lose structural information
(Dorado-Munoz et al., 2012). Most approaches in hy-
perspectral domain are based on the SIFT algorithm:
as classification support (Xu et al., 2008), algorithm
extension (Mukherjee et al., 2009; Dorado-Munoz
et al., 2012), aligning image strips for change detec-
tion (Ringaby et al., 2010) or optimizing parameters
for hyperspectral image matching (Sima and Buckley,
2013). A different approach is taken in (Vakalopoulou
and Karantzalos, 2014), where SIFT and SURF are
combined in working with spectral bands groups.
We identify two important practical shortcom-
ings of current studies. One is the lack of in-
cluding a significant geometric deformations in the
test data set–currently used images differ by time
of acquisition and selected affine parameters only
(translation (Mukherjee et al., 2009; Ringaby et al.,
2010; Dorado-Munoz et al., 2012; Vakalopoulou
and Karantzalos, 2014) and scale (Dorado-Munoz
et al., 2012)), obtained by down-looking satellite or
plane-mounted camera. The only exception is (Sima
and Buckley, 2013), where tripod-acquired geologi-
cal data show some geometric deformations. The sec-
ond problem is the lack of comparing side-by-side the
performance of different methods. The focus is com-
monly on only one method, even at verification stage
(Xu et al., 2008; Mukherjee et al., 2009; Ringaby
et al., 2010; Dorado-Munoz et al., 2012; Sima and
Buckley, 2013). The one exception is (Vakalopoulou
and Karantzalos, 2014), where SIFT and SURF are
compared.
Our focus in this paper is the investigation of
performance of interest point descriptors on hyper-
spectral images of a 3D scene. We make two novel
contributions: first, we compare four separate de-
scriptor algorithms: SIFT (Lowe, 2004), SURF (Bay
et al., 2008), ORB (Rublee et al., 2011) and BRISK
(Leutenegger et al., 2011); second, we use a specially
prepared dataset of scene of mixed natural and man-
made objects, imagined from different view points.
Our experimental setting is as follows: we use the in-
terest point algorithms to detect and match points in
two images, then evaluate them based on quality of
estimation of relative 3D camera positions.
This paper is organized as follows: next section
presents the experimental setting. The results are
presented in the third section, and the last section
presents discussion and concluding remarks.
2 METHODS
Data Set. To compare the descriptors, we use a spe-
cially prepared data set that allows to test image pro-
cessing methods on images with significant geometric
deformations, resulting from hyperspectral imagining
a 3D scene with total viewpoint change of about 45
o2
.
To our best knowledge, this is the first dataset of such
kind.
We use a scene (cf. Figure 1) containing both nat-
ural and artificial fruits of several categories. This
produces images rich in structure in both visual and
NIR spectral ranges (in the former, color based edges
are the strongest, in the latter, neighborhoods of
materials of different types). The scene also con-
tains checkerboard-type markers for calibration and
ground-truth estimation, and Munsell grey panel for
light calibration. The scene is lighted with multi point
halogen light, supported by UV lamp (Omnilux CFL
UV 25W with color temperature 6000 UV K). Images
are recorded with Surface Optics SOC-710VP 375-
1045 nm camera from five points. The angle steps
are at ≈ 11
o
intervals, this choice is based on analy-
sis of (Moreels and Perona, 2007), where it has been
observed that viewpoint change of more than at 30
o
drastically reduces the feature matching effectiveness.
Descriptors. For comparison, we select four de-
scriptors: SIFT (Lowe, 2004) and SURF (Bay et al.,
2008) because of their popularity and reported good
performance; and ORB (Rublee et al., 2011) and
BRISK (Leutenegger et al., 2011), proposed as alter-
native descriptors with good time efficiency. We used
the implementations available in the OpenCV library
(Bradski, 2000). For matching, we use ratio filtering
(Lowe, 2004): we only consider points for which the
ratio of distance to first and to second nearest neigh-
bour is lower than r
0
= 0.8, thus excluding points that
could be well matched to several locations.
2
The dataset will be made available on-line, link re-
moved for anonymization purposes.