A Comparative Evaluation of 3D Keypoint Detectors in a RGB-D Object

Dataset

ılvio Filipe and Lu

ıs A. Alexandre

Informatics, IT - Instituto de Telecomunicac¸

oes, University of Beira Interior,

Department of Informatics, 6200-001 Covilh

a, Portugal

Keywords:

3D Keypoints, 3D Interest Points, 3D Object Recognition, Performance Evaluation.

Abstract:

When processing 3D point cloud data, features must be extracted from a small set of points, usually called

keypoints. This is done to avoid the computational complexity required to extract features from all points in a

point cloud. There are many keypoint detectors and this suggests the need of a comparative evaluation. When

the keypoint detectors are applied to 3D objects, the aim is to detect a few salient structures which can be used,

instead of the whole object, for applications like object registration, retrieval and data simpliﬁcation. In this

paper, we propose to do a description and evaluation of existing keypoint detectors in a public available point

cloud library with real objects and perform a comparative evaluation on 3D point clouds. We evaluate the

invariance of the 3D keypoint detectors according to rotations, scale changes and translations. The evaluation

criteria used are the absolute and the relative repeatability rate. Using these criteria, we evaluate the robustness

of the detectors with respect to changes of point-of-view. In our experiments, the method that achieved better

repeatability rate was the ISS3D method.

1 INTRODUCTION

The computational cost of descriptors is generally

high, so it does not make sense to extract descriptors

from all points in the cloud. Thus, keypoint detec-

tors are used to select interesting points in the cloud

on which descriptors are then computed in these lo-

cations. The purpose of the keypoint detectors is

to determine the points that are different in order to

allow an efﬁcient object description and correspon-

dence with respect to point-of-view variations (Mian

et al., 2010).

This work is motivated by the need to quan-

titatively compare different keypoint detector ap-

proaches, in a common and well established experi-

mentally framework, given the large number of avail-

able keypoint detectors. Inspired by the work on

2D features (Schmid et al., 2000; Mikolajczyk et al.,

2005) and 3D (Salti et al., 2011), and by a similar

work on descriptor evaluation (Alexandre, 2012), a

comparison of several 3D keypoint detectors is made

in this work. In relation to the work of Schmid et al.

(2000); Salti et al. (2011), our novelty is that we use a

real database instead of an artiﬁcial, the large number

of 3D point clouds and different keypoint detectors.

Regarding the paper Filipe and Alexandre (2013), the

Input Cloud

Transformation

Keypoint

Detector

Inverse

Transformation

Keypoints

Correspondence

Keypoint

Detector

Figure 1: Keypoint detectors evaluation pipeline used in this

paper.

current paper introduces in the evaluation of 4 new

keypoint detectors, makes a computational complex-

ity evaluation through the time spent by each method

on the experiments. The beneﬁt of using real 3D point

clouds is that it reﬂects what happens in real life, such

as, with robot vision. These never “see” a perfect or

complete object, like the ones present by artiﬁcial ob-

jects.

The keypoint detectors evaluation pipeline used in

this paper is presented in ﬁgure 1. To evaluate the

invariance of keypoint detection methods, we extract

the keypoints directly from the original cloud. More-

over, we apply a transformation to the original 3D

point cloud before extracting another set of keypoints.

Once we get these keypoints from the transformed

cloud, we apply an inverse transformation, so that we

can compare these with the keypoints extracted from

the original cloud. If a particular method is invari-

ant to the applied transformation, the keypoints ex-

476

Filipe S. and A. Alexandre L..

A Comparative Evaluation of 3D Keypoint Detectors in a RGB-D Object Dataset.

DOI: 10.5220/0004679904760483

In Proceedings of the 9th International Conference on Computer Vision Theory and Applications (VISAPP-2014), pages 476-483

ISBN: 978-989-758-003-1

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

tracted directly from the original cloud should corre-

spond to the keypoints extracted from the cloud where

the transformation was applied.

The low price of 3D cameras has increased expo-

nentially and the interest in using depth information

for solving vision tasks. A useful resource for users of

this type of sensors is the Point Cloud Library (PCL)

(Rusu and Cousins, 2011) which contains many al-

gorithms that deal with point cloud data, from seg-

mentation to recognition, from search to input/output.

This library is used to deal with real 3D data and we

used it to evaluate the robustness of the detectors with

variations of the point-of-view in real 3D data.

The organization of this paper is as follows: the

next section presents a detailed description of the

methods that we evaluate; the results and the discus-

sion appear in section 3; and ﬁnally, we end the paper

in section 4 with the conclusions.

2 EVALUATED 3D KEYPOINT

DETECTORS

Our goal was to evaluate the available descriptors

in the current PCL version (1.7 pre-release, on June

2013).

There are some keypoint detectors in PCL which

we will not consider in this paper, since they are not

applicable to point cloud data directly or only sup-

port 2D point clouds. These are: Normal Aligned Ra-

dial Feature (NARF) (Steder et al., 2010) that assume

the data to be represented by a range image (2D im-

age showing the distance to points in a scene from a

speciﬁc point); AGAST (Mair et al., 2010) and Bi-

nary Robust Invariant Scalable Keypoints (BRISK)

(Leutenegger et al., 2011) that only support 2D point

clouds.

2.1 Harris3D

The Harris method (Harris and Stephens, 1988) is

a corner and edge based method and these types

of methods are characterized by their high-intensity

changes in the horizontal and vertical directions.

These features can be used in shape and motion

analysis and they can be detected directly from the

grayscale images. For the 3D case, the adjustment

made in PCL for the Harris3D detector replaces the

image gradients by surface normals. With that, they

calculate the covariance matrix Cov around each point

in a 3×3 neighborhood. The keypoints response mea-

sured at each pixel coordinates (x,y,z) is then deﬁned

r(x, y,z) = det(Cov(x,y, z)) −k (trace(Cov(x,y,z)))

(1)

where k is a positive real valued parameter. This pa-

rameter serves roughly as a lower bound for the ratio

between the magnitude of the weaker edge and that of

the stronger edge.

To prevent too many keypoints from lumping to-

gether closely, a non-maximal suppression process on

the keypoints response image is usually carried out

to suppress weak keypoints around the stronger ones,

followed by a thresholding process.

2.2 Harris3D Variants

In the PCL, we can ﬁnd two variants of the Harris3D

keypoint detector: these are called Lowe and Noble.

The differences between them are the functions that

deﬁne the keypoints response (equation 1). Thus, for

the Lowe method the keypoints response is given by:

r(x, y,z) =

det(Cov(x,y,z))

trace(Cov(x,y,z))

. (2)

The keypoints response from Noble method is

given by:

r(x, y,z) =

det(Cov(x,y,z))

trace(Cov(x,y,z))

. (3)

2.3 Kanade-Lucas-Tomasi

The Kanade-Lucas-Tomasi (KLT) detector (Tomasi

and Kanade, 1991) was proposed a few years after

the Harris detector. In the 3D version presented in

the PCL, this keypoint detector has the same basis as

the Harris3D detector. The main differences are: the

covariance matrix is calculated directly in the input

value instead of the normal surface; and for the key-

points response they used the ﬁrst eigen value of the

covariance matrix around each point in a 3 ×3 neigh-

borhood. The suppression process was similar to the

one used in the Harris3D method. In this method,

they remove the smallest eigen values by threshold-

ing these ones.

2.4 Curvature

Surface curvature has been used extensively in the lit-

erature for cloud simpliﬁcation and smoothing (Des-

brun et al., 1999), object recognition (Yamany and

Farag, 2002) and segmentation (Jagannathan and

Miller, 2007). However, there is a lack of a sys-

tematic approach in extracting salient local features

or keypoints from an input surface normals using its

AComparativeEvaluationof3DKeypointDetectorsinaRGB-DObjectDataset

477

local curvature information at multiple scales. The

curvature method in PCL calculates the principal sur-

face curvatures on each point using the surface nor-

mals. The keypoints response image is used to sup-

press weak keypoints around the stronger ones and

such a process is the same as performed in the me-

thod of Harris3D.

2.5 SIFT3D

The Scale Invariant Feature Transform (SIFT) key-

point detector was proposed by Lowe (2001). The

SIFT features are represented by vectors that repre-

sent local cloud measurements. The main steps used

by the SIFT detector when locating keypoints are pre-

sented below.

The original algorithm for 3D data was presented

by Flint et al. (2007), which uses a 3D version of the

Hessian to select such interest points. A density func-

tion f (x,y, z) is approximated by sampling the data

regularly in space. A scale space is built over the den-

sity function, and a search is made for local maxima

of the Hessian determinant.

The input cloud, I(x, y, z) is convolved with a

number of Gaussian ﬁlters whose standard deviations

{σ

,σ

,...} differ by a ﬁxed scale factor. That is,

j+1

= kσ

where k is a constant scalar that should be

set to

√

2. The convolutions yield smoothed images,

denoted by

G(x,y,z,σ

),i = 1,..., n. (4)

The adjacent smoothed images are then subtracted

to yield a small number (3 or 4) of Difference-of-

Gaussian (DoG) clouds, by

D(x,y,z,σ

) = G(x,y, z, σ

j+1

) −G(x, y, z, σ

). (5)

These two steps are repeated, yielding a number

of DoG clouds over the scale space.

Once DoG clouds have been obtained, keypoints

are identiﬁed as local minima/maxima of the DoG

clouds across scales. This is done by comparing

each point in the DoG clouds to its eight neighbors at

the same scale and nine corresponding neighborhood

points in each of the neighborhood scales. If the point

value is the maximum or minimum among all com-

pared points, it is selected as a candidate keypoint.

The keypoints identiﬁed from the above steps are

then examined for possible elimination if the two lo-

cal principal curvatures of the intensity proﬁle around

the keypoint exceed a speciﬁed threshold value. This

elimination step involves estimating the ratio between

the some eigenvalues of the Hessian matrix (i.e., the

second partial derivatives) of the local cloud intensity

around each keypoint.

2.6 SUSAN

The Smallest Univalue Segment Assimilating Nu-

cleus (SUSAN) corner detector has been introduced

by Smith and Brady (1997). Many corner detectors

using various criteria for determining “cornerness” of

image points are described in the literature (Smith and

Brady, 1997). SUSAN is a generic low-level image

processing technique, which apart from corner detec-

tion has also been used for edge detection and noise

suppression.

The signiﬁcance of the thresholding step with the

ﬁxed value g =

max

(geometric threshold) is simply

a precise restatement of the SUSAN principle: if the

nucleus lies on a corner then the Univalue Segment

Assimilating Nucleus (USAN) area will be less than

half of its possible value, n

max

. USAN is a mea-

sure of how similar a center pixel’s intensity is to

those in its neighborhood. The gray value similar-

ity function s(g

) measures the similarity between

the gray values g

and g

. s is meant to be similar in

shape to a step function

: [0, 255]

−→ [0,1]

,g2) 7−→



1 i f |g

−g

| ≤ t

0 otherwise

(6)

where t ∈ [1,256] is the brightness difference thresh-

old value. Summing over this kind of function for a

set of pixels is equivalent to counting the number of

similar pixels, i.e., pixels whose gray value difference

is at most t. It can be used to adjust the detector’s

sensitivity to the image’s global contrast level.

SUSAN uses the smooth gray value similarity

function

: [0, 255]

−→ [0,1]

,g2) 7−→ e

−

−g

(7)

which is mentioned to perform better than the step

function X

. The smoothness of s

plays an important

role in noise suppression (Smith and Brady, 1997),

since s

only depends on the difference between g

and g

To make the method more robust, points closer

in value to the nucleus receive a higher weighting.

Moreover, a set of rules presented in Smith (1992) are

used to suppress qualitatively “bad” keypoints. Lo-

cal minima of the SUSANs are then selected from the

remaining candidates.

2.7 ISS3D

Intrinsic Shape Signatures (ISS) (Zhong, 2009) is a

method relying on region-wise quality measurements.

VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications

478

This method uses the magnitude of the smallest eigen-

value (to include only points with large variations

along each principal direction) and the ratio between

two successive eigenvalues (to exclude points having

similar spread along principal directions).

The ISS S

= {F

, f

} at a point p

consists of two

components: 1 – The intrinsic reference frame F

,{e

}}where p

is the origin, and {e

}

is the set of basis vectors. The intrinsic frame is a

characteristic of the local object shape and indepen-

dent of viewpoint. Therefore, the view independent

shape features can be computed using the frame as a

reference. However, its basis {e

}, which spec-

iﬁes the vectors of its axes in the sensor coordinate

system, are view dependent and directly encode the

pose transform between the sensor coordinate system

and the local object-oriented intrinsic frame, thus en-

abling fast pose calculation and view registration. 2 –

The 3D shape feature vector f

= ( f

, f

,..., f

iK−1

which is a view independent representation of the

local/semi-local 3D shape. These features can be

compared directly to facilitate the matching of surface

patches or local shapes from different objects.

Only points whose ratio between two succes-

sive eigenvalues is below a threshold are considered.

Among these points, the keypoints are given by the

magnitude of the smallest eigenvalue, so as to con-

sider as keypoints only those points exhibiting a large

variation along every principal direction.

3 EXPERIMENTAL EVALUATION

AND DISCUSSION

3.1 Dataset

To perform the evaluation of keypoint detectors, we

use the large RGB-D Object Dataset

(Lai et al.,

2011). This dataset is a hierarchical multi-view ob-

ject dataset collected using an RGB-D camera. The

dataset contains clouds of 300 physically distinct ob-

jects taken from multiple views, organized into 51

categories, containing a total of 207621 segmented

clouds. Examples of some objects are shown in ﬁg-

ure 2. The chosen objects are commonly found in

home and ofﬁce environments, where personal robots

are expected to operate.

3.2 Keypoints Correspondence

The correspondence between the keypoints extracted

The dataset is publicly available at http://www.cs.

washington.edu/rgbd-dataset.

Figure 2: Examples of some objects of the RGB-D Object

Dataset.

directly from the original cloud and the ones extracted

from transformed cloud is done using the 3D point-

line distance (Weisstein, 2005). A line in three dimen-

sions can be speciﬁed by two points p

= (x

)

and p

= (x

) lying on it, then a vector line is

produced. The squared distance between a point on

the line with parameter t and a point p

= (x

)

is therefore

= [(x

−x

) + (x

−x

)t]

+ [(y

−y

)t]

+ [(z

−z

) + (z

−z

)t]

(8)

To minimize the distance, set ∂(d

)/∂t = 0 and solve

for t to obtain

t = −

−p

) ·(p

−p

)

−p

, (9)

where · denotes the dot product. The minimum dis-

tance can then be found by plugging t back into equa-

tion 8. Using the vector quadruple product ((A ×

= A

−(A ·B)

) and taking the square root re-

sults, we can obtain:

d =

|(p

−p

) ×(p

−p

, (10)

where × denotes the cross product. Here, the numer-

ator is simply twice the area of the triangle formed

by points p

, p

, and p

, and the denominator is the

length of one of the bases of the triangle.

AComparativeEvaluationof3DKeypointDetectorsinaRGB-DObjectDataset

479

3.3 Measures

The most important feature of a keypoint detector is

its repeatability. This feature takes into account the

capacity of the detector to ﬁnd the same set of key-

points in different instances of a particular model. The

differences may be due to noise, view-point change,

occlusion or by a combination of the above.

The repeatability measure used in this paper is ba-

sed on the measure used in (Schmid et al., 2000) for

2D keypoints and in (Salti et al., 2011) for 3D key-

points. A keypoint extracted from the model M

, k

transformed according to the rotation, translation or

scale change, (R

), is said to be repeatable if the

distance d (given by the equation 10) from its nearest

neighbor, k

, in the set of keypoints extracted from the

scene S

is less than a threshold ε, d < ε.

We evaluate the overall repeatability of a detector

both in relative and absolute terms. Given the set RK

of repeatable keypoints for an experiment involving

the model-scene pair (M

), the absolute repeatabil-

ity is deﬁned as

abs

= |RK

| (11)

and the relative repeatability is given by

r =

|RK

. (12)

The set K

is the set of all the keypoints extracted

on the model M

that are not occluded in the scene

. This set is estimated by aligning the keypoints

extracted on M

according to the rotation, transla-

tion and scale and then checking for the presence of

keypoints in S

in a small neighborhood of the trans-

formed keypoints. If at least a keypoint is present

in the scene in such a neighborhood, the keypoint is

added to K

3.4 Results and Discussion

In this article, we intend to evaluate the invariance of

the methods presented, in relation to rotation, transla-

tion and scale changes. For this, we vary the rotation

according to the three axes (X, Y and Z). The rotations

applied ranged from 5

to 45

, with 10

step. The

translation is performed simultaneously in the three

axes and the image displacement applied on each axis

is obtained randomly. Finally, we apply random vari-

ations (between ]1×,5×]) to the scale.

In table 1, we present some results about each

keypoint detector applied to the original clouds. The

percentage of clouds where the keypoint detectors

successfully extracted (more than one keypoint) is

Table 1: Statistics about each keypoint detector. These val-

ues come from processing the original clouds.

Keypoint % Keypoint

Mean of

Mean

detectors clouds

extracted

time (s)

keypoints

Harris3D 99.99 85.63 1.05

SIFT3D 99.68 87.46 9.54

ISS3D 97.97 86.24 1.07

SUSAN 86.51 242.38 1.64

Lowe 99.99 85.12 1.02

KLT 100.00 99.16 1.03

Curvature 99.96 119.36 0.70

Noble 99.99 85.12 1.04

presented in column 2. In the column 3, it ap-

pears the mean number of keypoints extracted by

cloud. And ﬁnally, we present the mean computa-

tion time (in seconds) spent by each method to ex-

tract the keypoints. These times were obtained on

a computer with Intel

Core

i7-980X Extreme Edi-

tion 3.33GHz with 24 GB of RAM memory.

To make a fair comparison between the descrip-

tors, all steps in the pipeline (see ﬁgure 1) are equal.

Figures 3 and 4 show the results of the evaluation of

the different methods with various applied transfor-

mations. The threshold distances (ε) analyzed vary

between [0, 2] cm, with small jumps in a total of 33

equally spaced distances calculated. As we saw in

section 2, the methods have a relatively large set of

parameters to be adjusted: the values used were the

ones set by default in PCL.

Regarding the relative repeatability (shown in ﬁg-

ures 3(a), 3(c), 3(e), 3(g), 3(i), 4(a) and 4(c)) the

methods presented have a fairly good performance in

general. In relation to the rotation (see ﬁgures 3(a),

3(c), 3(e), 3(g) and 3(i)), increasing the rotation an-

gle of the methods tends to worsen the results. Ide-

ally, the method results should not change indepen-

dently of the transformations applied. Regarding the

applied rotation, the method ISS3D is the one that

provides the best results. In this transformation (ro-

tation), the biggest difference that appears between

the various methods is in the 5 degrees rotation. In

this case, the method ISS3D achieves almost total cor-

respondence keypoints with a distance between them

of 0.25 cm. Whereas for example the SIFT3D only

achieves this performance for keypoints at a distance

of 1 cm. In both the scaling and translation (shown in

ﬁgures 4(a) and 4(c)), the methods exhibit very sim-

ilar results to those obtained for small rotations (5

rotation in ﬁgure 3(a)) with the exception of the SU-

SAN method, that has a relatively higher invariance

to scale changes.

Figures 3(b), 3(d), 3(f), 3(h), 3(i), 4(b) and 4(d)

VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications

480

(a) Relative repeatability for 5

rotation. (b) Absolute repeatability for 5

rotation.

rotation. (d) Absolute repeatability for 15

rotation.

(e) Relative repeatability for 25

rotation. (f) Absolute repeatability for 25

rotation.

(g) Relative repeatability for 35

rotation. (h) Absolute repeatability for 35

rotation.

(i) Relative repeatability for 45

rotation. (j) Absolute repeatability for 45

rotation.

Figure 3: Rotation results represented by the relative and absolute repeatability measures (best viewed in color). The relative

repeatability is presented in ﬁgures (a), (c), (e), (g) and (i), and the absolute repeatability in ﬁgures (b), (d), (f), (h) and (j).

The presented neighborhood radius is in meters.

AComparativeEvaluationof3DKeypointDetectorsinaRGB-DObjectDataset

481

(a) Relative repeatability for scale change. (b) Absolute repeatability for scale change.

Figure 4: Relative and absolute repeatability measures for the scale change and translation clouds (best viewed in color). The

relative repeatability is presented in ﬁgures (a) and (c), and the absolute repeatability in ﬁgures (b) and (d). The presented

neighborhood radius is in meters.

show the absolute repeatability, that present the num-

ber of keypoints obtained by the methods. With these

results we can see that the method that has higher

absolute repeatability (SUSAN) is not the one that

shows the best performance in terms of relative re-

peatability. In terms of the absolute repeatability,

the ISS3D and SIFT3D have better results than the

SUSAN method regarding the invariance transforma-

tions evaluated in this work.

4 CONCLUSIONS

In this paper, we focused on the available keypoint

detectors on the PCL library, explaining how they

work, and made a comparative evaluation on public

available data with real 3D objects. The experimen-

tal comparison proposed in this work has outlined as-

pects of state-of-the-art methods for 3D keypoint de-

tectors. This work allowed us to evaluate the best per-

formance in terms of various transformations (rota-

tion, scaling and translation).

The novelty of our work compared with the work

of Schmid et al. (2000) and Salti et al. (2011) is: we

are using a real database instead of an artiﬁcial, the

large number of point clouds and different keypoint

detectors. The beneﬁt of using a real database is that

our objects have “occlusion”. This type of “occlu-

sion” is made by some kind of failure in the infrared

sensor of the camera or from the segmentation me-

thod. In artiﬁcial objects this does not happen, so the

keypoint methods may have better results, but our ex-

periments reﬂect what can happen in real life, such as,

with robot vision.

Overall, SIFT3D and ISS3D yielded the best

scores in terms of repeatability and ISS3D demon-

strated to be the more invariant. Future work includes

extension of some methodologies proposed for the

keypoint detectors to work with large rotations and

occlusions, and the evaluation of the best combina-

tion of keypoint detectors/descriptors.

ACKNOWLEDGEMENTS

This work is supported by ‘FCT - Fundac¸

ao para a

encia e Tecnologia’ (Portugal) through the research

grant ‘SFRH/BD/72575/2010’, and the funding from

‘FEDER - QREN - Type 4.1 - Formac¸

ao Avanc¸ada’,

subsidized by the European Social Fund and by Por-

tuguese funds through ‘MCTES’.

We also acknowledge the support given by the

IT - Instituto de Telecomunicac¸

oes through ‘PEst-

OE/EEI/LA0008/2013’.

REFERENCES

Alexandre, L. A. (2012). 3D descriptors for object and cate-

gory recognition: a comparative evaluation. In Work-

VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications

482

shop on Color-Depth Camera Fusion in Robotics at

the IEEE/RSJ International Conference on Intelligent

Robots and Systems (IROS), Vilamoura, Portugal.

Desbrun, M., Meyer, M., Schr

oder, P., and Barr, A. H.

(1999). Implicit fairing of irregular meshes using dif-

fusion and curvature ﬂow. In Proceedings of the 26th

annual conference on Computer graphics and inter-

active techniques, pages 317–324, New York, USA.

Filipe, S. and Alexandre, L. A. (2013). A Comparative

Evaluation of 3D Keypoint Detectors. In 9th Con-

ference on Telecommunications, Conftele 2013, pages

145–148, Castelo Branco, Portugal.

Flint, A., Dick, A., and Hengel, A. (2007). Thrift: Local 3D

Structure Recognition. In 9th Biennial Conference of

the Australian Pattern Recognition Society on Digital

Image Computing Techniques and Applications, pages

182–188.

Harris, C. and Stephens, M. (1988). A combined corner

and edge detector. In Alvey Vision Conference, pages

147–152, Manchester.

Jagannathan, A. and Miller, E. L. (2007). Three-

dimensional surface mesh segmentation using

curvedness-based region growing approach. IEEE

Transactions on Pattern Analysis and Machine

Intelligence, 29(12):2195–2204.

Lai, K., Bo, L., Ren, X., and Fox, D. (2011). A large-scale

hierarchical multi-view RGB-D object dataset. In In-

ternational Conference on Robotics and Automation,

pages 1817–1824.

Leutenegger, S., Chli, M., and Siegwart, R. Y. (2011).

BRISK: Binary Robust invariant scalable keypoints.

In International Conference on Computer Vision,

pages 2548–2555.

Lowe, D. (2001). Local feature view clustering for 3D ob-

ject recognition. Computer Vision and Pattern Recog-

nition, 1:I–682–I–688.

Mair, E., Hager, G., Burschka, D., Suppa, M., and

Hirzinger, G. (2010). Adaptive and Generic Corner

Detection Based on the Accelerated Segment Test.

In European Conference on Computer Vision, pages

183–196.

Mian, A., Bennamoun, M., and Owens, R. (2010). On

the Repeatability and Quality of Keypoints for Lo-

cal Feature-based 3D Object Retrieval from Cluttered

Scenes. International Journal of Computer Vision,

89(2-3):348–361.

Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A.,

Matas, J., Schaffalitzky, F., Kadir, T., and Gool, L. V.

(2005). A Comparison of Afﬁne Region Detectors. In-

ternational Journal of Computer Vision, 65(1-2):43–

72.

Rusu, R. B. and Cousins, S. (2011). 3D is here: Point

Cloud Library (PCL). In International Conference on

Robotics and Automation, Shanghai, China.

Salti, S., Tombari, F., and Stefano, L. D. (2011). A Perfor-

mance Evaluation of 3D Keypoint Detectors. In Inter-

national Conference on 3D Imaging, Modeling, Pro-

cessing, Visualization and Transmission, pages 236–

243.

Schmid, C., Mohr, R., and Bauckhage, C. (2000). Evalua-

tion of Interest Point Detectors. International Journal

of Computer Vision, 37(2):151–172.

Smith, S. M. (1992). Feature based image sequence under-

standing.

Smith, S. M. and Brady, J. M. (1997). SUSAN – A new

approach to low level image processing. International

Journal of Computer Vision, 23(1):45–78.

Steder, B., Rusu, R. B., Konolige, K., and Burgard, W.

(2010). NARF: 3D range image features for object

recognition. In Intelligent Robots and Systems, Taipei,

Taiwan.

Tomasi, C. and Kanade, T. (1991). Detection and Tracking

of Point Features. Technical report, Carnegie Mellon

University.

Weisstein, E. W. (2005). The CRC Encyclopedia of Mathe-

matics. CRC Press, 3rd edition.

Yamany, S. M. and Farag, A. A. (2002). Surface signa-

tures: an orientation independent free-form surface

representation scheme for the purpose of objects reg-

istration and matching. IEEE Transactions on Pattern

Analysis and Machine Intelligence, 24(8):1105–1120.

Zhong, Y. (2009). Intrinsic shape signatures: A shape de-

scriptor for 3D object recognition. International Con-

ference on Computer Vision Workshops, pages 689–

696.

AComparativeEvaluationof3DKeypointDetectorsinaRGB-DObjectDataset

483