ADDING

COLOR TO GEODESIC INVARIANT FEATURES

Pier Paolo Campari

Politecnico di Milano, Milan, Italy

Matteo Matteucci, Davide Migliore

Politecnico di Milano, Milan, Italy

Keywords:

Feature Extraction, Feature Description, Color Pattern Recognition.

Abstract:

Geodesic invariant feature have been originally proposed to build a new local feature descriptor invariant not

only to afﬁne transformations, but also to general deformations. The aim of this paper is to investigate the

possible improvements given by the use of color information in this kind of descriptor. We introduced color

information both in geodesic feature construction and description. At feature construction level, we extended

the fast marching algorithm to use color information; at description level, we tested several color spaces on

real data and we devised the opponent color space as an useful integration to intensity information. The

experiments used to validate our theory are based on publicly available data and show the improvement, in

precision and recall, with respect to the original intensity based geodesic features. We also compared this

kind of features, on afﬁne and non afﬁne transformation, with SIFT, steerable ﬁlters, moments invariants, spin

images and GIH.

1 INTRODUCTION

In this paper we face the issue of feature description

at the base of automatic correspondence matching be-

tween images from different views of the same scene

or images of the same objects in different poses. In

particular we are interested in exploiting color infor-

mation to enrich feature descriptors invariant with re-

spect to generic transformation/deformations.

In (Ling and Jacobs, 2005), Ling and Jacobs in-

troduced such kind of descriptor, together with the

geodesic framework, for deformation invariant fea-

tures extraction and matching. To cope with generic

image deformations, this kind of descriptor is based

on a histogram built out of pixel intensities from re-

gions at the same geodesic distance with respect to

a given interest point. Geodesic invariant descrip-

tors have been experimentally proved to be covariant

with respect to generic deformations, but have little

descriptive capabilities with respect to other classi-

cal feature descriptors such as SIFT. This weakness

is mainly due to the smoothing effect induced by the

histogram and the sensitivity to image gradient in

geodesic distance calculation.

The aim of our work is to investigate how color in-

formation can be used to improve geodesic image

descriptors by reducing this sensitivity to the image

gradient and extending histogram description. In the

original paper Ling and Jacobs (Ling and Jacobs,

2005) use the fast marching algorithm (Sethian, 1999)

to expand geodesic borders based on intensity values;

in this paper we propose to use the RGB components

to consider edges in color components instead of in-

tensity. Color information is also used to complement

intensity histogram with an appropriate color descrip-

tion based on opponent color space (van de Weijer and

Schmid, 2006).

In the next section we brieﬂy review related works

in feature description and evaluation, Section 3 intro-

duces the geodesic invariant framework and the nov-

elties introduced by our work. Section 4 presents the

experimental setup and results to assess the improve-

ment given by the use of color information, while a

brief discussion about future work is presented in the

ﬁnal section.

227

Paolo Campari P., Matteucci M. and Migliore D. (2008).

ADDING COLOR TO GEODESIC INVARIANT FEATURES.

In Proceedings of the Third International Conference on Computer Vision Theory and Applications, pages 227-234

DOI: 10.5220/0001076602270234

 SciTePress

2 RELATED WORKS

A considerable amount of work has been done on ro-

bust local feature descriptors, studying invariance to

orientation, scale, afﬁne transformation, and, recently,

also to generic deformation.

In the work of Mikolajczyck and Schmid (Miko-

lajczyk and Schmid, 2005; Mikolajczyk et al., 2005)

a performance evaluation of several of these local

descriptors is performed both with respect to view-

points and lighting conditions. As ﬁnal result, this

work reports that the SIFT descriptor, proposed by

Lowe (Lowe, 2004), has the best performance with

images of ﬂat scenes and afﬁne transformations.

These conclusions have been supported also in the

paper by Moreels and Perona (Moreels and Perona,

2005), which generalizes these results to 3D scenes

using images of 3D objects viewed under different

scales, viewpoints, and lighting conditions.

All the local invariant descriptors, investigated so

far, are based on the hypothesis of perspective de-

formation being properly approximated, locally, by

afﬁne transformation; recently, Ling and Jacobs (Ling

and Jacobs, 2005) demonstrated that it is possible also

to construct descriptors invariant to generic deforma-

tion of image subject (e.g., a moving ﬂag). In their

proposal, they suggest to treat the intensity image as

a surface embedded in 3D space, with the third coor-

dinate being proportional to the intensity values, and

then build the descriptor by deformation invariance

geodesic distance in this 3D space.

Stimulated by their work, in this paper we want

demonstrate that it is possible to improve the perfor-

mance of the Geodesic Intensity Histogram (GIH) de-

scriptor by introducing color information. Although

color seems to be a fundamental clue for object recog-

nition in everyday life only few color invariant de-

scriptors have been proposed in the literature. The

work of Van De Weijer and Schmid (van de Weijer

and Schmid, 2006) is an important example from this

point of view. Their results lead to the encouraging

conclusions that a pure color-based approach outper-

forms a shape-based approach only for colorful ob-

jects, while, for the general case, it is possible any

way to outperforms a pure shape-based approach us-

ing a combination of shape and color.

In our work we verify the results achieved by Van

De Weijer and Schmid also for non afﬁne transforma-

tions with the idea that shape-based descriptors can

fail when dealing with generic deformations and com-

bining it with color can improve recognition rates..

3 COLORING GEODESIC

INVARIANT FEATURES

In the geodesic framework, an image can be inter-

preted as a 2D surface in a 3D space, with the third

coordinate being proportional to the pixels intensity

value, with an aspect weight α → 1, and the ﬁrst two

coordinates proportional to (x, y) (image pixel coordi-

nates) with weight 1 − α.

We deﬁne a geodesic level curve as the set of

points at the same geodesic distance from a given in-

terest point; it is possible to capture the joint distribu-

tion of intensity and geodesic distances and summa-

rize it into the so called GIH histogram-based descrip-

tor by sampling pixels with constant geodesic step ∆..

An efﬁcient scheme for the geodesic level curves

computation on discrete pixel grids, was provided

by Sethian with the name of fast marching algo-

rithm (Sethian, 1999). A marching speed F(x,y) is

associated to each pixel x,y and the geodesic distance

T (x,y) can be estimated solving locally the equation

∇T

F = 1, where

F(x,y) =

f (x, y)

(1 − α)

+ α

. (1)

Although the shape of the resulting region is irregular,

it is covariant with deformation and it has shown in-

teresting results for generic continuous deformations.

Sometimes worst behaviors may occur in correlation

with the presence of isotropic and anisotropic scale

transformations, causing a resampling of pattern con-

tours, but for uniform intensity region, the expansion

is independent from image gradient, mainly depend-

ing on 1 − α value.

3.1 Fast Marching Algorithm in RGB

Space

The ﬁrst improvement proposed in this paper aims at

modifying region expansion in the fast marching al-

gorithm considering color information

. We take into

account each RGB channel separately, computing 3

different inverse marching speeds one for each chan-

nel:

(x,y)

= (1 − α)

+ α

, (2)

(x,y)

= (1 − α)

+ α

, (3)

(x,y)

= (1 − α)

+ α

. (4)

We present in this paper only the RGB implementation

for the new fast marching algorithm; we tested also other

color spaces with no meaningful improvements thus we de-

cided to use the most efﬁcient one.

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

228

Assuming there is no particularly strong photometric

transformation (e.g. Lambertian surface), these func-

tions are deformation covariant, as with the geodesic

distance on the intensity surface, and, under the

same hypothesis, the ordering of f

(x,y), f

(x,y) and

(x,y) is preserved as well. The geodesic region, in

our proposal, is thus computed starting from a new in-

verse marching speed based on these three color chan-

nels:

min

(x,y) = min( f

(x,y), f

(x, y)), (5)

and the resulting new geodesic distance T is invari-

ant, being the sum of the same invariant stretches.

This choice, being somehow counterintuitive, has two

rationalities behind it and has been conﬁrmed by ex-

perimental results:

• The growth of geodesic region is bounded by all

channels and slows down only for high values of

min

(x,y). The moving front speed decreases only

if contour are thus found on all channels, cor-

responding to a strong change in luminance or

crominance, less sensitive to noise with respect to

geometric transformations

• Weak contours found on a single channel have mi-

nor inﬂuence on the geodesic region and let the

front go on. In this way, given a geodesic level

corresponding to a quantization interval on the

geodesic distance, this will show the salient in-

tensity and color varieties separated by weak con-

tours.

Table 1 shows an example of geodesic distance T

expansion on a monodimensional grid of pixel i ∈

{

1..n

}

. Pixel updating order is deterministic, in fact

the moving front can proceed in only one direction.

The updating formula becomes:

T (i + 1) = T (i) + f

min

(i + 1) (6)

We can identify weak contour on pixels i = 2 and

i = 3 and a strong contour on pixel i = 6. Comput-

ing geodesic distance in this fashion, it puts in the

same interval of T more color varieties, represented

by weak contours. For instance, let be the sampling

gap ∆ = 0.1 and the histogram quantization interval

of T to δT = 0.8. The geodesic level segment (ring

in the bidimensional space) starts at the ﬁrst pixel and

stops between pixel i = 5 and i = 6. Sampling points

coming from this segment shows a lot of color varia-

tions, the same variations causing the weak contours

on pixels i = 2 and i = 3.

The results obtained by using Equation (5) is

twofolds, on one hand we have that strong contours

are less likely to be smoothed, if affected by geo-

metric deformations, and, on the other one, a more

Table 1: Evolution of geodesic distance T on a monodimen-

sional grid of pixel i ∈

{

1..n

}

., computed using f

, f

i 1 2 3 4 5 6 7

0.1 0.5 0.1 0.1 0.2 0.9 0.1

0.1 0.1 0.1 0.1 0.1 0.8 0.1

0.1 0.1 0.4 0.1 0.2 0.8 0.2

T 0.1 0.2 0.3 0.4 0.5 1.3 1.4

deformation-covariant placement of sampling regions

is obtained.

3.2 Building the Geodesic Color

Descriptor

Low distinctiveness is a disadvantage of the original

geodesic descriptor. The GIH descriptor summarizes

the geodesic region content as an histogram H

(k, m)

built on two variables: geodesic distance g and nor-

malized intensity I (g is quantized in M intervals, I in

K intervals). For each geodesic interval m, a normal-

ization has to be performed such as

∑

k=1

(k, m) = 1

to compare inner and outer geodesic “rings” with the

same weight.

Given a ring of geodesic level curves, correspond-

ing to a quantization interval on the geodesic distance,

GIH stores no spatial informations about sample loca-

tion along the ring thus the resulting numeric vector is

less informative than other descriptors such as SIFT

(on afﬁne covariant trasformation). We propose to

partially overcome this distinctiveness problem, with-

out losing rotation invariance, by adding color infor-

mation into the descriptor.

To build a richer descriptor, we consider 3 or more

dimensions, together with their respective quantiza-

tion intervals: the geodesic distance g (quantized in

M bins), the normalized intensity I (quantized in K

bins), and one or more photometric invariants Inv

(quantized in Q

bins).

The implemented descriptor keeps the geodesic

information associated to color invariants, extending

the original M × K GIH descriptor with N additional

M × Q

histograms one per invariant Inv

. The ﬁnal

dimension of the descriptor is thus M(K +

∑

n=1

The matching distance between two Geodesic

Histogram & Histograms descriptor (GHH hereafter)

p and q is computed starting from the χ

distances

between each couple of histograms:

= χ

) (7)

Inv

= k

· χ

Inv

) (8)

Inv

= k

· χ

Inv

) (9)

.. . (10)

ADDING COLOR TO GEODESIC INVARIANT FEATURES

229

where k

and k

are weighting parameters, and than

the descriptors distance as a whole is obtained as

d(H

) = max(d

Inv

,...). (11)

This implies that a match between two point is al-

lowed only if all pairs of histograms are fairly sim-

ilar, up to a given threshold, both in luminance and

chrominance.

3.3 Color Invariants Selection

GHH descriptor can be seen as a general construc-

tion method for geodesic descriptors including inten-

sity and color information; effective invariant selec-

tion, then, should be performed taking into account

speciﬁc assumptions about image formation and the

speciﬁc application.

We can distinguish between zero-order invariants

and ﬁrst-order invariants, i.e., derivative based invari-

ants. For the use with geodesic regions, characterized

by generic deformations, the main challenge is to pre-

serve geometric robustness, then we can have focused

the work on zero-order invariants computable from

common color spaces. Another practical requirement

is stability, since known photometric invariant have

inherent instabilities.

The measured color information for a camera sen-

sor can be modeled as:

C(~x) = m

(~x)b

(~x)e

+ m

(~x)e

, (12)

where the ﬁrst part describes the light which is re-

ﬂected after interaction with surface albedo b and

the second-one is related to the light immediately re-

ﬂected at the surface, causing specularities. The aim

of color space selection is to provide invariance with

respect to m

and m

, which depend on scene and illu-

mination geometry. In this paper we considered three

invariant for color representation:

RGB normalized RGB are invariant to the shading

modeled by m

term in absence of specular effects

(i.e. m

= 0);

O1,O2 opponent colors, that, for white illuminant,

are invariant with respect to specularities;

H,S the hue information is invariant to both m

and

, but this is known to be unstable with low sat-

uration also; weighting each sample by its satura-

tion is only a partial solution.

We have tested these invariants on the real

dataset provided in the paper of Mikolajczyck and

Schmid (Mikolajczyk and Schmid, 2005) with GHH

descriptor, after normalization obtained subtracting

0.001

0.002

0.003

0.004

0.005

0.006

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

P(dI=k|corr)

P(dr=k|corr)

P(dg=k|corr)

P(dI=k|!corr)

P(dr=k|!corr)

P(dg=k|!corr)

Figure 1: Histograms distance distributions for I, r and g.

0.001

0.002

0.003

0.004

0.005

0.006

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

P(dI=k|corr)

P(dO1=k|corr)

P(dO2=k|corr)

P(dI=k|!corr)

P(dO1=k|!corr)

P(dO2=k|!corr)

Figure 2: Histograms distance distributions for I, O1 and

O2.

0.001

0.002

0.003

0.004

0.005

0.006

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

P(dI=k|corr)

P(dH*0.2=k|corr)

P(dI=k|!corr)

P(dH*0.2=k|!corr)

Figure 3: Histograms distance distributions for I and Hue

(scaled by 0.2 for better illustration).

the mean and dividing by standard deviation. For an-

gular invariant hue, illuminant normalization is pre-

viously performed on RGB channels, dividing each

channel value by its spatial average. Values within

the (−3σ, +3σ) interval have been segmented in 13

bins and Figures 1, 2 and 3 compare the intensity his-

tograms H

respectively with normalized r and g his-

tograms H

, opponent color histograms H

and hue histograms H

hue

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

230

0.2

0.4

0.6

0.8

0 2 4 6 8 10

detection rate (74)

RGB GIH

RGB GHH

SIFT

Ling GIH

SPIN

MOM.INV.

STEER.FILT.

(a) Grafﬁti image (grafﬁti 3)

0.2

0.4

0.6

0.8

0 2 4 6 8 10

detection rate (91)

RGB GIH

RGB GHH

SIFT

Ling GIH

SPIN

MOM.INV.

STEER.FILT.

(b) Flag image

0.2

0.4

0.6

0.8

0 2 4 6 8 10

detection rate (88)

RGB GIH

RGB GHH

SIFT

Ling GIH

SPIN

MOM.INV.

STEER.FILT.

0.2

0.4

0.6

0.8

0 2 4 6 8 10

detection rate (34)

RGB GIH

RGB GHH

SIFT

Ling GIH

SPIN

MOM.INV.

STEER.FILT.

(d) Candle image

Figure 4: Performace evaluation on real datasets.

These plots underline the relationship between his-

tograms distance χ

and matching correctness in

terms of conditional probabilities. Let be (a, b) a cou-

ple of features; we estimate the probabilities that the

histograms distance d(a, b) falls into a small interval

k − δ, k, given that (a,b) is a correspondence

P(k − δ < d(a, b) ≤ k|T (a) = b) (13)

and given that (a,b) is not a correspondence

P(k − δ < d(a, b) ≤ k|T (a) 6= b) (14)

being T the transformation between the images. In or-

der to make a distinctiveness comparison for intensity

and color histograms, the weighting parameters k

are

set to 1.

From the plots it can be noticed that corresponding

and non-corresponding features distances for inten-

sity show a higher discrimination power with respect

to color information. Nevertheless, opponent colors

show a more symmetric and regular behavior (com-

pared to normalized RGB), allowing the assumption

= k

. Mostly due to saturation weight, the

hue histograms have low distinctiveness (overlapped

distributions). For this reasons, we decided to use op-

ponent color in the following of the paper as color

invariants; moreover this analysis suggests a value for

the color histograms weighting parameter, that should

be adjusted according to the real working conditions.

In the case of this dataset k

= 0.3 can be used; this

value roughly overlaps correct matches distributions

of d

, d

and d

, allowing histograms comparison

with respect to the same threshold d

, and thus im-

proving matching precision.

4 EXPERIMENTAL RESULTS

In this section we present the experimental evalua-

tion done to test the capabilities of the new descriptors

proposed in this paper, using as interest point detec-

tor the Harris-Afﬁne

proposed by Mikolajczyck and

Schmid (Mikolajczyk and Schmid, 2004). In partic-

ular we compare the performances of our approaches

with the descriptors: GIH, GIH with RGB expansion,

SIFT (Lowe, 2004), spin images (Lazebnik et al.,

2003), moments invariants (Gool et al., 1996), and

steerable ﬁlters (Freeman and Adelson, 1991).

The characteristics we are interested in evaluating

are robustness, i.e., the capability to describe in the

same manner two correspondent regions, and preci-

sion, i.e., the distinctiveness of the descriptors. We

use the criterion proposed by Ling and Jacobs to eval-

uate the GIH performances. For each pair of images

we select the 200 interest points with the higher cor-

nerness and, for each of them, we manually estimate

the ground truth matching. Each interest point in

the ﬁrst image is compared with all interest points in

Code available at

http://www.robots.ox.ac.uk/vgg/research/affine/

ADDING COLOR TO GEODESIC INVARIANT FEATURES

231

(a) T-shirt images

(b) Candle images

Figure 5: Examples of matches on real images.

the second image according to the distance described

in Section 3.2 and we considered the detection rate

among the top N matches used to study the perfor-

mance

r =

#correct matches

#points f irst image

. (15)

Experiments have been done on the dataset provided

by Ling and Jacobs, on the dataset by Mikolajczyck

and Schmid, and on other real images with non afﬁne

deformation. Due to space limits we present here only

two plots with performances on public available data

and we details the results on the data we collected that

are more signiﬁcative for non afﬁne trasformation.

The plots in Figure 4 show that GHH description

capability improves the GIH intensity-based descrip-

tion in all the cases and that also the GIH with RGB

expansion achieve better performances with respect

to classical GIH even if the descriptor is still based

only on intensity information. GHH descriptors out-

perform most of other descriptors when non-afﬁne de-

formation is present and are able to describe local fea-

tures also when these are not on a planar surface.

In Figure 5 we report the new (with respect of

public available datasets) images used in this paper

and the matches found by the use of GHH. To bet-

ter understand the rationale behind GHH we plotted

geodesic levels and descriptor histograms for a few

interest points on the T-shirt images in Figure 6.

When dealing with afﬁne transformation (See plot

in Figure 4(a)) SIFT are still better. We investigated

this and devised two possible reasons: the SIFT de-

scriptor is more rich than a histogram based one, and

the afﬁne property is not exploited in the geodesic dis-

tance calculation. We have coped with the latter issue,

but this is out of the scope of this paper, while the for-

mer issue is still an ongoing work.

In Figure 4(a) it is also possible to notice that

sometimes the SPIN descriptors are better than SIFT,

GIH and GHH, this is not a surprising result be-

ing the latters based on the “SPIN image” idea: a

two-dimensional histogram encoding of the bright-

ness distribution in an afﬁne-normalized patch.

5 DISCUSSION AND FUTURE

WORKS

In this paper we introduced a method to extend

geodesic invariant feature with color information.

Experimental results conﬁrm the promising perfor-

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

232

Figure 6: Examples of GHH descriptors on the T-shirt image.

mances of geodesic descriptors and the improvement

given by the use of color information.

Althought we succeded in adding color informa-

tion to the geodesic framework some open issues

still remain: the improvement obtained by GHH

on deformed images is paid with a reduced perfor-

mance with respect to SIFT descriptors on real pla-

nar patches. To better understand the reasons for this,

we made a deep analysis of geodesic expansion. It

turns out that adding the afﬁne information carried by

the Harris-afﬁne detector to the expansion mechanism

improves the GHH performance on planar patches,

but this results are not reported here since somehow

reduce the generality of GHH and GHI descriptors.

Another reason for SIFT better performance is due to

the smoothing effect induced by the histogram rep-

resentation; we are currently working on SIFT like

representation of geodesic levels able to capture local

gradient information.

A ﬁnal note should be made about interest point

detectors. In this paper we used the Harris-afﬁne de-

tector, but intuitively it is not optimal for geodesic

distance computation since it selects corner points lo-

cated on singular positions of the image geodesic rep-

resentation. We have done preliminary work with

MSER detectors (Matas et al., 2002), but this needs

further investigation since they are located on the bari-

center of uniform regions and with non afﬁne defor-

mation this might not be in a repeatable positions.

This is a problem since in performance evaluation we

do not consider area matching, but points, and this can

fail even if we perform the correct match.

ACKNOWLEDGEMENTS

This work has partially been supported by Italian Isti-

tute of Tecnology (IIT) grant.

REFERENCES

Freeman, W. and Adelson, E. (1991). The design and use

of steerable ﬁlters. IEEE Trans. on PAMI, 13(9):891–

906.

Gool, L. V., Moons, T., and Ungureanu, D. (1996). Afﬁne

photometric invariants for planar intensity patterns.

ECCV, 642-651:1996.

Lazebnik, S., Schmid, C., and Ponce, J. (2003). A sparse

texture representation using afﬁne-invariant regions.

CVPR, 2:319–324.

Ling, H. and Jacobs, D. W. (2005). Deformation invariant

image matching. ICCV, 1:1466–1473.

Lowe, D. (2004). Distinctive image features from scale in-

variant keypoints. IJCV, 60(2):91–110.

Matas, J., Chum, O., Urban, M., and Pajdla, T. (2002). Ro-

bust wide baseline stereo from maximally stable ex-

tremal regions. BMVC, pages 384–393.

Mikolajczyk, K. and Schmid, C. (2004). Scale and afﬁne

invariant interest point detectors. IJCV, 60(1):63–86.

ADDING COLOR TO GEODESIC INVARIANT FEATURES

233

Mikolajczyk, K. and Schmid, C. (2005). A performance

evaluation of local descriptors. PAMI, 27:1615 – 1630.

Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman,

A., Matas, J., Schaffalitzky, F., Kadir, T., and Gool,

L. V. (2005). A comparison of afﬁne region detectors.

IJCV, 65(1/2):43–72.

Moreels, P. and Perona, P. (2005). Evaluation of features

detectors and descriptors based on 3d objects. ICCV,

1:800–807.

Sethian, J. (1999). Efﬁcient Schemes: Fast Marching Meth-

ods, chapter 8, pages 87–100. Cambridgr University

Press.

van de Weijer, J. and Schmid, C. (2006). Coloring local

feature extraction. ECCV, 2:334–348.

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

234