HARMONIC DEFORMATION MODEL FOR EDGE BASED

TEMPLATE MATCHING

Andreas Hofhauser

Technische Universit

at M

unchen, Boltzmannstr. 3, 85748 Garching b. M

unchen, Germany

Carsten Steger

MVTec Software GmbH, Neherstr. 1, 81675 M

unchen, Germany

Nassir Navab

Technische Universit

at M

unchen, Boltzmannstr. 3, 85748 Garching b. M

unchen, Germany

Keywords:

Deformable Template Matching, Pattern Recognition in Image Understanding, Object recognition.

Abstract:

The paper presents an approach to the detection of deformable objects in single images. To this end we

propose a robust match metric that preserves the relative edge point neighborhood, but allows signiﬁcant

shape changes. Similar metrics have been used for the detection of rigid objects (Olson and Huttenlocher,

1997; Steger, 2002). To the best of our knowledge this adaptation to deformable objects is new. In addition,

we present a fast algorithm for model deformation. In contrast to the widely used thin-plate spline (Bookstein,

1989; Donato and Belongie, 2002), it is efﬁcient even for several thousand points. For arbitrary deformations,

a forward-backward interpolation scheme is utilized. It is based on harmonic inpainting, i.e. it regularizes the

displacement in order to obtain smooth deformations. Similar to optical ﬂow, we obtain a dense deformation

ﬁeld, though the template contains only a sparse set of model points. Using a coarse-to-ﬁne representation

for the distortion of the template further increases efﬁciency. We show in a number of experiments that the

presented approach in not only fast, but also very robust in detecting deformable objects.

1 INTRODUCTION

The fast, robust, and accurate localization of a given

2D object template in images has been a research

topic for many decades. The results of these ef-

forts have enabled numerous different applications,

because the detection of the pose of an object is the

natural prerequisite for any useful operation. If the

object is deformable, not only the pose, but also the

deformation of the object must be determined simul-

taneously. Extracting this information allows to un-

warp the found region in the image and facilitates

OCR or a comparison with a prototype image for, e.g.,

detection of possible manufacturing errors. Various

application domains, which necessitate the detection

of deformable objects, can still not be comprehen-

sively solved. This is due to the fact that on the one

hand conventional pose estimation algorithms, like

generalized Hough transform or template matching,

do not allow the object to alter its shape nonlinearly.

On the other hand, descriptor-based methods notori-

ously fail if the image contains not enough or only a

small set of repetitive texture like in ﬁgure 1.

Figure 1: Two images of a deformed logo. The detected

deformed model is overlaid in white. The detection works

robustly even though the object contains only repetitive pat-

terns.

1.1 Related Work

We roughly classify algorithms for pose detection into

template matching and descriptor-based methods. In

Hofhauser A., Steger C. and Navab N. (2008).

HARMONIC DEFORMATION MODEL FOR EDGE BASED TEMPLATE MATCHING.

In Proceedings of the Third International Conference on Computer Vision Theory and Applications, pages 75-82

DOI: 10.5220/0001071800750082

 SciTePress

the descriptor-based category, the rough scheme is to

ﬁrst determine discriminative “high level” features,

extract from these feature points surrounding discrim-

inative descriptors, and to establish correspondence

between model and search image by classifying the

descriptors. The big advantage of this scheme is that

the runtime of the algorithm is independent of the de-

gree of the geometric search space. Recent prominent

examples, which fall into this category, are (Belongie

et al., 2002; Lowe, 2004; Berg et al., 2005; Pilet et al.,

2005; Bay et al., 2006). While showing outstanding

performance in several scenarios, they fail if the ob-

ject has only highly repetitive texture or only sparse

edge information. The feature descriptors overlap in

the feature space and are not discriminating anymore.

In the template matching category, we subsume algo-

rithms that perform an explicit search. Here, a simi-

larity measure that is either based on intensities (like

SAD, SSD, NCC and mutual information) or gradi-

ent features is evaluated. Using intensities is popular

in optical ﬂow estimation and medical image registra-

tion, where a rough overlap of source and target image

is assumed (Horn and Schunck, 1981; Modersitzki,

2004). However, the evaluation of intensity-based

metrics is computationally expensive. Additionally,

they are typically not invariant against nonlinear illu-

mination changes, clutter, or occlusion.

For the case of feature-based template matching, only

a sparse set of features between template and search

image is compared. While extremely fast and ro-

bust if the object undergoes only rigid transforma-

tions, these methods become intractable for a large

number of degrees of freedom, e.g. when an ob-

ject is allowed to deform perspectively or arbitrar-

ily. Nevertheless, one approach for feature-based de-

formable template matching is presented in (Gavrila

and Philomin, 1999), where the ﬁnal template is cho-

sen from a learning set while the match metric is eval-

uated. Because obtaining a learning set and applying

a learning step is problematic for many scenarios, we

prefer to not rely on training data except for the origi-

nal template. Another approach is to use a template

like (Felzenszwalb, 2003) or (Zhang et al., 2004).

Here an adapting triangulated polygon model is rep-

resenting the outer contour. Unlike this representa-

tion, our model is a set of edge points allowing us to

express arbitrarily shaped objects e.g. curved or com-

posite objects. In (Jain et al., 1996) and (Gonzales-

Linares et al., 2003) a deformable template model is

adapted while tracking object hypotheses down the

image pyramid. Here, for each match candidate a

global deformation ﬁeld represented by trigonometric

basis functions is optimized. Unfortunately, this rep-

resentation of the deformations is global, so that small

adaptations in one patch of the model propagate to all

areas, even where the object remains rigid. In contrast

to this, we preserve local neighborhood, and therefore

do not encounter this problem. However, we note that

these works are the closest approaches to ours and in-

spired us in several ways.

1.2 Main Contributions

This paper makes the following contributions: The

ﬁrst contribution is a deformable match metric that al-

lows for local deformations, while preserving robust-

ness to illumination changes, partial occlusion and

clutter. While we found a match metric with normal-

ized directed edge points in (Olson and Huttenlocher,

1997; Steger, 2002) for rigid object detection, and

also for articulated object detection in (Ulrich et al.,

2002), its adaptation to deformable object detection

is new.

The second contribution is an efﬁcient deformation

model, allowing a dense unwarping, even though the

template contains only a sparse set of points. There-

fore, we ﬁrst propagate the deformation into regions

between the points and then back-propagate these de-

formations into the original model. Hence, we ob-

tain a reprojected smooth displacement ﬁeld from

the original deformation. The proposed forward-

backward harmonic inpainting does not have the

problems of folding typically encountered with the

popular thin-plate splines (TPS) (Bookstein, 1989).

Additionally, the manipulation of our model only de-

pends on the size of the enclosing rectangle, but not

on the number of model points. To the best of our

knowledge these appealing properties have not yet

been exploited in the ﬁeld of deformable object de-

tection.

2 DEFORMABLE SHAPE-BASED

MATCHING

In the following, we detail the deformable shape-

based model generation and matching algorithm. The

problem that this algorithm solves is particularly dif-

ﬁcult, as in contrast to optical ﬂow, tracking, or med-

ical registration, we assume neither temporal nor lo-

cal coherence. While the location of deformable ob-

jects is determined with the robustness of a template

matching method, we avoid the necessity of expand-

ing the full search space as if it was a descriptor-based

method.

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

Figure 2: In the top image the rectangular white ROI de-

ﬁnes the template. The bottom image depicts the extracted

neighborhood graph of the model.

2.1 Shape Model Generation

As mentioned in section 1.1, we want our model to

represent arbitrary objects. For the generation of our

model, we decided to rely on the result of a simple

contour edge detection. This allows as to represent

objects from template images as long as there is any

intensity change. Note that in contrast to corners or

other point features, we can model objects that con-

tain only curved contours. Furthermore, directly gen-

erating a model from an untextured CAD format is

in principle possible. For all descriptor based ap-

proaches, a manual alignment between template im-

ages that show the texture and the CAD model would

be required. Therefore, our shape model M

rig

is com-

posed as an unordered set of edge points

rig



, c

, d

, n

, . . . , n

|i = 1 . . . n



(1)

Here, r and c are the row and column coordinates of

the model points. d

denotes the normalized gradi-

ent direction vector at the respective row and column

coordinate of the template. At model generation, we

index for every model point the nearest k model points

, . . . , n

. This allows us to access them efﬁciently

at runtime. As the model generation is completely

learning-free and the calculation of the neighborhood

graph is realized efﬁciently, this step needs, even for

models with thousands of points, less than a second.

One example of this model generation by setting a re-

gion of interest and the extracted neighborhood graph

is depicted in ﬁgure 2.

2.2 Deformable Metric based on Local

Edge Patches

Given the generated M

rig

, the task of the deformable

matching algorithm is to extract instances of the

model in new images. As mentioned in section 1.2,

we therefore adapted the match metric of (Steger,

2002). This score function is designed such that it

is inherently invariant against nonlinear illumination

changes, partial occlusion and clutter. The score func-

tion for rigid objects reads as follows:

s(r, c) =

∑

i=1

, d

(r+r

,c+c

)

k · kd

(r+r

,c+c

)

(2)

where d

is the direction vector in the search image,

h·i is the dot product and k · k is the Euclidean norm.

Three observations are important: First, the point set

of the model is compared to a dense gradient direction

ﬁeld of the search image. Even with signiﬁcant non-

linear illumination changes that propagate to the gra-

dient amplitude the gradient direction stays the same.

Furthermore, a hysteresis threshold or non maximum

suppression is completely avoided resulting in true in-

variance against arbitrary illumination changes. Sec-

ond, partial occlusion, noise, and clutter results in ran-

dom gradient directions in the search image. These

effects lower the maximum of the score function but

do not alter its location. Hence, the semantic mean-

ing of the score value is the ratio of matching model

points. Third, comparing the cosine between the gra-

dients leads to the same result, but calculating this

formula with dot products is several orders of mag-

nitudes faster.

To extend this metric for deformable object detec-

tion, we instantiate globally only similarity transfor-

mations. By allowing successive local deformations,

we implicitly evaluate a much higher class of non-

linear transformations. Following this argument, we

distinguish between an explicit global score function

, which is evaluated for, e.g. similarity, and a local

implicit score function s

, that allows for local defor-

mations. Similar to the rigid case, the global score

function s

is a sum over all the model points local

contributions. If the model is partially occluded, only

this ratio of all the model points change.

(r, c) =

∑

i=1

(r, c, i) (3)

One observation that is important for designing the

local score function is depicted in ﬁgure 3. If we

allow the model points to deform independently, the

gradient direction is not discriminative anymore. Fur-

thermore, if we allow a point to deform with a rota-

tion its local score value gives us a match for all po-

sitions. Even if we prevent rotations from occurring,

HARMONIC DEFORMATION MODEL FOR EDGE BASED TEMPLATE MATCHING

Independent Model Gradients

Image Gradients

Ambiguity

Gradients with Neighbors

No Ambiguity

Image Gradients

Figure 3: In the left image, each model point is considered

independently. This results in displacements that are highly

ambiguous. As depicted in the right picture, taking the local

neighborhood into account allows to resolve this ambiguity.

the ambiguity, particularly along edge contours, is not

resolved. With clutter or noise it is essential that the

model can be discriminated from the background or

from similar objects.

As a remedy, we add rigidity constrains that take the

movement and location of neighborhood points into

account. We assume that even after deformation the

neighborhood of each model point stays the same and

is approximated by a local euclidean transformation.

Hence, we instantiate local euclidean transformations

for each point and apply it on the local neighbor-

hood. The local score then is the maximum alignment

of gradient direction between the locally transformed

model points and the search image. Accordingly, the

proposed local score functions s

is:

(r, c, i) =

max

∑

j=1

i j

), d

(r+T

i j

),c+T

i j

))

i j

k · kd

(r+T

i j

),c+T

i j

))

(4)

For the sake of efﬁciency, we exploit the neighbor-

hood graph that was generated in the ofﬂine phase for

accessing the neighboring points (the n

i j

matrix). Fur-

thermore, we cache T

i j

), T

i j

) and T

i j

) since

they are independent of r and c.

2.3 Deformable Shape Matching

After deﬁning an efﬁcient score function that toler-

ates local deformations, we integrated it into a gen-

eral purpose object detection system. We decided to

alter the conventional template matching algorithm

such that it copes with deformed objects. Hence, the

deformable shape matching algorithm ﬁrst extracts

an image pyramid of incrementally zoomed versions

of the original search image. At the highest pyra-

mid level, only the rough location of the model is

determined. To speed up this exhaustive search the

evaluation of the score function can be transparently

restricted in our implementation to relevant search

(a)

(b) (c)

(d) (e)

(f)

(g)

Figure 4: In (a) a part of a search image deformed by a

random TPS-transformation is depicted. The images in (b)

and (c) show the displacements at model points with re-

spect to row and column coordinates. A medium gray value

means no deformation, brighter gray values denote positive,

dark negative displacements. As depicted in (d) and (e), we

obtain a smooth deformation after forward-backward har-

monic inpainting. The image (f) contains the unwarped

image region. The inverted difference image between un-

warped and original model area is shown in (g). We observe

only a small difference that is due to sampling effects.

regions or to a restricted amount of rotation/scale

ranges. The rough location resides at the local max-

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

ima of the score s

function (3). This initial set of

candidates are further reﬁned until either the lowest

pyramid level is reached or no match candidates are

above a certain score value. While tracking the can-

didates down the pyramid, a rough deformation was

already extracted during evaluation of the current can-

didate’s parent on a higher pyramid level. Therefore,

we ﬁrst use the deformation originating from the can-

didate’s parent to warp the model up to the known

deformation. Now, starting from this deformed can-

didate the deformation is iteratively reﬁned by evalu-

ating only the local score function with (4). Here, we

keep the best displacements T

and reproject the can-

didate given the deformation model that we discuss

later in section 2.4. As a result of these local iterative

reﬁnements, we obtain the best instance of the model

with respect to the score function and the deformation

model. This deformed candidate is deﬁned as:

de f

= {r, c, M

rig

, dr

, dc

} (5)

Here, r, c is the pose and dr

, dc

denote a displace-

ment vector that brings each model point from the

rigid to the deformed position. Hence, we known the

exact displacements only at locations where there are

model points.

However, for two reasons we need to infer defor-

mations for positions, that we do not know from

measurements. First, when we propagate deforma-

tions between pyramid levels, contour segments of

our model exist only at certain pyramid levels. Hence,

we bring the model that is deformed to the pyramid

level of the source deformation. Then we apply the

deformation and bring the model back to the original

scale. Second, when we ﬁnally unwarp the detected

image region, we have to interpolate deformation at

image regions where there are no model points.

For the rigid planar case of a perspective deforma-

tion, we estimate the parameters of a homography

by the well-known normalized DLT algorithm. This

parametrized warp is applied in a straightforward

way. As we think that this is not new, we do not

discuss this case further. However, for arbitrary de-

formations one need a suitable model.

2.4 Harmonic Deformation Model

Because no a priori information is known about the

exact physical behavior of our objects, we need a gen-

eral deformation model. This model is used for prop-

agating the deformation down the image pyramid and

to unwarp found instances (see section 2.3). Even

though we know the exact displacements at model

points, we expect it to give outliers, because no metric

is resistant to occasional failure. Preliminary experi-

ments with the widely used Thin Plate Spline model,

where we interpret model points as landmarks, failed.

The main problem is to suppress crossings of the

moving landmarks, leading to foldings. Particularly

problematic are the cases, where different landmark

points end up at exactly the same point or when two

nearby points move into different directions. Even

with the best local match metric, it is hardly possible

to suppress this entirely. Therefore, we take different

measures for e.g. preventing foldings due to outliers.

As a ﬁrst step we insert M

de f

into a row and column

deformation image. Hence, only pixels, where model

points are located, are set. One example for an in-

serted row/column deformation is shown in ﬁgure 4

(b) and 4 (c). In the next step, we infer the deforma-

tion of areas that are not lying at model points (The

medium gray pixels of the deformation images). We

state this task as an inpainting problem where the non-

model region is regarded as destroyed pixels and must

be interpolated. The reconstruction that we use solves

the discrete Laplace equation,

+ u

= 0 (6)

for the corresponding pixel value u that originates

from the deformation vector dr

and dc

. This particu-

lar inpainting function can be decomposed into inde-

pendent row and column coordinates allowing an ef-

ﬁcient solution by a gradient decent solver. This is re-

ferred as harmonic interpolation in the image restora-

tion literature (Aubert and Kornprobst, 2006). In the

original region discontinuities and crossing are still

present. Therefore, after we have extrapolated the

gray values, we apply the inpainting on the inverse

(original) model region. Hence, the original point dis-

placements are only approximated. This implicitly re-

solves the problem of crossings of landmark move-

ments that are encountered along contours. While

harmonic inpainting gives reasonable results only for

small regions (because, e.g., edges or texture is lost),

in our application it generates the desired deformation

ﬁeld (see image 4 (d) and (e)). It strongly penalizes

abrupt changes in the model. Furthermore, it smooths

out small errors of the detection that are encountered

frequently e.g. along contours.

3 EXPERIMENTS

For evaluation of the robustness of the proposed ob-

ject detection algorithm we conducted experiments

under synthetic and real world conditions. Under sim-

ulated conditions we independently measure the inﬂu-

ence of the proposed score function in section 3.1 and

the deformation model in section 3.2.

HARMONIC DEFORMATION MODEL FOR EDGE BASED TEMPLATE MATCHING

(a) (b)

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 2 4 6 8 10

Recognition Percentage

Pixel Random Displacement

Recognition Example Industrial Image

Shape-based

Descriptor-based

Figure 5: Synthetic experiments: In the picture (a) the orig-

inal template image is depicted. The region of interest is

overlaid in white. In (b) a perspectively distorted test im-

age is shown. The detected template is denoted with the

white rectangle. In the bottom the results of the detection

experiments is plotted.

3.1 Comparison with Descriptor-Based

Matching

In order to compare the proposed method with state

of the art detection algorithms, we decided in a ﬁrst

step to restrict the deformation to a perspective dis-

tortion. Hence, the simulated model remains rigid and

only the robustness of the detection is measured, not

the underlying deformation model. Here we are par-

ticularly interested to compare the proposed method

with a descriptor-based approach. We choose (Lep-

etit et al., 2005), as it is known for its robustness even

in the presence of big perspective changes. There-

fore, we generate homographies by random move-

ments of the corner points of the rectangle that de-

ﬁne the model. These displacements deﬁne a per-

spective distortion that we apply onto the original im-

age (see ﬁgure 5 (a) for original and (b) for distorted

image). Both the shape matching and the descriptor-

based approach try to extract a homography from this

image. For (Lepetit et al., 2005) we choose 25 trees

of depth 11, favoring robustness instead of speed. For

each size of the movement we generated 500 ran-

dom views. We tested different images with differ-

ent textured content. For highly textured objects the

proposed method only slightly outperforms (Lepetit

et al., 2005). However, we observe a signiﬁcant dif-

ference in objects like in ﬁgure 5. The robustness of

the descriptor-based method decreases rapidly even

for small displacements. In contrast to this, the pro-

posed method is robust despite increasing distortions.

This is mainly due to the fact that the repetitive struc-

tures (like the leads at the chip) pose a problem for the

descriptor-based method. Furthermore, we observe

that extracting edges is superior to interest points not

only in terms of robustness but also accuracy.

3.2 Simulated Tps and Harmonic

Deformation

Figure 6: Simulated Deformations: On the left image with

TPS deformation and on the right with the harmonic defor-

mation model. The landmark correspondences are shown

with the source/target points as white crosses.

For testing reasons we generated various synthetic de-

formations with the TPS and our proposed harmonic

model. In ﬁgure 6 the behavior for an exemplary re-

sult of the two models under artiﬁcial displacements

is depicted. This artiﬁcial displacement is deﬁned by

six landmark points. The four that are at the corners

of a quadrilateral are static and two that are inside this

quadrilateral move away such that their path crosses.

These crossings could originate from mismatches as

discussed in section 2.4. Hence, the crossing of

the landmark points induce a non-diffeomorphic dis-

placement. Under the TPS model the image is dis-

torted in an unnatural way. By penalizing the TPS

deformation parameters except the afﬁne transforma-

tion (see (Bookstein, 1989)), we hoped to solve this

problem. Unfortunately, it is difﬁcult to adjust the

regularizing parameter and control this kind of shape

change. A further observation is that a global defor-

mation is extrapolated outside the area of the land-

marks. In contrast to this, the forward-backward har-

monic deformation model is parameter free and does

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

not fold. It only bends the image locally according

the displacements. Also, only a translation is extrap-

olated globally, but not the nonlinear shape change.

We admit that this is a totally artiﬁcial example, but

the robustness of a deformation model with respect to

outliers play a crucial role when a detection system

is constructed that must handle complex models auto-

matically.

Another important observation is that the proposed

harmonic deformation model is an order of magni-

tude faster than the TPS deformation. The reason for

this is that the computational complexity for our har-

monic deformation model is linear in the size of the

deformation ﬁeld that is to be inpainted. Furthermore,

it is independent of the number of landmark points. In

contrast to this, the complexity of calculating the TPS

is cubic with the size of the model points and there-

fore becomes intractable for large-scale models like

the one we use. However, efﬁcient approximations

for TPS functions are still target of current research

(see, e.g., (Donato and Belongie, 2002)). While this

difference cannot be noticed for a small amount of

landmark points (for less than 10 landmarks the TPS

is even faster), the difference is dramatic for large

models. If we take typical example images like ﬁg-

ure 4 (a), the calculation of the TPS parameters and

unwarping takes several minutes. With the harmonic

inpainting this is calculated in ms.

3.3 Real World Experiments

The proposed object detection algorithm was tested

on real sequences. Sample frames are depicted in

ﬁgure 7. The object to be found is deformed, par-

tially occluded, and illuminated in changing ways.

After detection, we overlay the original image with

the model. Despite the different adverse conditions

the object is found globally with high robustness. One

remaining problem is that in case of partial occlusion

we currently don’t distinguish between deformation

and occlusion. Furthermore, some model parts tend to

match with nearby edges of the same polarity. Even

though this is not a problem for the global detection,

this issue will be addressed in future work. Here, we

expect even better results by adding further regular-

ization conditions to the model. If we instantiate a full

rotation for the model, detection and unwarping takes

typically around 1 second on a desktop computer.

4 CONCLUSIONS

In this paper we presented a solution for deformable

template matching that can be utilized in a wide range

Figure 7: Detection of a deformed object in the presence

of clutter, noise, illumination changes and occlusion. The

video sequence is provided in the supplementary material.

It shows the strength and limitations of our approach.

of applications. For this, we extended an already ex-

isting edge polarity based match metric for tolerat-

ing local shape changes. The proposed deformation

model, which is based on minimizing the Laplacian

of the deformation ﬁeld, allows a precise unwarping

and enforces smooth displacement ﬁelds in an elegant

way.

Future work will be to further reduce the runtime of

the algorithm by an optimized implementation. Addi-

tionally, this deformable shape matching can be used

as a module for compound object detection. While

currently all model points have the same importance,

leading to a split into a local-global match metric, we

HARMONIC DEFORMATION MODEL FOR EDGE BASED TEMPLATE MATCHING

plan to introduce a multi-level hierarchical decompo-

sition of our model, such that different layers and dif-

ferent local sub-parts are considered independently.

REFERENCES

Aubert, G. and Kornprobst, P. (2006). Mathematical Prob-

lems in Image Processing: Partial Differential Equa-

tions and the Calculus of Variations (second edi-

tion), volume 147 of Applied Mathematical Sciences.

Springer-Verlag.

Bay, H., Tuytelaars, T., and Gool, L. V. (2006). Surf:

Speeded up robust features. European Conference on

Computer Vision.

Belongie, S., Malik, J., and Puzicha, J. (2002). Shape

matching and object recognition using shape contexts.

IEEE Transactions on Pattern Analysis and Machine

Intelligence, 24(4):509–522.

Berg, A., Berg, T., and Malik, J. (2005). Shape matching

and object recognition using low distortion correspon-

dences. In Conference on Computer Vision and Pat-

tern Recognition, San Diego, CA.

Bookstein, F. L. (1989). Principal warps: Thin plate splines

and the decomposition of deformations. IEEE Trans-

actions on Pattern Analysis and Machine Intelligence,

11:567–585.

Donato, G. and Belongie, S. (2002). Approximate thin plate

spline mappings. European Conference on Computer

Vision, 2:531–542.

Felzenszwalb, P. F. (2003). Representation and detection of

deformable shapes. In Computer Vision and Pattern

Recognition, volume 1, pages 102–108.

Gavrila, D. M. and Philomin, V. (1999). Real-time object

detection for “smart” vehicles. In 7th International

Conference on Computer Vision, volume I, pages 87–

93.

Gonzales-Linares, J., N.Guil, and E.L.Zapata (2003). An

efﬁcient 2d deformable object detection and location

algorithm. In Pattern Recognition, volume 36, pages

2543–2556.

Horn, B. K. P. and Schunck, B. G. (1981). Determining

optical ﬂow. Artiﬁcal Intelligence, 17:185–203.

Jain, A. K., Zhong, Y., and Lakshmanan, S. (1996). Object

matching using deformable templates. IEEE Trans-

actions on Pattern Analysis and Machine Intelligence,

18(3):267–278.

Lepetit, V., Lagger, P., and Fua, P. (2005). Randomized

trees for real-time keypoint recognition. In Confer-

ence on Computer Vision and Pattern Recognition,

San Diego, CA.

Lowe, D. G. (2004). Distinctive image features from scale-

invariant keypoints. International Journal of Com-

puter Vision.

Modersitzki, J. (2004). Numerical Methods for Image Reg-

istration. Oxford University Press Series: Numerical

Mathematics and Scientiﬁc Computation.

Olson, C. F. and Huttenlocher, D. P. (1997). Automatic

target recognition by matching oriented edge pixels.

IEEE Transactions on Image Processing, 6(1):103–

113.

Pilet, J., Lepetit, V., and Fua, P. (2005). Real-time non-rigid

surface detection. In Conference on Computer Vision

and Pattern Recognition, San Diego, CA.

Steger, C. (2002). Occlusion, clutter, and illumination in-

variant object recognition. In Kalliany, R. and Leberl,

F., editors, International Archives of Photogrammetry,

Remote Sensing, and Spatial Information Sciences,

volume XXXIV, part 3A, pages 345–350, Graz.

Ulrich, M., Baumgartner, A., and Steger, C. (2002). Au-

tomatic hierarchical object decomposition for object

recognition. In International Archives of Photogram-

metry and Remote Sensing, volume XXXIV, part 5,

pages 99–104.

Zhang, J., Collins, R., and Liu, Y. (2004). Representation

and matching of articulated shapes. In Computer Vi-

sion and Pattern Recognition, volume 2, pages 342–

349.

VISAPP 2008 - International Conference on Computer Vision Theory and Applications