Using Polynomial Eigenvalue Problem Modeling to Improve Visual

Odometry for Autonomous Vehicles

Anderson Souza

, Leonardo Souto

, Fabio Fonseca de Oliveira

, Biswa Nath Datta

and Luiz M. G. Gonc¸alves

Department of Computing, University of the State of Rio Grande do Norte, Natal, RN, Brazil

Department of Computing Engineering and Automation, Federal University of the Rio Grande do Norte, Natal, RN, Brazil

Northern Illinois University, U.S.A.

Keywords:

Visual Odometry, Polynomial Eigenvalue Problem, Motion Estimation.

Abstract:

Visual Odometry (VO) is the process of calculating the motion of an agent (such as, robot and vehicle), using

images captured by a single or multiple cameras embedded to it. VO is an important process to supplement

autonomous navigation systems, since VO can provide accurate trajectory estimates. However, algorithms

of VO work with several steps of hard numerical computation which generate numerical errors and demand

considerable processing time. In this paper, we propose the use of a mathematical framework for monocular

VO process based on Polynomial Eigenvalue Problem (PEP) modeling in order to achieve both more accurate

motion estimation and to decrease the processing time of the VO process. Some previous experiments are

shown in order to validate the proposed computation accuracy.

1 INTRODUCTION

Nowadays, it is common to see autonomous agents

such as robot and vehicle to perform different tasks.

It is possible to ﬁnd out these devices transport-

ing materials in factories, transporting people in ur-

ban zone, monitoring environments, exploring areas,

in surveillance, among other applications (Siegwart

et al., 2011).

Autonomous agents must be able to collect infor-

mation about their environment; considering this in-

formation, they must make some decision in order to

decide how to proceed, facing what exists in the envi-

ronment; and have to actuate in order to perform the

previous decision, towards completing their mission

(Murphy, 2000). An important condition for agents

achieving these abilities is that they must be equipped

with sensors, which provide them useful information

of their environment. In this way, a robot or a vehi-

cle can interact coherently with its environment and

objects, leading with unexpected situations like, dy-

namic obstacles (Souza and Gonc¸alves, 2015).

Cameras are widely used as visual sensor systems

for autonomous robots and vehicles. These systems

can be composed by a single or multiples cameras and

a mechanism for computing cameras data, which

allows the extraction of useful information from raw

data (Ma et al., 2004). With a visual sensor sys-

tem it is possible to infer a plenty of information

such as colors, textures, geometric structures, object

recognition, among others. Furthermore, it is feasible

to estimate relative or absolute motion, from images

captured from different positions, Visual Odometry -

VO. This is an important operation to supplement au-

tonomous navigation systems, since VO can provide

accurate trajectory estimates (Scaramuzza and Fraun-

dorfer, 2011).

However, processing raw data captured by cam-

eras in order to extract useful information, goes

through several numerical computation steps, which

produce numerical errors (overﬂow and underﬂow),

round-off errors and error bound. Moreover, these

computations consume signiﬁcant processing time

demanded by algebraic calculations with vectors and

matrices (Datta, 2010). Both aspects (errors effects

and time processing) need to be minimized so that

they do not cause damages to the autonomous navi-

gation process by misinterpreted information (due to

miscalculations), or through decisions not taken in

time.

In this context, this paper proposes the use of a

mathematical framework for monocular VO process

502

Souza, A., Souto, L., Oliveira, F., Datta, B. and Gonçalves, L.

Using Polynomial Eigenvalue Problem Modeling to Improve Visual Odometry for Autonomous Vehicles.

DOI: 10.5220/0006478005020507

In Proceedings of the 14th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2017) - Volume 2, pages 502-507

ISBN: Not Available

based on Polynomial Eigenvalue Problem (PEP)

modeling in order to achieve both more accurate mo-

tion estimation and decreasing the processing time of

the VO process. In this way, the VO process can be

analyzed as a minimum problem and it can be solved

by using polynomial equations systems (Kukelova

et al., 2012). In order to validate the proposed mod-

eling in terms of accuracy, preliminary experiments

with real data are presented.

This paper is organized as follow: section 2 intro-

duces the Visual Odometry problem for a monocular

vision system. Section 3 provides the formulation of

PEP. Section 4 shows the Visual Odometry modeled

as a PEP. Section 5 presents some preliminary experi-

ments. Finally, some considerations and future works

are presented in section 6.

2 MONOCULAR VISUAL

ODOMETRY

Localization is a fundamental point in autonomous

navigation. An autonomous robot or vehicle needs to

be able to estimate its position and orientation relative

to its environment. Visual Odometry - VO is process

that can provide accurate trajectory estimates. VO op-

erates by incrementally estimating the pose (position

and orientation) of an agent through examination of

the changes that motion induces on images of its on-

board cameras (Scaramuzza and Fraundorfer, 2011).

VO process can be performed by multiples cam-

eras or a single camera, which is named monocular

VO. In this case, motion and 3D structures are com-

puted from 2D images data. Feature-based methods

is one of the main approaches for estimating poses in

monocular VO. These methods are based on salient

and repeatable features that are tracked over consecu-

tive frames (Scaramuzza and Fraundorfer, 2011).

VO computes the camera path incrementally, pose

after pose (for simplicity, it is common to assume

that the camera coordinate frame to be the agent

coordinate frame). In order to do this, the main

point is to compute the relative transformation T

∈

4×4

, (k = 1, ...,n), from consecutive images (I

k−1

)

taken at times k − 1 and k and then, concatenate the

transformation to recover the full trajectory C

0:n

,...,C

}. Figure 1 illustrates this process. T

often called rigid body motion and can be described

as:



0 1



(1)

∈ SO(3) and t

∈ R

3×1

are the rotation ma-

Figure 1: The relative motion T

is calculated from features

(projections of P

and P

) and after, it is concatenated to

obtain the absolute pose C

trix and the translation vector, respectively. These pa-

rameters can be computed by estimating the essential

matrix E, which describes the geometric relation be-

tween two images I

and I

k−1

of a calibrated camera,

up to a scale factor, as shown in Equation (2).

≃

(2)

where t

= [t

]

and







0 −t

−t







(3)

It is known that in the case of fully calibrated

cameras points q and q

′

(from I

k−1

and I

, respec-

tively), which are projections of 3D points P (Figure

1), are geometrically constrained by the epipolar ge-

ometry constraint (Kukelova et al., 2012), formulated

by Equation (4).

′

q = 0 (4)

where the essential matrix E

is a 3× 3 rank-2 matrix

with two equal singular values. These constraints can

be presented in an other way, as shown in Equations

(5) and (6).

det(E

) = 0 (5)

− trace(E

= 0 (6)

Five-points algorithm proposed by Nister (Nis-

ter, 2004) and eight-points algorithm proposed by

Longuet-Higgins (Longuet-Higgins, 1981) are the

most popular approaches for computing the essential

matrix E

and, recovering R

and t

. In this paper,

Using Polynomial Eigenvalue Problem Modeling to Improve Visual Odometry for Autonomous Vehicles

503

we focus on the polynomial eigenvaluesolution to the

ﬁve-points algorithm for estimating E

Usually, after getting a possible result of E

, a

nonlinear optimization step is performed in order to

obtain more accurate estimate of T

The VO process is often summarized into some

steps as shown in Figure 2. Further information about

VO can be found in (Scaramuzza and Fraundorfer,

2011) and (Scaramuzza and Fraundorfer, 2012), in-

cluding mathematical formulation of VO problem,

feature selection methods, matching, robustness and

optimization approaches and applications.

Image Sequence

Feature Matching

Feature Detection

Motion Estimation

Local Optimization

Figure 2: Steps of the monocular feature-based VO process.

3 POLYNOMIAL EIGENVALUE

PROBLEMS

Following deﬁnitions and formulation presented in

(Kukelova et al., 2012) and (Betcke et al., 2013),

polynomial eigenvalue problems (PEP) are problems

of the form presented by Equation (7), in which the

main purpose is to ﬁnd scalars λ and nonzero vectors

v that satisfy the equation.

C(λ)v = 0 (7)

In this equation, v is vector of monomials in all

variables except for λ, andC(λ) is a n×n matrix poly-

nomial in variable λ deﬁned as

C(λ) ≡ λ

+ λ

l−1

+ ... + λC

, (8)

with n× n coefﬁcient matrices C

A polynomial eigenvalue problem (PEP) can be

also represented as a standard generalized eigenvalue

problem (GEP) with the form

Ax = λBy (9)

Consider the following polynomial equation

(PEP),

(λ

+ λ

l−1

+ ... + λC

)v = 0 (10)

It can be transformed to a GEP with,

A =







0 I 0 ... 0

0 0 I ... 0

... ... ... ... ...

−C

... −C

l−1







, (11)

B =







...







,y =







λv

...

l−1







IfC

is nonsingular and well conditioned, it is pos-

sible consider a monic matrix polynomial,

C(λ) = C

−1

C(λ) (12)

with coefﬁcient matrices

= C

−1

, i = 0,... ,l − 1.

Thus, Equation (10) can be transformed to

Ay = λy, (13)

where

A =







0 I 0 ... 0

0 0 I ... 0

... ... ... ... ...

−

−C

... −C

l−1







(14)

In some cases, matrixC

is singular, in contrast the

matrix C

is regular and well conditioned. Thus, ei-

ther the described method which transforms PEP (10)

to the GEP (11) or the transformation β = 1/λ can be

used. Then,

= C

−1

, i = 1...l and matrix A gets

the form

A =







0 I 0 ... 0

0 0 I ... 0

... ... ... ... ...

−

−C

l−1

−C

l−2

... −C







(15)

The transformation β = 1/λ reduces the problem

to ﬁnding eigenvalues of the matrix (Kukelova et al.,

2012). There are many efﬁcient numerical algorithms

for solving this GEP, like the QZ algorithm (Datta,

2010).

4 PEP SOLUTION FOR VISUAL

ODOMETRY

As mentioned before, ﬁve-points algorithm (Nister,

2004) is one of the most common approach for the

ICINCO 2017 - 14th International Conference on Informatics in Control, Automation and Robotics

504

Visual Odometry problem. It has become the stan-

dard for monocular motion estimation in the presence

of outliers (Scaramuzza and Fraundorfer, 2011).

Kukelova and colleagues (Kukelova et al., 2012)

propose a solution based on PEP for relative pose

problems using part of the ﬁve-points algorithm, and

their formulation is used in this paper for the monoc-

ular visual odometry problem.

The ﬁve-points algorithm requires m ≥ 5, with m

been the number of points with correspondence in

two consecutive images. Each of the point correspon-

dences gives rise to a constraint of the form (4). This

constraint can also be written as

˜q

E = 0 (16)

where

˜q ≡ [q

′

]

E = [e

]

By stacking the vectors ˜q

for ﬁve points, a 5×9 ma-

trix is obtained. The null space of this matrix gener-

ates four vectors

and

that span it. They

can be found by SVD or QR decomposition. With

these vectors one can create four correspondent 3× 3

matrices, E

, E

and E

. The essential matrix E

can be constructed as a linear combination of these

matrices, as shown by Equation (17).

E = xE

+ yE

+ zE

+ E

(17)

for some scalars x, y and z. Now, it is used the rank

constraint (5) and the trace constraint (6) to build 10

third-order polynomial equations in three unknowns

and 20 monomials. These equations can be written

as:

MX = 0 (18)

where M is a 10 coefﬁcient matrix reduced by

Gauss-Jordan elimination and X = [x

,yx

,zx

,zyx, zy

x,z

y,z

,yx, y

,zx,zy,z

,x, y, z,1]

is the vector of all monomials. There are all mono-

mials in three unknowns up to degree three. At this

point, Equation (10) must be recovered, and taking

λ = z, Equation (17) can be written as

+ z

+ zC

)v = 0 (19)

where v vector of monomials, v =

y,xy

,xy,y

,x, y, 1]

and C

, C

and C

are 10× 10 coefﬁcient matrices given by:

= [0 0 0 0 0 0 0 0 0 m

= [0 0 0 0 0 0 0 m

= [0 0 0 0 m

= [m

where m

is the jth column from M.

The rank of the matrix C

is one and the matrix

is regular. Then, the transformation β = 1/z, pre-

sented in section 3, is possible and it reduces the cubic

PEP (19) to the problem of ﬁnding the eigenvalues of

the 30× 30 matrix A, Equation (20).

A =







0 I 0

0 0 I

−c

−1

−C

−1

−C

−1







(20)

From (20) 30 eigenvalues can be obtained, solu-

tions for β = 1/z, and 30 corresponding eigenvectors

v from which the solutions for x and y is extracted.

Hence, the essential matrix can be estimated by Equa-

tion (17).

After that, R and t are recovered. Let the sin-

gular value decomposition of the essential matrix be

E ∼ Udiag(1, 1, 0)V

, where U and V are cho-

sen such that det(U) > 0 and det(V) > 0. Then,

≡ [u

]

and R is equal to R

≡ UDV

≡ UD

(Nister, 2004).

Any combination of R and t satisﬁes the epipolar

constraint (16). Therefore, four possible solutions for

the transformation T

arise:



0 1





−t

0 1





0 1





−t

0 1



The true conﬁguration is found by triangulating

one point of the images for one of each possible solu-

tion and verifying if its coordinate in the space yield a

position in front of the camera. Further details of the

triangulation step can be found in (Nister, 2004).

5 PRELIMINARY EXPERIMENTS

This section shows some preliminary steps imple-

mented to validate the proposed framework of VO

computation in terms of accuracy.

5.1 Image Capturing and Feature

Detection

For this ﬁrst experiments, the KITTI dataset (Geiger

et al., 2012) was used to provide the sequence of im-

ages and the ground truth of a real vehicle. The FAST

(Features from Accelerated Segment Test) (Rosten

Using Polynomial Eigenvalue Problem Modeling to Improve Visual Odometry for Autonomous Vehicles

505

et al., 2010) algorithm was applied on the images in

order to identiﬁes corners in consecutive images.

Figures 3 and 4 show features detected in consec-

utive images from the KITTI dataset.

Figure 3: Corners detected fromthe ﬁrst image taken at time

k− 1.

Figure 4: Corners detected from the second image taken at

time k.

5.2 Feature Matching

In this step was used the FLANN (Fast Library for

Approximate Nearest Neighbors) algorithm (Muja

and Lowe, 2009), which matches the detected cor-

ners. Figure 5 illustrates a matching sampling.

Figure 5: Matching example.

5.3 VO Experiment

An experiment was performed in order to compare the

traditional implementation of VO using the Nister’s

ﬁve-point algorithm (Nister, 2004) and the implemen-

tation based on PEP with the modeling proposed by

Kukelova and colleagues (Kukelova et al., 2012).

Figure 6 presents a monocular VO process per-

formed with a set of 400 images from KITTI dataset.

The ﬁgure shows that the PEP solution is closer to

the ground truth line, indicating that the PEP solution

is more accurate than the VO based on the Nister’s

ﬁve-point algorithm. This happened due to numerical

robustness yield by the PEP solution.

-80 -70 -60 -50 -40 -30 -20 -10 0 10

X coordinates [m]

-20

100

120

140

160

Y coordinates [m]

Ground truth

Nister's 5pts

PEP Vo

Figure 6: Comparative experiment with VO based on Nis-

ter’s ﬁve-points algorithm and PEP solution.

6 CONCLUSION

This paper presented Monocular Visual Odometry

(VO) computing solution based on Polynomial Eigen-

value Problem for the ﬁve-points algorithm. VO

algorithms work with several step of hard numer-

ical computation which generate numerical errors

like overﬂow, underﬂow, round-off errors and error

bound, and demand considerable processing time.

VO process based on Polynomial Eigenvalue Prob-

lem (PEP) achieved more accurate motion estimation,

since PEP solutions achieve more numerical robust-

ness, as shown in a previous experiment.

As future works, the proposed method will be im-

proved with more robust optimization step and new

analysis comparing accuracy and processing will be

performed.

The ﬁnal idea is to apply the VO algorithm in a

quadcopter vehicle in order to supply it with a esti-

mation of its trajectory.

ACKNOWLEDGEMENTS

We would like to thank CAPES for the ﬁnancial sup-

port.

ICINCO 2017 - 14th International Conference on Informatics in Control, Automation and Robotics

506

REFERENCES

Betcke, T., Higham, N. J., Mehrmann, V., Schr¨oder, C., and

Tisseur, F. (2013). Nlevp: A collection of nonlinear

eigenvalue problems. ACM Transactions on Mathe-

matical Software (TOMS), 39(2).

Datta, B. (2010). Numerical Linear Algebra and Applica-

tions. SIAM Publishing, second edition edition.

Geiger, A., Lenz, P., and Urtasun, R. (2012). Are we ready

for autonomous driving? the kitti vision benchmark

suite. In Conference on Computer Vision and Pattern

Recognition (CVPR).

Kukelova, Z., Bujnak, M., and Pajdla, T. (2012). Poly-

nomial eigenvalue solutions to minimal problems in

computer vision. IEEE Transaction on Pattern Analy-

sis and Machine Intelligence, 34(7):1381–1391.

Longuet-Higgins, H. (1981). A computer algorithm for re-

constructing a scene from two projections. Nature,

293(10):133–135.

Ma, Y., Soatto, S., Kosecka, J., and Sastry, S. (2004). An

Invitation to 3-D Vision: From Images to Geometric

Models. Springer.

Muja, M. and Lowe, D. (2009). Fast approximate nearest

neighbors with automatic algorithm conﬁguration. In

International Conference on Computer Vision Theory

and Application.

Murphy, R. (2000). Introduction to AI Robotics. MIT Press.

Nister, D. (2004). An efﬁcient solution to the ﬁve-point

relative pose. IEEE Transaction on Pattern Analysis

and Machine Intelligence, 26(6):756–770.

Rosten, E., Porter, R., and Drummond, T. (2010). Faster

and better: a machine learning approach to corner de-

tection. IEEE Trans. Pattern Analysis and Machine

Intelligence, 32.

Scaramuzza, D. and Fraundorfer, F. (2011). Visual odome-

try - part i: The ﬁrst 30 years and fundamentals. IEEE

Robotics and Automation Magazine, 18(4):1–18.

Scaramuzza, D. and Fraundorfer, F. (2012). Visual odom-

etry - part ii: Matching, robustness, optimization, and

applications. IEEE Robotics and Automation Maga-

zine, 19(2):78–90.

Siegwart, R., Nourbakhsh, I., and Scaramuzza, D. (2011).

Introduction to Autonomous Mobile Robots. MIT

Press, 2nd edition.

Souza, A. and Gonc¸alves, L. M. G. (2015). Occupancy-

elevation grid: An alternative approach for robotic

mapping and navigation. Robotica, Cambridge Uni-

versity Press, pages 1–18.

Using Polynomial Eigenvalue Problem Modeling to Improve Visual Odometry for Autonomous Vehicles

507