ROBUST CAMERA CALIBRATION
A Generic, Optimization-based Approach
Stephan Rupp and Matthias Elter
Fraunhofer Institute for Integrated Circuits (IIS), Am Wolfsmantel 33, 91058 Erlangen, Germany
Keywords:
Robustness, Camera Calibration, Optimization, Genetic Algorithm, Heuristics.
Abstract:
The estimation of camera parameters is a fundamental step for many image guided applications in the industrial
and medical field, especially when the extraction of 3d information from 2d intensity images is in the focus
of a particular application. Usually, the estimation process is called camera calibration and it is performed
by taking images of a special calibration object. From these shots the image coordinates of the projected
calibration marks are extracted and the mapping from the 3d world coordinates to the 2d image coordinates is
calculated. To attain a well-suited mapping, the calibration images must suffice certain constraints in order to
ensure that the underlying mathmatical algorithms are well-posed. Thus, the quality of the estimation severly
depends on the choice of the input images. In this paper we propose a generic calibration framework that is
robust against ill-posed images as it determines the subset of images yielding the optimal model fit error with
respect to a certain quality measure.
1 INTRODUCTION
Camera calibration is an indispensable step for aug-
mented reality or image guided applications where
quantitative information should be derived from im-
ages. Usually, a camera calibration is obtained by
taking images of a special calibration object and ex-
tracting the image coordinates of projected calibration
marks enabling the calculation of the projection from
the 3d world coordinates to the 2d image coordinates.
To attain this, the calibration images must suffice cer-
tain constraints in order to ensure that the underly-
ing mathmatical algorithms are well-posed. In the lit-
erature, ill-posed setups are often referred to as sin-
gularities or degenerated configurations (Sturm and
Maybank, 1999; Zhang, 2000). Unfortunately, in ev-
eryday calibration work, some of the acquired images
yield significant calibration errors or even originate
from such ill-posed configurations and their determi-
nation is rarely obvious. Hence, a mechanism that
automatically identifies such images is desirable or at
least a calibration method that is robust with respect
to these configurations.
In our contribution, we address this problem and
propose a generic calibration framework, that is ro-
bust against ill-posed configurations because it auto-
matically chooses images that result in low calibra-
tion errors. The framework is generic in the sense
that it is independent of a certain calibration technique
since it is parameterized by the applied calibration al-
gorithm.
2 RELATED WORK AND
CONTRIBUTION
Camera calibration has been studied intensively in the
past years, starting in the photogrammetry commu-
nity (McGlone and Mikhail, 1940) and more recently
in computer vision (Tsai, 1987; Sturm and Maybank,
1999; Zhang, 2000; Heikkil
¨
a and Silv
´
en, 2000). Ac-
cording to Heikkil
¨
a and Silv
´
en (Heikkil
¨
a and Silv
´
en,
1997), there are four main problems when designing a
calibration procedure: control point localization in the
images, camera model fitting, image correction for ra-
dial and tangential distortion and estimating the errors
originated in these stages. Most of the research has
been devoted to model fitting and only few works can
be found in literature about the other stages of the pro-
61
Rupp S. and Elter M. (2007).
ROBUST CAMERA CALIBRATION - A Generic, Optimization-based Approach.
In Proceedings of the Second International Conference on Computer Vision Theory and Applications - IFP/IA, pages 61-68
Copyright
c
SciTePress
cess such as feature point localization, cf. (Mateos,
2000). In addition to this, Ouellet et. al. (Ouellet and
H
´
ebert, 2004) propose an interactive approach in or-
der to predict improper calibration images that feature
blurred, circular calibration marks. They analyze the
images’ marks with an acutance-based quality mea-
sure (Rangayyan et al., 1997) in order to quickly de-
cline images that suffer from static or motion blur.
This in combination with an interactive assistant tool
for geometric camera calibration eliminates the need
to carefully examine each of the images and thus fa-
cilitates the calibration process.
Concerning the identification of degenerated con-
figurations, literature neglects the problem of an auto-
matic image selection, that determines the images that
are likely to result in small model fit errors. However,
this is an important topic since ill-posed configura-
tions can negatively influence the over-all calibration
procedure and thus lead to significant errors as much
as poor image quality does.
Hence, we propose a generic and extensible cal-
ibration framework that is robust against singulari-
ties. The framework is based on discrete optimiza-
tion technique and makes use of local search methods
in order to reject image sets that are likely to con-
tain images from degenerated configurations. Due to
the fact that local search methods require an initial
start solution, we analyize the use of Genetic Algo-
rithms in comparison with the a our random sampling
method suggested in (Rupp et al., 2006). In addition,
the framework is extensible, so that i.e. the accutance-
based blur detection of Ouellet et. al. can be easily
integrated in order to automatically exclude images
from calibration that feature blurred calibration marks
and thus improve the robustness of camera calibration
against poor image quality and singularities.
3 METHODS
In general, calibration is the problem of estimating
values for the unknown parameters in a sensor model
in order to determine the exact mapping between sen-
sor input and output.
The calibration of a imaging device is usually
performed by observing a special calibration object,
which is in most cases a flat plate with a regular pat-
tern marked on it using colors causing a high con-
trast between the marks and the background. The pat-
tern is chosen such that the image coordinates of the
projected reference points can be measured with high
accuracy. Once the relationship between the 2d im-
age coordinates and 3d world coordinates is known,
the transformation C of the visual system can be esti-
mated. In practice, a set of n observations (input im-
ages) = {ι
1
,ι
2
,...,ι
n
} is considered whereas some
of the acquired images may originate from ill-posed
configurations. Typically, singularities are seldomly
known beforehand, so that neither considering all the
n images nor a human-made subset selection will in
general yield the optimal calibration result - particu-
larly for non-expert users.
We present a robust calibration framework that
applies optimization techniques in order to automat-
ically determine the optimal subset out of the pool
of aquired images yielding the best calibration result
with respect to a quality measure.
3.1 Optimization Terminology
The term optimization refers to the study of problems
in which one seeks to minimize or maximize a real
function φ : by systematically choosing the
values of the variables from within an allowed set
. Typically, the function φ is called objective func-
tion, its domain is the solution space and an ele-
ment of is referred to as solution or state. The op-
timization persues minimization or maximization of
φ that means to identify the global optimal solution
x
opt
such that φ(x
opt
) < φ(x) (minimization) or
φ(x) < φ(x
opt
) (maximization), for all x . Fre-
quently, there are some auxiliary conditions defined
on that reduce the solution space. These con-
straints are usually described by a predicate P defined
on or a set of (in)equalities. The solutions suffic-
ing these constraints are called feasible solutions and
define the set of feasible solutions .
If is countable and finit, the optimization prob-
lem is named discrete optimization problem, if addi-
tionally 2 holds, with being a certain basic
set, the optimization problem is called combinatorial
optimization problem (Lee, 2004).
3.2 Modelling
The framework makes use of optimization and thus
requires a formulation of the image selection task be-
ing suitable for the application of optimization tech-
niques. Hence, we assume the n elements of the cal-
ibration image set being (partially) ordered by an
arbitrary relation. We identify an element at position
i (the i-th image ι
i
) with the i-th unit vector
{ι
i
} 7→ e
i
= ( 0 0 ...
i
1
... 0 0
| {z }
n
)
T
i = 1, . . . , n,
and model a certain subset by the coordinate vector
x = (x
1
... x
n
)
T
,x
j
[0, 1], for example
x = (01 . . . 0 ... 1)
T
= 0e
0
+1e
1
+...+0 e
k
...+1e
n
.
VISAPP 2007 - International Conference on Computer Vision Theory and Applications
62
Here, s
j
= 1 denotes the containedness of the j-th im-
age in the corresponding subset. With this modelling,
the image selection is equivalent to the combinatorial
optimization problem (cf. Sec. 3.1):
x
opt
= argmin φ(x), (1)
the solution space = {0,1}
n
is due to the coordinate
vector representation.
Combinatorial optimization are sometimes easy to
solve, i.e. they can be solved in polynomial time,
but more often - such as in this case - polynomial-
time algorithms are not known (Garey and Johnson,
1979) and one usually resorts to heuristics that are
not guaranteed to find an optimal solution (Pitsoulis
and Resende, 2002), but a so-called sub-optimal so-
lution x
max
being close to the optimal solution
x
opt
instead.
3.3 Heuristics
Heuristic algorithms are based on searching the local
neighbourhood of a current solution for an improve-
ment. Given a current solution x , the elements
of the neighbourhood N (x) of x are those solutions
that can be obtained by applying an elementary oper-
ation to x. Local search methods start from an initial
solution x
0
and iteratively generate a serie of im-
proving solutions x
0
,x
1
,x
2
,...,x
k
.
At the k-th iteration, N (x
k
) is searched for an im-
proving solution x
k+1
such that φ(x
k
) < φ(x
k+1
) or
φ(x
k
) > φ(x
k+1
) respectively. If such a solution is
found, it is made the current solution. Otherwise, the
search ends with x
k
as a local optimum.
Since heuristic optimization strategies are based
on a neighbourhood relation N , two image selections
are defined to be neighbours, if they differ by excatly
one image:
N (x) := {y : d(x,y) = 1}
Thus, the neighbourhood of a solution x is given by
the Hamming distance defined on the search space
= {0, 1}
n
.
3.4 Framework
A configuration of the framework is defined by the tu-
ple F = ( ,α,ω,φ,τ) comprising the set of input im-
ages , the calibration algorithm α, the quality mea-
sure φ, the optimization strategy ω and finally a cri-
terium τ that terminates the optimization. In general,
the optimization strategy may cover arbitrary tech-
niques, however, due to the the special structure of
Eq. (1) the choice of a particular strategy is restricted
to the class of local search method
LS
described
above.
Algorithm 1 Generic robust camera calibration
Require: ,α,φ,ω
LS
,ω
init
,τ
x = ω
init
()
x
max
= x
while !τ do
y = ω(x), y : y N (x) P
if neighbour y exists then
x = y
if φ(α(x)) is better than φ(α(x
max
)) then
x
max
= x
end if
end if
end while
return x
max
A generic algorithm in pseudo code is depicted in
Alg. 1. Due to the limitation on heuristic algortihms
the framework features an additional parameter, that
is responsible for finding a suitable start solution ω
init
(initialization). Typically, the termination criterion τ
is given by the maximum number of iterations k
max
,
a convergence term measuring the relative improve-
ment or a combination of both.
Before entering the optimization loop, the initial-
ization strategy ω
init
is applied in order to find a fea-
sible solutions that acts as the start solution and that
is made the current (sub)optimal solution x
max
. Then,
the loop is entered and repeated until the termination
criterion is met. For the current solution x, an improv-
ing, feasible solution y from within the current solu-
tion’s neighbourhood N (x) is identified by the local
search method ω. If such a solution exists, the visual
system’s parameters are estimated with the calibration
algorithm α and the images represented by y. Subse-
quently, the calibration result is compared with the
current optimal solution using the quality measure φ.
4 APPLICATION
In the following we demonstrate the use of the frame-
work with common choices for the camera calibration
algorithm α, the objective function φ and exemplarily
a variant of the standard downhill search method ω
SD
.
The described configurations serve simultaneously as
setups for the subsequent experiment and result sec-
tion.
4.1 Initialization
As mentioned above (Sec. 3), heuristic optimization
starts from a given intial solution x
0
. Due to the
ROBUST CAMERA CALIBRATION - A Generic, Optimization-based Approach
63
bration with the images x:
φ
BPE
2d
(P) =
1
nm
n
i=1
m
j=1
u
i j
v
i j
1
P ·
X
i j
Y
i j
Z
i j
1
2
| {z }
ε
i j
The projection error of a single calibration feature ε
i j
is given by the Euclidean distance between its ini-
tially extracted image coordinates (u
i j
,v
i j
)
0
and the
corresponding 3d world coordinates (X
i j
,Y
i j
,Z
i j
)
0
be-
ing projected to the image plane with the projection
matrix P acquired by the calibration procedure.
In order to improve the calibration process, we
propose to combine the previous error measure
φ
BPE
2d
with a term that assesses the spatial error when
reconstructing 3d points from 2d image point corre-
spondences by means of the calibrated projection ma-
trix P. For this, we incorporate the regression error
ε
PlaneFitError
with respect to a plane that has been fitted
into the intersection points of back projected rays by
means of the projection matrix P:
φ
BPE
:= φ
BPE
2d
+ ε
PlaneFitError
(2)
Due to the fact that this error function φ
BPE
calcu-
lates the projection errors as well as the out-of-plane
errors for all the images of the initial calibration im-
age set , it can be used as an indicator for singu-
larities. The smaller the value of φ
BPE
is, the better
the calibrated parameters fit the model. Thus, the op-
timization persues minimization of φ
BPE
: {0,1}
n
+
0
in order to identify the best image subset, whereas
a huge value of φ
BPE
indicates the containdeness of a
singularities within the image subset.
4.4 Optimization Strategy
As a representant of the vast number of local search
algorithms, we exemplarily consider a variant of the
simple and common downhill heuristic. Both algo-
rithms make use of a conceptual skier that constantly
moves downhill in the value landscape. For this, the
basic version just seeks for a neighbour with an equal
or better solution. Thus, it chooses a deterministically
or stochastically determined neighbour x
k+1
N (x
k
)
that yields a smaller back-projection error than the
current solution x
k
:
φ
BPE
(x
k
) < φ
BPE
(x
k+1
)
The steepest descent local search method ω
SD
acts as
a stronger formulation as it always consideres the best
solution within the neighbourhood. Thus, the algo-
rithm replaces the current feasible solution x
k
with a
Table 1: Comparison of human-made selections with the
global optimum x
opt
, the selection of all images and the pro-
posed methods. The mean projection error is given in pixel
and calculated with respect to the whole input image set.
The bold values determine the best result within a group.
Method Average # Img. Std.Dev.
Expert 1 0.179088 8 ./.
Expert 2 0.180657 6 ./.
Expert 3 0.178398 10 ./.
Expert 4 0.178818 18 ./.
Expert 5 0.182151 4 ./.
Expert 6 0.178776 11 ./.
Expert 7 0.178678 7 ./.
Expert 8 0.178643 9 ./.
All 28.2622 20 ./.
ω
init,MCM
0.178386 ./. 1.10e-5
ω
init,MCM
+ ω
SD
0.178376 ./. 1.64e-5
ω
init,GA
0.178410 ./. 2.72e-5
ω
init,GA
+ ω
SD
0.178379 ./. 2.44e-5
x
opt
0.178320 11 ./.
new solution x
k+1
according to:
x
k+1
= argmax
x
k+1
N (x
k
)
φ
BPE
(x
k
)
with φ
BPE
(x
k
) < φ
BPE
(x
k+1
).
5 EXPERIMENTS AND RESULTS
For an evaluation of our approach, we asked differ-
ent persons with a background in computer vision to
calibrate cameras and compared their calibration re-
sults with those that have been obtained with the pro-
posed methods. The experts calibrated several cam-
eras of different resolution and manufacturers, each
from n = 20 images of a 14-by-10 checkerboard (with
m = 117 calibration marks) whereas some of them
originated from ill-posed configurations. In order to
compare the expert’s over-all performance, the glob-
ally optimal solution for all image subsets was de-
termined too. For this, an exhaustive search of the
search space has been performed and the minimum
projection errors for the configurations that comprise
of two images, three images and so on up to 20 im-
ages have been identified. Starting with configura-
tions of only two images is due to the Zhang method
that requires at least two different views of the cal-
ibration pattern (see Sec. 4.2). In contrast to taking
the minimum number of images, considering all im-
ages corresponds to the procedure typically persued
in everday calibration work. The exhaustive search
procedure has found the global optimum of 0.178320
ROBUST CAMERA CALIBRATION - A Generic, Optimization-based Approach
65
the global optimium as well as with human-made de-
cisions exhibit that the calibration can be significantly
improved by automated image selection. Even though
one of the image sets manually selected by one of
the experts performed almost equally well as the pro-
posed automatic methods, this is not the case in gen-
eral.
Considering the selection algorithms, the Monte
Carlo initialization method ω
init,MCM
performs
slightly better than the genetic algorithm ω
init,GA
.
Similarly, the heuristic optimization results in lower
backprojection errors with the stochastic initialization
but at the expense of longer responses. Regardless
of which initialization method is used and no matter
if it is followed by a heuristic optimization, the
calibration result is robust with respect to outlier
images, i.e. images that were taken from ill-posed
views. Hence, selecting good image sets for camera
calibration no longer requires long lasting experience
or time-consuming trial and error.
REFERENCES
Fischler, M. A. and Bolles, R. C. (1981). Random sampling
consensus: A paradigm for model fitting with appli-
cations to image analysis and automated cartography.
Comm. of the ACM, 24:381–395.
Garey, M. R. and Johnson, D. S. (1979). Computers
and Intractability - A Guide to the Theory of NP-
Completeness. W.H. Freeman and Company.
Goldberg, D. E. (1989). Genetic Algorithms in Search and
Optimization. Addison-Wesley.
Heikkil
¨
a, J. and Silv
´
en, O. (1997). A four-step camera
calibration procedure with implicit image correction.
IEEE Conference on Computer Vision and Pattern
Recognition, pages 1106–1112.
Heikkil
¨
a, J. and Silv
´
en, O. (2000). Geometric camera cal-
ibration using circular control points. IEEE Transac-
tions on Pattern Analysis and Machine Intelligence,
22(10):1066–1077.
Lee, J. (2004). A First Course In Combinatorial Optimiza-
tion. Cambridge Press.
Mateos, G. G. (2000). A camera calibration technique using
targets of circular features. 5th Ibero-America Sympo-
sium On Pattern Recognition (SIARP).
McGlone, C. and Mikhail, E. (1940). Manual of Pho-
togrammetry. ASPRS, 1 edition.
Ouellet, J.-N. and H
´
ebert, P. (2004). Developing assistant
tools for geometric camera calibration: Assessing the
quality of input images. In IEEE Conference on Com-
puter Vision and Pattern Recognition, pages 80–83.
Pitsoulis, L. S. and Resende, M. G. C. (2002). Greedy Ran-
domized Adaptive Search Procedures. In Pardalos, P.
and Resende, M., editors, Handbook of Applied Opti-
mization, pages 178–183. Oxford University Press.
Rangayyan, R. M., El-Faramawy, N. M., Desautels, J. E. L.,
and Alim, O. A. (1997). Measures of acutance and
shape for classification of breast tumors. IEEE Trans-
actions on Medical Imaging, 16(6):799–810.
Rupp, S., Elter, M., Breitung, M., Zink, W., and K
¨
ublbeck,
C. (2006). Robust Camera Calibration using Discrete
Optimization. Enformatika Transactions on Engineer-
ing, Computing and Science, 13:250 254.
Sturm, P. and Maybank, S. (1999). On plane-based camera
calibration: A general algorithm, singularities, appli-
cations. In IEEE Conference on Computer Vision and
Pattern Recognition, pages 432–437.
Tsai, R. Y. (1987). A versatile camera calibration technique
for high-accuaricy 3d machine vision metrology using
off-the-shelf tv cameras and lenses. IEEE Transac-
tions on Robotics and Automation, 4:323–344.
Zhang, Z. (2000). A flexible new technique for camera cal-
ibration. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 22(11):1330–1334.
VISAPP 2007 - International Conference on Computer Vision Theory and Applications
68