HANDLING REPEATED SOLUTIONS TO THE PERSPECTIVE

THREE-POINT POSE PROBLEM

Michael Q. Rieck

Drake University, U.S.A.

Keywords:

P3P, Pose, Photogrammetry, Danger cylinder, Discriminant, Jacobian, Repeated solution, Double solution.

Abstract:

In the Perspective 3-Point Pose Problem (P3P), when the three reference points are equidistant from each

other, this distance may be assumed to be one unit in length. A repeated solution to the problem then occurs

when and only when 1 + R

+ R

−R

= 0, where R

and R

are the squared

distances from the camera’s focal point to the reference points. When the setup only approximately satisﬁes

this equation, two nearly equal solutions can introduce substantial calculation errors. To better handle this

circumstance, it may be preferable to behave as though the above equation holds precisely, and then invert

a certain two-dimensional transformation to obtain the repeated solution. The inversion involves only a few

basic arithmetic operations and square roots. This approach is more efﬁcient, and more reliable, than the

standard quartic equation approach to solving P3P, at least in this special case.

1 INTRODUCTION

The Perspective 3-Point Pose Problem (P3P), as in-

troduced and solved by J. A. Grunert (1841), is essen-

tially concerned with inferring the distances to three

known reference points, seen in a photograph, from

the camera that took the photograph. With this in-

formation, one can then determine the position and

orientation of the camera. Amazingly, the problem is

nearly as old as photography itself.

Traditionally its applications were restricted to ar-

eas of photogrammetry such as aerial reconnaissance.

More recently though, it has been successfully ap-

plied in electronic digital imaging to address a vari-

ety of practical problems. These include robotic con-

trol and navigation, as in (Qingxuan, et al, 2006),

as well as six-degree-of-freedom tracking for vir-

tual/augmented reality and video game applications,

as in (Chen, et al, 1998) and (Ohayon and Rivlin,

2006).

Advancements and reﬁnements in the study of

P3P were steadily made throughout the nineteenth

and twentieth century, as for example (Merritt, 1949)

and (M

uller, 1925). An extensive survey of the state

of P3P as of 1994 can be found in (Haralick, et al,

1994). Several recent studies have classiﬁed solu-

tions, such as (Faug

ere, et al, 2008), (Gao, et al,

2003), (Wolfe, et al, 1991), (Zhang and Hu, 2005)

and (Zhang and Hu, 2006).

A simpliﬁed version of P3P assumes that the dis-

tances between the three reference points are equal.

Attention will be limited in the following discussion

to this situation, where in fact, the measurement units

will be set so as to make this distance equal one. The

P3P problem assumes that the cosines of the inte-

rior angles between pairs of lines-of-sight to the refer-

ence points are known, and that one wishes to deter-

mine the distances to these points. These cosines are

straightforward to calculate from the photograph (or

digital image) and intrinsic camera properties. Us-

ing the Law of Cosines, the underlying mathemati-

cal problem to be solved is therefore the determina-

tion of the unknown values of r

, based on the

known values of c

, in the following system of

quadratic equations:







+ r

−2c

= 1

+ r

−2c

= 1

+ r

−2c

= 1.

(1)

It will be convenient to set R

= r

( j = 1,2, 3),

and to sometimes regard (1) as a system of equations

in R

and R

. As demonstrated by Grunert, it is

possible to eliminate any two of the three unknowns,

resulting in a single quartic (i.e. fourth degree) poly-

nomial equation in the remaining R

(for j = 1,2,3):

+ B

+ D

+ E

= 0. (2)

395

Q. Rieck M. (2010).

HANDLING REPEATED SOLUTIONS TO THE PERSPECTIVE THREE-POINT POSE PROBLEM.

In Proceedings of the International Conference on Computer Vision Theory and Applications, pages 395-399

DOI: 10.5220/0002790103950399

 SciTePress

The coefﬁcients here depend on c

and c

. The

leading coefﬁcient A, as well as the equation’s dis-

criminant ∆, turn out (surprisingly) to be independent

of j. Speciﬁcally, A = 16 T

and

∆ = 16777216 (c

−c

)

−c

)

−c

)

(3)

where

T = 1 + 2τ −σ , S = (4)

4(1 −τ)

(1 + 8τ) T − 3 [ 3χ −(1 + 2τ)

]

σ = c

, τ = c

, χ = c

The computations involved here are rather tedious,

and best checked using mathematical manipulation

software, such as Mathematica



or Maple

The

polynomial S appears to be irreducible. However, the

story changes when the c

are expressed in terms of

the r

, using the following rational transformation:

+ r

−1

, c

+ r

−1

, c

+ r

−1

(5)

obtained by solving (1) for the c

(when the r

are all

nonzero). This transformation causes S to factor as

S = Ω

H / 256 R

, (6)

where

Ω = 1 + R

+ R

−R

, (7)

and where H is a rather complicated eighth degree

polynomial in R

and R

. Moreover, the Jacobian

determinant of the transformation (5) is

J = Ω/4r

. (8)

Section 2 describes the singular situation that re-

sults when two solutions coalesce to form a double

solution, causing J to vanish. This is “singular” in

the sense that transformation (5) from (r

) to

) becomes locally non-invertible.

The principal result of this article is next pre-

sented, an efﬁcient algorithm called as “DSA” for

handling double solutions. Section 2 also discusses

the results of experiments conducted using this algo-

rithm. Section 3 studies the transformation (5), from

the r

to the c

, in more detail, and lays the mathemat-

ical foundation for DSA.

A Mathematica notebook is available upon request.

2 DOUBLE SOLUTIONS

2.1 Double Solutions as Error Sources

This article is concerned with the situations where

Ω = 0 (hence J = S = ∆ = 0), and where |Ω| is sufﬁ-

ciently close to zero to cause trouble. Since J tends to

be small when |Ω| is small, computational errors can

result in large errors when computing the values of the

from those of the c

. This situation occurs when

two solutions to the quadratic system (1) coalesce

or nearly coalesce into a double solution. The case

where Ω = 0 was introduced and studied in (Smith,

1965) and (Thompson, 1966), and later considered by

others, such as (Zhang and Hu, 2005) and (Zhang and

Hu, 2006). It turns out that Ω = 0 corresponds to hav-

ing a physical setup in which the camera’s focal point

is on a special circular cylinder, customarily known as

the “danger cylinder.”

When S = 0, it can be shown that Ω = 0 for some

solution to (1), which is thus a repeated solution. By

determining that |S| is smaller than some given toler-

ance, and then behaving as though S = 0, the (nearly)

repeated solution can be computed more efﬁciently

and reliably than would otherwise be the case. Rather

than solving Grunert’s complicated quartic polyno-

mial, or following any of several known equivalent

approaches, one only needs to follow a simple algo-

rithm, detailed in the next subsection. As will be seen,

this only requires a few basic computations, involving

nothing more complicated than square roots.

There are a couple reasons why behaving as

though S = 0, when |S| is small, might be prudent.

Imprecisions in measuring the c

and/or roundoff er-

ror in computing S, mean that it might be impossible

to know for certain if S is zero, positive, negative, or

even non-real. Since the discriminant of the quartic

polynomials involves S as a factor, it is possible that

two nearly equal real solutions (or a double solution)

are erroneously perceived to be complex solutions in-

stead, and thereby ignored as being physically unre-

alistic. Even when two nearly equal real solutions are

discovered, these are likely to be rather far from the

correct solutions, owing to the small value of the Ja-

cobian determinant.

2.2 Double Solutions Algorithm (DSA)

The following algorithm has been found to be a sim-

ple way to mitigate the difﬁculties caused by double

solutions:

1. Receive (c

) as input.

2. If necessary, negate any two of c

and c

, so

as to make c

+ c

≥

. If this is not possi-

VISAPP 2010 - International Conference on Computer Vision Theory and Applications

396

ble, then quit, indicating that there is no repeated

solution.

3. Compute σ,τ,χ, T and S, using formulas (4).

4. If |S| is sufﬁciently small, then behave as though

S = 0, and continue this algorithm; otherwise quit,

indicating that there is no repeated solution.

5. Solve for u and v, using formulas (13). These for-

mulas uniquely determine a u with u ≥ 0, and a v

with |v| ≤ 1 (as can be proved).

6. Compute tentative values for r

and r

using

formulas (10), and r

( j = 1,2,3).

7. Compute corresponding values for c

and c

using

formulas (5). Call these c

and c

though.

8. Test to see whether or not swapping c

and c

would cause them to be closer to the values of c

and c

(from step 2). If so, then swap r

and r

9. If any negation took place in step 2, then com-

pensate for this by now negating a corresponding

or r

. Negate r

if c

and c

were negated;

negate r

if c

and c

were negated; negate r

if c

and c

were negated.

10. Return the repeated solution (r

Note that system (1) (using altered or unaltered c

)

has a repeated solution if and only if S = 0, and except

in some very special cases, a repeated solution is only

a double solution. Also, “closeness” in step 8 might

be decided by considering (c

−c

)

+(c

−c

)

ver-

sus (c

−c

)

+ (c

−c

)

. Although the correctness

of this algorithm is not proven here, the mathematical

analysis that led to it is described in Section 3. Addi-

tionally, the simulations to be discussed next attest to

its correctness as well.

2.3 Simulations

Simulations conﬁrm the advantages of using the Dou-

ble Solution Algorithm when |S| is small. These sim-

ulations were performed using compiled Mathemat-

ica functions, running on an Intel Core Duo processor.

Thus the ﬂoating point computations were performed

using 64-bit IEEE ﬂoating point format. Even more

dramatic results can be expected in a 32-bit ﬂoating

point environment.

A radius-one danger cylinder was used. Five dif-

ferent distance ranges along the cylinder axes were

explored: 0-2, 2-4, 4-6, 6-8 and 8-10. A camera focal

point on the cylinder (within the given range) was ran-

domly selected, and the cosines c

computed.

DSA was tested against Grunert’s quartic polynomial

method, and the resulting computed distances for r

were compared with the actual value of r

Next, each of the three cosines was randomly

perturbed by adding or subtracting up to one one-

millionth to/from it, and the two methods were com-

pared again using the resulting data. This was again

repeated, but using a maximum adjustment of one

one-hundredth, rather than one one-millionth, for

each cosine. In this way, ﬁfteen different experi-

ments (ﬁve distance ranges times three maximum co-

sine perturbation amounts) were considered. Each of

these experiments was performed one hundred thou-

sand times, and the results of these trials were aver-

aged.

When the computed cosines (c

) for a point

(essentially) on the danger cylinder were left unper-

turbed, the ratio of the average errors using Grunert’s

method versus DSA was between a hundred million

and a billion. Admittedly though, the likelihood of

having the camera’s focal point right on the danger

cylinder, within the computational tolerance of 64-

bit ﬂoating point arithmetic, is very small. Thus fur-

ther experiments were conducted using slightly al-

tered value of the cosines.

When the cosines were randomly perturbed by an

amount up to one one-millionth, the ratio of the av-

erage computed errors was as much as 52, when the

focal point was close to the reference point (the 0-2

range). But this ratio dropped to 14 when the focal

point was far away (8-10 range).

When the cosines were randomly perturbed by

an amount up to one one-hundredth, the error ratio

ranged between one and two. Thus the improvement

using DSA was modest in this case. Once again

though, computations performed using 32-bit arith-

metic, instead of 64-bit arithmetic, would more dra-

matically demonstrate a difference in accuracy be-

tween the two methods.

The ratio of the execution times for the two meth-

ods were also compared. Here though, it was difﬁcult

to know how much of the timing reported by Math-

ematica was attributable to the overhead involved in

calling compiled functions from within the Mathe-

matica interpreter. In every case, the reported speedup

(ratio) was in excess of four. However, a quick check

of the actual computations involved in the two meth-

ods suggests that the true speedup should be consid-

erably higher.

3 MATHEMATICAL ANALYSIS

This section captures much of the reasoning underly-

ing DSA. The phrases “R-space,” “r-space” and “c-

space” will be used to refer to the abstract three-

dimensional spaces of (R

) points, (r

)

HANDLING REPEATED SOLUTIONS TO THE PERSPECTIVE THREE-POINT POSE PROBLEM

397

Figure 1: The critical surfaces

Q and

points, and (c

) points, respectively. Because

the coordinates of the points in c-space represent the

cosines of angles in physical space, we are particu-

larly interested in the points (c

) whose coordi-

nates have absolute value less than or equal to one.

Let

T denote the set of points in c-space for which

| ≤ 1 ( j = 1, 2,3) and T ≥ 0 (another physical

requirement). The boundary of this region (where

T = 0) is shaped like an “inﬂated tetrahedron,” basi-

cally resembling an over-stuffed tetrahedral pillow.

consists of this surface plus its interior. Let

S denote

the set of all points in

T satisfy the equation S = 0.

This surface is shown on the right in Figure 1. It re-

sembles a deformed cube with two types of vertices,

all of which are also on the boundary of

T .

S can

be decomposed into four identical sections, each re-

sembling a deformed triangular cone. One of these

consists of points satisfying c

≥

, referred

to as the “principal cone.”

In r-space, there are also restrictions on the set of

points (r

) that are realizable, given the setup in

physical space. Since the reference points are a dis-

tance one apart, it is immediately clear that no two of

and r

can differ by more than one. There is

an additional restriction though, imposed by the fact

that the tetrahedron, in physical space, whose vertices

are the camera’s focal point and the three reference

points, must have positive volume. Using the Cayley-

Menger determinant, it can be seen that 144 times this

volume equals Ω + R

−2. So it is required

that this be non-negative. Also, a quick check estab-

lishes that when the substitution (5) is used to express

T in terms of the r

, one obtains

T =

Ω + R

+ R

−2

. (9)

Let

R denote the region in r-space consisting of

points (r

) satisfying |r

−r

| ≤ 1, |r

−r

| ≤

1, |r

−r

| ≤ 1 and Ω + R

+ R

≥ 2. These are

the points in r-space that are physically realizable, as-

suming that negative values of r

are admissible.

Let

Q denote the subset of

R consisting of points

for which Ω = 0. This surface is shown on the left in

Figure 1. Because of (6), it is clear that the points of

Q (in r-space) correspond to points of

S (in c-space),

under the transformation (5). It can be shown that

this mapping from

Q to

S is onto, except for a set of

measure zero.

From equation (7), observe that the surface

Q cor-

responds to a portion of a circular cylinder in R-space.

Only a portion of the cylinder is admissible though,

due to the restriction that R

+ R

≥ 2, by (9),

since T ≥ 0 and Ω = 0. The semi-cylinder (surface)

can be parameterized as follows:











(2 + u −2 cos θ),

(2 + u + cos θ +

√

3sin θ),

(2 + u + cos θ −

√

3sin θ).

(10)

Here u = R

+ R

− 2, which must be non-

negative, and we will assume too that 0 ≤ θ < 2π.

Notice that replacing θ with θ ±2π/3 (mod 2π) in-

duces a cyclic permutation of {R

}. Replacing

θ with 2π −θ swaps R

and R

. It is helpful to use the

notation v = cosθ and w = ±

√

1 −v

= sin θ.

Because of the independent sign choices for each

of r

and r

, for given R

and R

, there are eight

identical sections, or “legs,” of

Q in r-space. These

correspond to the same semi-cylinder in R-space, with

u ≥ 0. The section for which r

and r

are all non-

negative will be called the “principal leg” of

Q . The

points on the principal leg of

Q get mapped, via (5),

to points on the principal cone of

S. Expressing points

on the principal leg in terms of u,v and w, the mapping

(5) becomes the following:

1 + 2u + 2v

2 + u + v +

√

2 + u + v −

√

1 + 2u −v −

√

2 + u −2v

2 + u + v −

√

1 + 2u −v +

√

2 + u −2v

2 + u + v +

√

(11)

VISAPP 2010 - International Conference on Computer Vision Theory and Applications

398

These formulas then lead immediately to the follow-

ing formulas:

σ =

6(1 + 3v −4v

) + 3u(3 + 12u + 4u

)

4(2 + u −2v)(1 + 4u + u

+ 4v + 4v

+ 2uv)

τ =

(1 + 2u + 2v)(−1 + 2u + 2u

−v + 2v

−2uv)

4(2 + u −2v)(1 + 4u + u

+ 4v + 4v

+ 2uv)

χ =

3(1 + 3v −4v

)

3u(1 + 3v −4v

)(3 + 3u + 4u

)

+ 6u

(3 + u)(3 + 6u + 2u

)

4(1 + 4u + u

+ 4v + 4v

+ 2uv)

(12)

As complicated as formulas (11) and (12) are,

surprisingly simple inversion formulas exist. Given,

and c

, and using (4) to compute σ and τ, the

values of u and v can be readily deduced from the fol-

lowing:

(1 + u)

3(1 + 8τ)

4(1 + 2τ −σ)

(1 + v)

3(1 + 2τ −3c

)(1 + 2τ −3c

)

4(1 + 2τ −3c

)(1 + 2τ −σ)

. (13)

These two formulas can be efﬁciently conﬁrmed us-

ing mathematical manipulation software, by making

substitutions using (11) and (12).

4 CONCLUSIONS

The Double Solution Algorithm (DSA) presented in

this article gives a very practical, very fast, and highly

accurate method for solving the P3P problem, when

dealing with the setup where the camera’s focal point

is on or near the danger cylinder, and when the ref-

erence points are equidistant from each other. It re-

lies on the inversion of a transformation between two

surfaces. Surprisingly, the inverse mapping turns out

to be simpler to compute than the original mapping.

It would be quite useful to ﬁnd a generalization of

DSA to the situation where the reference points are

no longer assumed to be equidistant from each other.

REFERENCES

Chen C-S, Hung Y-P, Shih S-W, Hsieh C-C, Tang C-Y,

Yu C-G and Chang Y-C (1998). Integrating virtual

objects into real images for augmented reality. In

VRST’98, ACM Symp. Virtual Reality Software and

Techology, pp. 1-8. ACM.

Faug

ere J-C, Moroz G, Roullier F, El Din M S (2008).

Classiﬁcation of the perspective-three-point problem,

discriminant variety and real solving polynomial sys-

tems of inequalities. In ISSAC’08, 21st ACM Int.

Symp. Symbolic and Algebraic Computation, pp. 79-

86. ACM.

Gao X-S, Hou X-R, Tang J, and Cheng H-F (2003). Com-

plete solution classiﬁcation for the perspective-three-

point problem. In IEEE Trans. Pattern Analysis and

Machine Intelligence, v. 25, n. 8, pp. 930-943. IEEE.

Grunert J A (1841). Das pothenotische problem in er-

weiterter gestalt nebst

uber seine anwendungen in der

geod

asie. In Grunerts Archiv f

ur Mathematik und

Physik, Band 1, pp. 238-248. Verlag von C. A. Koch.

Haralick R M, Lee C-N, Ottenberg K and N

olle N (1994).

Review and analysis of solutions of the three point

perspective pose estimation Problem. In J. Computer

Vision, v. 13, n. 3, pp. 331-356. Springer Netherlands.

Merritt, E L (1949). Explicit three-point resection in space.

In Photogrammetric Engineering, v. 15, n. 4, pp. 649-

655. Amer. Soc. Photogrammetry.

uller F J (1925). Direkte (exakte) l

osung des einfachen

uckw

artsein-schneidens im raume. In Allegemaine

Vermessungs-Nachrichten. Wichmann Verlag.

Ohayon S and Rivlin E (2006). Robust 3D head tracking

using camera pose estimation. In ICIP’06, Int. Conf.

Image Processing, pp. 1063-1066. IEEE.

Qingxuan J, Ping Z and Hanxu S (2006). The study of po-

sitioning with high-precision by single camera based

on P3P algorithm. In INDIN’06, IEEE Int. Conf. In-

dustrial Informatics, pp. 1385-1388. IEEE.

Smith A D N (1965). The explicit solution of the single

picture resolution problem, with a least squares ad-

justment to redundant control. In Photogrammetric

Record, v. 5, n. 26, pp. 113-122. Wiley-Blackwell.

Thompson E H (1966). Space resection: failure cases. In

Photogrammetric Record, v. 5, n. 27, pp. 201-204.

Wiley-Blackwell.

Wolfe W J, Mathis D, Sklair C W, and Magee M (1991).

The perspecive view of three points In IEEE Trans.

Pattern Analysis and Machine Intelligence, v. 13, n. 1,

pp. 66-73. IEEE.

Zhang C-X and Hu Z-Y (2005). A general sufﬁcient condi-

tion of four positive solutions of the P3P problem. In

J. Comput. Sci. & Technol., v. 20, n. 6, pp. 836-842.

Springer.

Zhang C-X and Hu Z-Y (2006). Why is the danger cylinder

dangerous in the P3P problem? In Acta Automatica

Sinica, v. 32, n. 4, pp. 504-511. Elsevier.

HANDLING REPEATED SOLUTIONS TO THE PERSPECTIVE THREE-POINT POSE PROBLEM

399