Figure 1: Image of a Necker cube.
lows. In the next section we present related work and
emphasize the differences compared to our approach.
Two mathematical methods are derived in Section 3.
Here, we present an improved and simplified version
of the approach presented in (Haralick, 1989) before
we introduce a new method based on DLT. The iden-
tification problem as well as the accuracy analysis are
discussed in Section 4. The proposed methods are
evaluated on real world data before the paper con-
cludes with an outlook to future work.
2 PREVIOUS WORK
Using rectangular structures to compute various infor-
mation such as calibration and orientation of the cam-
era is not new. Different approaches for single im-
age based reconstruction have been presented in the
past. All of them rely on several constraints such as
parallelism and orthogonality in order to retrieve the
missing information (Wilczkowiak et al., 2001). Van-
ishing points are used to compute the internal and ex-
ternal parameters of a camera (Sturm and Maybank,
1999). The computation tends to become unstable
since these points are often placed near infinity. The
work presented in (Haralick, 1989) is partly similar
to our approach. It presents different derivations han-
dling degenerated scene configurations such as copla-
narity. This is not mandatory as it can be seen in Sec-
tion 3. In contrast to previous efforts, we introduce a
novel method which solves the stated problem by us-
ing a standard DLT method. In (Delage et al., 2007),
Markov random fields (MRF) are used for detecting
different planes and edges to form a 3D reconstruc-
tion from single image depth cues. In contrast to our
work they assume orthogonal planes instead of deal-
ing with the rectangle structure itself. (Micusk et al.,
2008) describes an efficient method for detecting and
matching rectilinear structures. They use MRF to la-
bel detected line segments. This approach enables the
detection of rectangles even if the four line segments
are not detected accurately. In (Lee et al., 2009), the
scene is reconstructed by building hypotheses of in-
tersecting line segments.
3 DERIVATION
We present two methods for reconstructing a rectan-
gle in 3D space. The first method is based on ge-
ometric relations while the second one is a new al-
gebraic solution. We assume a calibrated camera in
both cases. In this context we are only interested in
quadrangles with a convex shape since the projection
of a rectangle is never concave. Our primary goal is
to compute the orientation and the aspect ratio of a
rectangle in 3D space from a perspectively distorted
2D image of a rectangle. This is equivalent to the
computation of the extrinsic parameters of the cam-
era, e.g. in the local coordinate system defined by the
sides of the rectangle. The secondary goal is to verify
that the observed quadrangle is yielded by a rectangle
in 3D space and not by any other planar quadrangle.
We have to exclude as many non-rectangular quad-
rangles as possible from further processing early and
efficiently. The theoretical aspects of this problem are
discussed in Section 4.
3.1 Geometric Method
Fig. 2 shows the arrangement of a 3D rectangle pro-
jected onto an image plane and Fig. 3 contains the 2D
image representation. For the sake of clarity, we con-
sider a camera that is placed in the origin and looks
in Z-direction. P
1
...P
4
are the corner points of the
rectangle and p
0
1
...p
0
4
are the corresponding projec-
tions in the image plane of the camera. They can be
expressed in homogeneous coordinates p
0
1
...p
0
4
. Ne-
glecting the intrinsic camera parameters, the points
p
0
i
are transformed to P
0
i
, which are the corner points
of the rectangle’s projection in the world coordinate
system. They are connected by the edges of the rect-
angle l
12
, l
14
, l
23
and l
34
. Opposing edges intersect
at the vanishing points v
1
and v
2
. The center point
M is defined as the intersection of the rectangle’s di-
agonals. M
0
is the projection of the center point M.
The line defined by v
1
and M
0
intersects the rectan-
gle’s edges at its centers P
0
14
and P
0
23
, respectively.
P
0
12
and P
0
34
are defined by the second vanishing point
v
2
. The points P
i
and the camera rotation angles ω,
φ and κ are deduced from the corner points P
0
i
. The
distance d from projection center to rectangle center
can be chosen arbitrarily. In the following we derive
the computation of a rectangle based on a quadrangle.
According to Figs. 2 and 3 we can derive the follow-
ing simple equations:
l
i j
= P
0
i
× P
0
j
(1)
M
0
= l
13
× l
24
(2)
v
1
= l
12
× l
34
v
2
= l
14
× l
23
(3)
VISAPP 2011 - International Conference on Computer Vision Theory and Applications
272