Figure 1: User with mobile device and laptop desk.
nocular-camera system with an Inertial Measuring
Unit, this improves their estimated trajectory and
map. Newman et al (Newman et al., 2001) devel-
oped a building-wide AR system by combining ultra-
sound position estimates and rotation estimates from
an inertial tracker. This allows the wearer of a head
mounted display to be positioned, but does not pro-
vide detailed information about the environment they
are in. The main focus of these systems has been im-
proving the overall system robustness or improving
the accuracy of the estimate.
In this work we overcome these limitations by us-
ing a second sensor, an absolute positioning system
(APS). During areas of texture we simultaneously es-
timate our global trajectory using the APS and our
local trajectories using a monocular-camera system.
Using a least-squares approach we estimate the trans-
formation between the global and local trajectories
and build a single scalable consistent global map.
During periods of loss of visual information we use
the APS to estimate 3-D position within the global
map. This allows the system to efficiently relocalise,
when returning to a previously built local map, even
when there are a very large number of maps.
Results show that by using a second sensor we can
overcome the limitations of scalability and the need
for continuous texture, and buildmany local maps that
are known relative to one another. This provides us
with a novel AR capability, to be able to track in a
local map and see AR in other local maps that are not
joined to the current local map by continuous texture.
The remainder of the paper is organised as fol-
lows. The next section describes the general frame-
work for combining a mapping system with an APS.
The third section describes the implementation we
have developed based on this framework. Conclu-
sions are drawn in the final section.
2 POSITIONING AND VISION
This section describes the core method underlying our
work described in a general framework. One of the
advantages of our method is that it is directly appli-
cable to any camera mapping system, for example;
probabilistic and Structure from Motion approaches
and any absolute positioning system, for example; Ul-
trasound, Ultra-Wideband and GPS.
Another advantage of this work is that we can
create a very large number of local maps at an arbi-
trary position, orientation and scale and combine all
of these maps into a single coordinate frame to pro-
vide a single scalable global map. Crucially, the es-
timated map and trajectory from the local mapping
system are locally correct, but a transformation out,
when compared to the estimate in the global coordi-
nate frame. By simultaneously estimating this local
trajectory and the global trajectory we are able to es-
timate this alignment transformation and combine the
local maps into the global coordinate frame.
2.1 Estimating Transformations
We begin by estimating the motion of a mobile device
using two sensors; a monocular-camera and an abso-
lute positioning system. The two sensors are rigidly
attached at a known offset. Each sensor estimates the
mobile’s trajectory in their own coordinate frame; our
goal is to recover the transformation between the two
coordinate frames.
To estimate the transformation we need to esti-
mate two trajectories, propagated to a common time.
After each measurement update a new position is es-
timated and this is stored in the trajectory set. This
provides us with two sets of data, or trajectories, X
for the vision system and Y for the absolute position-
ing system.
Now we have the two trajectories we estimate
the transformation between them. We find it best
to estimate the transformation parameters over the
full trajectories once the local map has been ‘fin-
ished’. To estimate the desired transformation we
use a least-squares approach introduced by Umeyama
(Umeyama, 1991). This method is based on Singu-
lar Value Decomposition, which is known for its nu-
merical stability. We now give a brief overiew of the
method. First we find the means and variances of the
two trajectories and then the covariance Σ
xy
, between
the two. We then determine the singular value de-
composition of Σ
xy
which gives UDV
T
. From these
matrices we follow the steps described in Umeyama
(Umeyama, 1991) to estimate the rotation R, transla-
tion t, and scale s, of the transformation. These are
the parameters of the transformation required to con-
vert the local map into the global map.
2.2 Building the Global Map
Once the transformation parameters s
j
, R
j
and t
j
have
been estimated for local map j, they can be used to
GRAPP 2010 - International Conference on Computer Graphics Theory and Applications
354