equipped with or linked to an imaging device, such as
a camera, which is able to view the other device’s dis-
play, such that the camera’s image is registered with
that display. The term registered means that for any
(pixel) position in the viewed display we know its cor-
responding position in the captured image of the dis-
play. We call the concept of a display with a registered
image of that display display registration, as this is an
instance of image registration. The captured image of
the display, which is registered to the display itself,
can be passed to the display on the camera-equipped
device for the system operator to use in his/her cur-
rent task. In most applications, the camera equipped
device will be smaller and manoeuvrable by hand. We
call this the client device and movements and button
(or stylus) clicks of this device control the way in
which the system operates, within a certain context.
The other device, will, in general, be a larger static
device and we will call this the server device.
2.1 Device Interaction via Registered
Display Operations
In typical use of this method, the mobile client de-
vice is moved around by the user, whilst maintaining
at least a small part of the server display in it’s field of
view. Throughout this motion, the client camera im-
age and hence the client’s display of that image to the
user are registered with the server display. That is,
irrespective of the change in relative position of the
client device, we can always compute where any po-
sition on the server display appears on the client cam-
era image and the display of that client camera image.
Also, since we can easily compute the inverse trans-
formation, we can choose any position on the client
camera image, such as the centre or one of the image
corners, and determine the corresponding position on
the server display. We call the concept of maintaining
the correspondencebetween client and server displays
maintaining display registration. The fact that the dis-
plays are registered enables a large range of possible
interactions and data exchanges between client and
server devices. It is envisaged that the user may con-
trol this interaction through a variety of modes, which
are effectively different contexts in which to interpret
registered display operations.
3 REGISTRATION METHODS
For a planar client image plane and a planar server
display systems, we need to find a plane-to-plane
mapping that allows us to compute the display regis-
tration. This mapping encodes the (idealised) imaging
process of the camera (intrinsic parameters) and the
six degree-of-freedom pose of the client image plane
relative to the server display (extrinsic parameters). It
is well-known that this transformation, called a planar
homography, can be represented by a 3x3 matrix, H,
such that λx
i
= HX
i
, where X
i
is a point on the server
display, x
i
is the corresponding point in the client im-
age and λ is a constant. The matrix H, is defined up to
a scale factor and hence has eight degrees of freedom.
Thus it can be estimated by standard linear methods if
four corresponding points are known across the client
image and server display, with the constraint that no
three are collinear. In this case we haveeight indepen-
dent constraints and H is fully defined (up to scale).
More corresponding points can yield a more accurate
estimate of H, using some variant of a least-squares
technique. Various estimation techniques for H are
detailed by Hartley and Zisserman (Hartley and Zis-
serman, 2004).
The question now arises: how to we find four or
more corresponding points across the server display
and client image of that display? This problem can be
divided into two categories: (i) marker-based and (ii)
natural (markerless).
In marker-based display registration, the server is
required to maintain a dynamic display of four dis-
tinctively coloured reference targets, no three if which
are collinear, which can easily be detected and seg-
mented by the client. Given that the position of these
can be detected in the client, these positions can be
transmitted to the server, which knows where the tar-
gets were displayed on the server display. A planar
homography estimation method can then be applied
to register the displays without any prior calibration
of the camera. Note that, since the homography trans-
formation between the server display and client dis-
play is known when the displays are registered, it is
possible to change the markers in the server display,
such that the shape, size and position of the markers
is constant in the client image irrespective of camera
viewing pose. This leads to more reliable detection
of the markers, since they do not become too small
to detect as the client camera moves away from the
server display.
In natural display registration, no dynamically
controlled markers are used to aid registration (ho-
mography computation). Registration is achieved by
matching the client image to the unmodified server
display (although one can choose to use textured
backgrounds and windows) and this may be achieved
using one of several techniques in the computer vi-
sion and pattern recognition literature. Perhaps the
simplest approach is to use corner extraction (Harris
and Stephens, 1988), (Smith and Brady, 1995) fol-
VISAPP 2008 - International Conference on Computer Vision Theory and Applications
448