projection.
Next, candidate blocks of similar colors are
combined such that they are as close as possible to
the point where the gesture was detected in order to
secure an area with a size and shape corresponding
to the projected button. Figure 3 shows an example
of the processing sequence. In Figure 3, (a) shows
the image input from the camera, (b) shows the
image divided into blocks, (c) shows the candidate
blocks indicated in red, and (d) shows the region
selected as the projection area by combining
candidate blocks indicated in green. With respect to
(d), the point at which the gesture was recognized is
assumed to be the center of the image.
2.2 Touch Detection Function
In (Hartmann et. al, 2012), the tip of a hand or a
finger is detected by foreground shape analysis after
separating its shadow. Their system can estimate the
height of the fingertip from the interaction surface
by calculating 3D distance from the tip of the
shadow on the surface. However, in order to pursue
the precision of the distance estimation, the camera
should be placed far from the light source (i.e. the
projector), and its positional relation cannot be
adopted in our system.
For detecting a fingertip over or on a small
interaction surface, the virtual widget, Borkowski et
al. have proposed very simple and effective methods
(Borkowski et al., 2004; Borkowski et al., 2006).
Their metric for the touch detection is the ratio of
foreground occupation in the camera view. If the
ratio in the central region of the virtual widget is
very high and that in its surrounding region is
sufficiently low, the system recognizes a pointing.
However, as the distance of the foreground from the
widget is not estimated, false detection may occur
when a tip of some thin-rod like object or its shadow
happens to be observed over the central region of the
widget in the camera view. Therefore, in our touch
detection, we examine whether a user’s finger
(foreground) is close to the widget or not by the ratio
of the foreground to its shadow.
In order to make foreground shadows observable,
the camera is installed at a location slightly offset
from the optical axis of the projector (e.g., 50 cm to
the side in the setup described below) in the system.
When a finger enters the region of the virtual
button projected by the projector, the finger and its
shadow appear in the camera image as shown in
Figure 4 (a). The shadow of this finger is large when
the finger is not touching the projection plane and
almost disappears when the finger touches the
projection plane, as shown in Figure 4 (b).
In this system, touch detection based on the
amount of this shadow is realized by the following
three functions. The first function separates the
background (button region) and the foreground
(shadow and finger region). The second function
further separates the foreground into a finger region
and shadow region, and the third function
determines whether a touch operation has been
performed from the area ratio of the finger and
shadow regions.
A certain degree of variation in environmental
brightness must be tolerated because the
hypothesized environment for the system is a living
space such as a typical living room. Other projector-
camera systems that use fluctuations in a shadow
region similar to the one in the present study include
a system that uses infrared LEDs to respond to
fluctuations in environmental brightness (Dung et al.,
2013) and a system that extracts the shadow region
by altering the color of the projected light (Homma,
2014). However, special equipment in addition to
the projector and camera is necessary for the
infrared LED system (Dung et al., 2013).
Furthermore, changing the button color temporarily
by altering the projected light in order to separate the
shadow region (Homma et al., 2014) can make the
user perceive it mistakenly as a system response to a
touch operation.
We propose a method for detecting touch
operations in environments where brightness
fluctuates that uses just an image input from a
monocular camera and without alteration of the
projected light. Our touch detection method consists
of the following process, described in Sections
2.2.1–2.2.4.
Figure 4: Example of the reduction in shadow consequent
on a touch operation.
2.2.1 Separation of Foreground and
Background
A background subtraction technique is used to
separate the foreground (the region containing the
hand, its shadow, etc.) from the background (the
projection plane onto which the button light is
projected). As detailed above, the method must be
The"EverywhereSwitch"usingaProjectorandCamera
117