the working robot in the view of the image currently
observed. Then the area with which corresponding
points cover can be extracted as that of the working
robot in the image.
In this study, we propose a voting method for effi-
cient detection. The captured image is divided by sev-
eral square regions. The working robot is recognized
if there are matched regions in which the number of
corresponding points is more than a threshold given
in advance. The matched regions are assumed as the
area of the working robot in the image.
2.2 3-D Position Detection
The 3-D position of the working robot is calculated
using stereo vision theory. Even if the observing robot
has only one camera, it can move to change its po-
sitions so that two or more images of the working
robot are captured from different views. The corre-
sponding points of feature are then obtained by the
images from different angles. We can apply stereo
vision techniques such as 8-point algorithm (Shi and
Tomasi, 1994) to those corresponding points to com-
pute the 3-D position.
3 FEATURE POINT DETECTION
BASED ON SURF DESCRIPTOR
In order for fast detection of the feature points,
we have tried to apply SURF (Speeded Up Robust
Features) descriptor (Bay et al., 2008). SURF is
improved algorithm of SIFT descriptor (D.G.Lowe,
1999) so that it can compute fast.
The SIFT descriptor is capable of robust detection
of feature points in an image. It is also able to describe
quantities of detected features robustly to the change
of scale, illumination, and rotation of image. It is,
therefore, useful for object detection and recognition.
The processes of the detection of SIFT features con-
sist of extraction of key points, localization, compu-
tation of orientation, and description of quantities of
features. In the process of the extraction of key points,
DoG (Difference of Gaussian) is used for searching
local maxima to detect the positions and scales of fea-
tures. Some points are then picked up from them by
the process of localization. The orientations for those
points are then computed, and their quantities of fea-
tures are described. To describe the quantities of fea-
tures based on the orientation, surrounding region di-
vided by 4×4 blocks at a feature point is rotated to the
direction of the orientation. Making a histogram on 8
directions for each block produces a 128(4 × 4 × 8)-
Figure 1: Experimental setup.
dimensional feature vector. The quantity of SIFT fea-
ture is represented by this vector.
The SURF descriptor is improved to be faster in
the above processes of extraction of key points and
description of quantities of features. In the process
of extraction of key points, SURF create the DoG im-
age by the determinant of Hessian using a box filter
instead of Gaussian function. The box filter is an ap-
proximate image of second derivative filter of Gaus-
sian. Using the box filter, the filtering computation
becomes fast because it consists of pixels which have
same values so that we can obtain integral image in
advance. In the process of description of quantities
of features, the dimension of the feature vector is re-
duced to 64 from 128 by dividing orientation of each
block into 4 directions.
4 EXPERIMENTS
4.1 Experimental Setup
The method described abovehas been implemented to
two wheeled-mobile robots, Pioneer P3-DX (Mobile
Robots Pioneer P3-DX, 2007), which is 393 mm in
width, 445 mm in length, and 237 mm in height.
One robot has a camera, Canon VC-C50i, which
is able to rotate in pan and tilt directions so that it is
qualified as the observing robot. A board computer,
Interface PCI-B02PA16W, was also mounted in order
to process images from the camera in observation as
well as control the movement of the robot. We uti-
lized OpenCV for developing software for the image
processing in observation.
Fig. 1 shows experimental setup. The observing
robot initially stays at P
1
to observe and detect the
working robot, which does not have a camera, based
on the method described above. The observing robot
then changes the position from P
1
to P
2
in Fig. 1 to
observe the working robot in different visual angles in
order to obtain its corresponding points and calculate
its position.