IMAGE SEGMENTATION FOR OBJECT DETECTION

ON A DEEPLY EMBEDDED MINIATURE ROBOT

Alexander Jungmann

, Thomas Schierbaum

and Bernd Kleinjohann

Cooperative Computing & Communication Laboratory, University of Paderborn, Paderborn, Germany

Product Engineering, Heinz Nixdorf Institute, University of Paderborn, Paderborn, Germany

Keywords:

Image Segmentation, Run-length Encoding, Moments, Robotics.

Abstract:

In this paper, an image segmentation approach for object detection on the miniature robot BeBot - a deeply

embedded system - is presented. In order to enable the robot to detect and identify objects in its environment

by means of its camera, an efﬁcient image segmentation approach was developed. The fundamental algorithm

bases on the region growing and region merging concept and identiﬁes homogeneous regions consisting of

adjacent pixels with similar color. By internally representing a contiguous block of pixels in terms of run-

lengths, the computational effort of both the region growing and the region merging operation is minimized.

Finally, for subsequent object detection processes, a region is efﬁciently translated into a statistically feature

representation based on discretized moments.

1 INTRODUCTION

Embedded systems are usually very restricted with

respect to their memory and computational power.

The miniature robot BeBot (see Figure 1), which

combines an ARM Cortex-A8 600MHz processor,

256MB main memory and a small camera in a 9x9cm

chassis (Herbrechtsmeier et al., 2009), is a mobile

representative of an embedded system with such re-

strictions. In addition, it is able to explore its sur-

roundings by means of its differential chain drive.

However, for being able to act autonomously, the

robot has to perceive its environment by means of its

camera. For this purpose, an efﬁcient image segmen-

tation approach that takes the mentioned restrictions

into account was developed and is presented within

the scope of this paper. Object detection mechanisms

are not content of this particular work though.

(a) (b)

Figure 1: Miniature robot BeBot, (a) with and (b) without

light guide, enabling the robot to express its internal state.

This paper is organized as follows. The funda-

mental principles of the realized segmentation pro-

cess are described in Section 2. Section 3 deals with

the external feature representation for subsequent ob-

ject detection processes. Results of the segmentation

algorithm running on a BeBot are content of Sec-

tion 4. The paper ﬁnally concludes with Section 5.

2 SEGMENTATION APPROACH

The basic idea of image segmentation is the identiﬁ-

cation of contiguous blocks of pixels, that are homo-

geneous with respect to a pre-deﬁned criterion. By

doing so, the pixel-based visual information is ab-

stracted in order to get a reduced data representation,

which is more convenient on the one hand and less

computational expensive on the other hand. Regard-

ing the computational power of the BeBots, object de-

tection based on raw pixel data is not feasible at all.

2.1 Color as Criterion of Homogeneity

In the context of this paper, the criterion for construct-

ing homogeneous regions is based on color informa-

tion. Since the camera of the BeBot delivers YUV im-

ages, a simple but very efﬁcient heuristic H

yuv

, which

is based on the Manhattan distance, is applied in or-

441

Jungmann A., Schierbaum T. and Kleinjohann B..

IMAGE SEGMENTATION FOR OBJECT DETECTION ON A DEEPLY EMBEDDED MINIATURE ROBOT.

DOI: 10.5220/0003852104410444

In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2012), pages 441-444

ISBN: 978-989-8565-03-7

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

der to decide, whether two color tuples (y

, u

, v

)

and (y

, u

, v

) reside in a pre-deﬁned neighborhood

within the three dimensional YUV color space:

yuv

= H((y

, u

, v

), (y

, u

, v

))











1 if

− y

| ≤ c

∧

− u

| + |v

− v

| ≤ c

0 else.

(1)

By separating the luma value Y from the chrominance

values U and V , the different components can be inde-

pendently weighted by means of the two parameters

and c

2.2 Internal Data Representation

For the internal representation of the regions during

the segmentation process, the Run-Length Encoding

concept is incorporated: sequences of adjacent pixels

are compactly encoded as so called run-lengths (or

runs), whereas a single run is deﬁned in terms of three

integer values:

run

, y

), l

, (2)

with (x

, y

) being the coordinates of the starting pixel,

whereas l

represents the number of adjacent pixels

within the same row and therefore denotes the length

of a single run. Furthermore, an entire region R may

consist of a sequence of n adjacent runs:

R =

, y

), l

, y

), l

, . . . ,

, y

), l

(3)

The computational effort for both the region grow-

ing as well as the region merging operation (cf. Sec-

tion 2.3) is minimal. While adding a pixel to a region

is nothing but incrementing the length l

of the asso-

ciated run run

, merging of two regions is realized by

simply appending the sequence of runs of the ﬁrst re-

gion to the sequence of runs of the second region.

2.3 Basic Algorithm

The basic segmentation process is depicted in Algo-

rithm 1. The main loop (lines 3-25) iterates row by

row over the whole image, starting at the topmost row.

The inner iteration loop (lines 4-24) processes each

pixel within a row once, starting at the leftmost pixel

position. After identifying the left adjacent run run

le f t

as well as its associated region R

le f t

, heuristic H

yuv

(1)

is applied in order to check if the colors of the current

pixel and R

le f t

are similar (line 7). If so, the region

growing step takes place by adding the current pixel

Algorithm 1: Image Segmentation Algorithm.

1: image = latest camera image

2: regions = {∅} // set of identiﬁed regions

3: for all row ∈ image do

4: for all pixel ∈ row do

5: run

le f t

= left adjacent run

6: R

le f t

= region(run

le f t

)

7: if H(yuv(pixel), yuv(R

le f t

)) then

8: // region growing

9: add(run

le f t

, pixel)

10: continue with next pixel

11: else

12: regions

top

= top adjacent regions

13: for all R

top

∈ regions

top

14: if H(yuv(R

le f t

), yuv(R

top

)) then

15: // region merging

16: merge(R

le f t

, R

top

)

17: remove(regions, R

top

)

18: end if

19: end for

20: run

new

= new run(pixel)

21: R

new

= new region(run

new

)

22: add(regions, R

new

)

23: end if

24: end for

25: end for

26: return regions

pixel to the left adjacent run run

le f t

(length of run

le f t

is incremented by value 1) and updating the associ-

ated region R

le f t

with respect to its average color. Af-

terwards, the algorithm continues with the next pixel

of the row (line 10).

If heuristic H

yuv

fails, the left adjacent run run

le f t

is considered to be completed. The algorithm pro-

ceeds with its region merging mechanism. For this

purpose, all regions bordering run run

le f t

at the top

are identiﬁed. Subsequently, another iteration loop

tries to identify every region R

top

within Regions

top

that can be merged with the run

le f t

associated re-

gion R

le f t

(lines 13-19) by again applying heuristic

yuv

. If H

yuv

succeeds for two regions R

top

and R

le f t

the regions are merged by appending all runs of R

top

to R

le f t

. Furthermore, region R

le f t

is updated with

respect to its average color, whereas region R

top

completely discarded by removing it from the set of

heretofore identiﬁed regions.

Independent of the region merging step, a new run

run

new

with length 1 and with the current pixel as its

starting position as well as a new region R

new

with

run

new

as its ﬁrst run are allocated. Finally, R

new

added to the set of heretofore identiﬁed regions.

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

442

3 FEATURE DESCRIPTION

By interpreting a region’s associated pixels as a two-

dimensional Gaussian distribution in the image plane,

a region can be implicitly described by means of sta-

tistical parameters, namely the two mean values m

and m

, the two variances σ

and σ

, and the covari-

ance σ

. A generalization of these speciﬁc parame-

ters are the statistical moments, which can be directly

applied in our context in discretized form (Hu, 1962):

the two mean values correspond to the two moments

of ﬁrst order (m

and m

), whereas the two vari-

ances and the covariance correspond to the central-

ized (or central) moments of second order (µ

, µ

and µ

). In addition, the mass of a region is equiva-

lent to the moment of zeroth order (m

Since the central moments can be directly derived

from the moments through second order (Hu, 1962),

the basic feature descriptor for representing an ex-

tracted region is given by the following set M of mo-

ments:

M = {m

|p + q ≤ 2}

= {m

, m

}

(4)

For efﬁciently computing the required ﬁve mo-

ments, the runs of a region can be directly used by

applying the Delta ”δ” Method (Zakaria et al., 1987).

In this context, S1

and S2

are deﬁned as follows:

−1

∑

k=0

k =

(δ

− δ

)

−1

∑

k=0

−

(5)

with δ

corresponding to the length l

of a single run

run

. The required geometric moments m

, m

and m

of run

can then be com-

puted in the following way:

= δ

· y

= δ

· y

= δ

· x

+ S1

= y

· [δ

· x

+ S1

] = y

· m

= δ

· x

+ 2 · S1

· x

+ S2

(6)

Finally, the moments of an entire region R correspond

to the sums of the particular moments of all associated

n runs:

M = {m

∑

i=1

∧ p + q ≤ 2} (7)

For representing a feature in a more explicit man-

ner, additional geometric attributes can be derived

from M and the associated central moments (Teague,

1980; Prokop and Reeves, 1992). In this context, the

coordinates (x, y) of the center of mass of an extracted

feature in the image plane are deﬁned by the moments

of zeroth and ﬁrst order:

x =

and y =

(8)

Furthermore, due to the statistical interpretation in

terms of a Gaussian distribution, a feature is equiv-

alent to a elliptical disk with constant intensity, hav-

ing deﬁnite size, orientation and eccentricity and be-

ing centered at the origin of the image plane. The

lengths of its major and minor axes as well as its an-

gle of inclination can be computed with the use of the

associated central moments (Teague, 1980). By com-

bining both the center of mass and the elliptical disk,

a feature can be geometrically described in terms of

an ellipse that is located at the center of mass of the

respective feature (cf. Figure 3).

4 RESULTS

The image segmentation approach was implemented

in C/C++ and successfully applied to the BeBot.

While capturing a YUV image of size 320x240 pix-

els, the algorithm enables us to process at least 10

entire images per second. Regarding the load factor

of the BeBot during our experiments, the overall per-

formance of the segmentation algorithm heavily de-

pends on the number of different regions that are con-

structed, which in turn depends on the values of the

calibration parameters in combination with the homo-

geneity of the original image.

Figure 2 shows different segmentation results with

respect to different values of the calibration parame-

ters c

and c

of heuristic H

yuv

. Whereas the goal of

the algorithm was to clearly extract the colored ob-

jects (four balls and one rectangular block), we con-

structed the background (the horizon) as inhomoge-

neous as possible (see Figure 2(a)). From Figure 2(c)

to Figure 2(d), only parameter c

for modifying the

inﬂuence of the luma component Y was changed. By

increasing c

, the segmented image becomes more

and more homogeneous in comparison to Figure 2(b).

The same holds true when the second parameter c

heuristic H

yuv

is exclusively modiﬁed (Figure 2(e) to

Figure 2(f)). However, when comparing Figure 2(c)

and Figure 2(f), changing the allowed range of luma

similarity (c

) seems to have a greater inﬂuence than

changing the allowed range of chrominance similarity

). Keeping that in mind, Figure 2(g) shows a great

overall result with respect to the homogeneity of the

IMAGE SEGMENTATION FOR OBJECT DETECTION ON A DEEPLY EMBEDDED MINIATURE ROBOT

443

(a) original (b) c

= 10, c

= 10

= 10, c

= 20 (d) c

= 10, c

= 30

(e) c

= 20, c

= 10 (f) c

= 30, c

= 10

(g) c

= 30, c

= 20 (h) c

= 50, c

= 50

Figure 2: Results of the segmentation algorithm running on

the BeBot with different parameters c

and c

. In addition,

the minimal length of a run was limited to 5 pixels.

segmented image. In this context, the value of param-

eter c

was set to 30, whereas the value of parameter

was set to a slightly lower value of 20.

We tried to relax the restriction of color similarity

even more by simultaneously increasing both param-

eters c

and c

. At some point, the algorithm begins

to merge color values, that obviously are not similar

at all, whereas regions, which are of similar color but

differ with respect to their intensity (brightness), are

not merged (see Figure 2(h)). This issue can be traced

back to the characteristics of the YUV color space.

Last but not least, Figure 3 exemplary depicts the

external region representation in terms of equivalent

ellipses, which are located at the centers of mass of

the extracted features.

5 CONCLUSIONS

In this paper, an efﬁcient color-based image segmen-

(a) original (b) segmented

Figure 3: Feature representation in terms of the associated

center of masses and equivalent ellipses.

mentation approach for the deeply embedded minia-

ture robot BeBot is presented. In order to minimize

the computational effort as well as the memory con-

sumption during the segmentation process, regions

are compactly represented in terms of runs while they

are constructed. Furthermore, in order to provide a

convenient data representation for subsequent object

detection processes, the constructed regions are inter-

preted as a two-dimensional Gaussian distribution in

the image plane. Hence, they are efﬁciently trans-

lated into a statistical feature description in terms of

discretized moments. Finally, even though the cho-

sen heuristic for deciding whether two color values

are similar or not is very simple, it produces sufﬁcient

good results with respect to the separation of objects

in a realistic environment.

REFERENCES

Herbrechtsmeier, S., Witkowski, U., and R

uckert, U.

(2009). Bebot: A modular mobile miniature

robot platform supporting hardware reconﬁguration

and multi-standard communication. In Progress in

Robotics, volume 44, pages 346–356. Springer Berlin

Heidelberg.

Hu, M.-K. (1962). Visual pattern recognition by moment in-

variants. Information Theory, IEEE Transactions on,

8(2):179–187.

Prokop, R. J. and Reeves, A. P. (1992). A survey of

moment-based techniques for unoccluded object rep-

resentation and recognition. CVGIP: Graph. Models

Image Process., 54(5):438–460.

Teague, M. R. (1980). Image analysis via the general theory

of moments. Journal of the Optical Society of America

(1917-1983), 70:920–930.

Zakaria, M. F., Vroomen, L. J., Zsombor-Murray, P. J. A.,

and van Kessel, J. M. H. M. (1987). Fast algorithm

for the computation of moment invariants. Pattern

Recogn., 20(6):639–643.

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

444