(a) (b) (c)
Figure 2: (a) An artificial image with low-contrast edges.
(b) The result of the Sobel operator (Nixon and Aguado,
2002) with a high threshold. (c) The result of the Sobel
operator with a low threshold (in order to extract the weak
edges), which produces unwanted edges due to the shading
((c) is scaled independently for the sake of better visibility).
level information.
This paper proposes a concrete signal-symbol
loop mechanism to improve the extraction of low-
contrast edges by making use of motion information,
namely, the change of a symbolic local edge descrip-
tor under a Rigid Body Motion (RBM). In our paper,
the change of position and orientation of this descrip-
tor under an RBM can be formulated explicitly: Af-
ter estimating the position of a 3D edge descriptor at
a later frame, the image projection of the estimated
3D descriptor provides feedback to the filter process-
ing level. The feedback information states that there
must be an edge descriptor with certain properties at
a certain position. The filter processing level then en-
hances the information at a position if the feedback is
consistent with the original image information. The
rough outline of the mechanism that we propose is
given in figure 1.
The approach we introduce here is related to
’adaptive thresholding’ approaches which are for ex-
ample used in the area of image segmentation. These
can also recover low-contrast edges by adjusting the
threshold. This adjustment, however, is based on the
local distribution of image intensities (see, e.g., (Gon-
zales and Woods, 1992)). Our approach differs from
adaptive thresholding since it makes use of symbolic
information that facilitates a more global and also a
more directed mechanism rather than local intensity
distribution. Moreover, as we discuss at the end of
the paper, the novelty of the current paper is in the
proposal of a symbol-to-signal feedback mechanism
that can be applied also in other contexts.
The idea of using of feedback in vision systems is
not new (Aloimonos and Shulman, 1989; Angelucci
et al., 2002; Galuske et al., 2002; Bullier, 2001). For
computational models the interested reader is directed
for example to (Bayerl and Neumann, 2007) for mo-
tion disambiguation or (Bullier, 2001) for modelling
at the neuronal level for long-range information ex-
change between neurons. Our work is different from
the above mentioned works in that we introduce a
feedback mechanism between different layers of pro-
cessing, i.e., the signal-level and the symbol-level,
and we apply it in a different context.
The paper is organized as follows: In section 2,
we introduce the symbolic edge descriptors and the
concept of RBM that are utilized in this paper. Section
3 describes our feedback mechanism. In section 4,
we present and discuss the results, and the paper is
concluded in section 5.
2 SYMBOLIC DESCRIPTORS
AND PREDICTIONS
In this section, we give a brief description of the im-
age descriptors that we use to represent local scene
information at the symbolic level (section 2.1). These
descriptors represent local image information in a
condensed way and by that transform the local sig-
nal information to a symbolic level. In section 2.2,
we briefly comment on Rigid Body Motion which we
use as the underlying regularity of predictions on the
symbolic level.
2.1 Multi-modal Primitives
The concept of multi-modal primitives has been first
introduced in (Kr
¨
uger et al., 2004). These primi-
tives are local multi-modal scene descriptors, which
are motivated by the hyper-columnar structures in V1
(Hubel and Wiesel, 1969).
In its current state, primitives can be edge-like or
homogeneous and carry 2D or 3D information. For
the current paper, only edge-like primitives are rele-
vant. An edge-like 2D primitive (figure 3(a)) is de-
fined as:
π = (x,θ,ω, (c
l
,c
m
,c
r
)), (1)
where x is the image position of the primitive; θ is the
2D orientation; ω represents the local phase, the color
is coded as three vectors (c
l
,c
m
,c
r
), corresponding to
the left (c
l
), the middle (c
m
) and the right side (c
r
) of
the primitive. See (Kr
¨
uger et al., 2004) for more in-
formation about these modalities and their extraction.
Figure 4 shows the extracted primitives for an exam-
ple scene.
A primitive π is a 2D descriptor which can be used to
find correspondences in a stereo framework to create
3D primitives (as introduced in (Kr
¨
uger et al., 2004))
which have the following formulation:
Π = (X,Θ,Ω,(c
l
,c
m
,c
r
)), (2)
VISAPP 2008 - International Conference on Computer Vision Theory and Applications
216