MAKING STRUCTURAL PATTERN RECOGNITION
TRACTABLE BY LOCAL INHIBITION
E. Michaelsen, L. Doktorski and M. Arens
Forschungsinstitut f¨ur Optronik und Mustererkennung, Gutleuthausstr. 1, 76275 Ettlingen, Germany
Keywords:
Structural pattern recognition, Local inhibition.
Abstract:
Declarative knowledge and control decisions on the sequence of interpretation acts are separated in a structural
pattern recognition system. The control can be optimized leaving the knowledge fixed. A simple production
system is used as declarative example knowledge. It is tailored to recognize and locate rectangles in images
where object primitives are several thousand very short contour segments. Different control strategies can
be realized: (i) a simple quality driven bottom-up control; (ii) an heuristic strategy punishing object instances
which have been partner in an already performed reduction and (iii) a new psychologically inspired strategy
that combines local inhibition with less local excitation. These strategies are compared quantitatively on
synthetic data and qualitatively on a real aerial image.
1 INTRODUCTION
Controlling the search for feasible reductions given a
declarative knowledge structure such as a produc-
tion system – and a set of measured primitive objects
has been an interesting topic ever since the proposal
of structural pattern recognition was put forth decades
ago. In particular production systems can implicitly
define huge combinatorial search spaces that cannot
be systematically explored with feasible effort. The
following three major requests are set for the con-
trol unit of a production system interpreter suitable for
pattern recognition and machine vision: (i) it should
avoid visiting the whole combinatorial search space;
in fact it should get along with working out only a
very tiny portion of it; (ii) it should be capable of
handling large numbers of object instances; and (iii)
it should have anytime capability in the sense that it
provides some reductions of full depth very early and
use additional time to check alternatives and improve
the evidence for the validity of the already found. If
run to a complete end or infinitely the system should
approach a correct interpreter.
For these requests the correctness of the interpre-
tation system may be traded. The practitioner is sat-
isfied with approximate correctness as long as it does
not affect the usability and reliability of the system
on real data. In Section 2 of this contribution we
give an approximating interpreter that is tailored to
these requests. A particular simple system containing
only a few simple productions is given in Section 2.
It is complex enough to elaborate the differences
yet simple enough to maintain an overview on what
is going on. Section 3 compares three control strate-
gies using this setup. Quantitative testing is done with
synthetic data and there is an additional qualitativeas-
sessment with a real aerial image. The remainder of
this chapter will set the work in relation to the pub-
lished state-of-the-art in the field.
Automatic ”image understanding” has been a ma-
jor issue in pattern recognition for many decades
(Basu et al., 2005; Matsuyama and Hwang, 1990;
Draper et al.,1989). Most such proposals were tai-
lored to automatic recognition of man-made struc-
ture in particular from remote sensing data. But
structural and cognitive methods for image analy-
sis are not restricted to remote sensing applications.
There are also examples of similar structure that were
originally designed for medical applications (Nie-
mann et al., 1990) and for automatic reasoning for
safety in robotics (Qureshi et al., 2005). All of these
proposals contain also approaches to optimizing the
control in particular ERNEST had a sophisticated
theory of optimal control with it. Syntactic methods
for image understanding recently gain interest again.
E. g., (Zhu and Mumford, 2006b) have introduced
a stochastic grammar for the understanding of im-
ages. Emphasis is not on efficient control of user
defined fixed knowledge but on defining a sound de-
scription langugage capable of capturing and learn-
381
Michaelsen E., Doktorski L. and Arens M. (2009).
MAKING STRUCTURAL PATTERN RECOGNITION TRACTABLE BY LOCAL INHIBITION.
In Proceedings of the Fourth International Conference on Computer Vision Theory and Applications, pages 381-384
DOI: 10.5220/0001767203810384
Copyright
c
SciTePress
ing diverse and general visual categories. Biological
inspiration for our inhibition/excitation function pre-
sented in Section 3 comes from interdisciplinary work
between perceptual psychology and computer vision
such as (Aziz and Mertsching, 2007). Our work can
be seen as a straight decendentand continuation of the
blackboard-based accumulating interpretation system
proposed in (Stilla and Michaelsen, 1997). There the
main application was seen in the recognition of man-
made structures such as buildings in aerial im-
agery. The system has also been used for other pur-
poses (compare (Michaelsen et al., 2006; Michaelsen
et al., 2006)).
Intelligent control has always been a major issue.
Using Gaussian local inhibition in order to acquire a
priority sorting for 3D-surface plane patches in laser-
data has been reported by (von Hansen et al., 2006a).
Though this work used the same interpretation dis-
patcher the inhibition was only performed after the
search in order to rank order the resulting object set.
In this contribution such inhibition is used for the con-
trol of the search for all kinds of objects during the
run.
2 A SIMPLE SYSTEM
The example production system acts on primitive ob-
jects called line, which are extracted from an image
with a gradient filter. For simplicity, the system only
has three non-primitive object classes, namely long-
line, angle and rectangle. The system contains the
three productions given in Table 1.
Table 1: Example production system. The terminal symbol
l
denotes lines, whereas the non-terminal symbols
L
,
A
, and
R
denote long lines, angles, and rectangles, respectively.
L
colinear overlapping
regression
l
·· ·
l
(1)
A
rectangular ad jacent
intersection
L L
(2)
R
crosswise ad jacent
intersection
A A
(3)
We distinguish two normal forms of productions:
Normal form 1 generates a pair of objects from a sin-
gle non-terminal object. Productions of this form are
used for part-of intentions (compare productions (2)
and (3) in Table 1). Normal form 2 generates a set of
objects of arbitrary (finite) size of the same type from
a single non-terminal object. Such productions are
used for cluster intentions (Michaelsen et al., 2008).
Production (1) for example codes a Hough-type accu-
mulator for straight lines.
The system codes declarative knowledge for
searching rectangles in pictures. The interpretation
system we use approximates correct parsing. Pseudo-
code scheme Algorithm 1 explains the interpreter.
It works accumulating instead of reducing and thus
avoids backtracking. Hypotheses are constructed of
single (triggering) objects, and productions where
such objects appear on the right-hand side. Such hy-
potheses are tested by looking for appropriate partner
objects. Once the set fulfilling the query is found the
procedure is different for the two normal forms: for
productions of normal form 1 all elements of the set
are handled separately according to the combina-
toric nature of the production system. Each possible
partner leads to a new and separate reduction possi-
bility. For productions of normal form 2 all elements
of the set are used for one minimization calculation.
They lead to exactly one new instance of the cluster
object type of the left hand side of the production.
Algorithm 1: Approximative interpretation of
production systems.
while queue not empty OR other break do
sort(queue);
set of hopothesis = choose best n(queue);
foreach trigger hypo set of hopothesis do
if p=nil then
remove queue(trigger hypo);
foreach q where trigger obj right-hand
side do
new priority = prio(q) *
priority(trigger hypo);
append queue(trigger elem, q,
new priority);
end
else
actual query =
construct query(trigger hypo);
candidate set = pose query(actual query);
if p of Normal form 1 then
foreach partner candidate set do
p:new elem (trigger elem,
partner);
add database(new elem);
construct null hypo(new elem);
end
else
p:new elem candidate set;
add database(new elem);
construct null hypo(new elem);
end
end
end
end
3 LOCAL INHIBITION
A frequent observation with the dispatcher control
unit of our systems showed that the same or very
similar intermediate non-primitive objects were mul-
tiply reduced from the same objects. Of course there
VISAPP 2009 - International Conference on Computer Vision Theory and Applications
382
is a test for existence before the new elements are
appended to the set of reducible objects, so as to
avoid multiple listing in it, however, the computa-
tional effort for the query and construction method
is wasted. A more detailed observation of the phe-
nomenon revealed that, e. g., neighbouring primitive
line segments often get similar assessments, too. Thus
the corresponding working hypothesis tend to clus-
ter together also in the process queue and cause very
similar queries and constructions. One way to rem-
edy these unpleasant repetitions is the inclusion of
a remove-queue command for the hypothesis corre-
sponding to all the right-hand-side objects into the
modules coding the productions. This leads to very
efficient systems. However, it contradicts the combi-
natoric nature of such production systems: only the
actual triggering hypothesis can (and must) be re-
moved, the others will cause differentqueries and thus
open different possibilities. With the remove-queue
command these possibilities are cut away which alters
the declarative semantics of the system. The remove-
queue command may be replaced by a re-assess com-
mand. This does not alter the declarative semantics of
the production system. It shifts all the repetitions to
the end of the interpretation run. If the interpretation
is halted long before the queue runs empty they will
not be performed anymore. For the experiments we
used an appropriate inhibition constant γ = 0.5 (com-
pare Algorithm 2).
Algorithm 2: Heuristic inhibition.
remove queue(hypothesis(triggering-element));
foreach objects x candidate set (and not
x=triggering-element) do
priority(hypo(p,x))=γ*priority(hypo(p,x));
end
Algorithm 3: ‘Biological’ local inhibition.
foreach hypo(o,p) 6= hypo(ot,pt) do
priority(hypo(o,p))=ω(o, ot)*priority(hypo(o,p));
end
The sequence of inspection or saccade or gaze
control has been subject of psychological inves-
tigations for a long time. There are also works
on incorporating such behaviours into computer vi-
sion systems. E. g., recently (Aziz and Mertsching,
2007) described a control mode they call ”examine
behaviour”. I. e. a particular part of the space un-
der observation becomes uninteresting when the focus
of interest has been there. The whole closer neigh-
bourhood is lowered in its priority. Instead a less
close neighbourhood is getting higher priority (is ex-
cited). In particular such objects get more priority
which have similar other attributes concerning prop-
erties like orientation, colour etc. Thus a sequence
of observation is achieved which follows perceptual
Gestalts almost the same way like human subjects do.
Following such ideas we have implemented a priority
upgrade function ω which is between zero and one in
a close neighbourhood and slightly greater than one
in the further neighbourhood:
ω = 1 (1+ α)e
δ
2
+ αe
1
2
δ
2
where δ
2
indicates a specific metric distance from the
object of the triggering hypothesis
δ
2
= σ
loc
|locloc(ot)|
2
+ σ
ori
|ori ori(ot)|
2
with weights σ
loc
and σ
ori
balancing location in the
image against orientation. The values of these pa-
rameters where set as σ
loc
= 0.0004 and σ
ori
= 0.08
(in 512x512 images and with orientation measured
in degree) after systematic optimization of the per-
formance. Thus local inhibiton is quite far reaching
with respect to the image location but quite narrow
with respect to orientation. ω becomes zero for δ
2
= 0
and ω is approaching one for δ
2
. The weight of
the exciting versus the inhibiting effect of the function
depends on the parameter alpha. The experiments in-
dicated below used α = 0.9. Whenever a particular
hypothesis (ot,pt) is tested all other hypotheses of the
same object and production type are getting a priority
upgrade using this function.
4 EXPERIMENTS AND RESULTS
Experiments were performed with 200 randomly gen-
erated images each containing one ramdomly rotated,
sized (25-100), and positioned square with randomly
set greyvalue on black background,two circular disks
drawn with the same specifications, and ten lines of
three pixels width drawn accordingly. The images
were blurred and Gaussian noise was added. Each
image results in two or three thousand primitive lines
constructed with a gradient filter. The interpretation
using the production-system given in Section 2 was
halted when the queue ran empty or an instance of
the class rectangle was reduced. In the latter case the
experiment was counted as success if the object was
found in the correct position (with ve Pixels toler-
ance).
We also made experiments with a real aerial im-
age. In the real data experiment the interpretation
was stopped, when twelve or more rectangles were
reduced. Tab. 1 shows the results for the three runs
(no inhibition, heuristic inhibition and Gaussian inhi-
bition/excitation). Red crosses indicate the centres of
the found rectangle objects.
MAKING STRUCTURAL PATTERN RECOGNITION TRACTABLE BY LOCAL INHIBITION
383
a b c
Figure 1: Objects Rectangle found on a real aerial image: a) no inhibition, b) heuristic inhibition, c) Gaussian inhibi-
tion/excitation.
Table 2: Results on synthetic images without, with heuris-
tic, and with biologically inspired inhibition.
success rate computational effort
no inh. 27% 100%
heur. inh. 29,3% 61,4%
bio. 30,8% 69,2%
While both – heuristic inhibition and Gaussian in-
hibition/excitation need much less interpretation cy-
cles than the run without any inhibition (3200 and
3700 versus 5800 cycles) the spreading of the result-
ing rectangle objects looks much different: the Gaus-
sian control spreads its interest much more in the im-
age space. It marks several less salient rectangles as
well.
5 CONCLUSIONS
Both the heuristic and the more sophisticated con-
trol scheme are successful. An important point is
whether the extra administration effort spend for the
control calculations equalizes the gain in search ef-
fort. This is of course implementation dependent. An
important drawback are the extra parameters intro-
duced by the control which are domain-dependent.
REFERENCES
M. Z. Aziz and B. Mertsching: An Attentional Approach
for Perceptual Grouping of Spatially Distributed Pat-
terns. In: F. Hamprecht et al. (eds.): Proc. 29th
DAGM-Symposium 2007, LNCS 4713, Springer, pp.
345-354.
M. Basu et al. (eds): PAMI Special Issue on Structural and
Syntactical Methods, 27:7 (2005).
B. Draper, R. Collins, J. Brolio, A. Hanson, E. Riseman:
The Schema System. IJCV, Vol. 2 (1989) 209-250.
T. Matsuyama and V. S.–S. Hwang: Sigma a Knowledge–
based Image Understanding System. Plenum Press,
New York (1990).
E. Michaelsen, L. Doktorski, and M. Arens: Shortcuts in
Production Systems - A way to include clustering in
structural Pattern Recognition. Proceedings of PRIA-
9-2008, Vol. 2, Lobachevsky State University, Nis-
chnij Nowgorod, pp. 30–38.
E. Michaelsen, W. von Hansen, M. Kirchhof, J. Mei-
dow, and U. Stilla: Estimating the Essential Matrix:
GOODSAC versus RANSAC. ISPRS Symposium on
Photogrammetric Computer Vision PCV 2006, Bonn,
Germany, September 20-22 2006, Int. Archives of
Photogrammetry and Remote Sensing. Vol. XXXVI,
Part 3, PPV 2006, pp. 161–166, 2006.
E. Michaelsen, U. Soergel, and U. Thoennessen: Perceptual
Grouping in Automatic Detection of Man-Made Struc-
ture in high resolution SAR data. Pattern Recognition
Letters 27:4(2006) 218–225.
H. Niemann, G. Sagerer, S. Schr¨oder, F. Kummert:
ERNEST: A Semantic Network System for Pattern Un-
derstanding. PAMI 12:9(1990) 883–905.
F. Qureshi, D. Macrini, D. Chung, J. Maclean, S. Dickinson,
P. Jasiobedzki: A Computer Vision System for Space-
borne Safety Monitoring. 8th Int. Symposium on Arti-
ficial Intelligence, Robotics and Automation in Space
(iSAIRAS), Munich (2005).
U. Stilla and E. Michaelsen: Semantic modelling of man-
made objects by production nets. In: A. Gr¨un et
al. (eds.): Automatic Extraction of Man–Made Ob-
jects from Aerial and Space Images (II). Birkh¨auser–
Verlag, pp. 43–52.
W. von Hansen, E.Michaelsen, U. Th¨onnessen: Cluster
Analysis and Priority Sorting in Huge Point Clouds
for Building Reconstruction. In Proc. 18th Int. Conf.
on Pattern Recognition (ICPR’06), Vol. 1, pp. 23-26,
2006.
S.–C. Zhu and D. Mumford: A Stochastic Grammar of Im-
ages Foundations and Trends in Computer Graphics
and Vision 2:4 (2006) 259–362.
VISAPP 2009 - International Conference on Computer Vision Theory and Applications
384