Leyton (Leyton, 2001) introduced a generative
theory of shape, and his key insight was that the set
of points in a shape may be generated in many dif-
ferent ways, and that these ways can be characterized
technically by a wreath product group. We propose
that such a sensorimotor representation is more suit-
able for an embodied agent than a purely geometric
or static feature method. The wreath product com-
bines two levels of description: (1) a symbolic one
based on group action sequences (i.e., strings), and
(2) shape synthesis based on group actions on other
groups (i.e., motion descriptions). For example, a line
segment may be generated by moving a point along a
line for a certain distance – represented by the wreath
product: e ≀ Z
2
≀ ℜ; however, in order to realize this
for a specific line segment, an actuation mechanism
in the coordinate frame of the shape must be defined
and and actuation commands provided whose appli-
cation results in the kinematic synthesis of the points
in the line segment. For example, eye motion con-
trol to move the fovea along a shape is such a system.
The human arm and its motor control is another. The
abstract form of the wreath product allows either of
these control systems (eye or arm) to generate a line
segment. Thus, shape is a sensorimotor representa-
tion, and one which supports knowledge transfer be-
tween motor systems with known mappings between
them bound together through the abstraction of the
wreath product. Thus, if you see a square with your
eyes, you build a representation which allows the cre-
ation of that shape with your finger, say tracing it in
the sand.
Henderson et al. (Joshi et al., 2014) proposed to
directly incorporate and exploit actuation data in the
analysis of shape. A philosophical and psychological
rˆole for actuation in perception has been given by No¨e
(No¨e, 2004)
The sensorimotor dependencies that gov-
ern the seeing of a cube certainly differ from
those that govern the touching of one, that
is, the ways cube appearances change as a
function of movement is decidedly different
for these two modalities. At an appropriate
level of abstraction, however, these sensori-
motor dependencies are isomorphic to each
other, and it is this fact – rather than any fact
about the quality of sensations, or their corre-
lation – that explains how sight and touch can
share a common spatial content. When you
learn to represent spatial properties in touch,
you come to learn the transmodal sensorimo-
tor profiles of those spatial properties. Percep-
tual experience acquires spatial content thanks
to the establishment of links between move-
Figure 2: Two CAD Drawings; left: text image that is
included with CAD to explain how to paint the structure;
right: a hand-drawn design of a nuclear storage facility.
ment and sensory stimulation. At an appro-
priate level of abstraction, these are the same
across the modalities.
For the basic description of the original work on
the wreath product sensorimotor approach, see (Hen-
derson et al., 2015). Here we go beyond their results
by developing a coherent approach to the semantic
analysis of large sets of CAD drawing images. Fig-
ure 2 shows examples of the types of images we ana-
lyze; on the left is a text file that accompanies an engi-
neering drawing to explain how to paint the structure;
on the right is a hand-drawn design of a nuclear waste
storage facility.
The left image is a text drawing that provides in-
formation about the drawing and the image on the
right is a hand drawn plan for one of the double-
shell nuclear waste storage tanks at Hanford, WA.
The semantic information in such drawings is needed
to develop electronic CAD for automotive parts and
for non-destructive examinations, respectively. The
overall goal is to find the basic character strokes
(defined as Wreath Product Primitives), followed by
character classification (using Wreath Product Con-
straint Sets) and finally word recognition (by dictio-
nary lookup) from those. Figure 3 shows the En-
hanced Non-Deterministic Analysis System (NDAS)
which achieves this analysis; ENDAS uses agents to
achieve a parse of the image. The levels of NDAS
correspond to pre-processing, terminal symbol hy-
potheses, and nonterminal symbol hypotheses. Every
start symbol represents a complete parse of the image
(e.g., a Text Image).
2 THE ENDAS SYSTEM
Leyton proposed a generative model of shape (Ley-
ton, 2001) based on the wreath product group. (Also
see (Viana, 2008; Weyl, 1952) for a discussion of the