where: (i) Σ is a set of terminal symbols that corre-
sponds to CI types and identifies all the CIs in a SDM.
(ii) N is a set of nonterminal symbols that identifies
composed content items which structure is in Defini-
tion 3 below. (iii) S ∈ N is the grammar axiom. (iv)
R
rec
is the set of basic cardinal direction relation in-
troduced in Section 4.1. (iv) Π is a finite set of spa-
tial production rules (SPRs) of the form A
dir
→ αβ and
A→α, where: A ∈ N, α, β ∈ (Σ ∪ N), and dir ∈ R
rec
.
Sets Σ, N and R
rec
, are disjoint and finite.
In order to explain how spatial grammars works
we have to define the concept of composed content
item (CCI) as follows.
Definition 3. A composed content item (CCI) is a 6-
tuple of the form:
CCI = hΓ,A, r
−
x
, r
−
y
, r
+
x
, r
+
y
i
where:
• Γ is a non empty set of spatially contiguous CIs.
• A is the identifier of the CCI corresponding to a
nonterminal symbol of the SG.
• r
−
x
= min (r
−
x
γ
|γ ∈ Γ ∧ γ = (τ, σ, r
−
x
γ
, r
−
y
γ
, r
+
x
γ
, r
+
y
γ
))
• r
−
y
= min (r
−
y
γ
|γ ∈ Γ ∧ γ = (τ, σ, r
−
x
γ
, r
−
y
γ
, r
+
x
γ
, r
+
y
γ
))
• r
+
x
= max (r
+
x
γ
|γ ∈ Γ ∧ γ = (τ, σ, r
−
x
γ
, r
−
y
γ
, r
+
x
γ
, r
+
y
γ
))
• r
+
y
= max (r
+
y
γ
|γ ∈ Γ ∧ γ = (τ, σ, r
−
x
γ
, r
−
y
γ
, r
+
x
γ
, r
+
y
γ
))
CIs in Γ are spatially contiguous when the area de-
fined by r
−
x
γ
, r
−
y
γ
, r
+
x
γ
, r
+
y
γ
do not overlaps with other CIs
not in Γ.
A SPR of the form A→α, where α ∈ Σ, and A ∈ N
allows for creating a CCI corresponding to a CI iden-
tified by the terminal symbol α. This operation con-
stitutes a generalization of the terminal α into the non-
terminal A, and at the same time a transformation of
the CI related to α into the CCI related to A. Rules of
the form A
dir
→ α β, where α, β ∈ Σ, and A ∈ N, com-
pose the two contiguous CIs related to the terminal
symbols α and β, along the direction specified by the
relation
dir
→, in order to obtain a new CCI A having the
structure described in Definition 3. SPRs having the
form A
dir
→ BC, where A, B,C ∈ N, compose the sets Γ
B
and Γ
C
of CIs in the CCIs related to the nonterminals
B and C respectively, in order to obtain a new CCI
having coordinates computed as specified in Defini-
tion 3. Similar considerations can be done for rules
that combine terminal and nonterminal symbols.
4.2.1 The Spatial CYK Algorithm
In the following we present the pseudo-code of the
SCYK algorithm. Computational complexity aspects
are discussed in Section 4.2.2. Before presenting the
algorithm we define a CNF-like normal form that al-
lows a shortest and more intuitive pseudo-code. It is
obviously possible and easy to extend the algorithm
to parse any type of rules.
Definition 4. A SG is in SG-normal form if all its
production rules are either in the form A
dir
→ BC, or
A → α where A,B and C are non-terminals, while α
is a terminal symbol. Production of type A → α are
called unary spatial production rules.
The algorithm takes as input a SG Q and a set of CIs
D. In instruction 1 the algorithm creates two ordered
sets L
x
and L
y
that contain coordinates r
−
x
and r
−
y
of
all CIs ∈ D respectively.
In the worst case n = |L
x
| and m = |L
y
| can be at
most equal to the size of the document |D|, but in real
cases both have a size smaller than |D|. In instruction
4 SPRs in Q are acquired in the set RS. Instruction 6
assigns to RS
U
all unary SPRs in RS. In instruction
7 non unary SPRs are split in two subsets RS
H
and
RS
V
that contain rules of the type V
dir
→ XY , where dir
is a RCR that expresses relations obtained from the
subsets of basic RCR {E, SE, NE, W, SW, NW , B} for
RS
H
and {N, NW, NE, S, SE, SW , B} for RS
V
.
Instructions 8 and 9 generate tables T
1
and T
2
re-
spectively. Indices in the first four positions of T
1
and
T
2
, namely i
2
, i
1
, j
2
, and j
1
identify the CCI having as
bottom-left vertex (L
x
[i
2
], L
y
[ j
2
]) and as top-rigth ver-
tex (L
x
[i
2
+ i
1
], L
y
[ j
2
+ j
1
]). The last position in table
T
1
contains a nonterminal symbol.
The general idea which guides the algorithm is
that elements in T
2
represent CCIs by the correspond-
ing coordinates, while elements in T
1
are boolean val-
ues that state if a given nonterminal symbol V in the
grammar is associated to the corresponding CCI in T
2
(it is noteworthy that a CCI can have different asso-
ciated nonterminal symbols). Instruction 10 creates a
two-dimensional table I where elements I[i
1
, j
1
] con-
tain a set of couples < i
2
, j
2
> which indicate that
the CCI in T
2
[i
2
, i
1
, j
2
, j
1
] is not null. Instruction 11
creates the table res that represents the result of the
algorithm (i.e. the set of CCIs that satisfies the ax-
iom S in the grammar Q). Instruction 12 initializes
tables T
1
, T
2
, and I by using the set D of input CIs
and the set RS
U
of unary SPRs. The initialization
procedure works as follows: if an area having as ver-
tices (L
x
[i
2
], L
y
[ j
2
]) and (L
x
[i
2
+ i
1
], L
y
[ j
2
+ j
1
]) con-
tains only one CI, the couple < i
2
, j
2
> is added to
I[i
1
, j
1
] and T
2
[i
2
, i
1
, j
2
, j
1
] is filled by entering coor-
dinates of that CI. Let α be a CI type (i.e. a terminal
symbol), then for each unary rule V → β, where β
isa α (isa is computed by using the taxonomy of CI
types), element T
1
[i
2
, i
1
, j
2
, j
1
,V ] is set to true (such
an operation corresponds to the generation of a CCI
for each initial CI). Remaining values in tables T
1
, T
2
and I are computed in the main loop (instructions 13-
A SPATIAL QUERY LANGUAGE FOR PRESENTATION-ORIENTED DOCUMENTS
309