To address this problem, this work proposes an
extension of markup languages with a markup for
narrative variations. In a first step, authors have
to mark all transitional words and phrases to allow
machines to separate them from the reusable parts of
documents. In a second step, authors have to mark
narrative dependencies between their document parts
and associate these with the transition texts. Finally,
authors can enrich their documents with alternative
narrative dependencies and transition texts.
With this extension, we can now distinguish se-
mantic dependencies (which are inferred from se-
mantic relations as defined by formats like OM-
DOC) and narrative dependencies (which are added
to support narrative variations). In the terminology
of the content planning, these dependencies are re-
ferred to as transitions. They are traversed in or-
der to arrange document parts. Again to process
transitions they have to be represented in a machine-
processable form (M¨uller, 2010). With the machine-
processable representations of transition texts, such
texts no longer reduce the reusability of document
parts but can be flexibly hidden or displayed. The
narrative context of documents, which is formed
by coherent transitions between document parts, has
been dynamized while preserving the coherence of
the adaptation results.
3.3 Modularising Narrative Documents
Most adaptation approaches focus on topic-oriented
documents. If we want to apply these approaches
to narrative documents, we first need to modularize
them into self-contained, independent units. The new
markup for transition texts allows machine to distin-
guish narrative from reusable content and to extract
self-contained units. However, even if we could sim-
ply decompose narrative documents into topics, a co-
herent assembly into narrative documents is no longer
possible as topics omit transitional words and phrases.
Consequently, we need a new adaptation model.
This work proposes to modularize narrative doc-
uments into information units (called infoms) for
which all transition texts are preserved and narrative
and semantic dependencies are marked. These are
modelled as dependency graphs and processed dur-
ing the content planning. To increase the variation on
the content layer, variant infoms and variant relations
are identified. To increase the variation on the struc-
ture layer, alternative transitional texts and narrative
dependencies are marked.
This work is novel in that it distinguishes the se-
mantic context of document parts, the narrative con-
text of the document, and the user context. Users can
prioritize semantic, narrative or their individual con-
straints to guide the substitution and reordering – the
two content planning services addressed by this work.
In the following, both workflows are introduced.
3.4 The Substitution Service
The substitution is implemented as two-stage process.
In the first stage, the document is abstracted, i.e., cer-
tain parts are made adaptable. In particular, these
parts are replaced with holes. The term ‘hole’ is used
to denote a representation of document parts without
substance, which has to be substituted with a con-
crete, user-specific document part. Thus, holes are the
content planning correspondent for content markup in
the rendering algorithm.
In the second stage, the abstract document (d2c)
is converted into a user-specific, concrete document
(doc
∆
) by substituting each hole with an appropriate
document part according to the user’s extensional and
intensional context specification (ec
∗
and ic
∗
).
One contribution of this work is the finding that
there is a strong correlation between the render-
ing workflow for notations and the content planning
methods. Consequently, the substitution algorithm is
specified as generalization of the rendering algorithm:
The notation collector is replaced by the general in-
fom collector and the rendering grabber is substituted
with a variant sorter. Instead of iterating the subrou-
tines for each mathematical expression, the substitu-
tion is performed on each hole (2). The infom col-
lector returns a set of infoms (var
∗
) rather than no-
tation definitions. These infoms and the effective in-
tensional context (ic) are passed to the variant sorter.
The variant sorter ranks these infoms according to
how well their metadata matches with the user’s con-
text parameters (ic) and eventually returns the most
appropriate infom (var), which replaces (or fills) the
hole. The substitution returns a concrete document
(doc
∆
) in which each hole has been substituted with
user-specific content.
The substitution workflow has been implemented
as abstract document module of the JOMDOC library.
The
panta rhei
system has integrated JOMDOC to
demonstrate the generation of exams: Teaching assis-
tants can simply point to collections of exercises and
specify the context of the new exam (e.g., in terms of
language and difficulty level) and initiate the system
to generate an exam. An application to a big exer-
cise corpus is currently addressed in the TNTBASE
project (Zholudev and Kohlhase, 2010).
ADAPTATION OF MATHEMATICAL DOCUMENTS - Exploring Document Structures, Metadata, and Context for the
Generation of User-specific Documents
147