et al., 2008), the Google Knowledge Graph (Google,
2012), and DBpedia (Mendes et al., 2011). These on-
tologies are often used in conjunction with controlled
vocabularies such as Dublin Core (Weibel et al.,
1998) or FOAF (Brickley and Miller, 2007), that facil-
itate the interoperability of data sources, most notably
via the RDF representation language.
Rather than an ensemble of unrelated facts, nar-
ration implies relationships between atomic facts or
events. (Scherp et al., 2009) define a taxonomy
of such links (compositionality, causality, correlation
and documentation). These links are relevant to our
context, but they consider events at a coarse level with
no controlled vocabulary of predicates. Similarly to
(Van der Meij et al., 2010; Segers et al., 2011), they
are mostly concerned by interoperability between dif-
ferent ontologies. Likewise, (Van Hage et al., 2012)
emphasize link types, with little consideration of spe-
cific data structures for events. With close resem-
blance to the CIDOC CRM (Doerr, 2003), (Segers
et al., 2011) define explicitly roles (also known as
facets in (Mulholland et al., 2012)) applying to events
(e.g. actors, dates and locations), which are appro-
priate for historical events in a broad sense (e.g. the
French Revolution in (Segers et al., 2011)), but not
for events as constituents of a first-person narrative.
The objective is then to propose a standard metadata
description space for historical artifacts, rather than
exploring the structure of narration.
The contributions by (Zarri, 2009) are the most
closely related to our work. In their Narrative Knowl-
edge Representation Language (NKRL), they define
data structures and controlled vocabularies of predi-
cates and links to support the analysis of non-fictional
and factual narratives. They avoid the use of the
term story as it has led to ambiguities in the litera-
ture. Rather, They define a set of events and facts as
the fabula. The plot level adds chronological, log-
ical and coherence links between events. The pre-
sentation level is about the form in which plots are
shown. Some related work in narrative analysis and
storytelling is concerned with mapping arbitrary sto-
ries to a classical narrative structure (Tilley, 1992; Ye-
ung et al., 2014). In our work, stories are potentially
made of anecdotal testimonies, and as such cannot
be expected to match these structures. More abstract
properties, such as sentiment attached to stories, were
also extracted in (Min and Park, 2016) in order to an-
alyze the structure of books.
The way arbitrary text is remapped automatically
to taxonomies of entity types, relationships and pred-
icates is seldom considered in the literature. Some
authors explicitly assume that this mapping has to be
performed manually (Mulholland et al., 2012), or via
crowdsourcing (Bollacker et al., 2008). Wikipedia
page structure has also been exploited in (Suchanek
et al., 2008). Alternatively, a term-based heuristic is
used in (Gaeta et al., 2014) to determine links between
events, and the use of Natural Language Processing
(NLP) techniques such as Named Entity Recognition
(NER) to automatically extract facts and events has
been evaluated in (Segers et al., 2011; Van Hooland
et al., 2015). Entity types in event models such as
SEM (Van Hage et al., 2012) are closely related to
types extracted by standard NER methods such as
(Favre et al., 2005) (e.g. people, locations, dates).
3 NARRATIVE ENTITY MODEL
To suit the needs of the project described in the intro-
duction, and put people and spatio-temporal coordi-
nates at the center of narratives, we developed a sim-
plified variant of NKRL (Zarri, 2009). In a nutshell,
our model can be thought of as a database schema to
enable storage and facilitate indexation of narrative
data.
The root object type is denoted as entity, as a ref-
erence to the Drupal terminology, that supports our
implementation of the model, described to further ex-
tent in Section 5. Except primitive types such as text
and numbers, all non-primitive types (e.g. story) are
specializations of this root object type. Entity labels
in Figure 1 are meant to be unique. Entity references,
i.e. references to other entities in the database (e.g.
person referred to in a story) are underlined. Arrows
denote typed dependencies (i.e. pointers depend on
pointees), when other dependencies may refer to sev-
eral kinds of entities. To emphasize the story-centric
aspect of this model, most types, such as person and
location, directly store references to stories that refer
to them. This can be thought of as a kind of reverse
index.
The data model in Figure 1 is heavily inspired by
the model underlying NKRL (Zarri, 2009), but ex-
hibits decisive distinctions. The proposed structure
was designed with flexibility in mind. For example,
it easily supports partial specification - a typical nar-
rative may occasionally omit spatial and or temporal
specifications. Similarly, the proposed custom date
format supports loose specification. The approximate
flag indicates whether the precision of the temporal
bounds should not be accounted for, and all fields ex-
cept year are optional. Both points and intervals in
time can be described with the same format, simply
by equaling from and to respective fields.
Most entities in the model (e.g. artifacts, people,
places) may have alternative writings. This ambiguity
Storing and Processing Personal Narratives in the Context of Cultural Legacy Preservation
49