TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE
Parma Nand and Wai Yeap
School of Computer and Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand
Keywords:
Discourse processing, NLP coherence, Discourse relations, Discourse structure.
Abstract:
In this paper we argue that coherence relations between discourse units are ultimately based on mentioned
discourse entities embedded in the units participating in the relation. Coherence relations as discussed in
most literature ((Mann and Thompson, 1988), (Hobbs, 1985), (Grosz and Sidner, 1986) inter alia) are defined
between text segments, where a text segment could range from a single utterance to the whole discourse.
We show that these coherence relations are formed either directly or indirectly between embedded discourse
entities. Other semantic entities might be derived via inference/s based on the mentioned entities and the
complexity of these inferences determines some of the types of relations defined in literature. Hence, the
coherence relations as defined by (Mann and Thompson, 1988), (Hobbs, 1985) inter alia, existing between
text units is essentially an abstraction of these fundamental relations formed between embedded entities. We
argue that any representation of discourse coherence structure should entail representation of information
down to the resolution level of these embedded entities in order for such structures to be useful for automated
language processing tasks. We also show that the commonly accepted tree structure ((Hobbs, 1985),(Marcu,
1996) inter alia) is not sufficient to represent discourse relations to such a resolution level, and propose a semi
constrained directed graph as the alternative.
1 INTRODUCTION
A natural language discourse consists of a sequence
of utterances which convey a meaning with the help
of relations between the utterances, rather than solely
on the basis of meanings of individual words and
sentences. Hence, in order to fully comprehend the
meaning, it is crucial to be able to identify these re-
lations and then draw inferences based on them for
meaning–making. This does not necessarily mean
that the producer (writer or speaker) or the consumer
(reader or the listener) is consciously aware of the
these relations while progressing with the discourse.
As the discourse progresses, relations are necessar-
ily formed among a subset of embedded entities in
the discourse which act as discourse markers in the
minds of both the producer and the consumer. The
producer has access to previously mentioned entities
which he can use to overlap with new entities in order
to build on the achievement of the discourse objec-
tive. The overlap of entities can be direct as in the
case of the use of referrals or it could be indirect, em-
bedded in layers of inferences as in the case of inten-
tion based relations described by (Grosz and Sidner,
1986). Even when coherence relations are based on
semantic objects via inferences, the originating point
for the inferences is one or more embedded discourse
entities which we will refer to as grounding entities.
This concept of entity based coherence relations is
partially portrayed by backward looking links in the
centering theory described by (Grosz et al., 1995) al-
beit only for consecutive utterances. Coherence re-
lations in other studies (eg. (Mann and Thompson,
1988), (Hobbs, 1985),(Marcu, 1996), (Grosz and Sid-
ner, 1986), (Webber et al., 1999)) are based on re-
lations between some form of text units. Discourse
continuity using overlap of entities has been demon-
strated to be crucial in comprehension, as measured
by reading times and recall, when successive sen-
tences refer to the same entities (Gordon and Scearce,
1995). This affirms the theory that entities play a
significant role in derivation of relations between dis-
course units to aid in comprehension. Studies in lan-
guage production show that producers use reduced
forms of entities such as pronouns and ellipsis to refer
to entities that are locally in focus and use unreduced
forms that are not ((Marslen-Wilson et al., 1982),
(Fletche, 1984)). This aspect of continuity of ref-
erence and coherence is captured by (Grosz et al.,
1995)’s theoretical framework for local coherence as
centering theory. This theory defines semantic ob-
11
Nand P. and Yeap W. (2010).
TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE.
In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Artificial Intelligence, pages 11-19
DOI: 10.5220/0002701500110019
Copyright
c
SciTePress
jects as centers in discourse units (referred to as utter-
ances in the theory) and these centers are linked be-
tween utterances to define coherence relations in the
neighborhood. It applies an even stronger constraint
of ”overlap” of the defined grounding entities to clas-
sify successive utterance as shift, retain or continue of
local focus. (Grosz et al., 1995) use a notion of coher-
ence relation between entities in successive utterances
where an entity in a previous utterances is ”realized”
(by a pronoun) in the next utterance. However, their
centering theory framework does not provide any fur-
ther categorization of these relations or provide any
mechanism for extending it beyond successive utter-
ances. On the other hand, (Mann and Thompson,
1988), (Hobbs, 1985) inter alia describe categories of
relations existing between text units solely based on
inferences acting on the semantics of the text unit as
a whole. We propose that even though these relations
hold between two text units, it has to be based on em-
bedded entities therein, and should be identified in the
inferencing process used to arrive at the relation.
Since the basis of relations in our theory is the
presence of entities as discourse markers placed at the
mercy of the discourse producer, we cannot assume
a highly constrained hierarchical structure such as
that used by proponents of tree structure (see (Marcu,
1996) for a discussion). The popularity of discourse
representation using a tree structure seems to stem
from the fact that trees are easier to process and for-
malize, however in doing so we are essentially trying
to fit the contents of a container to its shape rather
than designing a container to fit the contents. On the
other hand we can represent a discourse using a purely
unconstrained directed graph as suggested by (Wolf
and Gibson, 2005), in which case, we can have infi-
nite number of nodes and edges making it difficult and
complex to process and formalize the graph. We be-
lieve that we need to have a representation which is in
the middle, a constrained directed graph. In section 2
we discuss the entity based coherence relations and il-
lustrate a constrained directed graph representation of
some relations using a naturally occurring discourse
in section 4.
2 COHERENCE RELATIONS
AND ITS REPRESENTATION
For the purpose of this discussion let us define an ut-
terance to be an independent clause, hence a sentence
can consist of one or more utterances. A discourse
consisting of multiple utterances is a linear arrange-
ment of these utterances in some order as determined
by the producer in a written discourse or both the pro-
Figure 1: New and old (dotted line) definition of coherence.
ducer and the consumer in an interactive discourse.
The formed order of these utterances is determined by
the relations between the utterances. We will classify
a relation to be local if it exists between utterances
in the same sentence or between utterances in adja-
cent sentences. Relations formed between utterances
in sentences further then the previous sentence will
be referred to as global relations. Relations will be
represented with nodes as text units and directed lines
(called edges) with the direction of the arrow point-
ing from the source to the destination entity/ies corre-
sponding to the relation as shown in figure 1. Hence
a coherence relation can also be expressed as a tuple:
rel type
¯
(Source Entities),(Destination Entites)
Let us consider some utterances and derive co-
herence relations already established by studies such
as (Mann and Thompson, 1988), (Hobbs, 1985),
(Marcu, 2000) inter alia. In the simplest of cases
the Source Entity could be a name and the Destina-
tion Entity could be a pronoun as shown in example
(1).
1) (a) John did not go to school today.
(b) He went to town with his friends.
In this example, according to (Hobbs, 1985),
(Mann and Thompson, 1988) inter alia, there is an
elaboration relation between text spans (1a) and (1b).
We go on further to specify that the elaboration rela-
tion is based on the grounding structures ”John” and
”He” embedded in the text units. The elaboration re-
lation is based on the ”John” and the second utterance
is an explanation as to why he did not go to school.
Identification and representation of the grounding en-
tities ”John” and ”he” is crucial in relating these two
utterances in the context of the whole discourse. For
extract (1), the candidates for grounding entities are
”John”, ”school”, ”He”, ”town” and ”friends” and one
has to use one or more rules to identify the grounding
entities for the relation. The most common basis of
coherence relations is overlapping of entities which
can be used as the first rule for grounding entity iden-
tification, however as we will see in later examples,
ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence
12
this is not necessarily the case all the time. In such
cases we can define the rule to be; ”grounding entities
are those that if removed from the text units would
make it impossible to draw inferences used to arrive
at the coherence relation”. This is a broad criteria for
identifying the grounding entities for coherence rela-
tions that can be used in all cases and is illustrated
throughout the paper with case examples.
2) (a) There were a lot of nails on the driveway.
(b) John’s car’s tyre got punctured.
Although it is common to have pronouns, reduced
noun phrases and repeated noun phrases as grounding
structures, we can also have semantically different en-
tities forming the grounding entities as shown by ex-
ample (2) which would be defined to have a cause-
effect relation. This cause-effect relation is built on
the fact that nails on the driveway caused the puncture
of the car’s tyre. Removing either nails or car’s tyre
would make the cause-effect relation untenable and
hence would form the grounding entities for the rela-
tion. Any other combination such as John and nails,
car and driveway, car and nails etc. would not support
the cause-effect relation hence, can not be considered
as the grounding entities.
(Hobbs, 1985) poses a fundamental question,
”What makes a sequence of sentences in a span of
text coherent?”. Frequently, a textual unit is consid-
ered coherent because it discusses about a series of
events in the world. Is it enough for two sentences
describing two events to be coherent if they are re-
lated temporally? (Hobbs, 1985) uses the example in
3 to illustrate that temporal succession on its own is
not enough to establish a coherence relation.
3) (a) At 5.00 a train arrived in Chicago.
(b) At 6.00 Ronald Reagan held a press conference.
Example 3 does not appear coherent in the first
instance on the shear basis of temporal succession,
however further assumptions can be made in order
to make the units 3a and 3b coherent. For instance,
Ronald Reagan could be holding a press conference
regarding a maidan voyage of a new bullet train, or
someone arrived in the train that had something to
do with the press conference. In any case, the con-
sumer has to make assumptions about the grounding
entities in the text units before any coherence relation
can be established. In the case when the grounding
entities are not immediately apparent, the consumer
has to try out the different combinations of entities
like train and Ronald Reagan, Chicago and Ronald
Reagan etc. and make inferences based on context
and world knowledge. He can then choose the combi-
nation which the consumer considers most plausible
which may not necessarily be the same as the one in-
tended by the producer.
Consider the utterances in 4 from (Hobbs, 1985):
4) (a) Did you bring your car today?
(b) My car is at the garage.
(Hobbs, 1985) describes the relation between 4a
and 4b as evaluation where he defines an evaluation
relation to be ”From S0 infer that S1 is a step in a plan
for achieving some goal of the discourse.” Hence, for
the case of 4, we can infer from sentence 4b that
the normal plan for getting somewhere in a car wont
work, and therefore the first sentence is a step in an
alternate plan for achieving that goal. The evalua-
tion relation formed between the two sentences (text
units) is sufficient when analyzed in isolation as above
but would present difficulties when it is analyzed em-
bedded in a whole discourse among numerous other
evaluation relations. Apart from shear identification
of coherence relations it would not be possible to for-
mulate any abstractions of these relations to the dis-
course level. For instance, there might be more co-
herence relations involving one or both instances of
”cars” further in the discourse as part of other coher-
ence relations and these need to be identified as ei-
ther same or different from the ”cars” in the previous
coherence relations. Hence we define that coherence
relations are based on one or more discourse entities
embedded in the text units participating in the rela-
tion, which we will call grounding entities. In the case
of the evaluation relation in extract 4, the grounding
entities would be car1 (your car) and car2 (my car)
which would be the two instances of entities from the
superset cars. Identification of these grounding enti-
ties would enable us to distinguish this evaluation re-
lation from other evaluation relations in the discourse
and formulate constraints and rules on grounding en-
tities for this relation participating in other relations
in the discourse.
Grounding entities need to be embedded entities
in the text units which includes ellipses as in the case
of extract 5 from (Wolf and Gibson, 2005).
5) (a) If the new software works,
(b) everyone should be happy.
In this case there is a condition relation between
5a and 5b. The entities that the condition relation is
based upon is ”software” in both text units, but it is
not mentioned explicitly in 5b. Hence grounding en-
tities can also be elliptic objects.
Grounding entities need not be overlapping ob-
jects as shown in the example 6. There is a cause-
effect relation as defined by (Wolf and Gibson, 2005)
and the grounding entities are ”bad weather” in 6a and
”flight” in 6b.
TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE
13
6) (a) There was bad weather at the airport
(b) and so our flight got delayed.
The linear nature of a discourse constrains a pro-
ducer to progress the discourse as a sequence of ut-
terances. It might not always be possible to produce
a complete discourse as a sequence of utterances with
relations either to the previous utterance or the utter-
ances in the previous sentence. In other words, the
producer will need to shift from the continuation of
the current theme and start off from another arbitrary
point in the discourse. (Grosz et al., 1995) capture
this notion of coherence in a very specific context by
use of the term instantiation of a center. They clas-
sify the links between adjacent utterances in terms of
retain, continue and shift in the local attentional state.
If the subsequent utterances are about same centers
(we call them entities) at the local level then the links
between the utterances are either retrain or continue
and if a subsequent utterance is about new center then
there is a shift in the attentional state. We propose
that this may be a shift at the local level but at the
discourse level the utterance may form a link with
an utterance which is an arbitrary number of utter-
ances back in the discourse. These global links be-
tween non-adjacent sentences needs to be identified
(in addition to the local ones) for one to be able to
fully understand a discourse. This study uses a much
broader set of relations based on the (Hobbs, 1985),
(Mann and Thompson, 1988), (Webber et al., 2003)
rather then the notions of retain, continue and shift by
(Grosz et al., 1995).
3 ANATOMY OF COHERENCE
The coherence relations used in section 2 to illustrate
significance of grounding entities have already been
defined in various studies such as (Hobbs, 1985) and
(Mann and Thompson, 1988), however there needs to
be further refinement of these relations in order for
them to be used for language processing tasks such
as anaphora resolution. Consider the utterances in 7
consisting of two text units all of which can be defined
to be in a cause-effect relation. Although the coher-
ence relations in the utterances 7 are all same there are
differences in the main entity/ies forming the basis of
the relation which needs to be identified and repre-
sented in order for a discourse wide entity map to be
constructed.
7) (a) John drove to the hospital because he was sick
(b) John drove to the hospital because Peter was
sick
(c) John drove to the hospital because his son was
sick
(d) John drove to the hospital because Peter asked
him to.
(e) John drove to the hospital because everyone
else was.
The cause-effect relations in 7 can be represented
as the following tuples corresponding to each of the
utterances from 7a to 7e :
Type A cause-effect{(e
1
),(e
1
)}
Type B cause-effect{(e
1
),(e
2
)}
Type C cause-effect{(e
1
),(belong(e
1
),(e
2
))}
Type D cause-effect{(e
1
),(event(e
1
),(e
2
))}
Type E cause-effect{(e
1
),(part-of(e
1
),(e
2
))}
Type A describes the simplest of the cause-effect rela-
tions where the effect is caused by some action of the
fsame entity. ie the ”causer” is the same as the ”ex-
periencer”. In the case of 7a, John being sick (cause)
led John (experiencer) to drive to hospital. In the sec-
ond case (Type B) ”Peter”, a unrelated entity, causes
an effect on the experiencer, John. We will consider
two entities to be related if the relation can be derived
from surface level semantics without use of external,
world or contextual knowledge. It can be argued that
ultimately a relation can be defined between all en-
tities in the universe and hence a coherence relation
can be formed between any two text units. The com-
plexity of inferences required to derive the relations
can be used as a measure of the strength of relations
between two entities (hence between text units) and is
left as an open issue for further research. In the case
of 7b, we need not know the explicit relation between
”John” and ”Peter” in order to be able to derive the
cause-effect coherence relation, hence we will assert
that ”John” and ”Peter” are unrelated and hence this
forms Type B relation. In utterances 7c, 7d and 7e the
cause-effect relation is effected indirectly via other
entities related by entity-to-entity (e-e) relations. For
instance, in 7c, the embedded ”son” forms the basis
for the cause-effect relation on the back of the e-e (be-
long) relation between the entities ”John” and ”son”.
Two other e-e relations (event and part-of) are illus-
trated but it is evident that this is not an exhaustive
list as it is envisaged many others will be derived with
more extensive case studies. In addition some entities
are intrinsically related such as ”tea” and ”cup” and
should be included in the list of e-e definitions.
The next logical step is to ask if the Type defini-
tions for cause-effect relations extend to other types
of coherence relations. To be able to illustrate it, we
need to first define the set of coherence relations. The
ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence
14
studies focussing on coherence and discourse struc-
ture ( such as (Hobbs, 1985), (Mann and Thomp-
son, 1988), (Webber et al., 1999), (Marcu, 2000) and
(Wolf and Gibson, 2005)) broadly agree on a core
set of coherence relations. The minor variations are
mostly due to consideration of generality/specificity,
for example, evaluation and background defined by
(Hobbs, 1985) are both considered as elaboration by
(Wolf and Gibson, 2005)). In other cases inclusion
and exclusion of some types of relations such as in-
clusion of attribution by (Wolf and Gibson, 2005)
The core set of relations as identified in most of the
mentioned studies are cause-effect, elaboration, con-
dition, violated expectation, example, generalization
and temporal. Extract 1 discussed earlier is an exam-
ple of exhibiting elaboration coherence relation where
the grounding entities are ”John” and ”He”, defined
to be Type A. Similarly, extract 2 has grounding enti-
ties ”nails” and ”tyres” which don’t have any relation
without deeper level inferencing, hence would be a
cause-effect relation of Type B. The evaluation rela-
tion in extract 4 would be a Type D since ”My car”
and ”your car” are part of the superset ”cars”. Our
analysis of coherence relations so far suggests that
most coherence relations can be categorized into the
Type definitions except some special case ones such
as attribution defined by (Wolf and Gibson, 2005)
which is meant to identify producers of direct and re-
ported speech. Section 4 uses a naturally occurring
discourse to illustrate representation of coherence re-
lations, their TYPE and the corresponding grounding
entities.
4 INTER-SENTENTIAL
COHERENCE
A natural discourse such as a newspaper article is un-
structured and is quite heavily dependent on the pro-
ducer’s writing style, context, the publishers beliefs
among numerous other factors. Therefore, if we at-
tempt to represent such an unstructured phenomenon
with a highly structured data structure such as a con-
strained tree, we are bound to make approximations
and as a consequence suffer loss of information in the
process. (Wolf and Gibson, 2005) have provided em-
pirical evidence of the unstructured nature of natural
discourses in the form of nodes with multiple parents
and crossed dependencies between nodes. In their re-
search, (Wolf and Gibson, 2005) deal with identifi-
cation and representation of coherence relations ei-
ther within sentences or between text units in adja-
cent sentences. This did not necessitate any defini-
tion of any constraints as the number of relations be-
tween text units were limited. However, if we extend
to representing relations between arbitrary text units
in a discourse, we need some constraints in order to
prevent combinatorial explosion of relations between
text units. In other words, we are proposing defini-
tion and representation of a discourse structure which
is less constrained then the commonly accepted tree
structure but which is also not completely constraint-
free as suggested by (Wolf and Gibson, 2005). Hence,
lets define text units as nodes and directed arrows as
edges connecting the nodes with a relation. The direc-
tion of the arrows will be used to identify, for instance,
the cause and the effect in a cause-effect relation. Let
us define the direction of the edges, similar to (Wolf
and Gibson, 2005), as the following :
cause-effect direction from cause to effect.
elaboration direction from elaborated segment to
elaborating segment.
condition direction from condition to consequence.
violated expectation direction from expectation to
the violated effect.
example direction from segment being exemplified
to the example.
generalization direction from the specific segment to
the generalized segment.
temporal direction from the event occurring first to
the event occurring next.
attribution direction from producer to the attributed
segment.
Extract 8 is the beginning 5 sentences of an online
article from New Zealand Herald which is segmented
into text units along similar lines as (Wolf and Gibson,
2005). We also used coherence relations from the set
used by (Wolf and Gibson, 2005) which is based on
a superset proposed by (Hobbs, 1985), however the
actual set of coherence relations is immaterial for the
purpose of this paper.
8) (a) A man has been charged with aggravated
robbery and wounding with intent to cause
grievous bodily harm after
(b) an 85-year-old war veteran was attacked last
week.
(c) Veteran Eric Brady was left with severe bruis-
ing and broken jaw bones after
(d) the attempted carjacking in Papatoetoe.
(e) Detective Sergeant Shaun Vickers of the Coun-
ties Manukau Crime Squad said
(f) an 18-year-old Manukau City resident was ar-
rested yesterday on unrelated charges and
(g) later charged over the assault.
TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE
15
(h) He will appear in Manukau district Court on
Monday morning.
(i) Mr Vickers said
(j) Mr Brady’s family had been informed and
(k) he was grateful for a ”great deal of valuable in-
formation from the public”
The directed graph in figure 2 show local (r1 to
r5) as well as global (R6 to R11) relations for the text
units in 8. The coherence relations defined are based
on embedded entities in the text units hence it is pos-
sible to distinguish between the elaboration relations
R7 which is based on the grounding entity ”veteran”
while the elaboration relation R8 is based on ”man”.
Figure 2 also shows the set of coherence relations
with unification of the grounding entities. Unification
was done by identifying and labeling all the unique
entities with a name (usually with the name used for
its first occurrence). This effectively involves resolv-
ing all referrals in the discourse. In figure 2, all in-
stances of ”man”, ”resident”, and pronouns referring
to ”resident” were replaced with ”man”. This enables
one to identify threads of text units which participate
in the same type of coherence relations throughout the
discourse. For the case of figure 2, there is a thread
consisting of units a, f, g and h about the entity ”man”
and another thread consisting of units b, c and j about
”veteran”.
The coherence relations in figure 2 also illustrate
the need for some level of constraints. For instance,
the relation R9 is from the immediately previous text
unit (g) instead of the other possible units ( f and a).
Hence we define the first rule to be :
Rule 1. A coherence relation should be defined be-
tween nearest text units.
This rule effectively forms a chained link between
text units traversing through the discourse hence re-
stricts multiple coherence relations of the same type
based on the same grounding entity/ies in a text unit.
For instance, there can not be two text units elaborat-
ing on the same grounding entity. In the case where
there are multiple elaboration relations of the same
grounding entity, Rule 1 would force the text units to
be chained rather then form multiple branches from
the same node. Therefore rule 2 is :
Rule 2. There can not be more than one coherence
relation of the same type from a text unit based on
the same source grounding entity.
It should be noted that Rule 2 does not restrict
one to have a coherence relation of a different type
based on the same grounding entity. For instance
there could be a cause-effect relation as well as an
elaboration relation based on the same grounding en-
tity/ies, however these relations need to be extending
to different text units. We cannot have two different
coherence relations between two text units which are
based on the same grounding entity/ies. In some cases
more than one type coherence relations might be ap-
parent, in which case we will choose the more specific
one out of the lot. As we will discuss later, the set of
coherence relations can be categorized into a taxon-
omy where for instance, the relation elaboration is a
superset of all other types of relations. Hence Rule 3
can be defined as :
Rule 3. There can not be more than one coherence
relation between two text units.
5 DISCUSSION
Although the existence of coherence relations is gen-
erally accepted among linguists and computational
linguists, the number, nature and the taxonomy of
the relations are controversial issues. Coherence re-
lations vary from deep relations buried under mul-
tiple layers of inferences to surface level syntactic
ones. In addition the relations could be located on ap-
proaches such as the intentions the writer had when he
wrote the text ((Grosz and Sidner, 1986)), the effect
the writer intends to achieve ((Mann and Thompson,
1988)) and the cognitive resources the reader uses to
process the discourse ((Sanders, 1992) among oth-
ers. Even though the approaches to coherence struc-
ture may vary between the discourse theories, most
of them either explicitly or implicitly propose using a
tree as a good mathematical abstraction for coherence
representation. Although we subscribe to a tree being
an appropriate structure to represent a well organized
and planned discourse such as an essay or a book, it is
inadequate for shorter discourses such as newspaper
articles and dialogues. Semi-planned discourses such
as newspaper articles are usually based around a lim-
ited set of entities and these entities keep reappearing,
usually abruptly, in multiple places in the discourse.
Representing this characteristic with a hierarchical
tree structure is infelicitous. The first 5 sentences used
to illustrate the directed graph representation in figure
2 is also represented as a tree in figure 3 for compar-
ison. The sub-trees a–b, e–g and i–k can be consid-
ered to be representation of local coherence relations
which contains the same information as the directed
graph structure. The difficulty arises in the represen-
tation of global relations R6 to R11 in figure 2. If the
text unit h is considered to be an elaboration of units
e–g, then it is represented as such in the sub-tree e–h.
This might well be sufficient for some discourse pro-
cessing applications but for other applications such as
pronoun resolution we need a finer grained represen-
ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence
16
Figure 2: Directed graph representation of the extract in 8. The sentences are labeled from S1 to S5 broken into text units
corresponding to extract 8.
tation. The directed graph on the other hand is able
to represent the elaboration relation with a focus on
unit h and g instead of the much bigger source text
unit, e–g. The directed graph on the other hand repre-
sents the elaboration relation with a focus specifically
on unit g and h. The tree representation is essentially
hierarchical with subsequent sub-trees joining as new
branches to the existing tree embedding the text units
deeper in the tree. Hence, the relations are formed
between larger and larger conflated text units. For a
large document the level of this embedding may be
so deep neutralizing any usefulness the information.
The popularity of discourse representation as a tree
seems to be due to the familiarity of the data struc-
ture in terms of its generation, traversal and analy-
sis of its efficiency calculation because of its use in
numerous other applications. This is not to say that
the tree structure is absolutely infelicitous to the dis-
course coherence3 representation. In fact larger, more
planned discourses are fairly well modeled along hier-
archical lines and a tree would be an adequate repre-
sentation if the goal of the application demands co-
herence relations only between larger spans of text
units. Shorter discourses such as newspaper articles
are more chaotic in terms of planning and organiza-
tion and making a a highly organized data structure
such as a tree. (Wolf and Gibson, 2005) have also dis-
cussed the felicity of directed graphs for coherence
data structure, however, they advocate a completely
constraint-free graph. The absence of any constraints
can explode the number of relations, especially at a
global level making it almost impossible to formal-
ize the processing and traversal of the graph. They
provide empirical data for showing the existence of
crossed dependencies between text units as well as
text units being part of multiple destination coherence
relations (they call this nodes with multiple parents).
Their results show that while crossed dependencies is
prevalent in small amounts in most types of relations,
it is especially high for similarity (33.18 %) and elab-
oration (50.52 %) of relations. Wolf et al argue that
a possible explanation for this could be that the two
types of relations are more frequent hence have a high
participation in cross dependency relations. (Knott,
1996) on the other hand hypothesizes that this is due
to the elaboration relation being less constrained then
the other types of relations. In fact both Wolfs and
Knott’s hypothesizes indirectly evaluate to the same
thing. That is, if a relation is less constrained then
we can define more of them between a given set of
text units hence the increase in its frequency as well
as the frequency of participation in crossed dependen-
TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE
17
Figure 3: Tree representation of the extract in 8.
cies. The existence of crossed dependencies can well
be represented using directed graphs (as illustrated by
(Wolf and Gibson, 2005)) but difficulties arise when
using these for natural language processing tasks such
as traversal from a node with multiple outgoing paths.
The three rules defined in section 4 restricts Wolfs
definition of constraint-free graph to a chained struc-
ture of similar relations based on same grounding en-
tities traversing through a discourse. This enables
one,for instance, to trace all elaborations of an entity
through a discourse. These threads can be used for
tasks such as document summarization and pruning
of candidates in pronoun resolution tasks.
6 CONCLUSIONS
We have presented the case for representation of a
discourse coherence structure using a directed graph
instead of the commonly accepted tree structure.
However the directed graph cannot be completely
constraint-free as presented by (Wolf and Gibson,
2005) since this can lead to an explosion of crossed
coherence relations especially at the global level.
Hence, we have presented an approach in the middle,
with definition of three rules and inclusion of ground-
ing entities which enables us to derive and store co-
herence relations with sufficient level of resolution
required for language processing tasks such as pro-
noun resolution and document resolution. We have
also shown that coherence relations can also be di-
vided into various categories depending on the source
and destination grounding entities.
REFERENCES
Akman, V. and Ersan, E. (1994). Focusing for pronoun
resolution in discourse: An implementation. Techni-
cal report, Bilkent University, Bilkent, Ankara 06533,
Turkey.
Beaver, D. I. (2004). The optimization of discourse
anaphora. Linguistics and Philosophy, 27(1):3–56.
Clark, H. H. and Sengul, C. J. (1979). In search of refer-
ents for nouns and pronouns. Memory and Cognition,
1(7):35–41.
Cristea, D., Ide, N., Marcu, D., and Tablan, V. (2000). An
empirical investigation of the relation between dis-
course structure and co-reference. In Proceedings
of the 18th conference on Computational linguistics,
pages 208–214, Morristown, NJ, USA. Association
for Computational Linguistics.
Fletche, C. R. (1984). Markedness and topic continuity in
discourse processing. Journal of Verbal Learning and
Verbal Behavior, 23(4):487–493.
Gordon, P. C. and Scearce, K. A. (1995). Pronominalization
and discourse coherence, discourse structure and pro-
noun interpretation. Memory and Cognition, 23:313–
323.
Grimes, J. E. (1975). The Thread of Discourse. The Hague:
Mouton.
Grosz, B. J., Joshi, A. K., and Weinstein, S. (1995). Center-
ing: A framework for modelling the local coherence
of discourse. Computational Linguistics, 21(2):202–
225.
ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence
18
Grosz, B. J. and Sidner, C. L. (1986). Attention, intention,
and the structure of discourse. Computational Lin-
guistics, 12(3):175–204.
Hahn, U. and Strube, M. (1997). Centering in-the-large:
Computing referential discourse segments. In eprint
arXiv:cmp-lg/9704014, pages 4014– +.
Hirschberg, J. and Nakatani, C. H. (1996). A prosodic anal-
ysis of discourse segments in direction-giving mono-
logues. In Proceedings of the 34th annual meet-
ing on Association for Computational Linguistics,
pages 286–293, Morristown, NJ, USA. Association
for Computational Linguistics.
Hitzeman, J. and Poesio, M. (1998). Long distance pronom-
inalisation and global focus. In ACL/COTJNG. Asso-
ciation for Computational Linguistics, Montreal.
Hobbs, J. R. (1979). Coherence and coreference. Cognitive
Science, 67:67–90.
Hobbs, J. R. (1985). On the coherence and structure of
discourse. Center for the Study of language and In-
formation, Stanford University.
Joshi, A. K. and Weinstein, S. (1981). Control of inference:
Role of some aspects of discourse structure - center-
ing. In Proc. International Joint Conference on Arti-
ficial Intelligence, pages 385–387. Morgan Kaufmann
Publishers Inc.
Kameyama, M. (1998). Intra-sentential centering: A case
study. In Walker, M. A., Joshi, A. K., and Prince,
E. F., editors, Centering Theory in Discourse, chap-
ter 6, pages 89–112. Oxford University Press.
Kehler, A. (1993). The effect of establishing coherence in
ellipsis and anaphora resolution. In Proceedings of the
31
st
Conference of the Association for Computational
Linguistics (ACL–93)., pages 62–69. Association for
Computational Linguistics.
Kehler, A. (2002). Coherence, reference, and the theory of
grammar. CSLI publications Stanford, Calif.
Kertz, L., Kehler, A., and Elman, J. L. (2006). Grammat-
ical and coherence-based factors in pronoun interpre-
tation. In Proceedings of the 28th Annual Conference
of the Cognitive Science Society (this volume), pages
26–29.
Knott, A. (1996). A data-driven methodology for motivating
a set of coherence relations. PhD thesis, The Univer-
sity of Edinburgh: College of Science and Engineer-
ing: The School of Informatics.
Mann, W. C. and Thompson, S. A. (1988). Rhetorical struc-
ture theory: Towards a functional theory of text orga-
nization. Text, 8(3):243–281.
Marcu, D. (1996). Building up rhetorical structure trees. In
In Proceedings of AAAI-96. American Association for
Artificial Intelligence, pages 1069–1074.
Marcu, D. (1997). The rhetorical parsing of natural lan-
guage texts. In ACL-35: Proceedings of the 35th An-
nual Meeting of the Association for Computational
Linguistics and Eighth Conference of the European
Chapter of the Association for Computational Lin-
guistics, pages 96–103, Morristown, NJ, USA. Asso-
ciation for Computational Linguistics.
Marcu, D. (2000). The theory and practice of discourse
parsing and summarization. MIT Press.
Marslen-Wilson, W., Levy, E., and Komisarjevsky Tyler, L.
(1982). Producing interpretable discourse: The es-
tablishment and maintenance of reference. Language,
pages 339–378.
Merlo, P. and Stevenson, S. (2001). Automatic verb classi-
fication based on statistical distributions of argument
structure. Computational Linguistics, 27(3):373–408.
Passonneau, R. J. (1998). Interaction of discourse struc-
ture with explicitness of discourse anaphoric noun
phrases., pages 327–356. Published by Oxford Uni-
versity Press.
Power, R., Scott, D., and Bouayad-Agha, N. (2003).
Document structure. Computational Linguistics,
29(2):211–260.
Sanders, T. (1992). Toward a Taxonomy of Coherence Re-
lations. Discourse processes, 15(1):1–35.
Strube, M. and Hahn, U. (1999). Functional centering-
grounding referential coherence in information struc-
ture. Computational Linguistics, 25(3):309–344.
Susan E. Brennan, M. W. F. and Pollard, C. J. (1987). A
centering approach to pronouns. Proc of the 25th ACL,
25:155–162.
Webber, B. (1988). Discourse deixis: Reference to dis-
course segments. In Proceedings of the 26th annual
meeting on Association for Computational Linguis-
tics, pages 113–122. Association for Computational
Linguistics Morristown, NJ, USA.
Webber, B., Knott, A., Stone, M., and Joshi, A. (1999).
Discourse relations: A structural and presuppositional
account using lexicalised TAG. In Proceedings of
the 37th annual meeting of the Association for Com-
putational Linguistics on Computational Linguistics,
pages 41–48. Association for Computational Linguis-
tics Morristown, NJ, USA.
Webber, B., Stone, M., Joshi, A., and Knott, A. (2003).
Anaphora and discourse structure. Computational
Linguistics, 29(4):545–587.
Wolf, F. and Gibson, E. (2004). Discourse coherence and
pronoun resolution. Language and Cognitive Pro-
cesses, 19(6):665–675.
Wolf, F. and Gibson, E. (2005). Representing discourse co-
herence: A corpus-based study. Computational Lin-
guistics, 31(2):249–287.
TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE
19