TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE

Parma Nand and Wai Yeap

School of Computer and Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand

Keywords:

Discourse processing, NLP coherence, Discourse relations, Discourse structure.

Abstract:

In this paper we argue that coherence relations between discourse units are ultimately based on mentioned

discourse entities embedded in the units participating in the relation. Coherence relations as discussed in

most literature ((Mann and Thompson, 1988), (Hobbs, 1985), (Grosz and Sidner, 1986) inter alia) are deﬁned

between text segments, where a text segment could range from a single utterance to the whole discourse.

We show that these coherence relations are formed either directly or indirectly between embedded discourse

entities. Other semantic entities might be derived via inference/s based on the mentioned entities and the

complexity of these inferences determines some of the types of relations deﬁned in literature. Hence, the

coherence relations as deﬁned by (Mann and Thompson, 1988), (Hobbs, 1985) inter alia, existing between

text units is essentially an abstraction of these fundamental relations formed between embedded entities. We

argue that any representation of discourse coherence structure should entail representation of information

down to the resolution level of these embedded entities in order for such structures to be useful for automated

language processing tasks. We also show that the commonly accepted tree structure ((Hobbs, 1985),(Marcu,

1996) inter alia) is not sufﬁcient to represent discourse relations to such a resolution level, and propose a semi

constrained directed graph as the alternative.

1 INTRODUCTION

A natural language discourse consists of a sequence

of utterances which convey a meaning with the help

of relations between the utterances, rather than solely

on the basis of meanings of individual words and

sentences. Hence, in order to fully comprehend the

meaning, it is crucial to be able to identify these re-

lations and then draw inferences based on them for

meaning–making. This does not necessarily mean

that the producer (writer or speaker) or the consumer

(reader or the listener) is consciously aware of the

these relations while progressing with the discourse.

As the discourse progresses, relations are necessar-

ily formed among a subset of embedded entities in

the discourse which act as discourse markers in the

minds of both the producer and the consumer. The

producer has access to previously mentioned entities

which he can use to overlap with new entities in order

to build on the achievement of the discourse objec-

tive. The overlap of entities can be direct as in the

case of the use of referrals or it could be indirect, em-

bedded in layers of inferences as in the case of inten-

tion based relations described by (Grosz and Sidner,

1986). Even when coherence relations are based on

semantic objects via inferences, the originating point

for the inferences is one or more embedded discourse

entities which we will refer to as grounding entities.

This concept of entity based coherence relations is

partially portrayed by backward looking links in the

centering theory described by (Grosz et al., 1995) al-

beit only for consecutive utterances. Coherence re-

lations in other studies (eg. (Mann and Thompson,

1988), (Hobbs, 1985),(Marcu, 1996), (Grosz and Sid-

ner, 1986), (Webber et al., 1999)) are based on re-

lations between some form of text units. Discourse

continuity using overlap of entities has been demon-

strated to be crucial in comprehension, as measured

by reading times and recall, when successive sen-

tences refer to the same entities (Gordon and Scearce,

1995). This afﬁrms the theory that entities play a

signiﬁcant role in derivation of relations between dis-

course units to aid in comprehension. Studies in lan-

guage production show that producers use reduced

forms of entities such as pronouns and ellipsis to refer

to entities that are locally in focus and use unreduced

forms that are not ((Marslen-Wilson et al., 1982),

(Fletche, 1984)). This aspect of continuity of ref-

erence and coherence is captured by (Grosz et al.,

1995)’s theoretical framework for local coherence as

centering theory. This theory deﬁnes semantic ob-

Nand P. and Yeap W. (2010).

TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE.

In Proceedings of the 2nd International Conference on Agents and Artiﬁcial Intelligence - Artiﬁcial Intelligence, pages 11-19

DOI: 10.5220/0002701500110019

 SciTePress

jects as centers in discourse units (referred to as utter-

ances in the theory) and these centers are linked be-

tween utterances to deﬁne coherence relations in the

neighborhood. It applies an even stronger constraint

of ”overlap” of the deﬁned grounding entities to clas-

sify successive utterance as shift, retain or continue of

local focus. (Grosz et al., 1995) use a notion of coher-

ence relation between entities in successive utterances

where an entity in a previous utterances is ”realized”

(by a pronoun) in the next utterance. However, their

centering theory framework does not provide any fur-

ther categorization of these relations or provide any

mechanism for extending it beyond successive utter-

ances. On the other hand, (Mann and Thompson,

1988), (Hobbs, 1985) inter alia describe categories of

relations existing between text units solely based on

inferences acting on the semantics of the text unit as

a whole. We propose that even though these relations

hold between two text units, it has to be based on em-

bedded entities therein, and should be identiﬁed in the

inferencing process used to arrive at the relation.

Since the basis of relations in our theory is the

presence of entities as discourse markers placed at the

mercy of the discourse producer, we cannot assume

a highly constrained hierarchical structure such as

that used by proponents of tree structure (see (Marcu,

1996) for a discussion). The popularity of discourse

representation using a tree structure seems to stem

from the fact that trees are easier to process and for-

malize, however in doing so we are essentially trying

to ﬁt the contents of a container to its shape rather

than designing a container to ﬁt the contents. On the

other hand we can represent a discourse using a purely

unconstrained directed graph as suggested by (Wolf

and Gibson, 2005), in which case, we can have inﬁ-

nite number of nodes and edges making it difﬁcult and

complex to process and formalize the graph. We be-

lieve that we need to have a representation which is in

the middle, a constrained directed graph. In section 2

we discuss the entity based coherence relations and il-

lustrate a constrained directed graph representation of

some relations using a naturally occurring discourse

in section 4.

2 COHERENCE RELATIONS

AND ITS REPRESENTATION

For the purpose of this discussion let us deﬁne an ut-

terance to be an independent clause, hence a sentence

can consist of one or more utterances. A discourse

consisting of multiple utterances is a linear arrange-

ment of these utterances in some order as determined

by the producer in a written discourse or both the pro-

Figure 1: New and old (dotted line) deﬁnition of coherence.

ducer and the consumer in an interactive discourse.

The formed order of these utterances is determined by

the relations between the utterances. We will classify

a relation to be local if it exists between utterances

in the same sentence or between utterances in adja-

cent sentences. Relations formed between utterances

in sentences further then the previous sentence will

be referred to as global relations. Relations will be

represented with nodes as text units and directed lines

(called edges) with the direction of the arrow point-

ing from the source to the destination entity/ies corre-

sponding to the relation as shown in ﬁgure 1. Hence

a coherence relation can also be expressed as a tuple:

rel type

(Source Entities),(Destination Entites)

Let us consider some utterances and derive co-

herence relations already established by studies such

as (Mann and Thompson, 1988), (Hobbs, 1985),

(Marcu, 2000) inter alia. In the simplest of cases

the Source Entity could be a name and the Destina-

tion Entity could be a pronoun as shown in example

(1).

1) (a) John did not go to school today.

(b) He went to town with his friends.

In this example, according to (Hobbs, 1985),

(Mann and Thompson, 1988) inter alia, there is an

elaboration relation between text spans (1a) and (1b).

We go on further to specify that the elaboration rela-

tion is based on the grounding structures ”John” and

”He” embedded in the text units. The elaboration re-

lation is based on the ”John” and the second utterance

is an explanation as to why he did not go to school.

Identiﬁcation and representation of the grounding en-

tities ”John” and ”he” is crucial in relating these two

utterances in the context of the whole discourse. For

extract (1), the candidates for grounding entities are

”John”, ”school”, ”He”, ”town” and ”friends” and one

has to use one or more rules to identify the grounding

entities for the relation. The most common basis of

coherence relations is overlapping of entities which

can be used as the ﬁrst rule for grounding entity iden-

tiﬁcation, however as we will see in later examples,

ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence

this is not necessarily the case all the time. In such

cases we can deﬁne the rule to be; ”grounding entities

are those that if removed from the text units would

make it impossible to draw inferences used to arrive

at the coherence relation”. This is a broad criteria for

identifying the grounding entities for coherence rela-

tions that can be used in all cases and is illustrated

throughout the paper with case examples.

2) (a) There were a lot of nails on the driveway.

(b) John’s car’s tyre got punctured.

Although it is common to have pronouns, reduced

noun phrases and repeated noun phrases as grounding

structures, we can also have semantically different en-

tities forming the grounding entities as shown by ex-

ample (2) which would be deﬁned to have a cause-

effect relation. This cause-effect relation is built on

the fact that nails on the driveway caused the puncture

of the car’s tyre. Removing either nails or car’s tyre

would make the cause-effect relation untenable and

hence would form the grounding entities for the rela-

tion. Any other combination such as John and nails,

car and driveway, car and nails etc. would not support

the cause-effect relation hence, can not be considered

as the grounding entities.

(Hobbs, 1985) poses a fundamental question,

”What makes a sequence of sentences in a span of

text coherent?”. Frequently, a textual unit is consid-

ered coherent because it discusses about a series of

events in the world. Is it enough for two sentences

describing two events to be coherent if they are re-

lated temporally? (Hobbs, 1985) uses the example in

3 to illustrate that temporal succession on its own is

not enough to establish a coherence relation.

3) (a) At 5.00 a train arrived in Chicago.

(b) At 6.00 Ronald Reagan held a press conference.

Example 3 does not appear coherent in the ﬁrst

instance on the shear basis of temporal succession,

however further assumptions can be made in order

to make the units 3a and 3b coherent. For instance,

Ronald Reagan could be holding a press conference

regarding a maidan voyage of a new bullet train, or

someone arrived in the train that had something to

do with the press conference. In any case, the con-

sumer has to make assumptions about the grounding

entities in the text units before any coherence relation

can be established. In the case when the grounding

entities are not immediately apparent, the consumer

has to try out the different combinations of entities

like train and Ronald Reagan, Chicago and Ronald

Reagan etc. and make inferences based on context

and world knowledge. He can then choose the combi-

nation which the consumer considers most plausible

which may not necessarily be the same as the one in-

tended by the producer.

Consider the utterances in 4 from (Hobbs, 1985):

4) (a) Did you bring your car today?

(b) My car is at the garage.

(Hobbs, 1985) describes the relation between 4a

and 4b as evaluation where he deﬁnes an evaluation

relation to be ”From S0 infer that S1 is a step in a plan

for achieving some goal of the discourse.” Hence, for

the case of 4, we can infer from sentence 4b that

the normal plan for getting somewhere in a car wont

work, and therefore the ﬁrst sentence is a step in an

alternate plan for achieving that goal. The evalua-

tion relation formed between the two sentences (text

units) is sufﬁcient when analyzed in isolation as above

but would present difﬁculties when it is analyzed em-

bedded in a whole discourse among numerous other

evaluation relations. Apart from shear identiﬁcation

of coherence relations it would not be possible to for-

mulate any abstractions of these relations to the dis-

course level. For instance, there might be more co-

herence relations involving one or both instances of

”cars” further in the discourse as part of other coher-

ence relations and these need to be identiﬁed as ei-

ther same or different from the ”cars” in the previous

coherence relations. Hence we deﬁne that coherence

relations are based on one or more discourse entities

embedded in the text units participating in the rela-

tion, which we will call grounding entities. In the case

of the evaluation relation in extract 4, the grounding

entities would be car1 (your car) and car2 (my car)

which would be the two instances of entities from the

superset cars. Identiﬁcation of these grounding enti-

ties would enable us to distinguish this evaluation re-

lation from other evaluation relations in the discourse

and formulate constraints and rules on grounding en-

tities for this relation participating in other relations

in the discourse.

Grounding entities need to be embedded entities

in the text units which includes ellipses as in the case

of extract 5 from (Wolf and Gibson, 2005).

5) (a) If the new software works,

(b) everyone should be happy.

In this case there is a condition relation between

5a and 5b. The entities that the condition relation is

based upon is ”software” in both text units, but it is

not mentioned explicitly in 5b. Hence grounding en-

tities can also be elliptic objects.

Grounding entities need not be overlapping ob-

jects as shown in the example 6. There is a cause-

effect relation as deﬁned by (Wolf and Gibson, 2005)

and the grounding entities are ”bad weather” in 6a and

”ﬂight” in 6b.

TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE

6) (a) There was bad weather at the airport

(b) and so our ﬂight got delayed.

The linear nature of a discourse constrains a pro-

ducer to progress the discourse as a sequence of ut-

terances. It might not always be possible to produce

a complete discourse as a sequence of utterances with

relations either to the previous utterance or the utter-

ances in the previous sentence. In other words, the

producer will need to shift from the continuation of

the current theme and start off from another arbitrary

point in the discourse. (Grosz et al., 1995) capture

this notion of coherence in a very speciﬁc context by

use of the term instantiation of a center. They clas-

sify the links between adjacent utterances in terms of

retain, continue and shift in the local attentional state.

If the subsequent utterances are about same centers

(we call them entities) at the local level then the links

between the utterances are either retrain or continue

and if a subsequent utterance is about new center then

there is a shift in the attentional state. We propose

that this may be a shift at the local level but at the

discourse level the utterance may form a link with

an utterance which is an arbitrary number of utter-

ances back in the discourse. These global links be-

tween non-adjacent sentences needs to be identiﬁed

(in addition to the local ones) for one to be able to

fully understand a discourse. This study uses a much

broader set of relations based on the (Hobbs, 1985),

(Mann and Thompson, 1988), (Webber et al., 2003)

rather then the notions of retain, continue and shift by

(Grosz et al., 1995).

3 ANATOMY OF COHERENCE

The coherence relations used in section 2 to illustrate

signiﬁcance of grounding entities have already been

deﬁned in various studies such as (Hobbs, 1985) and

(Mann and Thompson, 1988), however there needs to

be further reﬁnement of these relations in order for

them to be used for language processing tasks such

as anaphora resolution. Consider the utterances in 7

consisting of two text units all of which can be deﬁned

to be in a cause-effect relation. Although the coher-

ence relations in the utterances 7 are all same there are

differences in the main entity/ies forming the basis of

the relation which needs to be identiﬁed and repre-

sented in order for a discourse wide entity map to be

constructed.

7) (a) John drove to the hospital because he was sick

(b) John drove to the hospital because Peter was

sick

(d) John drove to the hospital because Peter asked

him to.

(e) John drove to the hospital because everyone

else was.

The cause-effect relations in 7 can be represented

as the following tuples corresponding to each of the

utterances from 7a to 7e :

Type A cause-effect{(e

),(e

)}

Type B cause-effect{(e

),(e

)}

Type C cause-effect{(e

),(belong(e

),(e

))}

Type D cause-effect{(e

),(event(e

),(e

))}

Type E cause-effect{(e

),(part-of(e

),(e

))}

Type A describes the simplest of the cause-effect rela-

tions where the effect is caused by some action of the

fsame entity. ie the ”causer” is the same as the ”ex-

periencer”. In the case of 7a, John being sick (cause)

led John (experiencer) to drive to hospital. In the sec-

ond case (Type B) ”Peter”, a unrelated entity, causes

an effect on the experiencer, John. We will consider

two entities to be related if the relation can be derived

from surface level semantics without use of external,

world or contextual knowledge. It can be argued that

ultimately a relation can be deﬁned between all en-

tities in the universe and hence a coherence relation

can be formed between any two text units. The com-

plexity of inferences required to derive the relations

can be used as a measure of the strength of relations

between two entities (hence between text units) and is

left as an open issue for further research. In the case

of 7b, we need not know the explicit relation between

”John” and ”Peter” in order to be able to derive the

cause-effect coherence relation, hence we will assert

that ”John” and ”Peter” are unrelated and hence this

forms Type B relation. In utterances 7c, 7d and 7e the

cause-effect relation is effected indirectly via other

entities related by entity-to-entity (e-e) relations. For

instance, in 7c, the embedded ”son” forms the basis

for the cause-effect relation on the back of the e-e (be-

long) relation between the entities ”John” and ”son”.

Two other e-e relations (event and part-of) are illus-

trated but it is evident that this is not an exhaustive

list as it is envisaged many others will be derived with

more extensive case studies. In addition some entities

are intrinsically related such as ”tea” and ”cup” and

should be included in the list of e-e deﬁnitions.

The next logical step is to ask if the Type deﬁni-

tions for cause-effect relations extend to other types

of coherence relations. To be able to illustrate it, we

need to ﬁrst deﬁne the set of coherence relations. The

ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence

studies focussing on coherence and discourse struc-

ture ( such as (Hobbs, 1985), (Mann and Thomp-

son, 1988), (Webber et al., 1999), (Marcu, 2000) and

(Wolf and Gibson, 2005)) broadly agree on a core

set of coherence relations. The minor variations are

mostly due to consideration of generality/speciﬁcity,

for example, evaluation and background deﬁned by

(Hobbs, 1985) are both considered as elaboration by

(Wolf and Gibson, 2005)). In other cases inclusion

and exclusion of some types of relations such as in-

clusion of attribution by (Wolf and Gibson, 2005)

The core set of relations as identiﬁed in most of the

mentioned studies are cause-effect, elaboration, con-

dition, violated expectation, example, generalization

and temporal. Extract 1 discussed earlier is an exam-

ple of exhibiting elaboration coherence relation where

the grounding entities are ”John” and ”He”, deﬁned

to be Type A. Similarly, extract 2 has grounding enti-

ties ”nails” and ”tyres” which don’t have any relation

without deeper level inferencing, hence would be a

cause-effect relation of Type B. The evaluation rela-

tion in extract 4 would be a Type D since ”My car”

and ”your car” are part of the superset ”cars”. Our

analysis of coherence relations so far suggests that

most coherence relations can be categorized into the

Type deﬁnitions except some special case ones such

as attribution deﬁned by (Wolf and Gibson, 2005)

which is meant to identify producers of direct and re-

ported speech. Section 4 uses a naturally occurring

discourse to illustrate representation of coherence re-

lations, their TYPE and the corresponding grounding

entities.

4 INTER-SENTENTIAL

COHERENCE

A natural discourse such as a newspaper article is un-

structured and is quite heavily dependent on the pro-

ducer’s writing style, context, the publishers beliefs

among numerous other factors. Therefore, if we at-

tempt to represent such an unstructured phenomenon

with a highly structured data structure such as a con-

strained tree, we are bound to make approximations

and as a consequence suffer loss of information in the

process. (Wolf and Gibson, 2005) have provided em-

pirical evidence of the unstructured nature of natural

discourses in the form of nodes with multiple parents

and crossed dependencies between nodes. In their re-

search, (Wolf and Gibson, 2005) deal with identiﬁ-

cation and representation of coherence relations ei-

ther within sentences or between text units in adja-

cent sentences. This did not necessitate any deﬁni-

tion of any constraints as the number of relations be-

tween text units were limited. However, if we extend

to representing relations between arbitrary text units

in a discourse, we need some constraints in order to

prevent combinatorial explosion of relations between

text units. In other words, we are proposing deﬁni-

tion and representation of a discourse structure which

is less constrained then the commonly accepted tree

structure but which is also not completely constraint-

free as suggested by (Wolf and Gibson, 2005). Hence,

lets deﬁne text units as nodes and directed arrows as

edges connecting the nodes with a relation. The direc-

tion of the arrows will be used to identify, for instance,

the cause and the effect in a cause-effect relation. Let

us deﬁne the direction of the edges, similar to (Wolf

and Gibson, 2005), as the following :

cause-effect direction from cause to effect.

elaboration direction from elaborated segment to

elaborating segment.

condition direction from condition to consequence.

violated expectation direction from expectation to

the violated effect.

example direction from segment being exempliﬁed

to the example.

generalization direction from the speciﬁc segment to

the generalized segment.

temporal direction from the event occurring ﬁrst to

the event occurring next.

attribution direction from producer to the attributed

segment.

Extract 8 is the beginning 5 sentences of an online

article from New Zealand Herald which is segmented

into text units along similar lines as (Wolf and Gibson,

2005). We also used coherence relations from the set

used by (Wolf and Gibson, 2005) which is based on

a superset proposed by (Hobbs, 1985), however the

actual set of coherence relations is immaterial for the

purpose of this paper.

8) (a) A man has been charged with aggravated

robbery and wounding with intent to cause

grievous bodily harm after

(b) an 85-year-old war veteran was attacked last

week.

ing and broken jaw bones after

(d) the attempted carjacking in Papatoetoe.

(e) Detective Sergeant Shaun Vickers of the Coun-

ties Manukau Crime Squad said

(f) an 18-year-old Manukau City resident was ar-

rested yesterday on unrelated charges and

(g) later charged over the assault.

TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE

(h) He will appear in Manukau district Court on

Monday morning.

(i) Mr Vickers said

(j) Mr Brady’s family had been informed and

(k) he was grateful for a ”great deal of valuable in-

formation from the public”

The directed graph in ﬁgure 2 show local (r1 to

r5) as well as global (R6 to R11) relations for the text

units in 8. The coherence relations deﬁned are based

on embedded entities in the text units hence it is pos-

sible to distinguish between the elaboration relations

R7 which is based on the grounding entity ”veteran”

while the elaboration relation R8 is based on ”man”.

Figure 2 also shows the set of coherence relations

with uniﬁcation of the grounding entities. Uniﬁcation

was done by identifying and labeling all the unique

entities with a name (usually with the name used for

its ﬁrst occurrence). This effectively involves resolv-

ing all referrals in the discourse. In ﬁgure 2, all in-

stances of ”man”, ”resident”, and pronouns referring

to ”resident” were replaced with ”man”. This enables

one to identify threads of text units which participate

in the same type of coherence relations throughout the

discourse. For the case of ﬁgure 2, there is a thread

consisting of units a, f, g and h about the entity ”man”

and another thread consisting of units b, c and j about

”veteran”.

The coherence relations in ﬁgure 2 also illustrate

the need for some level of constraints. For instance,

the relation R9 is from the immediately previous text

unit (g) instead of the other possible units ( f and a).

Hence we deﬁne the ﬁrst rule to be :

Rule 1. A coherence relation should be deﬁned be-

tween nearest text units.

This rule effectively forms a chained link between

text units traversing through the discourse hence re-

stricts multiple coherence relations of the same type

based on the same grounding entity/ies in a text unit.

For instance, there can not be two text units elaborat-

ing on the same grounding entity. In the case where

there are multiple elaboration relations of the same

grounding entity, Rule 1 would force the text units to

be chained rather then form multiple branches from

the same node. Therefore rule 2 is :

Rule 2. There can not be more than one coherence

relation of the same type from a text unit based on

the same source grounding entity.

It should be noted that Rule 2 does not restrict

one to have a coherence relation of a different type

based on the same grounding entity. For instance

there could be a cause-effect relation as well as an

elaboration relation based on the same grounding en-

tity/ies, however these relations need to be extending

to different text units. We cannot have two different

coherence relations between two text units which are

based on the same grounding entity/ies. In some cases

more than one type coherence relations might be ap-

parent, in which case we will choose the more speciﬁc

one out of the lot. As we will discuss later, the set of

coherence relations can be categorized into a taxon-

omy where for instance, the relation elaboration is a

superset of all other types of relations. Hence Rule 3

can be deﬁned as :

Rule 3. There can not be more than one coherence

relation between two text units.

5 DISCUSSION

Although the existence of coherence relations is gen-

erally accepted among linguists and computational

linguists, the number, nature and the taxonomy of

the relations are controversial issues. Coherence re-

lations vary from deep relations buried under mul-

tiple layers of inferences to surface level syntactic

ones. In addition the relations could be located on ap-

proaches such as the intentions the writer had when he

wrote the text ((Grosz and Sidner, 1986)), the effect

the writer intends to achieve ((Mann and Thompson,

1988)) and the cognitive resources the reader uses to

process the discourse ((Sanders, 1992) among oth-

ers. Even though the approaches to coherence struc-

ture may vary between the discourse theories, most

of them either explicitly or implicitly propose using a

tree as a good mathematical abstraction for coherence

representation. Although we subscribe to a tree being

an appropriate structure to represent a well organized

and planned discourse such as an essay or a book, it is

inadequate for shorter discourses such as newspaper

articles and dialogues. Semi-planned discourses such

as newspaper articles are usually based around a lim-

ited set of entities and these entities keep reappearing,

usually abruptly, in multiple places in the discourse.

Representing this characteristic with a hierarchical

tree structure is infelicitous. The ﬁrst 5 sentences used

to illustrate the directed graph representation in ﬁgure

2 is also represented as a tree in ﬁgure 3 for compar-

ison. The sub-trees a–b, e–g and i–k can be consid-

ered to be representation of local coherence relations

which contains the same information as the directed

graph structure. The difﬁculty arises in the represen-

tation of global relations R6 to R11 in ﬁgure 2. If the

text unit h is considered to be an elaboration of units

e–g, then it is represented as such in the sub-tree e–h.

This might well be sufﬁcient for some discourse pro-

cessing applications but for other applications such as

pronoun resolution we need a ﬁner grained represen-

ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence

Figure 2: Directed graph representation of the extract in 8. The sentences are labeled from S1 to S5 broken into text units

corresponding to extract 8.

tation. The directed graph on the other hand is able

to represent the elaboration relation with a focus on

unit h and g instead of the much bigger source text

unit, e–g. The directed graph on the other hand repre-

sents the elaboration relation with a focus speciﬁcally

on unit g and h. The tree representation is essentially

hierarchical with subsequent sub-trees joining as new

branches to the existing tree embedding the text units

deeper in the tree. Hence, the relations are formed

between larger and larger conﬂated text units. For a

large document the level of this embedding may be

so deep neutralizing any usefulness the information.

The popularity of discourse representation as a tree

seems to be due to the familiarity of the data struc-

ture in terms of its generation, traversal and analy-

sis of its efﬁciency calculation because of its use in

numerous other applications. This is not to say that

the tree structure is absolutely infelicitous to the dis-

course coherence3 representation. In fact larger, more

planned discourses are fairly well modeled along hier-

archical lines and a tree would be an adequate repre-

sentation if the goal of the application demands co-

herence relations only between larger spans of text

units. Shorter discourses such as newspaper articles

are more chaotic in terms of planning and organiza-

tion and making a a highly organized data structure

such as a tree. (Wolf and Gibson, 2005) have also dis-

cussed the felicity of directed graphs for coherence

data structure, however, they advocate a completely

constraint-free graph. The absence of any constraints

can explode the number of relations, especially at a

global level making it almost impossible to formal-

ize the processing and traversal of the graph. They

provide empirical data for showing the existence of

crossed dependencies between text units as well as

text units being part of multiple destination coherence

relations (they call this nodes with multiple parents).

Their results show that while crossed dependencies is

prevalent in small amounts in most types of relations,

it is especially high for similarity (33.18 %) and elab-

oration (50.52 %) of relations. Wolf et al argue that

a possible explanation for this could be that the two

types of relations are more frequent hence have a high

participation in cross dependency relations. (Knott,

1996) on the other hand hypothesizes that this is due

to the elaboration relation being less constrained then

the other types of relations. In fact both Wolf’s and

Knott’s hypothesizes indirectly evaluate to the same

thing. That is, if a relation is less constrained then

we can deﬁne more of them between a given set of

text units hence the increase in its frequency as well

as the frequency of participation in crossed dependen-

TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE

Figure 3: Tree representation of the extract in 8.

cies. The existence of crossed dependencies can well

be represented using directed graphs (as illustrated by

(Wolf and Gibson, 2005)) but difﬁculties arise when

using these for natural language processing tasks such

as traversal from a node with multiple outgoing paths.

The three rules deﬁned in section 4 restricts Wolf’s

deﬁnition of constraint-free graph to a chained struc-

ture of similar relations based on same grounding en-

tities traversing through a discourse. This enables

one,for instance, to trace all elaborations of an entity

through a discourse. These threads can be used for

tasks such as document summarization and pruning

of candidates in pronoun resolution tasks.

6 CONCLUSIONS

We have presented the case for representation of a

discourse coherence structure using a directed graph

instead of the commonly accepted tree structure.

However the directed graph cannot be completely

constraint-free as presented by (Wolf and Gibson,

2005) since this can lead to an explosion of crossed

coherence relations especially at the global level.

Hence, we have presented an approach in the middle,

with deﬁnition of three rules and inclusion of ground-

ing entities which enables us to derive and store co-

herence relations with sufﬁcient level of resolution

required for language processing tasks such as pro-

noun resolution and document resolution. We have

also shown that coherence relations can also be di-

vided into various categories depending on the source

and destination grounding entities.

REFERENCES

Akman, V. and Ersan, E. (1994). Focusing for pronoun

resolution in discourse: An implementation. Techni-

cal report, Bilkent University, Bilkent, Ankara 06533,

Turkey.

Beaver, D. I. (2004). The optimization of discourse

anaphora. Linguistics and Philosophy, 27(1):3–56.

Clark, H. H. and Sengul, C. J. (1979). In search of refer-

ents for nouns and pronouns. Memory and Cognition,

1(7):35–41.

Cristea, D., Ide, N., Marcu, D., and Tablan, V. (2000). An

empirical investigation of the relation between dis-

course structure and co-reference. In Proceedings

of the 18th conference on Computational linguistics,

pages 208–214, Morristown, NJ, USA. Association

for Computational Linguistics.

Fletche, C. R. (1984). Markedness and topic continuity in

discourse processing. Journal of Verbal Learning and

Verbal Behavior, 23(4):487–493.

Gordon, P. C. and Scearce, K. A. (1995). Pronominalization

and discourse coherence, discourse structure and pro-

noun interpretation. Memory and Cognition, 23:313–

323.

Grimes, J. E. (1975). The Thread of Discourse. The Hague:

Mouton.

Grosz, B. J., Joshi, A. K., and Weinstein, S. (1995). Center-

ing: A framework for modelling the local coherence

of discourse. Computational Linguistics, 21(2):202–

225.

ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence

Grosz, B. J. and Sidner, C. L. (1986). Attention, intention,

and the structure of discourse. Computational Lin-

guistics, 12(3):175–204.

Hahn, U. and Strube, M. (1997). Centering in-the-large:

Computing referential discourse segments. In eprint

arXiv:cmp-lg/9704014, pages 4014– +.

Hirschberg, J. and Nakatani, C. H. (1996). A prosodic anal-

ysis of discourse segments in direction-giving mono-

logues. In Proceedings of the 34th annual meet-

ing on Association for Computational Linguistics,

pages 286–293, Morristown, NJ, USA. Association

for Computational Linguistics.

Hitzeman, J. and Poesio, M. (1998). Long distance pronom-

inalisation and global focus. In ACL/COTJNG. Asso-

ciation for Computational Linguistics, Montreal.

Hobbs, J. R. (1979). Coherence and coreference. Cognitive

Science, 67:67–90.

Hobbs, J. R. (1985). On the coherence and structure of

discourse. Center for the Study of language and In-

formation, Stanford University.

Joshi, A. K. and Weinstein, S. (1981). Control of inference:

Role of some aspects of discourse structure - center-

ing. In Proc. International Joint Conference on Arti-

ﬁcial Intelligence, pages 385–387. Morgan Kaufmann

Publishers Inc.

Kameyama, M. (1998). Intra-sentential centering: A case

study. In Walker, M. A., Joshi, A. K., and Prince,

E. F., editors, Centering Theory in Discourse, chap-

ter 6, pages 89–112. Oxford University Press.

Kehler, A. (1993). The effect of establishing coherence in

ellipsis and anaphora resolution. In Proceedings of the

Conference of the Association for Computational

Linguistics (ACL–93)., pages 62–69. Association for

Computational Linguistics.

Kehler, A. (2002). Coherence, reference, and the theory of

grammar. CSLI publications Stanford, Calif.

Kertz, L., Kehler, A., and Elman, J. L. (2006). Grammat-

ical and coherence-based factors in pronoun interpre-

tation. In Proceedings of the 28th Annual Conference

of the Cognitive Science Society (this volume), pages

26–29.

Knott, A. (1996). A data-driven methodology for motivating

a set of coherence relations. PhD thesis, The Univer-

sity of Edinburgh: College of Science and Engineer-

ing: The School of Informatics.

Mann, W. C. and Thompson, S. A. (1988). Rhetorical struc-

ture theory: Towards a functional theory of text orga-

nization. Text, 8(3):243–281.

Marcu, D. (1996). Building up rhetorical structure trees. In

In Proceedings of AAAI-96. American Association for

Artiﬁcial Intelligence, pages 1069–1074.

Marcu, D. (1997). The rhetorical parsing of natural lan-

guage texts. In ACL-35: Proceedings of the 35th An-

nual Meeting of the Association for Computational

Linguistics and Eighth Conference of the European

Chapter of the Association for Computational Lin-

guistics, pages 96–103, Morristown, NJ, USA. Asso-

ciation for Computational Linguistics.

Marcu, D. (2000). The theory and practice of discourse

parsing and summarization. MIT Press.

Marslen-Wilson, W., Levy, E., and Komisarjevsky Tyler, L.

(1982). Producing interpretable discourse: The es-

tablishment and maintenance of reference. Language,

pages 339–378.

Merlo, P. and Stevenson, S. (2001). Automatic verb classi-

ﬁcation based on statistical distributions of argument

structure. Computational Linguistics, 27(3):373–408.

Passonneau, R. J. (1998). Interaction of discourse struc-

ture with explicitness of discourse anaphoric noun

phrases., pages 327–356. Published by Oxford Uni-

versity Press.

Power, R., Scott, D., and Bouayad-Agha, N. (2003).

Document structure. Computational Linguistics,

29(2):211–260.

Sanders, T. (1992). Toward a Taxonomy of Coherence Re-

lations. Discourse processes, 15(1):1–35.

Strube, M. and Hahn, U. (1999). Functional centering-

grounding referential coherence in information struc-

ture. Computational Linguistics, 25(3):309–344.

Susan E. Brennan, M. W. F. and Pollard, C. J. (1987). A

centering approach to pronouns. Proc of the 25th ACL,

25:155–162.

Webber, B. (1988). Discourse deixis: Reference to dis-

course segments. In Proceedings of the 26th annual

meeting on Association for Computational Linguis-

tics, pages 113–122. Association for Computational

Linguistics Morristown, NJ, USA.

Webber, B., Knott, A., Stone, M., and Joshi, A. (1999).

Discourse relations: A structural and presuppositional

account using lexicalised TAG. In Proceedings of

the 37th annual meeting of the Association for Com-

putational Linguistics on Computational Linguistics,

pages 41–48. Association for Computational Linguis-

tics Morristown, NJ, USA.

Webber, B., Stone, M., Joshi, A., and Knott, A. (2003).

Anaphora and discourse structure. Computational

Linguistics, 29(4):545–587.

Wolf, F. and Gibson, E. (2004). Discourse coherence and

pronoun resolution. Language and Cognitive Pro-

cesses, 19(6):665–675.

Wolf, F. and Gibson, E. (2005). Representing discourse co-

herence: A corpus-based study. Computational Lin-

guistics, 31(2):249–287.

TOWARDS A DISCOURSE LEVEL COHERENCE STRUCTURE