the main clause. Gerundives preceded by preposition by will be interpreted as Means;
those preceded by for are interpreted as Rationale or Purpose according to additional
conditions obtained in the main clause – see below.
As to tensed clauses, be they relative clauses, complement clauses, coordinate
clauses, or simply main clauses, they may express all gamut of Causal Relations. The
most difficult to be assigned, as we noted above, is Circumstance. It requires a certain
number of constraints to apply before the interpretation is accepted by the algorithm.
In particular, the SUBJect argument must be non-animate non-human, or else the
proposition must be under the scope of modal operators, or the main predicate must
be a semantically opaque predicate. In the presence of subordinators, specific cue
words or causality markers, the discourse relation is assigned straightforwardly. After
that, the choice is still difficult to make: the algorithm will use the list of Exceptional
Event predicates, Negative Judgment predicates, as well as the presence of negation
or other indicators of some uncommon or unpredictable situation that requires the
Cause to be explicitly expressed. Finally, if some of these are obtained, the algorithm
looks for cohesive links by accessing the output of the anaphora resolution algorithm
and checks for the presence of Resolved Nominals or Pronominals, i.e. whether the
current clause contains nouns/pronouns which corefer to some previously mentioned
entity present in the history list. In case this also fails and there is no cohesive link,
the system will have to search WordNet, LCS and FrameNet for some indication that
the Predicate-Argument Structures contained in the two clauses under examination
can be taken to be explicitly in a Causality Relation. We usually start out by a lookup
in LCS. LCS entries contain cross reference to Levin verb classes, to WordNet senses
and to PropBank argument lists. These have been mapped to a more explicit label set
of Semantic Roles, which can be regarded more linguistically motivated than the ones
contained in FrameNet, which are more pragmatically motivated. However, for our
purposes, LCS notation is more perspicuous because of the presence of the CAUSE
operator, and is more general: from a total number of 9000 lexical entries, 5000 con-
tain the CAUSE operator. On the contrary, in FrameNet, from a total number of
10000 lexical entries, only 333 are related to a CAUSE Frame; if we search for the
word “cause” in the definitions contained in all the Frames, the number increases to
789, but is still too small compared to LCS.
3.2 The CRs Algorithm in Detail
We will now present in detail the contents of the algorithm from a technical and lin-
guistic point of view. The algorithm performs these actions:
1. collects linguistic information from the parser output;
2. translates linguistic information into Semantic and Pragmatic Classes;
3. assigns Discourse Relations on a clause-by-clause level: some will be spe-
cific, some generic or default;
4. detects the presence of Causality Relations according to the algorithm traced
above;
5. determines coreference relations between the Discourse Units thus classified.
The input to the algorithm is reported in Table 1: the list of linguistic items repre-
sents clause level information as derived from dependency structure – related to the
example “he was not frozen in place by rigid ideology”:
88