THE INTELLIGENT WEB
F. Javier del Álamo, Raquel Martínez and José Alberto Jaén
Department of Control, Electronics and Computer Science
E.T.S. Ingenieros Industriales-Universidad Politécnica de Madrid, Spain
Keywords: Rhetoric structure theory, Rhetoric-semantic relation, Semantic network, Knowledge node.
Abstract: Many people are working on the Semantic web with the main objective being to enhance web searches. Our
proposal is a new research strategy based on the existence of a discrete set of semantic relations for the
creation and exploitation of semantic networks on the web. To do so, we have defined in a previous paper
(Álamo, Martínez, Jaén) the Rhetoric-Semantic Relation (RSR) based on the results of the Rhetoric
Structure Theory. We formulate a general set of RSR capable of building discourse and making it possible
to express any concept, procedure or principle in terms of knowledge nodes and RSRs. These knowledge
nodes can then be elaborated in the same way. This network structure in terms of RSR makes the objective
of developing automatic answering systems possible as well as any other type of utilities oriented towards
the exploitation of semantic structure, such as the automatic production of web pages or automatic e-
learning generation.
1 BASICS: SUMMARY OF THE
RHETPORICAL SEMANTIC
RELATIONS (RSR)
The primary objective of computational linguistics is
the study and computerized treatment of human
language, with the goal of providing a natural
language model.
Based on the significant contributions of the
Rhetorical Structure Theory (William Mann, Sandra
Thomson 1999), we have formulated a set of
Rhetorical Semantic Relations (RSR) for knowledge
representation.
The RST provides an explanation of the coherence
of the discourse. We assume the results of RST and
propose a set of relations capable for representing
knowledge.
In short, the RST defends the principle that the
reading of a text does not always produce an
expression of coherence. There are texts that are
syntactically and semantically correct but difficult to
understand. The theory explains the coherence of the
discourse in terms of the existence of a kind of
relation between blocks of text: the nucleus-satellite
relations and the multinuclear ones. Without going
into further detail, RST explains that the fact we can
understand some texts is related both to the presence
of rhetorical relations and to the distance between
the fragments of text. (William Mann, Sandra
Thomson 1999),(Bosma, 2005).
1.1 RSR Formulation
Based on RST, we wonder if, as there is a finite set
of rhetorical relations between two different blocks
of text, there is also a reasonable bound set of
relations (semantic primitives) in the same way for
any knowledge representation. This set must include
the rhetorical relations as a subset. We will call this
set RSR (Rhetorical Semantic Relations), and our
goal is to build the semantic networks definitions in
terms of RSR.
There are situations, such as those studied by
Katheleen McKeown in Text Generation
(McKeown, 1985), in which there are rhetorical
structures nested within others. According to our
results, by analyzing this structure in a large number
of cases, we have seen that there is a certain point
when the fragment has a semantic structure that does
not correspond to the relations proposed by the RST.
For this reason, as a second step, we propose to
continue analysing the blocks of text obtained from
the rhetorical analysis by the detection of semantic
relations, such as “is_a”, both in “is_a (individual-
category)” such as “is_a (category-category)” forms
the “is_part_of” relation and the “causal” ones.
662
Javier del Álamo F., Martínez R. and Alberto Jaén J. (2010).
THE INTELLIGENT WEB.
In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Artificial Intelligence, pages 662-666
DOI: 10.5220/0002763106620666
Copyright
c
SciTePress
In “Basic Method of instruction” (Reigeluth, 2007),
the author makes a distinction between common
features, referred to as those features shared by all
the members (or subclasses) of the class, and
differential features, such as those which
differentiate the individual or subclass from the rest
of the individuals (or subclasses) of the same class.
The summarized results are shown in the table
below, where we have included the canonical
expression, showing the representative fragment of
text for all the rhetorical-semantic relations
including both the relation to be used and the type of
content of the child node in capital letters
Table 1: RSR Canonical expression.
Nr. Denomination Canonical expression (*)
1 Transformation Changes the ‘OBJECT’ …
2 Feature Shows the ‘FEATURE’…
3 Function Performs the
‘FUNCTION’…
4 Location Places in the
‘LOCATION’…
5 Objective Pursues the
‘OBJECTIVE’…
6 Classify Belongs to the ‘CLASS’…
7 Coincidence Shows the
‘COINCIDENCE’…
8 Difference Shows the
‘DIFFERENCE’…
9 Part Shows the ‘PART’…
10 Effect Produces the ‘EFFECT’…
11 Result Yields the ‘RESULT’…
12 Activity Develops the
‘ACTIVITY’…
13 Method Is reached by the
‘METHOD’…
14 Comparison Is compared to the reference
‘OBJECT’…
15 Taxonomy Is organized in ‘CLASSES’
16 Cause Because of the ‘CAUSE’…
17 Evaluation Has the ‘VALUE’…
18 Condition Has the ‘CONDITION’…
19 Elaboration Is elaborated in the
‘OBJECT’…
Table 1: RSR Canonical expression (cont.).
20 Antithesis Is opposed to the
‘OBJECT’…
21 Summary Is summed up in the
‘OBJECT’…
22 Restatement Can be expressed as
‘OBJECT’…
23 Background Is understood because of the
‘OBJETCT’…
24 Instrumental
relation
Is related to the ‘OBJECT’
25 Interpretation Must be interpreted in the
‘CONTEXT’
26 Concession Although the ‘PREDICATE’
can be true …
27 Justify Is justified by the ´THESIS’
28 Motivation Is interesting because of the
‘REASON’…
29 List Includes the ‘OBJECT
/CLASS’…
30 Following Follows the ‘ELEMENT’…
(*) Note that some of the relations make sense
only in singular or only in plural. A plural
expression is equivalent to a set of several identical
relations between the same knowledge father node
and different knowledge child nodes.
In this way, the definition of C= {x, y, z} is
equivalent to: C includes x object; C includes y
object; C includes z object; (The LIST relation in
the Table 4). An alternative formulation in RSR is
by means of the inverse relation: x Є C; y Є C; z Є
C. That is the CLASS relation.
1.2 RSR Verification
For the set of RSR verification, we decided to test
the behavior of the defined set of relations with the
categories of questions defined in the classic theory
of QA (Question Answering) proposed for the
generation of automatic systems.
Of the 13 different categories of questions defined
by in the conceptual categorization in question
answering (Lehnert, 1978), we have marked the ones
that are supported in the proposed set of RSR,
followed by the corresponding RSR in brackets.
1. Causal antecessor. (Cause)
2. Objective oriented. (Objective)
3. Enabling. (Condition)
THE INTELLIGENT WEB
663
4. Causal consequent. (Effect)
5. Verification
6. Disjunctive. (List)
7. Instrumental / Procedural.(Method)
8. Complete concepts.
9. Prospects. (Explanation)
10. Judgment. (Evaluation)
11. Quantification. (Evaluation)
12. Specification de characteristics. (Features)
13. Request. (Result)
Regarding verification question support, once we
have expressed a discourse in terms of RSR, it is
possible, for example, to have a direct translation in
terms of prolog predicates. The use of an inference
mechanism over this knowledge, as a prolog query,
it is the base for the implementation of a question
answering system. If the result of this query is true,
this implies that the facts are true. If false, it is not
possible to confirm that the proposition is true or
false with the available knowledge.
Completing concepts requires a recompilation
process of all the semantic networks including this
node and the corresponding predicates.
The rest of the possible questions are completely
supported by the set of rhetorical-semantic relations
defined in the present article, concluding that our
model supports in a reasonable way the questions
that an agent can make, either by means of the
relation or by means of its inverse.
1.3 Innovation Aspects
The main innovation aspect of the proposed
approach is the semantic enhancement of the
resulting representation. We can represent a set of
linked pages and resources as a set of knowledge
nodes interconnected solely by means of RSR (or
specific RSR synonyms in specific domains). The
set of web pages will then be rhetorical-semantic
networks based on these special relationships.
Ontology corresponds just to one specific RSR
(classify), but it is not the only one: In this paper, we
propose a set of 30 basic relations to be used as an
alternative and more complete treatment of
ontology, by using the same technological support.
An important contribution of the RSR approach to
the semantic web exploitation is to provide an
instrument for automatic building knowledge bases
for different applications such as intelligent question
answering, by replacing node names as variables and
RSR as relationships of the knowledge base.
The main applications are in the field of automatic e-
consulting, e-learning generation or automatic
document production.
2 PROOF OF CONCEPT
We have developed a large number of different
examples to test our proposal. As important as the
correct interpretation in the stage in which we find
ourselves, is the didactic feature of the example.
From this point of view, we have developed
examples in the field of mechanical engineering,
instructional design and e-learning production.
We will show here an example in the field of
physical sciences and the role of RSR in explaining
the Archimedes Principle:
Figure 1: Archimedes Principle.
By automatic identification of nodes, we can obtain
a set of Node IDs, for example:
Figure 2: Archimedes Principle Node Identification.
The set of RSR is valid for knowledge
representation, and it supports the questions
categories in Q&A theory. We can express any
content as a network composed of nodes and
relations of the defined set, and the use of the
appropriate synonyms It means we can translate
ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence
664
directly as a set of Horn clauses in a knowledge
database:
Produces_the_EFECT(id_node_1, id_node_11),
Takes_the_VALUE(id_node_11, id_node_111),
Produces_the_RESULT(id_node_11, id_node_112),
Produces_the_RESULT(id_node_11, id_node_113),
Shows_the_FEATURE(id_node_11, id_node_114)
This approach is the basis for the e-consulting
application. It is solved like a query to the
knowledge database yielding the node identification.
By using the resultant id_node, it recovers the
associated content (text, text + graphics, animation,
web page, equations, etc…).
3 THE INTELLIGENT WEB
Many people are working on the semantic web, with
the main objective being to simplify and to enhance
web searches. This work is based on the presence of
certain kinds of tags specifying ‘ontologies’.
That is finally the main idea of the semantic web: if
we can specify the classes to which a word belongs
and tag it, we are establishing absolute relationships
between words and categories in such a way that we
have an ‘implicit relationship of ‘belongs to the
class’ between the tagged word and all classes to
which it belongs. Certainly, the inverse relation is
present by means of the implementation of a
mechanism for recovering all the words belonging to
a certain class.
What we propose is a way for tagging the existent
RSR in an explicit way on every text in a web page.
As far as we satisfy this proposal, we will exploit the
resultant text in different ways.
In our goal of representing knowledge, we must
begin by wonder what is knowledge? This is
probably one of the most difficult questions we can
ask, and the most profound philosophical answer is
surely out of the scope of this paper, but we can
agree that “knowledge is a representation of the
reality in our mind”.
We think in terms of ideas that we usually express in
different ways, such as texts, drawings, equations,
images, sequences of memories, such as videos, etc.
The important thing here is that they are connected
in our mind by means of certain kinds of relations.
Our intention is integrate different contributions
proceeding from different theories, such as the idea
of building mental maps in Meaningful Learning
Theory (Novak, Ausubel, 2002).
4 CONCLUSIONS
The set of RSR is valid for knowledge
representation and it supports the question categories
in Q&A theory. We can express any content as a
network composed of nodes and relations of the
defined set, and the use of the appropriate
synonyms. It means we can translate directly as a set
of Horn clauses in a knowledge database.
We can use different synonyms for different domain
applications without losing the semantic
connectivity. It provides a means for the
development of natural language answering systems.
It can be a means for the definition of general
ontology and relations on the semantic web.
It is possible to automatically generate e-learning
lessons, documents or Q&A systems from any
knowledge base generated automatically from an
RSR expression of contents.
5 FUTURE LINES OF RESEARCH
The main lines of research in which we are
interested and in which we are intensifying our
efforts include the following:
Operations on RSR (RSR Inverses and
plurals, RSR combinations, The treatment
of verbal trends in RSR)
Creation of a knowledge representation and
storage model and data architecture capable
of supporting the definition of knowledge
networks based on RSR at the same time.
Fundamental Cognitive Networks:
Formulation of a molecular structure of
knowledge by using the patterns most
frequently used by people, for discourse
construction.
The elaboration of Knowledge
Representation Methodology, by using
rhetoric-semantic networks.
The application of Walter Bosma’s results
regarding rhetorical distance application
and treatment as semantic weighted
networks.
REFERENCES
Mann, William C. and Thompson, Sandra A. (1999) “An
Introduction to Rhetorical Structure Theory”.
THE INTELLIGENT WEB
665
Marcu, Daniel. (1997) “The Rhetorical Parsing,
Summarization, and Generation of Natural Language
Texts”.
Taboada, M. and Mann, William C. (2006) “Applications
of Rhetorical Structure Theory. Discourse Studies” 8
(4): 567-588. Pre-publication version, pdf
Taboada, M. and Mann, William C. (2006) “Rhetorical
Structure Theory: Looking Back and Moving Ahead”.
Discourse Studies 8(3): 423-459. [ Pre-publication
version, in pdf ]
Mann, William C., Matthiessen Christian M. I. M. (1991)
"Functions of language in two frameworks". Word 42
(3): 231-249.
Mann, William C., matthiessen Christian M. I. M. and
THOMPSON, Sandra A. (1992) “Rhetorical Structure
Theory and Text Analysis. Discourse Description:
Diverse linguistic analyses of a fund-raising text” . ed.
by W. C. Mann and S. A. Thompson. Amsterdam,
John Benjamins: 39-78.
Mann, William C. and Thompson, Sandra A, Eds.
(1992a)”Discourse Description: Diverse linguistic
analyses of a fund-raising text”. Pragmatics &
Beyond, New Series. Amsterdam, John Benjamins..
Mann, William C. and Thompson, Sandra A. (1992b)
“Relational Discourse Structure: A Comparison of
Approaches to Structuring Text by 'Contrast”.
Language in Context: Essays for Robert E. Longacre .
ed. by S. J. Hwang and W. R. Merrifield. Dallas, SIL:
19-45..
Lehnert, Wendy G. (1978) “The Process of Question
Answering. A Computer Simulation of Cognition”.
Yale University, Lawrence Erlbaum Associates, J
Willey & Sons.
Mann, W.C., Y Thompson, S.A. (1988) Rhetorical
Structure Theory: “Toward a functional theory of text
organization”. Text, 8 (3). 243-281.
Mckeown, Kathleen (1985) “Text Generation”,
Cambridge University Press.
Reigeluth, Charles. (2007) “Basic Methods of
Instruction”. Indiana University Website.
http://education.indiana.edu ,
http://www.indiana.edu/~ist/faculty/reigelut.html
Bosma, W.E. (2005) “Query-Based Summarization using
Rhetorical Structure Theory”. University of Temple.
15th meeting of CLIN pp 29-44 ISBN 90-76864-91-8..
Alamo, F. Javier del. (2007) “Knowledge Modeling and
Automatic Web exploitation”. Polytechnic University
of Madrid, Industrial Engineering School.
Alamo, F Javier del, Martínez, Raquel, Jaén, José Alberto
(2009) “Rhetorical-Semantic Relations: A proposal
for Atomic Representation of Knowledge”. ITA 09.
Novak, J.D. (2002) Meaningful Learning: The Essential
Factor for Conceptual Change in Limited or
Inappropiate Propositional Hierarchies Leading to
Empowerment of Learners. Science Education, 4(86),
548-571
ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence
666