RICAD: TOWARDS AN ARCHITECTURE FOR RECOGNIZING

AUTHOR'S TARGETS

Kanso Hassan

1,2

, Elhore Ali

, Soulé-Dupuy Chantal

1,2

and Tazi Said

1,3

Université de Toulouse 1, 2 rue Du Doyen Gabriel Marty 31042 Toulouse, cedex 9, France

IRIT,CNRS Université Paul Sabatier118, route de Narbonne, 31062 Toulouse, cedex4, France

LAAS, CNRS, 7 Avenue Colonel Roche 31077 Toulouse, cedex 4, France

Keywords: Information Research, Intentional Structure, Intentional Explanation, Ontology.

Abstract: We present RICAD system based on a semi-automatic method from specific-domain corpus (with which it

is impossible to apply classical method information research). This approach is based on a model of

intentional structure and RICAD system to recognize the author’s intentions from written documents in a

specific domain. Our RICAD system happens in three stage: 1) to make a segmentation in a semi-automatic

way of a document according to the authors intentions, and to extract the intentional verbs accompanied by

their concepts of each segment through the system algorithms, 2) ontology building and 3) This system is

also able to update the ontology of intentions for the enrichment of the knowledge base containing all

possible intentions of a domain.

1 INTRODUCTION

The masses of information the researcher is exposed

to make it hard

for her to find the needle in the

haystack as it is impossible to skim-read even a

portion of the potentially relevant material. The

information access and search problem is

particularly acute for researchers in interdisciplinary

subject areas like computational linguistics or

cognitive science, as they must in principle be aware

of articles in a whole range of neighboring fields,

such as computer science, theoretical linguistics,

psychology, philosophy and formal logic.

In this article, we tackle the problems of

representation of information contained in

documents by basing us on the various structures

which can be extracted from it. Several types of

structures can be identified and used to describe

information and to facilitate research and the

restitution. The structures most fluently approached

in documentary information, according to the type of

concerned document , cover with supplementary

aspects: physical structure (related to the restitution),

logical structure (generally hierarchical organization

of the various elements composing a document),

semantic structure (semantic decomposition of a

document), rhetoric structure (is a descriptive and

functional theory of the textual organization based

on the recognition of semantic relations between

units of text), spatio-temporal structure

(representation in space and time). Exploitation of

the logical and physical structure has an interest

already proven with an aim of facilitating

fragmentation, storage and restitution of documents.

However, the documents structures based on

rhetoric, semantics and in particular the

communication intention are neither yet sufficiently

studied, nor exploited in the documentary systems.

Our work is focused more precisely on the concept

of intentional structure. This concept represents

intentional knowledge of the textual corpus. This

intentional knowledge could be used as a basis for

any process of annotation or of retrieval documents

because they will make it possible to bring

supplementary information on the contents of these

documents. By basing on the theory of the

intentionality, we developed an RICAD whose

objective is to find the communication intentions of

the authors. This RICAD uses techniques existent

natural techniques of deduction, close to those used

by an expert domain. Its specificity in the fact that it

is able to find the author intentions, to refine its

strategies of analysis of a new corpus and to produce

ontology of the intentions automatically. The

research and the identification of the intentions are

based on a segmentation of texts, then the analysis

of each segment to extract the intentional verbs and

374

Hassan K., Ali E., Chantal S. and Said T. (2008).

RICAD: TOWARDS AN ARCHITECTURE FOR RECOGNIZING AUTHOR’S TARGETS.

In Proceedings of the Tenth International Conference on Enterprise Information Systems - ISAS, pages 374-379

 SciTePress

their associated concepts. The used techniques of

segmentations and the methods of extraction and

analysis of the intentional verbs are described in this

paper.

This article is organized by follow: section 2

gives an overview about research in plan

recognition. After the overview, we present the

recognition systems based on the intention (section

3). Finally, conclusion and future work are presented

in Section 4.

2 RELATED WORK (OVERVIEW)

Since Schmidt(Schmidt et al.,78) first identified plan

recognition as a problem in its own right, plan

recognition has been applied widely to a variety of

domains, including natural language understanding

and generation (Allen et al., 80) (Carberry, 90), story

understanding (Wilensky, 78) (Charniak et al., 89,

93), multi-agent coordination (Huber et al, 94),

dynamic traffic monitoring (Pynadath et al., 95),

collaborative systems (Ferguson et al., 96, 98),

adventure game (Albrecht et al., 98), network

intrusion detection (Geib et al., 01), multi-agent

team monitoring (Kaminka et al., 02), and so on.

Many plan recognition approaches have been

proposed. (Kautz et al., 86) presented the first formal

theory of plan recognition, using McCarthy’s

circumscription. They define plan recognition

problem as identifying a minimal set of top-level

actions sufficient to explain the observed actions,

and use minimal covering set as a principle for

disambiguation. To deal with uncertainty inherently

in plan inference, (Charniak et al., 89, 93) built the

first probabilistic model of plan recognition based on

Bayesian reasoning. Their system supports

automatically generation of a belief network (BN)

from observed actions according to some network

construction rules. The constructed belief network is

then used for understanding a character’s actions in

a story. (Huber et al., 94) used PRS as a general

language for plan specification. They gave the

dynamic mapping from PRS specification to belief

networks, and applied the approach to coordinate

multi-agent team.

Pynadath and Wellman proposed a probabilistic

method that was based on parsing. Their approach

employs probabilistic state-dependent grammars

(PSDGs) to represent an agent’s plan generation

process. The PSDG representation, together with

inference algorithms supports efficient answering of

restricted plan recognition queries. More recently,

Bui et al., 02, 03) proposed an online probabilistic

policy recognition method based on the abstract

hidden Markov model (AHMM) and the extension

of AHMM allowing for policies with memories

(AHMEM). In their frameworks, scalability in

policy recognition in the models is achieved by

using an approximate inference scheme (i.e., Rao-

Black wellised Particle Filter). Besides Bayesian

models, some probabilistic approaches are based on

Dempster-Shafer theory, e.g., (Carberry, 90) and

(Bauer, 95, 96).

Though the approaches differ, most plan

recognition systems infer a hypothesized plan based

on observed actions. World states and in particular,

state desirability (typically represented as utilities of

states) are rarely considered in the recognition. On

the other hand, in many real-world applications,

utilities of different outcomes are already known

(Blythe, 99). A planning agent usually takes into

account that actions may have different outcomes,

and some outcomes are more desirable than the

others. Therefore, when an agent makes decisions

and acts on the world, the agent needs to balance

between different possible outcomes in order to

maximize the expected utility of overall goal

attainment. Utility and rationality issues have been

explored in earlier

Work in AI (e.g., rational assumptions, (Doyle,

92)). Plan recognition can be viewed as inferring the

decision making strategy of the observed agent. So it

is natural to assume that a rational agent will adopt a

plan that maximizes the expected utility. While

current probabilistic approaches capture the fact of

how well the observed actions support a

hypothesized plan, the missing part is the utility

computation.

One measure of progress in information retrieval

many systems has been developed, i.e. which adapts

to the circumstances of the information recognition

process.

In this paper, we present the architecture of our

RICAD system to recognize intentional structure

from scientific specific-domain.

In the following section, we present the several

systems by taking into account the concept of

intention recognition.

3 INTENTIONAL RETRIEVAL

SYSTEMS

An intentional retrieval systems were developed by

our research team, these system are: SABRE system

(Al-Tawki et al., 02) is an Authoring system Based

on the Re-use, who allows helping the authors to

RICAD: TOWARDS AN ARCHITECTURE FOR RECOGNIZING AUTHOR'S TARGETS

375

create new documents based on fragments of

existing documents. These fragments are described

in terms of the intentions of their authors, and are

identified by the main intention of their author, The

XSEdit system «XML Shared Editor ", is a system

consists to conceiving and implements a tool of

distributed co-operative edition allowing managing

and controlling the intentions of the writers by

metadata (Tazi et al., 06a, 06b). It utilizes techniques

whose employment extends quickly currently and is

based on a portable language. This tool should make

it possible to the users to compile and annotate the

same document without having to be located at the

same room, and at the same moment. This tool has

more interest if the users not are at the same place.,

and finally the Pero system (Elhore et al., 06), and

RICAD System (Kanso et al. 07), the first system is

used the learning by observation through the

reasoning of intentions and the second one, the

RICAD System is to recognize the author’s

intentions from written documents in a specific

domain.

These Tow following section, shows how the

Pero system recognizes the intention of an action

executed and the RICAD system recognizes the

authors intentions from written scientific documents.

3.1 Pero System

The Pero system was developed with an aim of

implementing a model of problem solving based on

the concept of intention (Elhore et al., 05a, 05b, 05c,

05d, 06). This model consist of a planner who allows

to solve mathematical problems applied to the

physical sciences by generating an explanation

related to each stage and which leads to the

resolution. The model proposes to integrate the

notion of intention in the process of problem solving

in order to add knowledge of explanation resolution.

This concept represents knowledge which leads to

the realization of each resolution action i.e. the

means and the reason used to take the action as well

as the explanatory argument. The graph of resolution

in which the nodes correspond to the states of the

planner and the arcs with the actions of resolution

makes it possible to represent the explanations

(Figure 1) as being the goal, the means and the

justification of resolution on the level of each arc of

the graph.

This section shows how the Pero system

recognizes the intention of an action executed. This

recognition eases the explanation process of the

solving exercises.

We will adopt the following generic form to

represent an intention in the process of the scientific

problem solving: IA (a1, a2, A, G, M, R)

Where IA represents the intention belonging to I

could be carried out by action A. This expression

expresses that the agent a1 with intention I to carry

out action A, to try to achieve the goal G, by the

means M for reason R. a2 represents the agent which

is intended the action, it is generally learning it.

Where

a1: is the author of the action; it is generally the

system,

a2: is the agent for which is intended the

explanation of the action; generally is the learning,

G: (Goal) is an act which expresses what the

author wants to make by making the action;

M: (Means) is an act which expresses the type of

action achieved on the reasoning;

R: (Reason) is an act which expresses with

which concepts the author makes the action.

This model we propose here takes into account

the context of actions being performed. Each action

achieved by the planner is contextualized, i.e. we

consider what one may call the intention of the

action. The intention of an operator of the planner is

a set of knowledge representing the goal of the

action, the means used to perform the action and the

reasons that justify the action. This knowledge

depends on the context of the action, so for any

action performed to solve a problem, there is an

intention that could be considered as the explanation

of this contextual action. The whole explanation of

the solution is considered as the set of explanations

of the actions performed to attain the final solution.

In previous work (Tazi, 2001) we have developed

the model of Intentional structures that we recall

briefly here.

Pero generates the knowledge concerning the

description of what we call the intention of the

action, (i.e. the goal, the means and the reasons for

the action). This knowledge comes from the solution

graph. The whole explanation destined for the

student is the concatenation of the all intentions of

the actions belonging to the path solution.

In order to illustrate this model, the following is

a draft of how the solution is proved and the

explanation is generated.

Let EQ1 be the initial state, and EQ2 be the final

state. (EQ1 and EQ2 are respectively the first

equations that will lead to the second equation after

a certain number of substitutions and or calculus).

When the system passes from one state S1 to the

following one (S2) it concatenates the intention of

the action that leads from S1 to S2.

ICEIS 2008 - International Conference on Enterprise Information Systems

376

1. Goal: try all possible combination of

substitution and calculus to find the solution

2. Means: are the operators used in the actions;

3. Reason: is the set of theorems, laws or functions

that triggers the operator.

For each action the intention is defined as:

1. Goal: Try to find the final state from the current

state

2. Means: The operators used to perform the

action (e.g. Substitute, Calculate, Derive, etc.)

3. Reason: justify the action by the arguments that

trigger the action, these arguments can be

theorems, laws, lemma functions, etc.

Explanation Etat Final

état initial

Assumption

Conclusion

Action of global problem

Intention of global problem

(G, M, R)

S1 S2 Sn-1 SnAction 1

Intention 1 (Goal1,

Means1, Reasons1)

Action n-1

Intention n-1

(Goal n-1, Meansn-1, Reasonsn-1)

Figure 1: Explanation process with intention recognition.

In the following section we present our RICAD

system and different stages of intention research.

3.2 RICAD System

The RICAD is dedicated in the information research

of textual corpus (Figure 2). It is based on

algorithms which facilitates the intentions research

and its principle of operation closer to domain

expert. At the beginning, after we make manual text

segmentation, it calls some tools such as:

Treetagger, for extracting the verbs for each

segment, Wordnet to find the synonyms of the

verbs which belong to the same segment in order to

minimize the set of verbs, a knowledge base

containing the intentions in order to find out the

intentional verbs of this segment.

The RICAD system allows also adopting a

method of counting the intentional verbs to find the

occurrences of each verb in order to announce the

intention of each segment. It has also the possibility

of generating intentions ontology of documents

containing all possible intentions.

The RICAD system is based on the following

steps. The initialization, this task which set up the

necessary resources to all other following

operations. It is the first task being launched and is

carried out only before starting the other tasks.

Initialization in the RICAD system, allow us to

introduce a corpus annotated by an expert, and to

enrich the knowledge base containing the verbs, and

their relative and absolute frequencies and their

intentions.

Result in XMLfiles

structuring

of ontology

Knowledge base

storage new

intention

Ontology update

Possible intentions of

the textual corpus

Layer Knowledge

My Computer

graphical

Interface Expert

RI CAD

extraction of

int enti onnal

verbs

syntactic

Ana lys i s

Treetagger

New Text

Textual corpus

manually

segmented

semantic Analysis

Wo r d Ne t

compraison(knowl

dge base verbs,

new verbs corpus

Initialization

Environment of initialization and treatment

Figure 2: RICAD Architecture.

The Introduction of a new corpus, this stage

makes it possible to the reader to introduce a new

corpus not segmented, and used it like the entry of

the system, in order to make an analysis to segment

it according to the authors intentions. Syntactic

Analysis (Treetagger) allows the system to make a

syntactic analysis on each logic element of the

document. The RICAD recognizes the sentences and

the verbs using the ontology of the verbs. The

Semantic Analysis (Wordnet) after the generation of

the textual files which contain lists of the verbs, the

semantic analysis uses Wordnet to find the

synonyms of the verbs in the annotated corpus in the

same segment in order to avoid the redundancy of

the verbs, and at the same time to find the other

synonyms of these verbs.

The Comparison

(Knowledge Base verb, new corpus verb) will be

made according to most relevant frequencies of

these verbs in the knowledge base and that of the

verbs of a new document (which is an estimation of

the verbs probability that repeat in the knowledge

base with big frequencies). In this stage we obtained

segmentation by sentences, and we used then the

principle of regrouping sentences by intentions. For

that, all the contiguous sentences which have verbs

at the same intentions are regrouped at the same

segment. The Result on XML files, this stage allows

the generation of an XML file containing the results

of segmentation accompanied by intentions.

When the RICAD system find a new terms

(verbs or concepts) from collections of scientific

documents not included in our knowledge base, it

will be added and updated automatically the existing

knowledge base. These changes may then be

incorporated into the RICAD knowledge base.

RICAD: TOWARDS AN ARCHITECTURE FOR RECOGNIZING AUTHOR'S TARGETS

377

4 CONCLUSIONS

This article presented some related work and several

systems were developed by our research team based

on the concept of intention. We used existing

structures in order to restructure the collections to

solve arising problems of information research

within these collections. We based on the concept of

intentional structure to establish a semi-automatic

system of segmentation according to the author’s

intentions.

We present some of our research into the

development of tools for analyzing scientific and

problem solving in the natural language processing

and extracting intentional information, and the

different relationships between local and global

intentions.

Ontologies are used with a knowledge

representation language for the machine and are

exploited with possibilities of inference.

REFERENCES

Albrecht D. W., Zukerman I. and Nicholson A. E.

Bayesian Models for Keyhole Plan Recognition in an

Adventure Game. User Modeling and Use Adapted

Interaction, 8(1-2):5-47, 1998.

Al-Hawamdeh, S."Knowledge management: re-thinking

information management and facing the challenge of

managing tacit knowledge" Information Research,

8(1), paper no. 143 [Available at

http://InformationR.net/ir/8-1/paper143.html], 2002.

Allen J. F. and Perrault R. Analyzing Intention in

Utterances. Artificial Intelligence, 15(3):143-178,

1980.

Al-Tawki, Y. Création par réutilisation de documents

décrits par les intentions de l’auteur. Doctorat de

l'Université de Toulouse 1. 2002.

Bauer M. A Dempster-Shafer Approach to Modeling

Agent Preferences for Plan Recognition. User

Modeling and User-Adapted Interaction, 5(3-4):317-

348, 1995.

Bauer M. Acquisition of User Preferences for Plan

Recognition. Proceedings of the Fifth International

Conference on User Modeling, 1996.

Blythe J. Decision-Theoretic Planning. AI Magazine,

20(2):37-54, 1999.

Bui H. H. A General Model for Online Probabilistic Plan

Recognition. Proceedings of the Eighteenth

International Joint Conference on Artificial

Intelligence, 2003.

Bui H. H., Venkatesh S. and G. West. Policy Recognition

in the Abstract Hidden Markov Model. Journal of

Artificial Intelligence Research, 17:451-499, 2002.

Carberry S. Plan Recognition on Natural Language

Dialogue. The MIT Press, 1990.

Charniak E. and Goldman R. A Bayesian Model of Plan

Recognition. Artificial Intelligence, 64(1):53-79, 1993.

Charniak E. and Goldman R. A Semantics for

Probabilistic Quantifier-Free First-Order Languages,

with Particular Application to Story Understanding.

Proceedings of the Eleventh International Joint

Conference on Artificial Intelligence.1989.

Doyle J. Rationality and Its Roles in Reasoning.

Computational Intelligence, 8(2):376-409, 1992.

Elhore, A., Tazi, S.“Pero a planning system for the

explanation of problem solving in physics”. Mixed

Language Explanation in Learning Environment

(MLELE’05) in conjunction with AIED’05,

Amsterdam, p. 80-81, 2005.

Elhore, A., Tazi, S., 2005 “Explaning and Indexing

Solutions for Physics Learning”. IEEE, Conférence

International CELDA’05, p. 532-538.

Elhore, A., Tazi, S.“Planning solution for physics

learning”. International conference CAPS’05

http://europia.org/ICHSL05/) CAPS 05, Marrakech –

Morocco, 2005.

Elhore, A., Tazi, S.“Planning and explaining solutions for

physics learning”. IEEE international conference on

Machine Intelligence

(http://www.acidcaicmi2005.org/) The 2nd ACIDCA-

ICMI'2005. Tozeur – Tunisia, 2005.

Elhore, A., Tazi, S.“Apprentissage par observation à

travers les intentions du raisonnement”. Conference on

Information Technologies, MCSEAI’06, 2006.

Ferguson G. and Allen J. F. TRIPS: An Intelligent

Integrated Problem-solving Assistant. Proceedings of

the Fifteenth National Conference on Artificial

Intelligence, 1998.

Ferguson G., Allen J. F. and Milller B. TRAINS-95:

Towards a Mixed-Initiative Planning Assistant.

Proceedings of the Third Conference on Artificial

Intelligence Planning Systems, 1996.

Geib C. W. and Goldman R. P. Plan Recognition in

Intrusion Detection Systems. Proceedings of the

Second DARPA Information Survivability Conference

and Exposition, 2001.

Goldman R. P., Geib C. W. and Miller C. A.. A New

Model of Plan Recognition. Proceedings of the

Fifteenth Conference on Uncertainty in Artificial

Intelligence, 1999.

Haddawy P. and Suwandi M. Decision-Theoretic

Refinement Planning Using Inheritance Abstraction.

Proceedings of the Second International Conference

on Artificial Intelligence Planning, 1994.

Huber M. J., Durfee E. H. and Wellman M. P. The

Automated Mapping of Plans for Plan Recognition.

Proceedings of the Tenth International Conference on

Uncertainty in Artificial Intelligence, 1994.

Kaminka G., Pynadath D. V. and Tambe M. Monitoring

Teams by Overhearing: A Multiagent Plan

Recognition Approach. Journal of Artificial

Intelligence Research, 17:83-135, 2002.

ICEIS 2008 - International Conference on Enterprise Information Systems

378

Kautz H. A. and Allen J. F. Generalized Plan Recognition.

Proceedings of the Fifth National Conference on

Artificial Intelligence, 1986.

Mao W. and Gratch J. Decision-Theoretic Approaches to

Plan Recognition. ICT Technical Report

(http://www.ict. usc.edu/publications/ICT-TR-01-

2004.pdf), 2004.

Marsella S. and Gratch J. Modeling Coping Behavior in

Virtual Humans: Don’t Worry, Be Happy.

Proceedings of the Second International Joint

Conference on Autonomous Agents and Multiagent

Systems, 2003.

Pynadath D. V. and Wellman M. P. Accounting for

Context in Plan Recognition, with Application to

Traffic Monitoring. Proceedings of the Eleventh

International Conference on Uncertainty in Artificial

Intelligence, 1995.

Rickel J., Marsella S., Gratch J., Hill R., Traum D. and

Swartout W. Toward a New Generation of Virtual

Humans for Interactive Experiences. IEEE Intelligent

Systems, 17(4):32-38, 2002.

Schmidt C. F., Sridharan N. S. and Goodson J. L. The Plan

Recognition Problem: An Intersection of Psychology

and Artificial Intelligence. Artificial Intelligence,

11(1- 2):45-83, 1978.

Tazi S."Description de documents multimédias, standards

et tendances". dans Imad Saleh (Editeur), Conception

et Réalisation, Hermès, 2004.

Tazi S., DRIRA K., ESSAJIDI K. Enabling consistency of

communicative intentions in cooperative authoring.

2nd International Conference on Innovative Views of

NET Technologies (IVNET'2006), Florianopolis

(Brésil), pp.237-246, 2006.

Tazi S., DRIRA K., ESSAJIDI K., Maintien de la

cohérence des intentions de communication dans la

rédaction coopérative. 9ème Colloque International sur

le Document Electronique (CIDE'2006), Fribourg

(Suisse), pp.151-168., 2006.

Wilensky R. Understanding Stories Involving Recurring

Goals. Cognitive Science, 2:235-266, 1978.

RICAD: TOWARDS AN ARCHITECTURE FOR RECOGNIZING AUTHOR'S TARGETS

379