THE USE OF THE NATURAL LANGUAGE UNDERSTANDING
AGENTS WITH CONCEPTUAL MODELS
Olegas Vasilecas, Algirdas Laukaitis
Vilnius Gediminas Techical University, Saulėtekio al.11, Vilnius – 40, LT-10223 Lithuania
Keywords: Natural language interfaces, conceptual modelling, neural network.
Abstract: In this paper AI agents for natural language interfaces in data exploration domain are presented. The
experiment done with the IBM natural language toolbox has shown that the black box approach in this case
leads to misclassification. Unsatisfactory results of the experiment triggered the present research aimed at
improving the user interface with the natural language modality at architectural and algorithm levels. We
extended traditional natural language interfaces in data exploration domain in the following direction: the
use of feedforward neural network as concepts indexes in the users natural language interfaces are
suggested. All presented concepts are realized as the open source project JMining Dialog.
1 INTRODUCTION
Corporate data environment is becoming more and
more complex as the amount of information is
constantly growing. Since the early 80's many
efforts have been made to investigate the use of
natural language for information extraction from
data base management systems (DBMS). Some
efforts were successful and some commercial
applications emerged but the NLP techniques have
not become widely used for DBMS interfaces. As
mentioned by researchers in (Androutsopoulos,
1995) this is due to:
1. Graphical and menu driven interfaces achieved
the level of sophistication that data analyst can do
job without deep knowledge of some data queer-ing
language (e.g. SQL), and on the other side NLP
techniques has not been able to deliver interfaces of
adequate sophistication.
2. Most research results reports on the possibility
to generate only one data queering script (in most
cases this was one SQL sentence) generated from
one natural language sentence. They do not support
complex dialog, which is the most usual case in real
life when we query data analysis ex-pert.
3. Some systems are commercial products and
they are close systems with difficulties in extending
such systems. We think that only open source
projects can bring more attention from researchers to
natural language database interface systems
(NLDBIS) field.
4. In available systems only system
administrators are able to parameterise the system.
We think that resent advances is building personal
assistants in such fields like an adaptive information
research from internet or personalized learning
knowledge maps will help to renew researches
interest in (NLDBIS) field.
To respond to these challenges a system
JminingDialog (Laukaitis, 2005) that use a dialog
rather than a sentence and is a constituent part of the
open source information delivery web portal
JMining (Laukaitis, 2005) was developed. The
suggested solution presents an agent architecture
consisting of a set of asynchronously operating
agents. This architecture enables us to perform
sophisticated data and interaction analysis without
loosing the property of short respond time essential
for interactive real-time operation. In the system
created several well-established Java toolboxes were
used. For text information pre-processing GATE
(Cunningham, 2000) which is a general natural
language architecture and a toolbox as well as
WordNet (Miller, 1985) representing English
language dictionary were applied.
The contribution of this paper is as follows:
firstly, a conceptual model used in the performed
experiments is described. Next the experiment with
IBM natural language understanding solution
WebSphere Voice Server considered as the black
308
Vasilecas O. and Laukaitis A. (2006).
THE USE OF THE NATURAL LANGUAGE UNDERSTANDING AGENTS WITH CONCEPTUAL MODELS.
In Proceedings of the Eighth International Conference on Enterprise Information Systems - AIDSS, pages 308-311
DOI: 10.5220/0002446203080311
Copyright
c
SciTePress
box approach to natural language supporting
systems is presented. The encountered problems
stimulated the research in the uses of a hybrid neural
network for natural language understanding. The
main idea behind this new proposal is to simulate
neural network architecture by ontological
knowledge base structure.
2 CONCEPTUAL MODEL
DRIVEN NLU
UNDERSTANDING
Ambiguity and vagueness raise a lot of problems for
developing information systems (IS). Business
applications ambiguity and vagueness arise because
at all IS lifecycle stages (analysis, design, testing
etc.) natural language is an essential part of
communication between people involved in business
activities (policies, regulations, laws, etc.).
Conceptual data centric modelling can be an
effective tool for eliminating ambiguity and
vagueness from IS business applications. This can
help to extend the analysts capabilities, enabling
him/her to define business concepts, characteristics,
behaviours, and interactions. Conceptual data-
centric enterprise models are rarely built and few
organizations even tried to use them with
information systems and in business activities. The
problem with conceptual data-centric enterprise
models is that they are difficult to understand. Their
abstract and generic concepts are unfamiliar to both
business users and IS professionals, being removed
from their local organizational contexts. We have
found this in several Baltic and Scandinavian banks
working with the IBM financial services data model
(FSDM) (IBM, 2005), which is a domain specific
model, based on the ideas of the experts from IBM
banking solution centre. To boast the awareness and
project-centric approach we integrated the model
into the created data exploration and information
extraction framework JMining (Laukaitis, 2005).
The model is shown to consist of a high level
strategic classification of domain classes integrated
with particular business solutions (e.g. Credit Risk
Analysis) and logical and physical data entity-
relationship (ER) models. In JMining Dialog system
the user identifies concepts by using natural
language on all conceptual models levels: the 'A'
level identifies nine data concepts that define the
scope of the enterprise model (involved party,
Products, arrangement, event, location, resource
items, condition, classification, business), the 'B'
level contains with business concepts hierarchies
(more than 3000 concepts), the 'A/B' business
solutions (integrates more than 6000 concepts with
more than 50 solutions) and 'C' level – entity
relationship ER diagram with about 6000 entities,
relationships and attributes.
Figure 1: Part from conceptual model used to support
natural language modality in data querying system
Jmining
.
In figure 1 we can see the small part from
conceptual model. If the user brings the input, "show
all arrangements with the type loan", the system
activates the conceptual model graph paths with
different probabilities for each concept e.g.: 1)
Arrangement (0.59) -> Arrangement Family (0.42) -
> Account Arrangements (0.40) -> Loan
Arrangements (0.14), 2) Arrangement (0.59) ->
Arrangement Family (0.42) -> Arrangements Type
(0.25) -> Product Arrangements Type (0.23) etc.
As we see the user natural language input
activates not just one concept but a path on
conceptual graph. Then intelligent agents can act on
that information e.g. agent responsible for SQL
understanding can build the SQL sentences from
identified databases, agent responsible for dialog
handling can propose several options for user and
ask to specify more accurately what the user has in
mind.
3 NATURAL LANGUAGE
UNDERSTANDING WITH IBM
NLU TOOLBOX
At the beginning of the research we looked for the
state-of-the-art natural language understanding
(NLU) systems that can be found in the market and
used as plugging to our concepts identification
THE USE OF THE NATURAL LANGUAGE UNDERSTANDING AGENTS WITH CONCEPTUAL MODELS
309
system. We have made primary evaluation of
WebSphere Voice Server, which is a part of the IBM
WebSphere software platform. From IBM
presentation (IBM, 2004) it appeared that the system
is primarily intended for telecommunication market.
It was a challenging task to test it on more a
complex system e.g. a full conceptual model for
financial services. The IBM NLU system uses
statistically based models, which as they claim,
provide more flexibility and robustness compared
with traditional grammar-based methods. Much of
the algorithm is unknown becouse the product is
proprietary. In the present research the black box
approach was used: put the training data, compile
and test the system response to the new arriving
data. For statistical learning the sets of pairs
including the concept and the description of the
concept were provided.
The following experiment conducted with IBM
NLU solution revealed some basic problems with
the current state-of-the art technologies when we
want to apply them beyond ordinary telephony voice
applications. A group consisting of 3 students was
instructed about the above data model. They queried
the system with about 20 questions and tried to
identify the "Involved Party" concept. The number
of concepts put into IBM NLU model for learning
was constantly increased. At the beginning only 9
top 'A' level concepts were considered. In this case
for training data a description of these concepts were
extracted from the original IBM model. At the
second stage, the descriptions from child concepts
were added to the training data for these 9 top parent
concepts (see the second row in the table). Next the
number of concepts was increased to 50 and finally
500 concepts with their descriptions were extracted
and put to the IBM NLU statistical training data.
Table 1 shows the results of the experiment. To
detect the classification error the proportion of the
correct identified concepts was used.
We were faced with a critical scalability
problem. There were several instances in training
when the system diverged from any reasonable
acceptance level. While it was possible to make the
training successful through manual intervention by
adding more training data, the problem of
divergence remained when the number of concepts
increased up to the full conceptual model. The
present research has shown that there is a lack of
descriptive power for entities identification when
training data include only brief descriptions of the
conceptual model entities (as in IBM FDWM).
Table 1: Concepts identification experiment (CN - number
of concepts for identification).
CN=9 CN=50 CN=500
1. IBM NLU 0.1521 .0405 0.0152
2. IBM NLU (child
nodes descriptions
added)
0.3682 .1726 0.0822
3. Hybrid modular
FF NN (NL parsers
integrated in the
network structure)
0.4590 0.2814 0.1874
To increase concept identification accuracy, we
experimented with Separate Multi-Layer
Feedforward Network (MLF) with one hidden layer.
The novelty of this experiment is that there is a
feedforward network representing each node
(concept) in the conceptual model. To train the
network unit, which represents one node, we
suggested that a different dictionary be provided for
each network. For parent nodes children's training
data, which was used in the IBM NLU experiment,
was employed. In the presented architecture each
network is concentrated on identification of one
entity, but each network has a connection with other
networks representing different concepts.
It has been found that such "weak"
connectionism between separate neural networks can
increase concept identification. First the modular
network was tested without symbolic pre-
processing. In the training process concept maps
were constructed based on the training examples.
These concept maps relate each input
sentence/phrase to a specific concept in the problem
domain. All patterns consist of a unipolar
representation of the training sentence or phrase. For
example, the sentence could be: Show all my
arrangements. Then the pattern for concept
arrangement would be: 1 0 0 0 … 0 0 … .
It has also been found that if there is a case
where there is no symbolic preprocessing there
should be textual input that accurately matches the
network dictionary. This was the main reason why
we decided to improve the performance of the
system by transforming our dictionary input into
Vector Space Model VSM. For this purpose,
methodology presented in (Wermter, 1995) based on
WordNet (Miller, 1985) was used for additional
semantic mapping. Term weighting is a well-known
representation approach that transforms a term to a
weight vector in text processing. For neural models,
this representation plays a key role in model
performance. The most common term-weighting
method, is based on the bag-of-words approach,
which ignores the linear ordering of words within
ICEIS 2006 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS
310
the context and uses basic occurrence information.
In addition the GATE (Cunningham, 2000) was used
to extend semantic mapping of the WordNet initially
used by others researches (Wermter, 1995). With
GATE toolboxes some natural language processing
techniques, such as tagging, parsing, and word sense
disambiguation can be integrated with statistical
word knowledge.
Table 1 shows the results of the experiment. Row
3 demonstrates that symbolic natural language
processing combined with the connectionism
paradigm can improve concepts prediction accuracy.
REFERENCES
Androutsopoulos, I., Ritchie, G.D., Thanisch, P., 1995.
Natural Language Interfaces to Databases - An
Introduction. Natural Language Engineering, 1(1):29-
81.
Androutsopoulos, I., Ritchie, G.D., Thanisch, P., 1995.
Experience Using TSQL2 in a Natural Language
Interface. In J. Clifford and A. Tuzhilin, editors,
Recent Advances in Tem- poral Databases -
Proceedings of the International Workshop on
Temporal Databases, Zurich, Switzerland, Workshops
in Computing, pages 113-132. Springer-Verlag,
Berlin.
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.,
Wilks, Y., 2000.Experience of using GATE for NLP
R/D. In Proceedings of the Workshop on Using
Toolsets References 200 and Architectures To Build
NLP Systems at COLING-2000, Luxembourg.
IBM IBM Banking Data Warehouse General Information
Manual. Available from the IBM corporate site
http://www.ibm.com (accessed July 2005).
IBM. An Introduction to IBM Natural Language
Understanding. An IBM White Paper. Available from
the IBM corporate site http://www.ibm.com (accessed
July 2004).
Laukaitis, A., Vasilecas, O., Berniunas, R., 2005. JMining
- information delivery web portal architecture and
open source implementation // Edited by O. Vasilecas
et al. Information Systems. Development: Advances
in Theory, Practice and Education., Springer.
Laukaitis, A., Vasilecas, O., 2005. An architecture for
natural language dialog applications in data
exploration and presentation domain. ADBIS.
Miller, G.A., 1985. WordNet: A Dictionary Browser,
Proc. 1st Int'l Conf. Information in Data, pp. 25-28.
Wermter, S., 1995. Hybrid Connectionist Natural
Language Processing, Neural Computing Series,
Chapman & Hall.
THE USE OF THE NATURAL LANGUAGE UNDERSTANDING AGENTS WITH CONCEPTUAL MODELS
311