
 
system. We have made primary evaluation of 
WebSphere Voice Server, which is a part of the IBM 
WebSphere software platform. From IBM 
presentation (IBM, 2004) it appeared that the system 
is primarily intended for telecommunication market. 
It was a challenging task to test it on more a 
complex system e.g. a full conceptual model for 
financial services. The IBM NLU system uses 
statistically based models, which as they claim, 
provide more flexibility and robustness compared 
with traditional grammar-based methods. Much of 
the algorithm is unknown becouse the product is 
proprietary. In the present research the black box 
approach was used: put the training data, compile 
and test the system response to the new arriving 
data. For statistical learning the sets of pairs 
including the concept and the description of the 
concept were provided. 
The following experiment conducted with IBM 
NLU solution revealed some basic problems with 
the current state-of-the art technologies when we 
want to apply them beyond ordinary telephony voice 
applications. A group consisting of 3 students was 
instructed about the above data model. They queried 
the system with about 20 questions and tried to 
identify the "Involved Party" concept. The number 
of concepts put into IBM NLU model for learning 
was constantly increased. At the beginning only 9 
top 'A' level concepts were considered. In this case 
for training data a description of these concepts were 
extracted from the original IBM model. At the 
second stage, the descriptions from child concepts 
were added to the training data for these 9 top parent 
concepts (see the second row in the table). Next the 
number of concepts was increased to 50 and finally 
500 concepts with their descriptions were extracted 
and put to the IBM NLU statistical training data. 
Table 1 shows the results of the experiment. To 
detect the classification error the proportion of the 
correct identified concepts was used. 
We were faced with a critical scalability 
problem. There were several instances in training 
when the system diverged from any reasonable 
acceptance level. While it was possible to make the 
training successful through manual intervention by 
adding more training data, the problem of 
divergence remained when the number of concepts 
increased up to the full conceptual model. The 
present  research has shown that there is a lack of 
descriptive power for entities identification when 
training data include only brief descriptions of the 
conceptual model entities (as in IBM FDWM). 
 
 
Table 1: Concepts identification experiment (CN - number 
of concepts for identification). 
 CN=9 CN=50 CN=500 
1. IBM NLU  0.1521   .0405  0.0152 
2. IBM NLU (child 
nodes descriptions 
added) 
0.3682   .1726  0.0822 
3. Hybrid modular 
FF NN (NL parsers 
integrated in the 
network structure) 
0.4590 0.2814  0.1874 
 
To increase concept identification accuracy, we 
experimented with Separate Multi-Layer 
Feedforward Network (MLF) with one hidden layer. 
The novelty of this experiment is that there is a 
feedforward network representing each node 
(concept) in the conceptual model. To train the 
network unit, which represents one node, we 
suggested that a different dictionary be provided for 
each network. For parent nodes children's training 
data, which was used in the IBM NLU experiment, 
was employed. In the presented architecture each 
network is concentrated on identification of one 
entity, but each network has a connection with other 
networks representing different concepts. 
It has been found that such "weak" 
connectionism between separate neural networks can 
increase concept identification. First the modular 
network was tested without symbolic pre-
processing. In the training process concept maps 
were constructed  based on the training examples. 
These concept maps relate each input 
sentence/phrase to a specific concept in the problem 
domain. All patterns consist of a unipolar 
representation of the training sentence or phrase. For 
example, the sentence could be: Show all my 
arrangements. Then the pattern for concept 
arrangement would be: 1 0 0 0 … 0 0 … . 
It has also been found that if there is a case 
where there is no symbolic preprocessing there 
should be textual input that accurately matches the 
network dictionary. This was the main reason why 
we decided to improve the performance of the 
system by transforming our dictionary input into 
Vector Space Model VSM. For this purpose, 
methodology presented in (Wermter, 1995) based on 
WordNet (Miller, 1985) was used for additional 
semantic mapping. Term weighting is a well-known 
representation approach that transforms a term to a 
weight vector in text processing. For neural models, 
this representation plays a key role in model 
performance. The most common term-weighting 
method, is based on the bag-of-words approach, 
which ignores the linear ordering of words within 
ICEIS 2006 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS
310