C, G, or U. The results are as follows:
P
G
= 0.18, P
A
= 0.34, P
C
= 0.27, P
U
= 0.21
Inserting the probabilities into the grammar rules
is done by generating a random variable in the guard
section of the rules, which is the only part that accepts
Prolog predicates. This random variable then is tested
according to the probabilities: for instance if the ran-
dom variable in the guard of a rule that assigns nu-
cleotides to positions known to be paired is less than
0.53, it will assign a GC pair. The average error is es-
timated to be about 18%, meaning that 18% of the nu-
cleotides might be paired with a nucleotide in a wrong
position (in the original structure they might be either
unpaired or be paired with another nucleotide).
4.2.2 Illness Diagnosis Agents
In previous work of one of the authors with Alma
Barranco-Mendoza, specialized concept formation
rules were used for representing knowledge in view
of diagnosing diseases such as lung cancer (Barranco-
Mendoza et al., 2004). Following this work, yet an-
other kind of probabilistic agent materializes as an ad-
ditional parameter of each constraint in the special-
ized concept formation rules of our multi-agent sys-
tem.The application introduced in this paper aims to
aid in early stage detection of some types of cancer,
like lung and oral, which have poor prognosis because
they are very difficult to diagnose at the early stages.
Our concept formation methodology assists in the
integration and analysis of multidisciplinary agents
containing genetic and molecular information along
with the radiological, serum and sputum data. In par-
ticular, it provides some kind of diagnosis even if
given incomplete patient information, as not all tests
can or will be done on a given patient at a given time.
This is achieved by relaxing certain properties, where-
upon the analysis will be completed even if the infor-
mation is not complete. The list of violated properties
can provide a list of suggested follow-up tests to im-
prove the accuracy of the diagnosis.
As part of the input concepts it accepts the pa-
tients age, smoking history, malignancy history, ra-
diological, serum and sputum data. The knowledge
store includes the properties that should be evaluated
for each input data element as well as the relations
amongst them. The diagnosis is given as a probability
of cancer that is calculated as a function of the con-
cepts used in the analysis. As well, the diagnosis will
list those diagnostic properties that were satisfied and
those that were not. For example:
const(Prob),age(x,A),history(x,smoker,T),
serum_data(x,marker_type,in_range)<=>
marker(x,marker_type,in_range,P,B),
acceptable(marker(x,marker_type,in_range, P),B),
probability(P,Prob,x, B),
acceptable(probability(P,Prob,x),B)|
possible_lung_cancer(yes,Prob,x).
relax(marker(x,marker_type,in_range,P,B)).
This rule evaluates for a patient x if a specific
biomarker, marker-type, found in serum data is within
a certain value range for a patient with an age of A
who is a type T smoker (T depends on the number of
cigarettes or cigars smoked daily). If true, then the
diagnosis of possible lung cancer is going to be true
with a probability increase of P (where P is a func-
tion of the patient’s age, health history, and this par-
ticular biomarker presence). But if we relax the re-
quirement of the presence of the biomarker, then the
system can evaluate patient records that do not have
this particular information and report in the diagnosis
listing that this information was not included in the
record, which could be valuable information as rec-
ommended follow-up tests for that particular patient.
5 CONCLUSIONS
We have abstracted, from recent different realiza-
tions of the linguistically inspired Concept Formation
paradigm, a multi-agent model for Biological Con-
cept Formation which can be considered as a com-
putational metaphor for the (biological) mind, with
direct executability implications. Due to the general-
ized use of Constraint Handling Rules or their gram-
matical counterpart, we are able to integrate human
language processing techniques into our approach
which are not only useful for all types of concept for-
mation but also allow us a smooth integration of hu-
man language processing agents, as well as their in-
teractions with the knowledge base agents. Another
interesting feature of our proposal is its robustness:
due to the capability of relaxing some of the prop-
erties involved in concept formation, results that can
be useful are provided even in the absence of all the
information “necessary” to form the concepts in ques-
tion.
Concept formation rules are applicable to many
other AI and cognitive problems as well, most no-
tably, those involving the need to reason with incom-
plete or incorrect concepts.
The flexibility allowed by relaxing properties was
argued in our initial paper (Dahl and Voll, 2004) to
provide a more appealing solution to the need for
flexibility than the two main alternatives out there,
namely probabilities and fuzzy logic. The probabilis-
tic approach had been discounted as inappropriate for
measuring the meaning of information, although ad-
BIOLOGICAL CONCEPT FORMATION GRAMMARS - A Flexible, Multiagent Linguistic Tool for Biological Processes
393