interface or directly on the telephone, having a
phone conversation with a virtual assistant. For the
moment this is quite hard to do since it implies an
important expertise to be able to translate the needs
expressed in a natural language into a set of Forth
scripts and programs written in other languages. Our
method permits to create semantic dependencies in
both clearly explicitly stated expressions and vague
ones according to user's geo-information needs.
This paper is organized as follows: in next
section we give some works in NLP and fuzzy
semantics, then we explain our method and describe
the interface, we finally present a use case and
highlight the interest of this work.
2 SEMANTIC AND FUZZY
LOGIC ANALYSIS
At the first place to make discourse analysis, we can
use part-of-speech tagging (PoS) to (try to)
disambiguate words (e.g. “cross” can be a noun, an
adjective or a verb) (Winograd, 1971). However
these techniques permit to “understand” sentences
without ambiguity in a closed domain context but
they don't consider any imprecision or vagueness in
the meaning. The first approaches to deal with this
come from Zadeh when he introduced in 1965 the
fuzzy set theory, the fuzzy logic and the concept of
linguistic variables (Zadeh, 1965). The fuzzy sets
could be employed to integrate vagueness
throughout the relational structure of meaning
including both the concept of structure and reference
that a term denotes.
Since 1965, many models have been proposed,
mainly based on the empirical or possibility theory
which handling incomplete information (Zadeh,
1978). But recently, one seems the most appropriate
in our case: the 2-tuple fuzzy linguistic model [9]
because it deals with words and uses a simple
internal representation of them. Indeed the idea is to
deal only with words or linguistic expressions in
translating them into a linguistic pair (s
i
,α) where s
i
is a triangular-shaped fuzzy set and α a symbolic
translation. If α is positive then s
i
is reinforced else s
i
is weakened. If the information is perfectly balanced
(i.e. the distance between words is exactly the same,
then all the s
i
values are equally distributed on the
axis). But if not – that may happen when talking
about distance, for instance, “almost arrived” and
“close to” are closer to each other than “near” and
“out of the route” – the s
i
values may not be equally
distributed on the axis. That is why another model
has been proposed by the same team to deal with
such information that they call multi-granular
linguistic information (Martínez et al., 2010) for a
deeper review of these models.
In next sections we explain the methodology
with a use case to show the interest of the approach.
3 LINGUISTIC 2-TUPLES
MODEL AND OUR NLP
APPROACH
In recent papers, it has been shown that despite its
advantages, the 2-tuple model or unbalanced
linguistic term sets doesn't fit our needs perfectly
especially when one (or more) linguistic expression
is far away from its next neighbor (Abchir and
Truck, 2011). The new model we propose fully takes
advantage of the symbolic translations α that
become a very important element to generate the
data set.
Our 2-tuples are twofold. Indeed, except the first
one and the last one of the partition, they all are
composed of two half 2-tuples: an upside and a
downside 2-tuple. The choice of our 2-tuple model is
relevant since the linguistic terms used in the
geolocation context are usually unbalanced.
The methodology we use to deal with
imprecision inside the natural language is inspired
by the Parts of Speech (PoS) recognition and tagging
(Pappa, 2009). We simplify the analysis using
semantic tags because the context (geolocation
software) is known. Here is an example: “I want to
create an alert when the truck gets very close to the
warehouse” (see below).
<tokens>
<token gram="PRON">I</token>
...
<token gram="NOUN" sem="ALERT">alert</token>
...
<
token gram="VERB" sem="ZONE_ENTRY">gets</token>
<token gram="ADV" sem="FUZZY_MODIF_+">very</token>
<token gram="ADJ" sem="DISTANCE">close</token>
</tokens>
A tree using a simplified tree-adjoining grammar
(TAG)-based is then created, where each leaf node
represents the semantic tag of a token from the
lexicon. This grammar describes the components of
a geolocation alert that can be created by the end
user:
ALERT=TYPE,MOBILE,PLACE,NOTIFICATION
TYPE=ZONE_ENTRY|ZONE_EXIT|CORRIDOR
...
PLACE=TOWN|ADDRESS|POI|ZOI
Once we defined the lexicon (list of tagged tokens)
IJCCI2012-InternationalJointConferenceonComputationalIntelligence
430