Natural Language Interfaces to Databases: Simple Tips

Towards Usability

ısa Coheur, Ana Guimar

aes and Nuno Mamede

F/INESC-ID Lisboa

Rua Alves Redol, 9, 1000-029 Lisboa, Portugal

Abstract. Natural Language Interfaces to Databases can be an easy way to ob-

tain information: the user simply has to write a question in his/her own language

to get the desired answer. Nevertheless, these kind of applications also present

some problems. Many of those arise from the fact that who develops the inter-

face does it according with his/her own idea of usability, which is sometimes far

from the real interaction the interface will have to support; but even when a ques-

tion is syntactically supported, it can be misunderstood and a wrong answer can

be provided to the user. In this paper we present some simple tips that intend to

minimize these situations.

1 Introduction

During the implementation of JaTeDigo [1, 2], a Natural Language Interface (in Por-

tuguese) to a cinema database, we had to deal with many problems related with usability

and we understood that some simple solutions can be implemented in order to minimize

these problems and its effects. As so, in this paper, we focus on some tips that intend to

make NLIDBs more user friendly and trustable, improving their usability.

The paper is organized as follows: in Section 2 some related work is presented; in

Section 3 we present some tips towards usability; in Section 4 we evaluate one of those

tips, namely the importance of presenting examples of questions that are understood by

the system, as well as questions that the system is not able to answer; ﬁnally, in Section

5 we present some conclusions and future work.

2 Related Work

Communicating with the computer is a long-standing goal for Artiﬁcial Intelligence

research. Although the ﬁrst NLIDB emerged in the 70’s, NLIDB had their golden era

in the 80’s and mid 90’s. Nowadays, NLIDB are considered to be particular situations

of question answering (QA) systems. In recent years there have been several attempts

to merge QA systems with dialogue systems, improving system results by allowing in-

teraction with the user. For instance, HITIQA (High-Quality Interactive Question An-

swering) [3], is an interactive question answering system that answers (complex) open

domain questions in natural language, such asWhat has been Russia’s reaction to U.S.

Coheur L., Guimar

aes A. and Mamede N. (2008).

Natural Language Interfaces to Databases: Simple Tips Towards Usability.

In Proceedings of the 5th International Workshop on Natural Language Processing and Cognitive Science, pages 147-152

 SciTePress

bombing of Kosovo? and narrows the search space through a clariﬁcation dialogue with

the user. Another example is the RITEL (Recherche d’Informations para TEL

ephone)

project [4]. Its goal is to integrate conversational and oral capabilities in information

retrieval systems (made by phone) and, in particular, in QA systems. We also detach

TV-Guide and BirdQuest projects [5–7]. In TV-Guide, a multimodal system is used to

allow access to public domain information, namely television programming. Within this

application, the user can formulate a vague question that is then reﬁned in a dialogue;

BirdQuest answers questions about nordic birds. In this system, dialogue capacities are

combined with information extraction.

JaTeDigo follows some of these applications’ ideas as we also believe that interact-

ing with the user – even in a very simple manner, not implicating the development of a

truly dialogue system – can improve the application results.

3 Tips

As said before, many problems arise from the fact that who develops a NLIDB does

it according with his/her own idea of usability. In the following we present some tips

concerning this problem:

– Before starting the NLIDB implementation, a corpus containing questions that

users would like to ask should be build. This corpus can be used to identify the

questions in which the developer should invest – that is frequently asked questions

with the same syntax and/or topic – but also to confront developers with inventive

and unusual questions. Considering JaTeDigo implementation, before starting the

development of the interface, a corpus with around 80 questions was build from 8

users. At that point we understood that there was a set of questions that we could not

answer, regarding the information we had in the database. For instance, the ques-

tion Qual o maior

exito de bilheteira dos

ultimos 5 anos? (Which was the major

box ofﬁce in the last 5 years?) could not be answered because the database had no

information concerning major box ofﬁces. Also, another problem that was detected

in this phase resulted from the fact that questions were written in Portuguese and in-

formation regarding characters, was in English. As so, for instance, the question De

quem

e a voz do burro no “Shrek”? (From whom is the donkey voice in “‘Shrek”?)

could not be answered, because we had no means to translate burro into donkey.

– Present examples of successful and unsuccessful questions to the user. The exam-

ples obtained in the previous step can be used to guide the user in the type of

question that he/she may or may not submit.

– A ﬁrst evaluation should be done as soon as possible, without embarrassments, and

by as many different users as possible.

– When the interface is in use, if there is no way for the system to perform a safe

disambiguation it is better to proﬁt from the user to do it. Considering JaTeDigo, as

sometimes there is no way to disambiguate without making possible wrong choices,

we opt to ask user’s opinion. Figure 1 illustrates this disambiguation step being

given the question Who directed King Kong?.

– Unnecessary interactions should be avoided. For instance, consider the question

Who plays with Emma Watson in Harry Potter?. There are two actresses with the

148

name Emma Watson, nevertheless, only one of them plays in Harry Potter. As a

result, this ambiguity should be solved by the system as there is no need to ask the

user to disambiguate: the information is all there.

– Identify situations where the user will use the “wrong” words to ask the question

that he/she has in mind and adapt the system to those. For instance, the following

question was asked to JaTeDigo Quem contracena com Hugo Weaving em The

Lord of the Rings? (Who plays with Hugo Weaving in The Lord of the Rings?)

and an early version of JaTeDigo answered Hugo Weaving does not participate in

the movie The Lord Of The Rings. Why? Because none of the movies from the

Tolkien trilogy is called exactly The Lord of the Rings (but, for instance, The Lord

of the Rings, the two towers). Besides, there is an animation movie from 1978 with

that name (and Hugo Weaving does not participate in it). As a result JaTeDigo

understood that the user was asking about that movie from 1978.

– As the previous step is not always possible, try to minimize the troubles caused

by a wrong answer, by providing information that can help the user to validate the

answer or to understand that the question was badly interpreted. Considering JaTe-

Digo, information about the ﬁlm opening year is provided, as well as the main cast.

If JaTeDigo answer was Hugo Weaving does not participate in the movie The Lord

Of The Rings from 1978, the user would understand that something was wrong.

Fig. 1. Disambiguation step.

In the following we show some preliminary results of an evaluation concerning the

last tip.

4 How Important are Example-questions?

JaTeDigo interface is a web page (Figure 2). As it happens with START [8], examples

of successful and unsuccessful questions are presented in order to give the user a picture

of the system capabilities and limitations.

149

By this, although this is a preliminary evaluation, we can say that the user is inﬂu-

enced by the examples showed (mainly inﬂuenced by its topics or syntax), but, appar-

ently, he/she does not read carefully enough the presented examples in order to avoid

misspellings. Anyway, we can say that it is worthy to invest in examples in the interface.

5 Conclusions and Future Work

We have presented some tips that intend to make NLIDBs more user friendly and

trustable. First, we have detached the user’s role: the NLIDB can proﬁt from potential

users feedback during the development process, allowing to understand the question

that will effectively be asked to the system (and not only what the development team

has in mind). Also the NLIDB can proﬁt from the user feedback when the interface is

running, for instance, for disambiguation proposes.

Secondly, we have presented some tips to increase (or at least not to decrease) user’s

conﬁdence: the system should try to avoid unnecessary questions and provide informa-

tion in the answers that would help the user to understand if the question was well in-

terpreted (or not). Also, particular situations, where it is known that user will formulate

the question in a “incorrect way” should be identiﬁed.

Moreover, we have presented an experiment that intended to show the importance

of guiding the user with successful and unsuccessful examples and we have shown that

this guidance lead to a considerable increase of successful answered questions although

it does not help to avoid misspellings.

A system as JaTeDigo, as any NLIDB, needs constant improvement. As future work

we intend to continue to extend its understanding capabilities and make it more robust:

if only part of the request was understood, a dialogue with the user should be establish

in order to reﬁne the question. Moreover, we intend to incorporate some of these tips in

a QA system.

Acknowledgments

This work was funded by PRIME National Project TECNOVOZ number 03/165.

References

1. Guimar

aes, R.: J

atedigo – uma interface em l

ıngua natural para uma base de dados de cinema.

Master’s thesis, Instituto Superior T

ecnico (2007)

2. Coheur, L., Guimar

aes, R., Mamede, N.: Supporting named entity recognition and syntactic

analysis with full–text queries. In: Proceedings of the 3th International Conference on Appli-

cations of Natural Language to Information Systems (NLDB2008), London, Springer-Verlag

(2008)

3. Small, S., Strzalkowski, T., Liu, T., Ryan, S., Salkin, R., Shimizu, N., Kantor, P., Kelly, D.,

Rittman, R., Wacholder, N., Yamrom, B.: Hitiqa: Scenario based question answering. In

Harabagiu, S., Lacatusu, F., eds.: HLT-NAACL 2004: Workshop on Pragmatics of Question

Answering, Boston, Massachusetts, USA, Association for Computational Linguistics (May 2

- May 7 2004) 52–59

151

4. Rosset, S., Galibert, O., Illouz, G., Max, A.: Interaction et recherche d’information : le projet

Ritel. Traitement Automatique des Langues 46(46-3) (2006)

5. J

onsson, A., Merkel, M.: Some issues in dialogue-based question-answering. In Maybury,

M.T., ed.: New Directions in Question Answering, AAAI Press (2003) 45–48

6. J

onsson, A., Merkel, M.: Extending qa systems to dialogue systems. In: Working Notes from

NoDaLiDa 03, Iceland (2003)

7. J

onsson, A., And

en, F., Degerstedt, L., Flycht-Eriksson, A., Merkel, M., Norberg, S.: Experi-

ences from combining dialogue system development with information extraction techniques.

In: New Directions in Question Answering. (2004) 153–168

8. Katz, B., Lin, J.: Annotating the semantic web using natural language. In: NLPXML ’02:

Proceedings of the 2nd workshop on NLP and XML, Morristown, NJ, USA, Association for

Computational Linguistics (2002) 1–8

152