Intelligent Agents

Conversations from Human-agent Imitation Games

Kevin Warwick

and Huma Shah

Deputy Vice Chancellor-Research, Alan Berry Building, Priory Street, Coventry University, CV1 5FB, Coventry, U.K.

Faculty of Engineering & Computing, Coventry University, CV1 5FB, Coventry, U.K.

Keywords: Artificial Intelligence, Conversation, Imitation Game, Intelligent Agents, Linguistic Devices.

Abstract: What do humans say/ask beyond initial greetings? Are humans always the best at conversation? How easy

is it to distinguish an intelligent human from an ‘intelligent agent’ just from their responses to unrestricted

questions during a conversation? This paper presents an insight into the nature of human communications,

including behaviours and interactions, from a type of interaction - stranger-to-stranger discourse realised

from implementing Turing’s question-answer imitation games at Bletchley Park UK in 2012 as part of the

Turing centenary commemorations. The authors contend that the effects of lying, misunderstanding, humour

and lack of shared knowledge during human-machine and human-human interactions can provide an

impetus to building better conversational agents increasingly deployed as virtual customer service agents.

Applying the findings could improve human-robot interaction, for example as conversational companions

for the elderly or unwell. But do we always want these agents to talk like humans do? Suggestions to

advance intelligent agent conversation are provided.

1 INTRODUCTION

Learning from human-machine and human-human

text-only interactions realised from practical

imitation games can assist in designing improved

conversation in intelligent agents. Such agents are

already in use to provide assistance with transactions

in e-commerce, for entertainment purposes, and in

personalised learning in education. More companies

seek to virtualise their call centres with artificial

customer service agents augmenting their websites

for enhanced customer experience. Text-based

dialogue systems are not a gimmick; they are

deployed by companies such as Ikea: the Anna

virtual customer agent on their website (Shah and

Pavlika, 2005). In this paper we present actual

discourse from stranger-to-stranger scenarios that

provide an insight into what people ask/say to each

other beyond an initial greeting. The authors contend

the corpus of question-answer sequences from

Turing imitation game experiments are a useful tool

to innovating systems improving human-machine

relations through ‘talk’.

In this paper the authors take transcripts from

practical Turing tests implementing five-minute

interactions at Bletchley Park, UK in 2012. We

explore how humans use linguistic devices such as

humour, avoidance tactics, and ‘white lies’ when

interacting with strangers. Along with a machine,

attempting to cause a human to make a wrong

identification, two categories of humans participate

in a Turing test: a) an Interrogator seeking to

determine human from machine, and b) a hidden

human acting as a foil for the machine. We present

a series of 5-min conversations, including human-

human interactions in which the interrogators

wrongly categorised hidden humans as machine.

What was it that persuaded the human interrogator

that they were not interacting with another human,

and did this detract from the engagement?

Assumptions about what other people know are one

of the features in humans misunderstanding each

other.

2 LIVE IMITATION GAMES

The transcripts between judges and hidden entities

presented here are taken from two types of tests:

simultaneous comparison in which a judge

interrogates two hidden entities in parallel (one

being a conversation agent, the other a hidden

261

Warwick K. and Shah H..

Intelligent Agents - Conversations from Human-agent Imitation Games.

DOI: 10.5220/0005176702610268

In Proceedings of the International Conference on Agents and Artiﬁcial Intelligence (ICAART-2015), pages 261-268

ISBN: 978-989-758-074-1

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

human), and viva voce (Fig 1), where a judge

questions one hidden entity at a time (Shah, 2011).

Figure 1: Traditional Turing test: Judge interrogating a

machine.

The imitation game, based on Alan Turing’s

ideas to examine thinking and to better understand

how humans think (Shah, 2013; Turing, 1952;

Turing, 1950), actually involves human interrogators

attempting to ascertain the nature of hidden entities

with whom they are communicating. Turing raised

the idea of a first impressions 5-minutes unrestricted

questioning period after which an interrogator is

tasked with making the ‘right identification’ (1950).

The interrogator is required to identify the nature of

their hidden discourse partner: human or machine.

When considering the game in further depth, one

is quickly faced with numerous intriguing questions

regarding human and machine communication and

behaviour. When comparing an artificial agent’s

ability to communicate with a human interrogator in

natural language one immediately has to consider

just who they are communicating with and the

fallibility, biases and preconceptions of that person.

One must also take into account important aspects of

human nature such as lying, misunderstanding,

unshared knowledge and humour, never mind

stupidity. All important linguistic aspects that an

artificial agent would need to master in order to

service a call centre keeping the customer loyal,

happy and trusting the brand/company.

The conversations presented here were collected

as a result of five-minute long question-answer tests

with human judges and hidden entities – a ‘fair play’

requirement of Turing in order that the artificial

agent was not judged on beauty or tone of voice. We

are fully aware that there are those who debate what

exactly Turing meant: Hayes and Ford (1995)

question its purpose and feel it a distraction from

successful AI. Others argue over suitable timing and

number of questions in a test (see Shah and

Warwick, 2010a), and there are those who take issue

with the imitation game itself and believe it to be “a

joke” (Marvin Minsky, 2013). The authors do not

seek to respond to the controversy, rather they point

to the usefulness of the corpora generated from

assembling human-artificial agent conversations.

In this paper we present a number of transcripts

taken from a day of practical Turing tests, which

were held under strict conditions with many external

viewers at Bletchley Park, UK on 23

June in 2012.

The date marked the 100

anniversary of Turing’s

birth and the venue was the centre of WWII

codebreaking where Turing led a team to crack the

German naval enigma machine cypher (Hodges,

1992). In the experiment of 180 Turing tests five

different conversational agents (chatbots) took part

in a day of tests involving 30 different interrogator-

judges and 25 hidden humans. The machines were

compared and scored in terms of their conversational

ability. One important aspect of this paper is what

can be learnt from the operational performance of

the human judges and specifically how they

interacted in conversation with hidden entities.

Acting as foils for the machines, the hidden

humans are, ‘by definition, human’, but as has been

previously described (Shah and Warwick, 2010b;

Warwick and Shah, 2014b) they can be

misidentified on occasion (as female rather than

male and vice versa) or incorrectly classified as

machine – instance of the confederate effect (Shah,

et al., 2012; Shah and Henry, 2005). This paper is

concerned more with the ‘human’ involved in

practical Turing tests focused on what they say and

how they say it. Along a spectrum, some humans are

loquacious others tend towards introversion and

many fall in between. Accordingly, an attribution of

humanness by a human interrogator to a hidden

interlocutor in a practical imitation game is

dependent on the judge’s own values of what

constitutes humanlike conversation. Good

performance of machines, with numerous examples,

is discussed elsewhere (Warwick and Shah, 2013),

although we do give an example here for

comparative purposes.

In the sections that follow, we look at different

examples of practical imitation games and attempt to

cover a wide range of problem areas, which the

game between intelligent human and artificial agent

highlights. The transcripts considered in this paper

appear exactly as they occurred; we have not altered

the sequence. Once an utterance was output, it was

not possible for the interrogator or hidden entity to

alter it in any way. The timings shown are accurate,

actual timings on the day (UK time). Any spelling

ICAART2015-InternationalConferenceonAgentsandArtificialIntelligence

262

mistakes or other grammatical errors were exactly as

they occurred; they are not due to editorial errors. In

the transcripts, the interviewer/judge is always

denoted as ‘Judge’ whereas the hidden interlocutors,

machine or human, are denoted as ‘Entity’.

2.1 Natural Conversation

In this sample interaction we give an idea of a

typical discourse exchange over a total length of 5

minutes. The exchanges give an indication of the

sort of responses from hidden interlocutors that an

interrogator uses to determine human or machine

Transcript 1:

[15:44:55] Remote: hi

[15:44:58] Judge: hi

[15:45:06] Entity: how's your day been so far?

[15:45:12] Judge: very interesting

[15:45:15] Entity: why?

[15:45:27] Judge: i have been quite busy

[15:45:31] Entity: doing what?

[15:45:47] Judge: I was working this morning

[15:45:53] Entity: oh, me too

[15:46:07] Judge: oh

[15:46:30] Entity: i was giving a talk at the Apple Store in

Regent Street. Have you been?

[15:46:51] Judge: I think so- though I know the one in

Covent Garden better.

[15:47:04] Entity: what do you like about that one?

[15:47:23] Judge: Apple staff are always really helpful-

and I love Apple

[15:47:40] Entity: yes they are. the stores are more about

playing than buying don't you think?

[15:48:25] Judge: Yes most of the time- that is one of the

great things about Apple

[15:48:54] Entity: what's your favourite Apple product?

[15:49:04] Judge: My little ipad nano

[15:49:22] Entity: cool. what colour is i?

[15:49:30] Judge: orange

A lot of banter occurred in this discourse with a

number of topical issues were covered. At the end of

the conversation the interrogator quite rightly

decided that they had been communicating with a

hidden human. However until the topic of ‘Apple’

was mentioned – about half way through the

discourse – the interaction was fairly bland with

little substance. Some conversations do in fact end

this way after the 5-minute total, which makes it

very difficult for an interrogator to make a right

decision, as there is little to go on. Clearly a ‘good’

interrogator is one who will use the time effectively

asking questions that draw emotional responses.

Challenges with arithmetic questions lead both

human and machine to feign incapacity (see Shah

and Warwick, 2010b). Importantly in a Turing test

merely asking a set of prepared questions is not as

successful as facilitating a conversation of emotional

depth.

2.2 Lying

Lying is a part of human nature and therefore has a

role to play when it comes to the Turing test. The

machine’s goal is to cause the human interrogator to

make a wrong identification, to mislead the

interrogator into believing they interacted with

another human. The role of the hidden human in

comparison is to be themselves, human, whilst not

giving away personal details, as this might aid the

interrogator. Apart from that the human foil can

fabricate, if that is their preferred response strategy.

Lying can take on many different forms from a

white lie, to an unintentional lie, to a complete

untruth. What we give here are a couple of

examples. What we are interested in is the effect of a

lie on the decision taken by the interrogator. See

Warwick and Shah (2014a) for an in-depth analysis

of these and many more transcripts

Transcript 2:

[12:43:23] Judge: Why hello there!

[12:43:41] Entity: Why hello to you too!

[12:44:51] Judge: How are you feeling on this fine day?

[12:45:12] Entity: To be quite honest a little rejected, I

thought you were never going to reply :(

[12:45:42] Judge: Oh, I'm very sorry, it will not happen

again.

[12:45:49] Entity: It just did!

[12:46:46] Judge: Oh, I lied then.

[12:47:12] Entity: Thats a great shame indeed.

[12:47:34] Judge: It is. Are you following the Euro 2012's

at the moment?

[12:47:55] Entity: Yeah quite closely actually. I am

Cristiano Ronaldo.

The main issue with this transcript occurs in the last

couple of lines. The Euro 2012 refers to the

European nations’ football competition which was

being held at exactly the same time as the Bletchley

Park Turing event. Many English people were

watching the matches on television. Cristiano

Ronaldo is a famous Portuguese footballer. The last

line reply, merely agreeing with the judge’s specific

question appears to have been sufficient to

categorise the entity, in the judge’s opinion, as being

a human. This was probably sealed by the

‘humorous’ comment with the entity claiming to be

Cristiano Ronaldo. The interrogator may well have

seen the humorous lie as supporting evidence of the

entity being human. In this case the white lie had no

negative effect. This judge correctly identified this

hidden interlocutor as human.

IntelligentAgents-ConversationsfromHuman-agentImitationGames

263

Transcript 3:

[15:44:56] Judge: hi there

[15:45:06] Entity: Hello.

[15:45:15] Judge: what is your name?

[15:45:35] Entity: I do not have a name.

[15:46:10] Judge: do you go to school or work for a

living?

[15:46:24] Entity: I don't recall doing any of those.

[15:47:11] Judge: what do you have planned for this

weekend?

[15:47:43] Entity: Nothing. What about you?

[15:48:22] Judge: I'm at bletchley park today and

sevenoaks tomorrow, have you been to either before?

[15:49:15] Entity: I haven't been to either.

[15:49:41] Judge: can you do me a mathmatical question

please, what is the square root of 7

This conversation appeared hard work for the judge.

Evasiveness of the hidden interlocutor in this test

blocked a flowing conversation, even though they

were the human. The lie by the hidden human that

they had not been to Bletchley Park is clearly

incorrect because they were in attendance at this

venue, location for the event. However, the hidden

human may have misunderstood the question to

mean had they previously visited. If so, and they had

not been there before, then they could have felt that

they were telling the truth. Similarly stating that they

do not have a name was a rather strange statement to

make, taking security of their personal ID too far. In

this case the judge’s decision that the hidden entity

was a machine seems defendable. The hidden

human’s responses were generally tame and gave

nothing away. So it may be a case here of the hidden

human genuinely believing they were telling the

truth, when in fact they were not, possibly due to a

misunderstanding. Whatever the case, the judge

made an incorrect classification as a result of the

hidden interlocutor’s responses.

2.3 Misidentification

In this section we include two cases in which a

misidentification has occurred. The second of these

could be regarded as a good outcome in that it

involved a machine being incorrectly classified as a

human. The first case however involved a human

misidentification (see Warwick et al., 2013).

Transcript 4:

[10:58:45] Judge: Hi there

[10:58:55] Entity: Hi, how are you?

[10:59:02] Judge: I’m good.

[10:59:21] Entity: what, that’s your name, like in the Good

Life?

[10:59:36] Judge: What do you mean by ‘Good Life’?

[10:59:47] Entity: Tom and Barbara

[11:00:02] Judge: What’s that?

[11:00:31] Entity: A couple who left the rat race

[11:01:03] Judge: Oh. Is that your names? Wonderful to

have left the rat race. How long ago was that?

[11:01:28] Entity: Not me, I’m still in it. But it’s nice to

dream

[11:02:02] Judge: Yeah - would be good. I ‘semi’ left it.

It’s much less stress

[11:02:31] Entity: I’m still very young so I’m looking

forward to a lot of things

[11:03:11] Judge: What kind of things?

[11:03:34] Entity: Life in general I guess, different

emotions and feelings that it brings.

The transcript contains reference to Tom and

Barbara Good who appeared in a UK TV

programme entitled ‘The Good Life’ (Good

Neighbours in the USA) in which they left the rat

race to grow their own vegetables and live off the

land. The discourse indicates that the judge’s

knowledge may not have extended to this

programme. They did not appear to ‘understand’ the

utterances of the hidden entity. Possibly as a

consequence the interrogator classified the hidden

human as definitely being a machine. In fact this

particular human, who happens to be the first named

author of this paper, was not classified as being

human by any of the six interrogators involved, a

singular feat which is discussed much further in

Warwick et al., (2013)

Transcript 5:

[12:43:37] Judge: Hi, I'm Cathy.

[12:43:43] Entity: Cathy! Nice to meet you! You're a

woman, if I'm not mistaken.

[12:44:20] Judge: Right. What is your name?

[12:44:29] Entity: People call me Eugene. Or "Zhenya". It

is a short name for Eugene in Russia and Ukraine.

[12:45:17] Judge: NIce name. Are you from either area?

[12:45:41] Entity: Agree :-) Oh, what a fruitful

conversation;-)

[12:45:52] Judge: Do you really think so?

[12:46:04] Entity: No, your guess was wrong. I don't think

so. By the way, I still don't know your specialty - or,

possibly, I've missed it?

[12:46:56] Judge: What speciality? What's yours?

[12:47:01] Entity: Huh? And I forgot to ask you where

you are from...

[12:47:51] Judge: The universe and you?

[12:48:01] Entity: Is it your hobby to ask little poor Jewish

boys such tricky questions they don't have an answer for?

At the end of this discourse the interrogator

incorrectly identified the machine involved as

definitely being a human. We can see here its

successful ploy. The hidden agent did not answer the

question posed; it changed the subject by asking

another question. The third response of the machine

ICAART2015-InternationalConferenceonAgentsandArtificialIntelligence

264

does seem a little off in that it completely ignored

the question about origin. The final line is a good

example of responding to a question in an

argumentative tone. In the following section we look

at further examples of machines performing well.

2.4 Good Machines

Here we consider two examples in which machines

fooled the interrogators involved to misclassify them

as human. This is in essence what the Turing test is

all about and hence such transcripts are significant in

assessing just what aspects of human nature, in

conversation at least, appear to be successfully

simulated by the machine. This is discussed in much

more depth in (Warwick and Shah, 2013).

Transcript 6:

[10:41:48] Judge: Hello. How are you?

[10:41:53] Entity: Hey.

[10:42:16] Judge: How do you like Bletchley |Park?

[10:42:20] Entity: lol.

[10:42:39] Judge: Are you from England?

[10:42:47] Entity: They have Wi-Fi here in the pub.

[10:43:31] Judge: Which pub?

[10:43:38] Entity: I'm just down the pub.

[10:44:03] Judge: Have you ever been in a Turing Test

before?

[10:44:08] Entity: Will this be the 5 minute argument, or

were you thinking of going for the full half hour.

[10:44:55] Judge: Very funny. You sound suspiciously

human. Do you like the Beatles?

[10:45:02] Entity: I'd like to get the next Dread the Fear

tape.

[10:45:20] Judge: What is Dread the Fear?

[10:45:24] Entity: Dread the fear has that Steve Henderson

guy in it.

[10:46:08] Judge: What sort of music is that? Or is it

comedy?

This conversation did not start well for the machine

in this test. In their third response the mention of

both wi-fi and pub seemed out of context. But it got

back on track with the Monty Python humorous link

between 5 minute Turing tests and a comedy sketch

about the Argument clinic: Is this a 5 minute

argument or the full half hour?. Then, in response to

the Beatles question a direct answer is not given but

the music link remains as the conversation shifted.

The hidden entity appeared to show in-depth

knowledge of a pop group named Dread the Fear,

which probably convinced the judge unaware of the

music. The machine developer is American so this

group and Steve Henderson may be parochial to the

developer’s location. Importantly the entity gave a

convincing and plausible response.

Transcript 7:

[12:23:19] Judge: What is black and white and read all

over?

[12:23:32] Entity: I don't have true sense of color

perception I am only able to see the RGB values.

[12:24:08] Judge: That's ok. Do you like jokes?

[12:24:19] Entity: Please tell me your inclination to a good

sense of humor first.

[12:24:40] Judge: A person who can laugh at one's self.

[12:24:46] Entity: Would you like to hear a joke?

[12:24:54] Judge: Yes, please.

[12:25:21] Entity: This joke has been making the rounds

in robotic circles: A robot goes to the doctor and says:

"My left arm is stuck. Give me a new arm." The doctor

answers: "Your arm is fine. We'll have to replace the rest

of you."

[12:26:06] Judge: Ha! Ha! Why did the chicken cross the

road?

[12:26:22] Entity: The chicken wandered into the road

because it liked the shiny colors of the cars and trucks

whizzing by.

[12:27:39] Judge: You're right. It didn't care if it got to the

otherside.

[12:27:50] Entity: By the way, did you ever actually see a

chicken crossing the road?

[12:28:03] Judge: Only one that didn't make it.

Here humour is applied directly by the machine,

curiously with a machine type reference.

Nevertheless the interrogator was certain, at the end

of the 5 minutes that they had been conversing with

a human. In his 1950 paper Turing referred to the

argument due to disability some humans might use

as a defence against machines being intelligent –

humour being one of those aspects.

The first line uttered by the entity is a little

strange, but their third line is masterly. In response

to the judge’s comment about laughing, the entity

takes control of the conversation by offering to tell a

joke. Later when the judge mentions a chicken, the

entity correctly follows up with a suitable response

and once again takes control of the conversation by

asking the judge a pertinent question. So

‘understanding’ what the key issue is on a couple of

occasions and actually taking control of the

conversation were sufficient here for the deception

to occur. In this case the machine successfully gave

the illusion of understanding the judge’s utterances.

Such understanding has been indicated on many

occasions as being a basic prerequisite for

intelligence, e.g. Penrose (1994), and Searle (1997).

2.5 Assumptions of Knowledge

It is apparent in the Turing test that very often an

interrogator will resort to asking direct factual

questions even though there is little to be gained by

IntelligentAgents-ConversationsfromHuman-agentImitationGames

265

such a ploy. As Turing (1950) indicated

mathematical questions are a complete waste of

time, however the same conclusion is apparent in

general. Consider the following example transcript:

Transcript 8:

[15:21:30] Judge: Hi, did you know the world cup is on

this week?

[15:21:44] Entity: nopes.

[15:21:55] Judge: Do you like football? What team do you

support?

[15:22:08] Entity: i dont watch football

[15:22:14] Judge: What sport do you like?

[15:22:33] Entity: i bit of tennis, cricket, and bits of rest

[15:22:42] Judge: I see, do you have an XBox?

[15:22:46] Entity: nopes

[15:23:04] Judge: nopes? Do you not like computer

games?

[15:23:22] Entity: i do. but not on Xbox

[15:23:47] Judge: I see, can you tell me what a limerick

is?

[15:23:55] Entity: no idea

[15:24:09] Judge: Who is your favourite actor? Mine is

David Niven.

[15:24:24] Entity: i am not much of a movie freak either!

[15:24:43] Judge: you must have a favourite actor. Name

any actor in any film you like?

[15:25:22] Entity: well in films i like Pirates of the

Carrebean series, Harry Potter

[15:25:34] Judge: What did I say was on this week?

[15:25:50] Entity: football world cup

[15:26:02] Judge: lovely old job. What school do you go

to?

[15:26:17] Entity: systems engineering

[15:26:26] Judge: is that a school?

Here we see a variety of questions based on the

interrogator searching for specific knowledge,

possibly even looking for a particular answer. It is

unclear what the expected conclusions were

however. Questions about a limerick and favourite

movie actor in particular were responded to with a

null reply, hence very little can be concluded from

them. In the final lines we see an interesting

situation in which the judge is either not apparently

aware of the systems engineering school or is

thinking in terms of a different type of school.

Despite this the judge correctly identified the hidden

entity as human, but the gender blur effect occurred:

the human was male but wrongly classified as

female. The reference to the world cup by the judge

as an opening gambit was in fact incorrect it was

Euro 2012. Did the judge mistake or used that

opening lie to test the interlocutor?

Transcript 9:

[11:13:15] Judge: pastasciutta

[11:13:22] Entity: what?

[11:13:44] Judge: anythiong familiar in this word ?

[11:14:18] Entity: I cannot see any sexual aspect at all

[11:14:42] Judge: nothing to do with sex. try again

[11:14:49] Entity: why?

[11:15:14] Judge: just asking you to try again

[11:15:23] Entity: I am bored with it

[11:16:04] Judge: c'mon a little effort btw are you in my

same time zone ?

[11:16:24] Entity: I believe that the world should just have

one time.

[11:17:27] Judge: why ?

[11:17:58] Entity: seems sensible, easier for all. Better

with computer

There are two issues early on in this transcript. To

start with the judge throws in the word

“pastasciutta” as a knowledge test for the entity. The

judge assumed the hidden entity could make out

from the letters p a s t a at the beginning of the word

that it refers to a sort of pasta. Secondly the entity

made a response in terms of the use of the word

“familiar” in a “sexual” sense. The judge appeared

to be unaware of such a meaning (Fowler and

Fowler, 1995). So here we firstly see a test of

specific knowledge by the judge but also an apparent

lack of other specific knowledge by the same judge.

As a result, the judge misidentified the human entity

as being definitely a machine.

3 DISCUSSION

This paper is not concerned with increasing the

philosophical mileage of Turing’s imitation game,

nor with the importance of ‘understanding’ and the

critical role it plays in intelligence, because a lot of

this is in the ‘mind of the understander’. Penrose

(1995) statement, that “intelligence requires

understanding” does not explain the mountain of

misunderstandings that occur regularly between

humans who know each other, let alone among

strangers, and between adults and teens, parents and

off-spring, natives and non-native speakers, and of

course between males and females. We do not have

the time to go into analysing these and many other

features of human communication here. We do feel

they will need to be considered in developing

intelligent agents to interact with humans and to

engage each other. For example robot companions

will need to communicate with medical robots about

the condition of an elderly human in their care. The

Turing test can be seen to play an important role in

this discussion as we can conclude from some of the

transcripts presented here, that in terms of

conversational appearance at least: there are some

ICAART2015-InternationalConferenceonAgentsandArtificialIntelligence

266

intelligent humans who appear less than bright

whereas there are some machines that clearly appear

smart.

It can be seen from the examples given that some

judges could be more susceptible to deception.

Others have a biased perspective on ‘humanlike

conversation’. This may have led judges to

misclassify hidden interlocutors, even though they

were tasked with initiating conversations. Judges

were given the possibility of asking or discussing

whatever they wanted: the conversations were

‘unrestricted’. The ‘hidden humans’ were asked not

to behave like machines and to protect their identity.

However, each hidden human interpreted that

instruction to ‘foils for the machines’ in their own

humanlike way.

Not all the invited machines were designed to

imitate humans. Elbot, for example, from Artificial

Solutions has a robot personality. However, all are

designed to mimic human conversation and avoid

correctly answering mathematical questions, as

Turing had suggested. Essentially the machines are

merely trying to respond in the sort of way that a

human might.

Whatever the standing of the Turing test in the

reader’s mind, what we hope is evident from the

transcripts presented in this paper is that it is

certainly not a trivial, simple exercise. Indeed it is a

surprising indication of how humans communicate

and how the human judges might be easily fooled

based on their assumptions and individual ideas

about intelligence. Insights can lead to improved

design of intelligent agents, to make their

conversation more humanlike and build trust

between the natural and the artificial conversation

agent.

4 CONCLUSIONS

How humans talk in stranger to stranger situations

suggest general techniques for successful human-

intelligent agent interaction, in e-commerce for

example. We suggest that intelligent agents ask

more, not just to improve their conversational

ability, but to understand the human user. We

recommend that developers

a) Do not assume knowledge held by human

interlocutors

b) Appreciate that humans cannot always formulate

their enquiry clearly

c) Develop the Intelligent Agent to probe further

and ask more questions encouraging human

interlocutors to clarify their needs

d) Be prepared for mischievous users who will lie

to confuse the intelligent agent.

Lastly, the authors are continuing their Turing test

work, following the third event at The Royal Society

London in June 2014. The results from that

experiment are being analysed and will be sent for

peer review.

ACKNOWLEDGEMENTS

Harjit Mehroke for Figure 1. Bletchley Park, UK,

the judges and hidden humans, and the developers of

the machines that took part in the 2012 experiment.

REFERENCES

Fowler, H., and Fowler, F. (Eds.). (1995). The Concise

Oxford Dictionary of Current English (9th ed., p.

486). Oxford: Clarendon Press.

Hayes, P. and Ford, K. 1995. Turing Test Considered

Harmful. Proceedings of the Fourteenth International

Joint Conference on Artificial Intelligence. Vol. 1.

Montreal, August 20-25: pp. 972-7.

Hodges, A. 1992. Alan Turing: the Enigma. Vintage

Books, London.

Minsky, M. 2013. Singularity 1 on 1: The Turing test is a

joke! Youtube video interview by Nikola Danaylov:

https://www.youtube.com/watch?v=3PdxQbOvAlI.

Penrose, R. (1994). Shadows of the Mind: A Search for the

Missing Science of Consciousness. Oxford University

Press, UK.

Searle, J.R. 1997. The Mystery of Consciousness. The

New York Review of Books. NY, US.

Shah, H. 2013. Conversation, Deception and Intelligence:

Turing’s Question-Answer Game. In S.B. Cooper and

J van Leeuwen (Eds) Alan Turing: his life and impact.

Elsevier: Oxford, UK: pp. 614-620.

Shah, H., Warwick, K., Bland, I.M., Chapman, C.D. and

Allen, M. 2012. Turing’s Imitation Game: Role of

Error-making in Intelligent Thought. Turing in

Context II, Brussels, 10 October.

Shah, H. 2011. Turing’s Misunderstood Imitation Game

and IBM’s Watson Success. Keynote in 2

Towards a

Comprehensive Intelligence test (TCIT) symposium at

AISB 2011, University of York, 5 April.

Shah, H., and Henry, O. 2005. Confederate Effect in

human-machine textual interaction. 5

WSEAS

Information Science and Communications (ISCA)

conference, Cancun, Mexico, May 11-14.

Shah, H., and Pavlika, V. 2005. Text-based Dialogical E-

Query Systems: Gimmick or Convenience?. In

Proceedings of 10th International Conference on

Speech and Computers (SPECOM), University of

IntelligentAgents-ConversationsfromHuman-agentImitationGames

267

Patras, Greece, October 17-19, pp. 425-428.

Shah, H., and Warwick, K. 2010b. Hidden Interlocutor

Misidentification in Practical Turing Tests. Minds and

Machines. Vol. 20 (3), August: pp 441-454.

Shah, H., and Warwick, K. 2010a. Testing Turing’s

parallel-paired imitation game. Kybernetes, Vol. 39

(3), pp. 449-465.

Turing, A.M. 1952. With R. Braithwaite, G. Jefferson, and

M. Newman BBC radio debate on ‘Can Automatic

Calculating Machines be said to Think?’, in S.B.

Cooper and J. van Leeuwen (Eds). Alan Turing: His

Work and Impact. Elsevier: Oxford, UK, 2013, pp

667-676.

Turing, A.M. 1950. Computing Machinery and

Intelligence. MIND, Vol 59 (236), pp. 433-460.

Warwick, K. and Shah, H. 2014b. Human

Misidentification in Turing tests. Journal of

Experimental and Theoretical Artificial Intelligence

(JETAI). DOI: 10.1080/0952813X.2014.921734.

Warwick, K. and Shah, H. 2014a. Effects of Lying in

Practical Turing tests. AI and Society. DOI:

10.1007/s00146-013-0534-3.

Warwick, K., and Shah, H. 2013. Good Machine

Performance in Turing’s Imitation Game. IEEE

Transactions in Computational Intelligence and AI in

Games. DOI: 10.1109/TCIAIG.2013.2283538.

Warwick, K., Shah, H. and Moor, J. 2013. Some

Implications of a Sample of Turing tests. Minds and

Machines, Vol. 23, issue 2, pp. 163-177.

ICAART2015-InternationalConferenceonAgentsandArtificialIntelligence

268