Last week headlines announced that a computer, known as Eugene Goostman, had passed the Turing Test at a competition at the Royal Society of London by University of Reading researchers. It was heralded as a milestone in artificial intelligence (by one of the competition organizers) and implied that a computer program had shown some significant amount of intelligence and fooled people into believing it was human after a robust interrogation. Turing originally predicted that in the year 2000 a computer might be able to hold up in a conversational test as well as a human for five minutes in 30 percent of trials. Which added a sense of officialness to the claim that Eugene passed. Quickly critics appeared to call into question whether Eugene would really have fooled anyone in a normal conversation.
Eugene managed to fool the judges in the competition about a third of the time. However this was achieved by Eugene presenting the persona of a Ukranian 13 year old with imperfect English, the competition was a speed test with only five minutes to evaluate multiple potential humans or machines at once via computer relayed chat (you can see examples here). Critics pointed out this means the program shows no real intellectual achievement and rather relies on convincing the judges that the agent is confused and that a longer time to take the test would be more informative.
In the 1950 paper “Computing Machinery and Intelligence” Alan Turing asked the question “Can machines think?” He then declares the definition of the terms of the question (machine and think) too vague to admit of a good answer and changed it to ask whether some digital computer could successfully play the imitation game. The imitation game imagined was one where two participants hid from the view of the third and conversed by passed notes or some other intermediary device, one of the two hidden participants would imitated a woman, the other would in fact be a woman and the third participant would have to guess which was which after conversing with them for some time. Turing imagined the computer in place of the man. It is ambiguous how exactly the game would be modified with the change and some have argued that it makes a difference which way we take the game to be played. Since Turing does not precisely define his tests all subsequent uses are in a sense their own version of a Turing Test. Modern versions of the Turing Test tend to assume that judges will converse with multiple participants some of whom are computers and others are humans and they will have to guess which is which. In any case the point of the redefinition of the question was as Turing put it to “drawing a fairly sharp line between the physical and the intellectual capacities of a man”. Turing imagined the discussions ranging from physical appearance through, mathematics, chess, and poetry writing, every imaginable skill or piece of knowledge might be called upon by the participants. Although put in terms of “thought” the original question seems to have been meant in the spirit of “can machines possess intelligence” or “can machines engage in intelligent behaviour”. It seems as though Turing was trying to demonstrate to his incredulous audience what he thought an intelligent machine would look like by example more than trying to define thought or intelligence as such.
In some ways Turing anticipated that a machine might succeed at the imitation without showing any intelligence. He notes “the best strategy for the machine may possibly be something other than imitation of the behaviour of a man”, but he thought it unlikely, outside the scope of the essay, and stipulated that for the purposes of the essay we should assume that the best strategy was really that of imitating a man’s behaviour. This illustrates that Turing was more concerned with illustrating how future machines might earn the appellation of thinking or intelligent rather than devising a strict test for success.
Despite these ambiguities Turing’s paper is widely cited and created an interest among both academic AI researchers and a wider public in the idea of a computer convincing humans that it was human as a test of its intellectual ability. A google search finds a first instance of “Turing’s Test” in 1959, and in 1962 it is noted that “Turing’s Test” has become standard nomenclature in the computer field and I find an instance of shortening the name to “Turing Test” in 1964. Over the years, in the popular imagination some have transformed the Turing Test with the idea that a computer that can pass the test is an autonomous intellect on par with a human person. Competitions like the one that crowned Eugene have been going on for some time, such as the Loebner Prize an annual competition since 1991.
The diversity of things covered by the name Turing Test is best illustrated by the most ubiquitous example of a Turing Test. CAPTCHA stands for Completely Automated Public Turing Test To Tell Computers and Humans Apart and the term was invented in 2000 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford of Carnegie Mellon University. Here the idea is to find a simple one question test administered by a computer that distinguishes humans from currently available computers by taking advantage of a specific skill (such as recognizing distorted text) that humans are good at but current computers find impossible. A computer passing this test would not require the variety of abilities Turing imagined, but it serves to deter computer programs that might otherwise spread unwanted advertising in internet forums or do other dubious or nefarious work.
The question of whether machines can think actually has an older pedigree than Turing or the modern computer. An example of this is a 1939 essay in Astounding Science Fiction “Tools for Brains” which begins with the line: “CAN machines think? The question keeps coming up every time a new kind of calculating machine is invented…” However Turing’s imitation game has left an indelible mark on the question.
Eugene is not the first machine that some have claimed passed the Turing Test. In terms of other claimants to having passed the Turing test, in 2011 Cleverbot made similar headlines to Eugene managed to achieve a rating of humaness close to that of humans from the large number of volunteer judges at tech festival in India. You can chat to a low powered version of Cleverbot on-line. Perhaps the first machine that some claimed passed the Turing Test was unveiled in an academic paper in 1966. ELIZA was a program that intended to use a limited variety of stock phrases and pattern recognition to imitate a therapy session to explore natural language interfaces. If a user mentioned hating their mother ELIZA would ask to you hated any other relatives and so on. Despite the very limited capabilities of the machine some users claimed an emotional connection with ELIZA and felt that it had human quality. ELIZA’s creator Joseph Weizenbaum was dismayed by the credulity of users taken in by ELIZA. Programming versions of ELIZA for academic and home computers became an activity of some through the 70s and 80s.
In my experience of on-line discussion if someone brought up the Turing Test as a measure of machine intelligence, some wag would counter by saying that people’s responses to ELIZA showed that the concept was bankrupt. This is a very large leap, but the success of programs like ELIZA have given many people pause. With more rigour some academics have made influential criticism of the Turing Test for focusing on behaviour and some have argued it is in principle impossible for a digital computer to think. The most prominent such argument is probably John R. Searle’s “Chinese Room argument” (explicated in his paper “Minds, Brains, and Programs”). Searle took up the question “Could a machine think?” and gives an intricate argument that electronic digital computers lack the causal powers to instantiate mental state or a mind, whatever their behaviour. There is a vast literature in philosophy and cognitive science discussing and disputing Searle’s argument.
In 1950 Turing anticipated many objections to his example of an intelligent machine. A pertinent qualification arose around the objection: “May not machines carry out something which ought to be described as thinking but which is very different from what a man does?” He said that this was a strong objection, but that if the machine succeeds in the imitation game that should obviate the objection. This implies that Turing accepted that an intelligent machine might not have a human mind or thought processes. In fact he turned to the imitation game as a way to avoid the prejudices of a conventional definition, where it just may be by definition “thinking” is a human activity carried out in an idiosyncratic human way and one might even be able to say that a computer does not “really” do such things as arithmetic or play chess. Note also Turing justified his focus on a behavioural criterion for approaching the question because he thought we necessarily evaluate intelligence and understanding in others behaviourally (citing an example of someone being quized about a sonnet they wrote asked about their use of metaphor, meter etc. to show they understood the poem). In a way Turing seemed to think that we are all constantly being subjected to the Turing Test in our daily lives as people try to judge our character as intellectual beings.
The Turing Test is part of Turing’s legacy and one that, at least in its emphasis on the imitation game aspect of it, might have surprised Turing. Turing saw the way forward for machine intelligence as on the one hand attempting to program specific intellectual tasks such as chess playing into computers, and on the other hand attempting to create learning machines capable of developing capacities and knowledge like a human child. The past 64 years has seen dramatic successes in programming machines to solve some specific tasks such as playing a winning chess game. The success in creating learning machines of the type Turing imagined is far less dramatic and more limited. Also, success in tasks such as chess has not translated into success producing machines with generally applicable skills in the way Turing might have anticipated.
There will almost certainly be many more headlines announcing another contender as first computer to pass the Turing Test. When or if it will be definitively passed by a computer is a very difficult question. Also, just what it will mean remains obscure.
A good entry level discussion and survey of the questions around the Turing Test is found in the first part of the CBC radio ideas episode Mind and Machine by my friend Dan Falk (the second part is concerned with the implications of artificial intelligence for our present and future).
The Royal Society where the competition was performed is about 39 miles east of Reading (taking University of Reading as the zero point) or 15 megabytes east of Reading in terms of the length of 5-bit tape required to store that much data.