Last week headlines announced that a computer, known as Eugene Goostman, had passed the Turing Test at a competition at the Royal Society of London by University of Reading researchers. It was heralded as a milestone in artificial intelligence (by one of the competition organizers) and implied that a computer program had shown some significant amount of intelligence and fooled people into believing it was human after a robust interrogation. Turing originally predicted that in the year 2000 a computer might be able to hold up in a conversational test as well as a human for five minutes in 30 percent of trials. Which added a sense of officialness to the claim that Eugene passed. Quickly critics appeared to call into question whether Eugene would really have fooled anyone in a normal conversation.
Eugene managed to fool the judges in the competition about a third of the time. However this was achieved by Eugene presenting the persona of a Ukranian 13 year old with imperfect English, the competition was a speed test with only five minutes to evaluate multiple potential humans or machines at once via computer relayed chat (you can see examples here). Critics pointed out this means the program shows no real intellectual achievement and rather relies on convincing the judges that the agent is confused and that a longer time to take the test would be more informative.
In the 1950 paper “Computing Machinery and Intelligence” Alan Turing asked the question “Can machines think?” He then declares the definition of the terms of the question (machine and think) too vague to admit of a good answer and changed it to ask whether some digital computer could successfully play the imitation game. The imitation game imagined was one where two participants hid from the view of the third and conversed by passed notes or some other intermediary device, one of the two hidden participants would imitated a woman, the other would in fact be a woman and the third participant would have to guess which was which after conversing with them for some time. Turing imagined the computer in place of the man. It is ambiguous how exactly the game would be modified with the change and some have argued that it makes a difference which way we take the game to be played. Since Turing does not precisely define his tests all subsequent uses are in a sense their own version of a Turing Test. Modern versions of the Turing Test tend to assume that judges will converse with multiple participants some of whom are computers and others are humans and they will have to guess which is which. In any case the point of the redefinition of the question was as Turing put it to “drawing a fairly sharp line between the physical and the intellectual capacities of a man”. Turing imagined the discussions ranging from physical appearance through, mathematics, chess, and poetry writing, every imaginable skill or piece of knowledge might be called upon by the participants. Although put in terms of “thought” the original question seems to have been meant in the spirit of “can machines possess intelligence” or “can machines engage in intelligent behaviour”. It seems as though Turing was trying to demonstrate to his incredulous audience what he thought an intelligent machine would look like by example more than trying to define thought or intelligence as such.
In some ways Turing anticipated that a machine might succeed at the imitation without showing any intelligence. He notes “the best strategy for the machine may possibly be something other than imitation of the behaviour of a man”, but he thought it unlikely, outside the scope of the essay, and stipulated that for the purposes of the essay we should assume that the best strategy was really that of imitating a man’s behaviour. This illustrates that Turing was more concerned with illustrating how future machines might earn the appellation of thinking or intelligent rather than devising a strict test for success.
Despite these ambiguities Turing’s paper is widely cited and created an interest among both academic AI researchers and a wider public in the idea of a computer convincing humans that it was human as a test of its intellectual ability. A google search finds a first instance of “Turing’s Test” in 1959, and in 1962 it is noted that “Turing’s Test” has become standard nomenclature in the computer field and I find an instance of shortening the name to “Turing Test” in 1964. Over the years, in the popular imagination some have transformed the Turing Test with the idea that a computer that can pass the test is an autonomous intellect on par with a human person. Competitions like the one that crowned Eugene have been going on for some time, such as the Loebner Prize an annual competition since 1991.
The diversity of things covered by the name Turing Test is best illustrated by the most ubiquitous example of a Turing Test. CAPTCHA stands for Completely Automated Public Turing Test To Tell Computers and Humans Apart and the term was invented in 2000 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford of Carnegie Mellon University. Here the idea is to find a simple one question test administered by a computer that distinguishes humans from currently available computers by taking advantage of a specific skill (such as recognizing distorted text) that humans are good at but current computers find impossible. A computer passing this test would not require the variety of abilities Turing imagined, but it serves to deter computer programs that might otherwise spread unwanted advertising in internet forums or do other dubious or nefarious work.
The question of whether machines can think actually has an older pedigree than Turing or the modern computer. An example of this is a 1939 essay in Astounding Science Fiction “Tools for Brains” which begins with the line: “CAN machines think? The question keeps coming up every time a new kind of calculating machine is invented…” However Turing’s imitation game has left an indelible mark on the question.…