“Turing” the landscape of the Imitation Game

A chatbot designed to respond (in English) like a 13-year-old Ukrainian boy (with limited English skills) was recently reported to have passed the Turing Test. Many commentators were quick to demonstrate that the ‘bot emphatically did not – and cannot – pass any version of the Turing Test having any meaningful connection to intelligence.

In my view, the chatbot, “Eugene Goostman”, merely entered the wrong competition. If it were to enter a competition in which every entrant was introduced as a 13-year-old from Ukraine, and all entrants had been either programmed or coached to impersonate a 13-year-old from Ukraine, we could more validly compare its results to that of its competitors. In the Reading University competition, however, other entrants were (tacitly) assigned different roles – roles arguably more difficult to pull off, such as “full-grown, educated, native English-speaking person.”

When Alan Turing famously proposed that the Imitation Game be used as the “gold standard” for machine intelligence, he did so with a challenging version of the game in mind – the version that humans play. To be considered decently good at the Imitation Game, a player (say, a man or a computer program charged with imitating a woman) would have to prove indistinguishable from “the genuine article” (a variety of actual woman players) a great deal of the time. The perfect “female impersonator” would be able to convince an impartial judge just as often, on average, as a woman can. (Bear in mind that even a genuine woman will be mistaken for a non-woman pretending to be a woman, a certain percentage of the time – especially by a judge wary of being duped.)

If a computer program could convince a panel of skeptical judges that it is a person just as often as the average person could do, that program would be a Master Im-person-ator – and indisputably intelligent.

That’s a big “if”. The best we can (generously) allow, at present, is that a computer program may now perform adequately well at an extremely limited and “dumbed-down” version of the Turing Test. We might envision the space of all Turing-Related Tests as a plane, of which a few tiny slivers – representing such roles as “ignorant, sassy teenager” and “abusive, paranoid weirdo” – are coloured either yellow or a very faint shade of green. All other “tiles” in the plane – “full-grown, educated, native English-speaking person”, “woman”, “award-winning journalist”, etc. –  are either red or uncoloured.