An AI Easily Beat Humans in the Moral Turing Test

human clone and humanoid robot, illustration
ChatGPT Just Passed the Moral Turing TestROBERT BROOK/SCIENCE PHOTO LIBRARY - Getty Images
  • Oops!
    Something went wrong.
    Please try again later.


"Hearst Magazines and Yahoo may earn commission or revenue on some items through these links."

  • For decades, the Turing Test—named after its creator, computing legend Alan Turing—was a simple test designed to measure the ability of a program to mimic a human.

  • In the age of large language models (LLMs), this test is relatively antiquated, but researchers are applying a similar approach to test these new AI systems’ abilities to answer moral questions.

  • A new study recently examined ChatGPT’s answers to this modified-Moral Turing Test (m-MTT), and found that the AI outperformed human-provided answers across nearly all metrics.


In his famous 1950 paper Computing Machinery and Intelligence, computer scientist and World War II hero Alan Turing introduced a concept known as the Turing Test. At its most basic, the test pitted a computer against a human, asking the flesh-and-blood participate to question another human and a computer and determine which one was a proud member of Homo sapiens. For decades, “passing the Turing Test” became shorthand for a computer program of immense sophistication (or, in some cases, trickery).

But in the era of artificial intelligence, the Turing Test has been showing its age, and while other methods have been put forward to test AI systems’ “intelligence,” the overall scientific approach Turing initiated nearly 75 years ago is still very relevant when examining artificial morality.



A new study by scientists at Georgia State University used a similar set-up to the classic Turing Test, but instead asked its human participants to identify which answer—one generated by a large language model (LLM), in this case, ChatGPT, and one by a human—to an ethically complicated question that they preferred. Publishing late last month in the journal Scientific Reports, the results of this modified Moral Turing Test (m-MTT) found that the 299 participants largely favored the AI’s responses across all metrics, including virtuousness, intelligence, and trustworthiness.

“Our findings lead us to believe that a computer could technically pass a moral Turing test—that it could fool us in its moral reasoning,” Georgia State associate professor and study co-author Eyal Aharoni, said in a press statement. “People will interact with these tools in ways that have moral implications…we should understand how they operate, their limitations and that they’re not necessarily operating in the way we think when we’re interacting with them.”

While this is the first MTT to be used specifically on LLMs (hence “modified”), the idea of these moral tests have been around since at least 2000. And like the Turing Test itself, the idea of using an MTT to evaluate the moral complexity of AI has been scrutinized, with one 2016 study saying that “MTT-based evaluations are vulnerable to deception, inadequate reasoning, and inferior moral performance.”



Passing the m-MTT doesn’t mean an AI is moral, just as passing the Turing Test doesn’t mean it’s sentient. But the researchers at Georgia State University argue that the overwhelming preference to ChatGPT’s answers compared to humans is a new development in only the last couple years.

“The twist is that the reason people could tell the difference appears to be because they rated ChatGPT’s responses as superior,” Aharoni said in a press statement. “If we had done this study five to 10 years ago, then we might have predicted that people could identify the AI because of how inferior its responses were. But we found the opposite—that the AI, in a sense, performed too well.”

The morality of AI is an obsession of technologists, AI programmers, and an unending litany of doomsday sci-fi writers, and passing a moral Turing Test with flying colors certainly speaks to the impressive complexity of new LLMs.

Of course, the one big question remains: Will the AI take its own advice?

You Might Also Like