Artificial intelligence has had its share of ups and downs recently. In what was widely seen as a key milestone for artificial intelligence (AI) researchers, one system beat a former world champion at a mind-bendingly intricate board game. But then, just a week later, a "chatbot" that was designed to learn from its interactions with humans on Twitter had a highly public racist meltdown on the social networking site.
How did this happen, and what does it mean for the dynamic field of AI?
In early March, a Google-made artificial intelligence system beat former world champ Lee Sedol four matches to one at an ancient Chinese game, called Go, that is considered more complex than chess, which was previously used as a benchmark to assess progress in machine intelligence. Before the Google AI's triumph, most experts thought it would be decades before a machine could beat a top-ranked human at Go. [Super-Intelligent Machines: 7 Robotic Futures]
But fresh off the heels of this win, Microsoft unveiled an AI system on Twitter called Tay that was designed to mimic a 19-year-old American girl. Twitter users could tweet at Tay, and Microsoft said the AI system would learn from these interactions and eventually become better at communicating with humans. The company was forced to pull the plug on the experiment just 16 hours later, after the chatbot started spouting racist, misogynistic and sexually explicit messages. The company apologized profusely, blaming a "coordinated attack" on "vulnerabilities" and "technical exploits."
Despite Microsoft's use of language that seemed to suggest the system fell victim to hackers, AI expert Bart Selman, a professor of computer science at Cornell University, said the so-called "vulnerability" was that Tay appeared to repeat phrases tweeted at it without any kind of filter. Unsurprisingly, the "lolz" to be had from getting the chatbot to repeat inflammatory phrases were too much for some to resist.
Selman said he is amazed Microsoft didn't build in sufficient safeguards to prevent such an eventuality, but he told Live Science the incident highlights one of modern AI's major weak points: language comprehension.
AI is very good at parsing text — that is, unraveling the grammatical patterns that underpin language — Selman said, which allows chatbots like Tay to create human-sounding sentences. It's also what powers Google's and Skype's impressive translation services. "But that's a different thing from understanding semantics — the meaning of sentences," he added.
Many of the recent advances in AI technology have been thanks to an approach called deep learning, which at some level mimics the way layers of neurons behave in the brain. Given huge swathes of data, it is very good at finding patterns, which is why many of its greatest successes have been in perceptual tasks like image or speech recognition. [A Brief History of Artificial Intelligence]
While traditional approaches to machine learning needed to be told what to look for in order to "learn," one of the main advantages of deep learning is that these systems have "automatic feature discovery," according to Shimon Whiteson, an associate professor in the Department of Computer Science at the University of Oxford.
The first layer of the network is optimized to look for very basic features in the data, for instance the edge of objects in an image. This output is then fed to the next layer, which scans for more complex configurations, say squares or circles. This process is repeated up the layers with each one looking for increasingly elaborate features so that by the time the system reaches the higher levels, it is able to use the structures detected by lower layers to identify things like a car or a bicycle.
"With deep learning, you can just feed raw data into some big neural network, which is then trained end-to-end," Whiteson told Live Science.
This has led to some superhuman capabilities. Selman said deep-learning systems have been shown to outperform medical specialists at diagnosing disease from MRI scans. Combining the approach with so-called reinforcement learning, in which machines use reward signals to hone in on an optimal strategy, has also been successful with tasks where it is possible to build accurate virtual simulations, said Kaheer Suleman, chief technology officer and co-founder of Canadian AI startup Maluuba. Google's AI system, dubbed AlphaGo, became an expert by playing itself millions of times and using this combination of methods to sharpen its skills and develop strategies.
"The big challenge for AI is in domains where there is no massive collection of labeled data, or where the environment cannot be simulated well," Suleman said. "Language is a great example of such a domain. The internet contains endless text, but nowhere is its "meaning" labeled in some machine-digestible form."
Maluuba is developing algorithms that can read text and answer questions about it, but Suleman said there are several features of language that make this particularly difficult. For one, language is hugely complex — meaning is spread across multiple levels, from words to phrases to sentences. These can be combined in an infinite number of ways and every human uses language differently.
And all language is abstract; words are simply symbols for things in a real world that a machine often can't experience.
"From the perspective of machine learning, the learned system is only as good as the data you provide it," Whiteson said.
Without access to the lifetime of data on the physical world and the wealth of social interactions that a human has accumulated, it’s little surprise Tay didn't understand what, for instance, the Holocaust is, let alone why it's inappropriate to deny it.
Despite these challenges, Maluuba posted a paper last month to arXiv, an online repository for preprint research papers, describing how its system was able to answer multiple-choice questions about unfamiliar text with more than 70 percent accuracy, outperforming other neural network approaches by 15 percent, and even outdoing hand-coded approaches. Maluuba's approach combined deep learning with neural network structures, engineered to interact with each other in a way that interactions result in a rudimentary form of reasoning. The company is also working on spoken dialogue systems that can learn to engage in natural conversations with humans.
Selman said language-focused AI can be surprisingly powerful for applications where the subject matter is fairly restricted. For instance, technical helplines are things he predicts could soon be automated (and some already are, to a degree), as could relatively senior administrative jobs that boil down to routine interactions like updating spreadsheets and sending out formulaic emails.
"Weaknesses are exposed in these uncontrolled, very open-ended settings, which involve multiple aspects of human intelligence but also really understanding other people," Selman said.
But progress is certainly being made on this front, Whiteson said, with Google's self-driving car being a prime example. Sharing the street with humans requires the machine to understand more than just the rules of the road — it also needs to be able to follow unstated social norms and navigate ethical dilemmas when avoiding collisions, he added.
And as advances in AI and robotics result in increasing numbers of machines being used in the real world, the ability to interact with humans is no longer some lofty goal for sci-fi aficionados. Researchers are now searching for new approaches that could help machines not only perceive, but also understand the world around them.
"Deep learning is great, but it's not a silver bullet," Whiteson said. "There are a lot things still missing. And so a natural next step that people are working on is how can we add things to deep learning so that it can do even more."
"Now all of these thorny questions about what it is we want machines to do and how do we make sure they do it are becoming of practical importance so people are starting to focus on them a lot more now.”
Copyright 2016 LiveScience, a Purch company. All rights reserved. This material may not be published, broadcast, rewritten or redistributed.