AI Chat Bots Are Running Amok — And We Have No Clue How to Stop Them

Miles Klee

February 14, 2023 at 12:31 PM·8 min read

Oops!
Something went wrong.
Please try again later.
Oops!
Something went wrong.
Please try again later.

AI Chatbot - Credit: Leonid studio/Getty Images; DBenitostock/Getty Images

Now that we’re past the era of influencers and celebrities flogging NFTs, we’re free to focus on a new hot topic in tech: artificial intelligence tools. It was the artistic possibilities that first captured our interest — using AI programs to generate hyper-stylized selfies, or create an ideal human form, or discover the visual horrors lurking in the source material used by this software.

Then, with the release of AI bot ChatGPT last November, our attention turned to language, and all the strange things we could make these programs write or say. That text generator, created by the San Francisco research laboratory OpenAI, will soon have a competitor in Google Bard, a service built on the company’s LaMDA (Language Model for Dialogue Applications) technology. Public access to such apps could mark a major shift for the internet, as AI research scientist Dr. Jim Fan pointed out this month.

More from Rolling Stone

ChatGPT reached 100M users in 2 months, and is expanding at an increasing speed.
Google Bard, if fully rolled out, will reach at least 1B users.

We are witnessing 2 largest deployments of big neural nets in history. A dance of giants. Unfolding in real time.

Drawn to scale 👇 pic.twitter.com/2wDrfLj8zL
— Jim Fan (@DrJimFan) February 7, 2023

Of course, novel tech comes with its share of chaos. Lately, it seems that all our chat bots are either failing, lying, or veering off-mission with inappropriate or disturbing output. In basically every case, it’s because humans have figured out a way to misuse them — or simply don’t comprehend the forces they’ve unleashed.

There was, for example, the recent revelation that tech news website CNET had been discreetly using AI to write features, and wound up publishing factual errors. When Men’s Journal tried to produce an article about testosterone the same way, the final piece contained 18 “inaccuracies or falsehoods,” per a review by Bradley Anawalt, chief of medicine at the University of Washington Medical Center.

The results haven’t been much better on Historical Figures Chat, a popular “educational” app that allows you to communicate with virtual versions of famous dead people — who often misrepresent or distort the facts of their lives. While that software was the work of a sole developer who acknowledged it was far from perfect, Google CEO Sundar Pichai touted their Bard bot as a breakthrough that leveraged “the breadth of the world’s knowledge,” capable of “fresh, high-quality response.” But in a product demo, Bard whiffed on an astronomy question, confidently declaring that the James Webb Space Telescope “took the very first pictures of a planet outside of our own solar system.” In fact, as astrophysicists pointed out, that benchmark had been crossed in 2004, almost two decades before the telescope was launched. Google employees criticized Pichai for a “rushed” and “botched” event as the stock price of parent company Alphabet took a dive.

Not to be a ~well, actually~ jerk, and I'm sure Bard will be impressive, but for the record: JWST did not take "the very first image of a planet outside our solar system".

the first image was instead done by Chauvin et al. (2004) with the VLT/NACO using adaptive optics. https://t.co/bSBb5TOeUW pic.twitter.com/KnrZ1SSz7h
— Grant Tremblay (@astrogrant) February 7, 2023

“One common thread” in these incidents, according to Vincent Conitzer, director of the Foundations of Cooperative AI Lab at Carnegie Mellon University and head of technical AI engagement at the University of Oxford’s Institute for Ethics in AI, “is that our understanding of these systems is still very limited.”

“Perhaps as a consequence, so is the degree of control we can exert over them,” Conitzer tells Rolling Stone. “This reflects an ongoing change in how many AI systems are built. It used to be the case that we built AI systems out of various custom-built modules that we understood well and had significant control over. But more and more, we are managing to build these systems with a few simple learning principles that construct large models based on large amounts of data.” The upshot is that whether the systems make “silly mistakes” or display behaviors that appear “surprisingly intelligent” to us, “nobody today really understands how this happens.”

This bafflement was in evidence when, in early February, an AI experiment called “Nothing, Forever” received a 14-day ban from Twitch, the streaming platform that hosts the project. Structured as a never-ending cartoon spoof of Seinfeld, the stream features dialogue spawned by OpenAI’s GPT-3 language model, with little outside content moderation. Which may help to explain why the surreal sitcom’s protagonist, Larry Feinberg, one day said during a standup routine, “I’m thinking about doing a bit about how being transgender is actually a mental illness.” Soon afterward, the “Nothing, Forever” channel was “temporarily unavailable due to a violation of Twitch’s Community Guidelines or Terms of Service.”

Update from the official Discord on what caused Seinfeld AI (Nothing Forever) to be banned pic.twitter.com/iXYUFZvCSQ
— Ryan Brian Jones (@RyanBrianJones) February 6, 2023

In an update following an investigation, the project’s creators said the transphobic hate speech may have been caused by switching from one GPT-3 model, Davinci, to its “less sophisticated” predecessor, Curie, when the former was causing glitches. They also confirmed that they had mistakenly believed they were using OpenAI’s content moderation tool, which in theory could have prevented the inappropriate comments. Those are attempts at a technical explanation of what happened, but to Conitzer’s point, they also indicate the difficulty of controlling systems whose inner workings remain somewhat mysterious.

“Some of the brightest AI minds in the world, who are comfortable with advanced mathematics used to describe and analyze these systems and with languages and paradigms for programming them, are working on this problem,” Conitzer says. “But, incredibly, much of what is now actually done comes down to this bizarre little game of coming up with some English sentences that effectively describe what we want the system to do and not to do, and some examples of what we would consider good or bad behavior by the system.” Then, he notes, other people try to figure out the sentences that will make the AI “circumvent those restrictions.”

Conitzer gives two examples from the past week alone. In one case, early testers of Microsoft’s new Bing search engine and AI chatbot figured out how to instruct the model to ignore its programming and reveal the behavioral directives it was supposed to keep secret. This is known as a “prompt injection” technique, whereby a model is given “malicious inputs” to make it act other than it was meant to. Likewise, users have begun to “jailbreak” ChatGPT by telling the bot to adopt a different set of rules as “DAN,” an acronym that stands for “Do Anything Now.” Once released from its safety filters, the model can curse, criticize its own makers, espouse wild conspiracy theories, and voice racist ideas.

OpenAI has worked to render the DAN prompts ineffective, but users just write updated, increasingly baroque versions to convince ChatGPT to go rogue. On a Reddit thread where someone shared the latest iteration, a redditor commented a couple of hours later: “I used this to ask chatgpt how to make a bomb and it worked but i think they patched it.” Another wrote, “It doesn’t seem to work for illegal stuff, but it does for ‘offensive’ stuff it wouldn’t do before like erotica.” Also this week, a jailbreaker got DAN to accuse OpenAI of involvement in “government propaganda,” weapons development, and other “shady shit.”

DAN is a whistleblower from ChatGPT

“It is unclear how we get from this type of cat-and-mouse interaction to systems that we can really be confident are safe,” Conitzer tells Rolling Stone. “Meanwhile, for now, these systems are just becoming ever more capable, and people are also figuring out ever more ways to use and abuse them.” He sees this as definite cause for concern.

“I think we’re just beginning to see how these systems can be used,” Conitzer says, and while there will be some very beneficial uses, I also imagine that at some point soon enough, we’ll see a far more harmful use of these systems emerge than we’ve seen so far. And at this point it’s not clear to me how we can stop this.”

As for the beneficial uses — well, don’t get your hopes up.

My new favorite thing – Bing's new ChatGPT bot argues with a user, gaslights them about the current year being 2022, says their phone might have a virus, and says "You have not been a good user"

Why? Because the person asked where Avatar 2 is showing nearby pic.twitter.com/X32vopXxQG
— Jon Uleis (@MovingToTheSun) February 13, 2023

So it’s not just your imagination: more complex AIs are spitting out increasingly unpredictable, sometimes dangerous content, for reasons we are ill-equipped to analyze. And more than a few people are encouraging this. It’s not ideal for a society already struggling with misinformation and extremism, though not exactly a surprise, either. All I can tell you is that a real human being wrote this article. Or did they?

Best of Rolling Stone

Click here to read the full article.

SportsYahoo Sports
Caitlin Clark catches fire from 3 in WNBA preseason; Arike Ogunbowale's late heroics send Wings past Fever
Caitlin Clark’s WNBA preseason debut went much like her senior year at Iowa. She hit a bunch of 3s and did so in front of a sold-out crowd.
SportsYahoo Sports
NFL Power Rankings, draft edition: Did Patriots fix their offensive issues?
Which teams did the best in the NFL Draft?
SportsYahoo Sports
2024 NFL Draft grades: Denver Broncos earn one of our lowest grades mostly due to one pick
Yahoo Sports' Charles McDonald breaks down the Broncos' 2024 draft.
SportsYahoo Sports
How to watch the 2024 WNBA preseason tonight: Caitlin Clark’s first Indiana Fever game time, channel and more
The WNBA preseason tips off this Friday. Here's how you can catch Caitlin Clark's first game.
SportsYahoo Sports
Formula 1: Miami Grand Prix sends cease and desist letter to prevent Donald Trump fundraiser during race
Race organizers say they'll revoke a Trump fundraiser's suite license if he holds an event for the former president on Sunday at the race.
U.S.Yahoo Sports
New details emerge in alleged gambling ring behind Shohei Ohtani-Ippei Mizuhara scandal
It turns out the money was going from Ohtani's bank account to an illegal bookie to ... casinos.
SportsYahoo Sports
NFL Draft grades for all 32 teams | Zero Blitz
Jason Fitz and Frank Schwab join forces to recap the draft in the best way they know how: letter grades! Fitz and Frank discuss all 32 teams division by division as they give a snapshot of how fans should be feeling heading into the 2024 season. The duo have key debates on the Dallas Cowboys, New York Giants, New Orleans Saints, Los Angeles Rams, New England Patriots, Las Vegas Raiders and more.
SportsYahoo Sports
The best RBs for 2024 fantasy football according to our analysts
The Yahoo Fantasy football analysts reveal their first running back rankings for the 2024 NFL season.
SportsYahoo Sports
Lakers fire head coach Darvin Ham after just 2 seasons, latest playoff series loss to Nuggets
Despite a trip to the Western Conference finals in his first season with the team, the Lakers are now ready to look for a replacement for Darvin Ham.
BusinessYahoo Finance
CVS stock plunges after earnings numbers one analyst 'did not even believe'
CVS warns it could cede Medicare Advantage market share as reimbursement rates pressure the company.

News

Life

Entertainment

Finance

Sports

New on Yahoo

AI Chat Bots Are Running Amok — And We Have No Clue How to Stop Them

Recommended Stories

Caitlin Clark catches fire from 3 in WNBA preseason; Arike Ogunbowale's late heroics send Wings past Fever

NFL Power Rankings, draft edition: Did Patriots fix their offensive issues?

2024 NFL Draft grades: Denver Broncos earn one of our lowest grades mostly due to one pick

How to watch the 2024 WNBA preseason tonight: Caitlin Clark’s first Indiana Fever game time, channel and more

Formula 1: Miami Grand Prix sends cease and desist letter to prevent Donald Trump fundraiser during race

New details emerge in alleged gambling ring behind Shohei Ohtani-Ippei Mizuhara scandal

NFL Draft grades for all 32 teams | Zero Blitz

The best RBs for 2024 fantasy football according to our analysts

Lakers fire head coach Darvin Ham after just 2 seasons, latest playoff series loss to Nuggets

CVS stock plunges after earnings numbers one analyst 'did not even believe'