OpenAI previews voice generator, acknowledging election risks

Artificial intelligence startup OpenAI released a preview Friday of a digital voice generator that it said could produce natural-sounding speech based on a single 15-second audio sample.

The software is called Voice Engine. It’s the latest product to come out of the San Francisco startup that’s also behind the popular chatbot ChatGPT and the image generator DALL-E.

The company said in a blog post that it had tested Voice Engine in an array of possible uses, including reading assistance to children, language translation and voice restoration for cancer patients.

Some social media users reacted by highlighting possible misuses, including potential fraud assisted with unauthorized voice imitation, or deepfakes.

But OpenAI said it was holding off for now on a wider release of the software because of the potential for misuse, including during an election year. It said it first developed the product in late 2022 and had been using it behind the scenes in other products.

“We are taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse,” the company said in the unsigned post.

“We hope to start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities,” it said. “Based on these conversations and the results of these small scale tests, we will make a more informed decision about whether and how to deploy this technology at scale.”

The 2024 election has already witnessed its first fake voice, which appeared in New Hampshire in a robocall in January imitating President Joe Biden. A Democratic operative later said he commissioned the fake voice using artificial intelligence and the help of a New Orleans street magician.

After that call, the Federal Communications Commission voted unanimously to ban unsolicited AI robocalls.

OpenAI acknowledged the political risks in its blog post.

“We recognize that generating speech that resembles people’s voices has serious risks, which are especially top of mind in an election year,” it said.

The company said it was “engaging with U.S. and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build.”

It said its usage policies prohibit impersonation without consent or legal right, and it said broad deployment should be accompanied by “voice authentication experiences” to verify that the original speaker knowingly added their voice to the service. It also called for a “no-go voice list” to prevent the creation of voices that are too similar to prominent figures.

But finding a way to detect and label AI-generated content has proven difficult for the tech industry. Proposed solutions such as “watermarking” have proven easy to remove or bypass.

Geoffrey Miller, an associate professor of psychology at the University of New Mexico, responded to OpenAI on the platform X asking what it would do about potential misuse by criminals.

“When millions of older adults are defrauded out of billions of dollars by these deepfake voices, will @OpenAI be ready for the tsunami of litigation that follows?” he asked. The company did not immediately reply to him.

This article was originally published on NBCNews.com