Your conversations with ChatGPT are about to get way more personal.
OpenAI, the creator of ChatGPT, announced yesterday (Monday) that it will be launching new voice and image features for the AI chatbot over the next two weeks.
Those that pay for a ChatGPT Plus subscription, as well as Enterprise users, will soon be able to have back-and-forth conversations with ChatGPT. Those using the free version will still be limited to text input. The speech features will include a set of human voices generated by real voice actors. A new text-to-speech model paired with the open-source speech recognition system Whisper will be powering these life-like conversations.
OpenAI certainly put its best foot forward when it released short samples of what ChatGPT’s new voices sound like reading a poem or a speech. They’re an audible step up from the generic AI voices some websites serve up to (robotically) voice their long-read pieces.
Having trouble finding the right words when talking to ChatGPT? The second big upgrade that’s coming is image chat functionality. If you momentarily forget the plastic or metal tips of the best running shoes shoelaces are called aglets but you urgently need to ask ChatGPT if they can be replaced, simply snap a photo and send it to the chat. You can discuss multiple images or use the drawing tool to guide the AI about the specific part of an image you’re referring to.
The processing of the images will be powered by GPT-3.5 and GPT-4 models that can apply their language reasoning skills to different image types such as photographs, screenshots and documents containing both text and images, according to OpenAI.
Purposefully dumbed down
In its announcement about these new features, OpenAI acknowledged they create the potential for people to try to impersonate public figures or commit fraud.
“This is why we are using this technology to power a specific use case — voice chat. Voice chat was created with voice actors we have directly worked with,” said OpenAI.
When it comes to image processing, ChatGPT’s ability to analyze and make statements about people in photos has been purposefully limited “since ChatGPT is not always accurate and these systems should respect individuals’ privacy”, the company said.
Voice and image features are being rolled out to ChatGPT Plus and Enterprise users over the next two weeks. Voice will be available for iOS and Android users provided they opt-in. Image features can be used on all platforms.