ChatGPT rolls out voice and image prompts

OpenAI is adding new voice and image capabilities to ChatGPT. 

The new features will allow you to either engage in voice conversations with the chatbot or share images to express your thoughts, rather than relying solely on typing out prompts.

ChatGPT’s new voice and image functionalities will be rolled out to paying subscribers over the next two weeks, with plans to expand these services to the free version of the app “soon after”.

Why we care. ChatGPT’s ability to create synthetic voices and images spell huge creative opportunities, however, they also have some concerning potential uses, such as impersonation and fraud, according to OpenAI. These risks could have huge implications for advertisers.

Speak with ChatGPT

ChatGPT’s new chat functionality allows you to have dynamic conversations with it using just your voice.

Getting started. To get started with voice, follow these simple steps:

  • Head to “Settings”.
  • Scroll down and click on “New Features” on the mobile app.
  • Opt into voice conversations.
  • Tap the headphone button located in the top-right corner of the home screen.
  • Choose your preferred voice out of five different voices.

How it works. To use the Chat functionality, you simply press a button, ask your question aloud, ChatGPT turns it into text, sends it to the big language model, gets a response, changes it back into speech, and then vocalizes the answer. It should be as smooth as having a conversation with Alexa.

You’ll will then be able to choose from five different voices for ChatGPT, although OpenAI believes the model has much greater potential beyond just these options.

Chat about images

You can now show ChatGPT one or more images to communicate a query. You can also focus on a specific part of the image by using the drawing tool in our mobile app.

Getting started. To get started with image prompts, follow these simple steps:

  • Tap the photo button to capture or choose an image.
  • If you’re on iOS or Android, tap the plus button first.
  • You can also discuss multiple images or use our drawing tool to guide your assistant.

How it works. Image search will work in a similar way to Google Lens. Users simply take a photo, upload it, then ChatGPT will try to figure out your query and provide an answer.

To clarify your intent, you will also be able to use the app’s drawing tool, as well as speak or type questions to go with your uploaded image.

If ChatGPT provides an answer you’re not happy with, instead of doing another search, you can talk with the bot and fix the answer as you chat. It’s similar to what Google does with their multimodal search.

What has Open AI said? An Open AI spokesperson said in a statement:

  • “Voice and image give you more ways to use ChatGPT in your life. Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it. When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe).”
  • “After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.”
  • “We’re rolling out voice and images in ChatGPT to Plus and Enterprise users over the next two weeks. Voice is coming on iOS and Android (opt-in in your settings) and images will be available on all platforms.”

Deep dive. Read Open AI’s announcement in full for more information.

