Using ChatGPT with Voices and Images

   
Image from ChatGPT website

A new major update on OpenAI's ChatGPT model came out a week ago. This update enables users to interact with the model in more ways than just using text. These ways are namely speech and sending images. In addition, ChatGPT can respond to the user's voice with its own voice or send pictures to the user if requested. These features have been implemented by a variety of third-party extensions but it has finally been brought natively into ChatGPT. This advancement has been anticipated by the community for over a year now.

Users can now talk to ChatGPT via a microphone and the model responds with voice using text-to-speech technology and it is acclaimed as a good quality one. This essentially means that people can now talk to ChatGPT like talking to a real person. This also opens a way for people with certain disabilities to use ChatGPT more easily. Another advantage of that would be the effectiveness of using ChatGPT, people can interact with the model faster and understand the response more easily with the audio support on top of text.

Users can also upload images from their devices to ChatGPT and ask for outputs from the model. ChatGPT can also respond with images using Dall-E 3. Although the image inputting may be sloppy sometimes, it could be very useful for enhancing the context comprehension of ChatGPT where the user may find it hard to fully explain the situation with words. Like the voice inputs, the visual inputs can also increase accessibility. 

One interesting example given by the developers that utilize both of these functionalities in an accessibility sense is a blind person taking a photo of a pair of his/her t-shirts and asking ChatGPT which one is red and which one is blue. Although this is just one example, it is foreseeable that this would be useful in many cases.

There is one issue though. All these features have been added to GPT-4 and this version of ChatGPT is not free of charge. Today, users have to pay a monthly fee of $20 to gain access to GPT-4 from its website. The default plan only covers GPT-3.5 which has fewer features including the voices and images. 

For those who do not want to pay this much money on GPT-4 whether because they just want to try and see it, or they straight up don't think that it's worth it, there is actually a way to access GPT-4 features. Users can access GPT-4 for free by using Microsoft's Bing search engine. Although its use and reliability are limited compared to the ChatGPT website, it is a great way of trying out the new features and doing test runs for free. It should be noted that text-to-speech is not available at Bing right now.

I definitely recommend checking out GPT-4 for everyone. Go toy around and see what interesting ideas you can come up with. It is very inspiring.

Thanks for reading!

Oğuz Arslan

Comments

Popular posts from this blog

Pizzas in 10 Minutes Ontology Tutorial

Why we hate each other on the internet? (by Kurzgesagt)

Gephi Quick Start Guide