OpenAI releases free ChatGPT-4o that can talk, laugh, sing and see

Keep in mind that he will be able to understand that you are lying

By: Viktor Tsyrfa | 14.05.2024, 15:24

OpenAI releases free ChatGPT-4o that can talk, laugh, sing and see

On 13 May, OpenAI announced the release of a new version of ChatGPT-4o, which, according to them, "will make human-computer interaction one step more natural". The new neural network accepts any combination of text, audio, and images and generates responses in all of these formats. According to the company, the AI recognises emotions, can interrupt in mid-sentence and responds almost as quickly as a human.

Say hello to GPT-4o, our new flagship model that can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN

Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx
- OpenAI (@OpenAI) 13 May 2024

The letter "o" in ChatGPT-4o's name not only mimics 40, but also stands for omni, which means comprehensiveness or omnivorousness. CTO of OpenAI Mira Murati stated that ChatGpt-4o will be a ChatGPT-4-level artificial intelligence for everyone, even users without a paid subscription.

At the presentation, ChatGPT-4o solved a linear equation written on paper and also gave deep breathing tips by simply listening to breathing sounds.

Previous language models ChatGpt-3.5 and ChatGPT-4 could also communicate by voice, but for this purpose, the voice was first translated into text and then processed. In other words, first, one neural network processed the voice, and then another processed the text. Now, the same neural network processes both the sound and the information it carries. With this approach, OpenAI tries to get more context from the available material, such as the emotional state of the interlocutor. It's also much faster.