Yesterday, OpenAI released a new GPT-4o model that will be available to the general public over the next few weeks. It offers the premium features of the GPT-4 along with an updated web interface. During the announcement, OpenAI CTO Mira Murati demonstrated some of the new model’s features. So let’s take a look.
According to the company, the GPT-4o takes “a step towards a more natural human-computer interaction.” The new model can process text, images, and audio, and can help its users based on this. Voice mode now works more smoothly, with faster responses and better understanding. Previously, voice mode used three different models to transcribe, recognize, and convert text to speech together. Not to mention, this led to delays in response. In comparison, GPT-4o handles all of these issues natively.
By using the camera on your phone, you can share information with the model and ask it questions easily using voice mode. The new model can reportedly respond to voice input in 232 milliseconds, which is similar to the reaction time of us humans. The model can also respond with different tones according to the user’s preference. Compared to the GPT-4 Turbo, the model understands non-English languages better and faster. The GPT-4o can also work as a translator.
Importantly, GPT-4o will also be available for APIs, so developers will be able to create AI applications using the capabilities of the new model.
Although the new model offers features for free, premium users will be able to use five times more resources compared to the new model.
The company also launched ChatGPT for Apple’s macOS desktop computers. The macOS app offers deeper integration with the platform. OpenAI wants to make it easier for users to integrate the tool into their workflows. With the help of a keyboard shortcut (Option + Space), users can easily access the tool’s conversation page.