On Monday (May 13), OpenAI announced the release of GPT-4o, a new AI model. Capable of realistic voice conversation, it interacts across text and image, marking OpenAI’s latest move in the technology race.
New audio capabilities enable users to speak to ChatGPT and obtain real-time responses with no delay, as well as interrupt ChatGPT while it is speaking, both hallmarks of realistic conversations that AI voice assistants have found challenging, the OpenAI researchers showed at a livestream event.
“It feels like AI from the movies … Talking to a computer has never felt really natural for me; now it does,” OpenAI CEO Sam Altman wrote in a blog post.
Microsoft-backed OpenAI faces growing competition and pressure to expand the user base of ChatGPT, its popular chatbot product that wowed the world with its ability to produce human-like written content and top-notch software code.
At the livestream event, OpenAI researchers showed off ChatGPT’s new voice assistant capabilities. In one demo, ChatGPT used its vision and voice capabilities to talk a researcher through solving a math equation on a sheet of paper.
In another demo, researchers showed the GPT-4o model’s capability of real-time language translation.
The unveiling of OpenAI’s new AI model underscores the fierce competition driving innovation in the AI sector. With technology rapidly evolving, companies are pushing boundaries to develop more powerful and efficient AI models. The increased competition not only benefits consumers through technological advancements but also encourages collaboration among industry players. Additionally, it fosters knowledge sharing.
read more
image source