779pubcom| The AI circle is boiling! The giant's mysterious new product is here, and Apple is "winning"?

After thousands of calls, she came out shyly, and OpenAI was finally "new".

OpenAI Live released its products at 1: 00 a.m. on May 14, Beijing time.779pubcomUpdate of. At a half-hour online launch, Mira Murati, OpenAI's chief technology officer, announced a series of upgrades to GPT-4. The main highlights of the conference are as follows:

779pubcom| The AI circle is boiling! The giant's mysterious new product is here, and Apple is "winning"?

A new model, GPT-4o, is introduced, in which the "o" stands for "omni" (comprehensive, omnipotent). At the same time, GPT-4o is free to all users.

The new model has a strong ability of multimodal interaction. In the presentation, GPT-4o has the ability of text, picture, video and voice to communicate smoothly with humans and read screen messages.

The ChatGPT desktop application has been released, and the currently available macOS,Windows version will be released later this year.

AI assistant is beginning to take shape.

Before the press conference, the reporter noticed that the description of GPT-4 had been changed from "state-of-the-art model" to "advanced model" on OpenAI's official website, preparing ahead of time for the release of GPT-4o.

As the most advanced model of OpenAI, GPT-4o is special in that it can accept any combination of text, audio and image as input and generate the content of these modes. This means that GPT-4o has the basic prototype of AI assistant, a step forward on the road to general artificial intelligence.

At the conference site, Murati demonstrated the real-time voice conversation function with Mark Chen, head of cutting-edge research at OpenAI, and Barret Zoph, head of the post-training team. From the point of view of the demonstration effect, the interaction between GPT-4o and human becomes more timely and natural. According to reports, GPT-4o can respond to audio input in 232 milliseconds, which is close to the response time of human conversation. Prior to this, voice mode was used to communicate with ChatGPT with an average delay of 2.779pubcom.8 seconds (GPT-3.5) and 5.4 seconds (GPT-4). GPT-4o can not only respond to conversations in real time without long awkward delay, but also generate voice with different emotional styles.

For example, when asked, "how have you been?" In addition to saying "I'm fine", GPT-4o will also ask "how are you?" When asked to tell a "bedtime story of robots and love", GPT-4o was interrupted when he began to speak, asking for a more emotional and dramatic way to tell the story, and then GPT-4o 's tone of voice was more ups and downs, more emotional, and could even end the conversation with a song.

From then on, parents coaxed their children to sleep and became so easy.

Not only that, GPT-4o also has visual + voice interaction and can see graphical equations. Zoph opened the phone video call and said to GPT-4o, "I'm going to write down a linear equation on a piece of paper." Don't tell me the answer, just give the process of solving it. " Then Zoph wrote down 3x+1=4 's equation and asked how to solve the problem. GPT-4o gives advice on the next step when Zoph asks for help and questions through persuasive hints, thus getting the correct result.

Since then, it has become easier for parents to help their children with their homework.

In addition, GPT-4o can read screen information in real time, help answer code questions and analyze charts; it can translate in real time across languages, translate into the corresponding language without delay when the speaker talks in Italian and English, and imitate the tone of the speaker. Being able to identify and analyze human emotions, when the speaker shows a selfie and asks him to judge his or her emotions, GPT-4o analyzes, "you look very happy, maybe a little excited, and you should be in a good mood."

Although OpenAI CEO Sam Altman didn't show up at the event, he broadcast OpenAI updates in real time on his personal social platform. After the press conference, he posted a post with only the word "her" written. According to previous foreign media reports, Altman has said that one of his favorite artificial intelligence movies is Her, and the ultimate goal is to develop a virtual AI assistant similar to that in movies, in an effort to make existing voice assistants such as Apple's Siri more practical and intelligent.

"cut off Hu" Google to show goodwill to Apple

As early as a week ago, there was a lot of news that OpenAI had launched a new product. There are reports that OpenAI will release GPT-5, and that OpenAI will soon launch an AI search engine based on ChatGPT, which will launch an attack on Google. Altman denied the rumors on his personal social platform on May 11, saying: "it's not GPT-5, it's not a search engine, but we've been trying to develop something new that we think people will like!" It feels like magic to me! "

It is worth noting that Google is about to hold an Iamp O developer conference on May 14 to announce updates on Android, Google search and so on. OpenAI chose to hold a press conference the day before the isign O developer conference, no doubt because it didn't want Google to steal its thunder. This is not the first time such a thing has happened. On February 16 this year, OpenAI released the Sora Vinson video model without prior warm-up, which attracted worldwide attention. At that time, Google had just upgraded the Gemini Pro model, but it was overshadowed by the popularity of Sora.

Now that OpenAI has declared war again, the pressure has been put directly on Google, which is about to fight head-on. According to the Huafu Securities Research report, among the overseas mainstream AI models, ChatGPT still ranks first in terms of total visits, while Claude, Perplexity and Character.ai visits in other large models all increased to a certain extent in April, but Google's Gemini visits declined in April, down 1.4% from the previous month. It can be seen that Google is facing increasingly strong competition from OpenAI on the road to the big model.

By contrast, the behind-the-scenes winner in this new product launch is Apple. The reporter noticed that the presentation was conducted with iPhone and MacBook Pro throughout the conference, and the desktop version of ChatGPT for Mac was also released, which seemed to hint at OpenAI's ability to work with Apple to access large models in Apple devices.

In fact, this cooperation can already be seen in some previous actions and media news of OpenAI. Apple and OpenAI are in talks to finalize an agreement to introduce OpenAI's large model technology into iPhone this year. Through the deal, Apple will be able to provide "chatbots" supported by ChatGPT as part of the artificial intelligence features of iOS 18. However, the report also pointed out that Apple and Google have also negotiated the authorization of Gemini chat robots, but have not yet reached an agreement.

Recently, Altman participated in the podcast "All-in Podcast", in which he talked about many hotspots and directions of artificial intelligence. OpenAI will continue to improve the quality of voice features, he said, "believing that voice interaction may be an important clue to the way we interact in the future". When the host asked him if he had worked with Jony Ive, the "father of iPhone", the former Apple chief designer, Altman said, "Yes, we are exchanging some ideas."

In February, Apple CEO Tim Cook revealed that the company was developing a generative AI software feature that would introduce new Siri features supported by large language models in iOS 18, but did not mention whether it would work with OpenAI. It is reported that Apple will host the WWDC Global developer Conference in June to showcase the cutting-edge innovations of iOS, iPadOS, macOS, watchOS, tvOS and visionOS.

Analysts believe that if Apple can reach a cooperation with OpenAI, Apple can not only shorten the product development cycle, but also quickly improve the intelligence of its own products. Whether Apple, which has lagged far behind in the era of generative AI, will be able to fight a beautiful turnaround by plugging into the world's leading models in its hardware may also be revealed in June.

You may also be interested in the following article:

No relevant articles

After scanning the QR code using WeChat

Click on the upper right corner to send to friends