Why is it so difficult to develop AI native applications?

Why is it so difficult to develop AI native applications?

The development of AI native applications is becoming increasingly difficult. Why is AI application so difficult to develop? Let's take a look at the author's views~

AI native applications are "difficult to give birth". After the 100-model war, a group of exhausted entrepreneurs gradually realized that China's real opportunity lies in the application layer, and AI native applications are the most fertile soil for the next round.

Li Yanhong, Wang Xiaochuan, Zhou Hongyi, and Fu Sheng, reviewing the speeches of bigwigs in the past few months, all of them emphasized the huge opportunities in the application layer.

Internet giants are always talking about AI native: Baidu released more than 20 AI native applications at once; ByteDance set up a new team to focus on the application layer; Tencent embedded large models into mini programs; Alibaba also wants to use Tongyi Qianwen to re-do all applications; Wps is giving away AI experience cards crazily...

Startups are even more enthusiastic. After a hackathon, there are nearly 200 AI native projects. This year, including Qiji Chuangtan, Baidu, and Founder Park, dozens of events and thousands of projects have been held, but in the end, not a single one has come out.

We have to face the fact that, although we are aware of the huge opportunities at the application layer, the big model has not subverted all applications, and all products are undergoing a painless transformation. Although China has the best product managers, they also "failed" this time.

From the time Midjourney became popular in April to now, 9 months have passed. Why has the domestic AI native application that has gathered "the hopes of the whole village" been so difficult to give birth?

Choice is more important than hard work. At this moment, perhaps we need to look back calmly and find the correct "posture" to open AI native applications.

1. AI native, not end-to-end

Why are native apps so difficult to produce? We may be able to find some answers from the "production" process of native apps.

"We usually run four or five models at the same time and choose the one with better performance." A big model entrepreneur in Silicon Valley mentioned in an interview with "Self-Quadrant" that they develop AI applications based on basic big models, but do not bind to a certain big model in the early stage. Instead, they let each model run and finally choose the most suitable one.

Simply put, the horse racing mechanism is now also involved in the big model.

However, this approach still has some drawbacks. Although it selects different large models for trial, it will eventually be deeply coupled with one of the large models. This is still an "end-to-end" research and development approach, that is, one application corresponds to one large model.

However, unlike applications, as an underlying big model, it corresponds to multiple applications at the same time, which results in very limited differences between different applications in the same scenario. The bigger problem is that the basic big models on the market have their own strengths and weaknesses, and no big model has become a hexagonal warrior, leading in all fields, so it is difficult for applications developed based on a big model to achieve balance in various functions.

In this context, decoupling large models from applications has become a new idea.

The so-called "decoupling" is actually divided into two links.

The first is the decoupling of big models from applications. As the underlying driving force of AI native applications, the relationship between big models and native applications can actually be compared to the automotive industry.

For AI native applications, the big model is like the engine of a car. The same engine can be adapted to different car models, and the same car model can also be matched with different engines. Through different tuning, different positioning from micro cars to luxury cars can be achieved.

So for the whole vehicle, the engine is only a part of the overall configuration, and cannot be the core that defines the entire vehicle.

Analogously to AI native applications, the basic big model is the key to driving the application, but the basic big model should not be completely bound to the application implementation. A big model can drive different applications, and the same application should also be driven by different big models.

Such examples have actually been reflected in current cases. For example, domestic Feishu and DingTalk, and foreign Slack, can all adapt to different basic large models, and users can choose according to their own needs.

Secondly, in specific applications, the big model and different application links should be decoupled layer by layer.

A typical example is HeyGen, an AI video company that has become popular abroad. Its annual recurring revenue reached US$1 million in March this year and reached US$18 million in November this year.

HeyGen currently has 25 employees, but it has built its own video AI model and integrated large language models from OpenAI and Anthropic and audio products from Eleven Labs. Based on different large models, HeyGen will use different models in different links such as creation, script generation (text), and sound when making a video.

Another more direct example is the plug-in ecosystem of ChatGPT. Recently, the domestic editing application Jianying joined the ChatGPT ecosystem. After that, when users request to call Jianying's plug-in to make videos on ChatGPT, Jianying can automatically generate a video under the drive of ChatGPT.

In other words, the many-to-many matching between big models and applications can be refined to the point where the most suitable big model is selected for support at each stage. That is, an application is not driven by one big model, but by several or even a group of big models.

Multiple large models correspond to one application, integrating the strengths of hundreds of companies. Under this model, the division of labor in the AI ​​industry chain will also be redefined.

Just like the current automotive industry chain, each link of the engine, battery, accessories, and fuselage has a dedicated manufacturer responsible for its own duties, and the OEM only needs to select and assemble to form differentiated products and push them to market at the same time.

Re-divide the work, break down and reorganize, without destruction there is no construction.

2. The prototype of the new ecology

The multi-model and multi-application model will give birth to a new ecosystem.

Following the lead, we tried to envision the architecture of the new ecosystem based on the experience of the Internet.

When mini programs were first introduced, everyone was confused about their capabilities, architecture, and application scenarios. In the early stages, each company had to learn the capabilities and methods of mini programs from scratch. The development of mini programs was very slow, and the number could not increase rapidly.

Until the emergence of WeChat service providers, they were able to connect with the WeChat ecosystem and become familiar with the underlying architecture and structure of mini programs. They also connected with corporate customers and helped them create exclusive mini programs based on their needs. At the same time, they also cooperated with the gameplay of the entire WeChat ecosystem and acquired and retained customers through mini programs. The service provider group also included Weimeng and Youzan.

In other words, the market may not need large vertical models, but it does need large model service providers.

Similarly, each large model needs to be actually used and operated before one can truly understand its related features and how to use it. Service providers are in the middle layer, which can not only be backward compatible with multiple large models, but also work with enterprises to create a healthy ecosystem.

Based on past experience, we can roughly divide service providers into three categories:

The first type is experienced service providers, which understand and master the characteristics and application scenarios of each large model, cooperate with the industry's segmented scenarios, and open up the market through service teams;

The second type of resource-based service providers are like Weimob, which was able to obtain low-priced advertising space in WeChat and then outsource it. In the future, the open permissions of large models will not be universal. Service providers who can obtain sufficient permissions will build early barriers.

The third type is technology service providers. When an application has different large models embedded in its underlying layer, they need to solve various technical problems such as how to call and connect multiple models while ensuring stability and security.

According to the observation of "Self-quadrant", the prototypes of large model service providers have begun to appear in the past six months, but in the form of enterprise services, they provide enterprises with teaching on how to apply various large models. And the application method is also slowly forming WorkFlow.

"I am making a video now. I first propose an idea for a script to Claud and ask him to write it into a story. Then I copy and paste it into ChatGPT and use its logical capabilities to break it down into a script. I connect the Jianying plug-in to convert text to video to directly generate a video. If some of the pictures in the middle are not accurate, I use Midjourney to regenerate them and finally complete a video. If an application can call on these capabilities at the same time, it is a truly native application." An entrepreneur told us.

Of course, there are many problems to be solved in order to truly implement a multi-model and multi-application ecosystem, such as how to communicate between multiple models, how to maximize model calls through algorithms, and how to coordinate to achieve the best solution. These are both challenges and opportunities.

Based on past experience, the development trend of AI applications may be that they will appear in a scattered and point-like manner, and then gradually be unified and integrated.

For example, we need to answer questions, make pictures, and make PPTs. At this stage, these may be many separate applications, but in the future they may be integrated into one overall product. Moving towards platformization. For example, the previous taxi-hailing, food delivery, ticket booking and other business formats are now gradually integrated into a super APP. Different needs will also pose further diversified challenges to model capabilities.

In addition, native AI will subvert the current business model. The hot money in the industrial chain will be redistributed. Baidu will become a knowledge shelf, and Alibaba will become a commodity shelf. All business models will return to the most essential part to meet the real needs of consumers, and redundant processes will be replaced.

On this basis, value creation is one aspect, and how to rebuild the business model has become a more important issue that investors and entrepreneurs need to think about.

At present, we are still on the eve of the explosion of AI native applications. When the bottom layer is the basic large model, the middle layer is the large model service provider, and the upper layer is various startups, AI native applications can be launched in batches only with clear division of labor and benign collaboration.

Author: Luo Ji WeChat public account: Self-quadrant

<<:  More than 40 million "paid members", in-depth analysis of Intime Department Store's membership operation system

>>:  Short dramas are fierce, brands are entering the market, who is making money in the multi-billion market?

Recommend

How to operate Temu EU agent?

Temu, as an emerging cross-border e-commerce platf...

On Labor Day, let’s talk about whose jobs AI is replacing?

At the beginning, our purpose of developing AI was...

Are there any sea freight costs? What are the costs of sea freight?

Nowadays, cross-border e-commerce platforms have r...

Another 618, no winners

Why is there no winner in this year's 618? Thi...

"KOS" is becoming the best choice for Xiaohongshu ecosystem

This article starts with the concept of KOS, analy...

What should I learn and prepare to open a store on Amazon?

Before e-commerce, many people who wanted to buy o...

Which is better, Wayfair or Amazon? How to choose?

Everyone is very familiar with cross-border e-comm...

Amazon Black Friday may usher in the largest strike in history

Amazon employees announced at the "Make Amazo...

Kuaishou declares war on Douyin, who will be the final winner?

The two short video giants, Kuaishou and Douyin, h...

Do digital credit cards have an annual fee? How much is it?

In the digital age, credit cards, as one of the pa...