The price war during this 618 shopping festival was started by large models. ByteDance took the lead, and BAT followed closely. The main model was reduced in price by 97%, with one million tokens as low as 1 yuan, and the large model rolled into the "centimeter era", free, completely free, and permanently free... The price reduction at this level has not been seen for a long time. It seems like a replay of the "Hundred Regiments Offensive", "O2O melee", "online car-hailing subsidy war", and "1 yuan bid for cloud projects", igniting the war in the industry. 1. 8 large models collectively reduced pricesByteDance and BAT have collectively entered the big model price war. According to incomplete statistics from IT Times, since May, eight domestic and foreign big model companies have announced major price cuts, including Huanfang Quantitative, Zhipu, GPT-4o, ByteDance, Alitong Yiqianwen, Baidu Wenxin Yiyan, Tencent Hunyuan and iFlytek Spark. On May 15, ByteDance started a price war for large models. The input price of Doubao Universal Model Pro-32k version dropped to 0.0008 yuan/thousand tokens, and the input price of Doubao Universal Model Pro-128k version dropped to 0.005 yuan/thousand tokens. This means that 1 yuan can buy 1.25 million tokens of Doubao's main model, which is about 2 million Chinese characters, equivalent to 3 volumes of "Romance of the Three Kingdoms". Large models usually use "yuan/thousand tokens" as the billing unit. ByteDance has directly reduced the usage cost from cents to cents, which has really provoked a price war. Alibaba and Baidu followed suit around May 21, with the input price of Qwen-Long, the main GPT-4 model of Ali Tongyi Qianwen, falling to 0.0005 yuan/thousand tokens, a direct drop of 97%. After the price reduction, it is about 1/400 of the price of GPT-4, breaking the global reserve price. This means that 1 yuan can buy 2 million tokens, which is equivalent to 5 copies of the Xinhua Dictionary. Baidu Wenxin Yiyan directly announced that the two main models are fully free and effective immediately. They are the lightweight large models ERNIE Speed and ERNIE Lite launched in March this year, which support 8k and 128k context lengths. On May 22, Tencent's Hunyuan large models were fully reduced in price. The price of one of the main models, Hunyuan-lite, was adjusted from 0.008 yuan/thousand tokens to completely free, and the total length of API input and output was planned to be upgraded from the current 4k to 256k. The API input and output prices of the three large models, Hunyuan-standard, Hunyuan-standard-256k with the ability to process ultra-long texts of over 380,000 characters, and Hunyuan-pro with the highest configuration of trillion parameters, were fully reduced, with the largest price reduction of 87.5%. On the same day, iFLYTEK launched the industry's first "permanently free" large model - Spark Large Model Lite. The price of the top-end version of iFLYTEK Spark Max API is as low as 0.21 yuan/10,000 tokens. In comparison, Baidu Wenxin Yiyan ERNIE4.0 and Ali Tongyi Qianwen Qwen-Max are priced at 1.2 yuan/10,000 tokens. The top-end version of iFLYTEK Spark is only one-fifth of Baidu and Ali. What really makes domestic big models nervous is that OpenAI has cut prices four times since the beginning of 2023. The GPT-4o released on May 13 not only achieved a performance leap, but also dropped in price by 50%. 2. Giants spend money to buy data"The current pricing of large models can no longer cover the costs, but why do large model manufacturers continue to lower prices? The main purpose is to collect data." In the view of Zhou Jian, CEO of Lanma Technology, which focuses on AI Agent, the strategy of GPT-4o is to provide free services to the public and reduce the fees for developers by half, so that the public can use it more, thereby collecting interactive data such as multiple rounds of conversations, which can improve model capabilities more quickly than static data. Domestic large models are also in the stage of "burning money for data." In the past, large LLM models had no concept of time, but GPT-4o has solved the problem of short-term memory. It can perceive emotions, follow instructions or be interrupted during conversations, and can tell stories with voices of different emotions. However, its long-term memory ability and social intelligence are still insufficient. Zhou Jian gave an analogy. Even if GPT-4o is as smart as Einstein, it cannot be qualified as the CFO of a listed company because this role requires strong long-term memory and the ability to summarize and analyze the entire company's organizational structure, power structure, etc. based on different communications. Therefore, it is still a "brain in a jar" at this stage. In addition to being free to the public, OpenAI is also using hardware to collect data from offline sales, headhunting, and other customer communications. After releasing GPT-4o, OpenAI worked with Reddit, a well-known American forum and the headquarters of retail investors in the U.S. stock market, to collect community content to train the big model. Therefore, the purpose of collecting these real dynamic data is to make the big model more human-like, capable of multiple rounds of dialogue, long-term memory, and familiar with the logic of social interaction. In the view of Xu Hongyi, senior R&D manager of Shanghai Artificial Intelligence Research Institute, the key to big models winning the market is also data. At present, the competition for big models has shifted from computing power to high-quality data. High-quality Chinese data is very scarce, and dynamic interactive data is even more important. Chinese textbooks are only one-tenth of English textbooks. The closed loop of static knowledge data in Chinese is naturally weaker than that in English. Therefore, we can only achieve quantitative breakthroughs from dynamic data. Only by letting real people use it can we collect dynamic data. In addition, 70% of the world's data only stays at the level of free and public data sets. If we want to let big models master professional knowledge, we can only continue to attract developers from all walks of life to inject data sets in vertical fields. Low price is undoubtedly the best way to attract customers. Of course, the premise is to ensure the same quality or even faster speed. From last year's GPT-4 to this year's GPT-4o, the first token response speed is 6 times faster, but the retrieval price is 12 times cheaper. Foreign manufacturers use speed to attract attention, while domestic manufacturers use price to attract the market. However, some industry insiders and media have questioned that it is "unscientific" for large model manufacturers to talk about price without considering the concurrency. If they cannot support high concurrency, they cannot guarantee the output speed and quality. "The collective price reduction of large models is not only a market strategy, but also a signal that a turning point has arrived." Yang Xiaojing, a specially appointed expert in Beijing who was once responsible for the country's first credit bond risk model based on spatiotemporal data, believes that there are three reasons for the collective price reduction of large models: first, based on policies such as unified subsidies, the reduction in costs of cloud, computing chips, etc. has brought technological dividends; secondly, it is the confidence of large model manufacturers in scale growth. At the beginning of 2024, the daily API call volume of all large models in the country will not exceed 100 million times, but it is expected to increase 100 times by the end of this year; finally, it is to attract developers, thereby quickly covering thousands of industries. The current user penetration rate of AIGC in China is only about 6%. "ByteDance wants to promote Volcano Engine and cloud services through the Doubao big model. In fact, videos and other content are where the gold is." Yang Xiaojing also believes that cloud, computing power, big models, content, and data are a chain. Once connected, they can form an ecological closed loop. This is also the internal logic of giants spending money in exchange for data. 3. Price war may be transmitted to the C-endThe "Hundred Models War" has truly entered the actual combat stage. According to the observation of the reporter of IT Times, the main body of the price reduction this time is the text large model, which is mainly for developers and enterprises, and has not yet been transmitted to the C-end user level. In the next stage, domestic large models may usher in price cuts at the C-end user level and multi-modal large models. Let C-end users afford it or even use it for free, so that large models become more and more useful. In foreign countries, OpenAI has almost formed a monopoly with its performance advantage. OpenAI's latest multimodal large model GPT-4o currently only has text and image functions, but will be fully open to C-end users for free in the future. It also supports voice and video input and output. GPT-4o Demo Currently, if you want to experience the Plus version on ChatGPT, you still have to upgrade to a membership of $19.99/month. According to statistics from Appfigures, an application intelligence company, ChatGPT's app net revenue jumped 22% on the day GPT-4o was released, reaching $900,000 on May 21, nearly twice the app's average daily revenue. "The competition in the domestic large-model market is becoming increasingly fierce. Until an absolute winner is decided, the price reduction trend will not stop and may even decline exponentially," Zhou Jian believes. Yang Xiaojing also believes that massive users and a huge consumer market will accelerate cost reduction. Behind the price war is the desire of China's big models to seize the dividends of data and scenarios, shortening or even surpassing the development speed of big models in the United States. Data as the core and scenario as the driving force are China's "magic weapon" for overtaking in the mobile Internet and 5G curves. In the era of big models, is this path of overtaking still feasible? 2024 is considered the first year for the commercialization of big models. IDC predicts that the market size of China's AI big models will reach US$21.1 billion in 2026, and artificial intelligence will enter a critical period of large-scale application. According to Xu Hongyi's observation, domestic big models are more inclined to go deep into application scenarios and lay a foundation for business. There has always been controversy over the gap in the development of big models between China and the United States. Some say the gap is one and a half years, while others say it is half a year. Recently, the "2024 Artificial Intelligence Index Report" released by Stanford University shows that of the 149 well-known big models released in the world in 2023, the United States accounts for 61 and China accounts for 15, ranking second in the world, catching up faster. China's number of artificial intelligence patents accounts for 60%, leading other countries. If China wants to narrow the gap in the development of large models between China and the United States, China's large models must rely on application scenarios to win. Price cuts will directly motivate companies to use big models. Based on previous experience in developing credit risk models and smart investment advisors, Yang Xiaojing calculated that from 2005 to 2022, 825,000 brokerage company research reports have accumulated in the A-share market. If each report is calculated at 10,000 words, the total is about 8.5 billion words, equivalent to 340 million tokens. For developers of smart investment research big models, if they use the original price of a general big model to call the API, it would cost 34,000 yuan each time, but now it only costs 1,700 yuan. According to Yang Xiaojing's judgment, in the financial field, the field of intelligent customer service, which is in urgent need, will see the impact of AI injection and price reduction the fastest. After the cost of calling is reduced, the user scale of the service will also increase rapidly. In addition, China is the world's largest installer of industrial robots, accounting for 50% of the world's installations. Chinese AI large-scale model companies should seize the opportunity of industrial upgrading. "AI services must become as easy to use and available as water and electricity, and as ubiquitous as 5G, so that they can overtake others like 5G and achieve global leadership," said Yang Xiaojing. |
<<: Xiamen beauty group may not be able to get out of Douyin
>>: Samsung's new ad contains hints about Apple, is this competitive promotion always effective?
When operating the Amazon platform, I believe that...
The best choice for brands to break through, creat...
Sometimes a seller’s Amazon account may be suspend...
This article mainly introduces how Duolingo, throu...
Amazon, like AliExpress, is a cross-border e-comme...
Amazon is one of the world's largest e-commerc...
Amazon is what we call Amazon. Many merchants doin...
The most powerful helper for brands going global i...
Anthropic, an emerging AI research company, is gra...
Traffic is critical to the operation of e-commerce...
As one of the world's largest e-commerce platf...
Nowadays, immersive short videos have become a for...
Now when you start a business, you will basically ...
Now when doing cross-border e-commerce, everyone n...
In the increasingly competitive cross-border e-com...