Oral video, Chinese manufacturers join the battle

Oral video, Chinese manufacturers join the battle

Over the past six months, AI-generated videos have been promoted in fits and starts, and oral videos have also become a major track that Chinese manufacturers are continuously entering.

Can videos also be narrated? This is now being realized.

After the release of OpenAI's large-scale human video model Sora, domestic companies rushed to join the market, and domestic large-scale human video models entered an accelerated stage.

Over the past six months, AI-generated videos have been advancing in fits and starts.

Vidu, which claims to be the first self-developed large video model in China, and the subsequent video generation models launched by ByteDance, Tencent and many other domestic manufacturers have attracted attention from the outside world from time to time.

Recently, another domestic video large model joined the battle, and the official website of Kuaishou's "Keling" video generation large model was officially launched.

On the 21st, Kuaishou Keling released a major update: the image-to-video function was officially opened, supporting the conversion of static images into 5-second videos. Users can control the movement of objects in the image through prompt text; at the same time, the video continuation function was launched, supporting one-click continuation and multiple consecutive continuations of generated videos, and can generate up to about 3 minutes of video.

Compared with the large video models released by various companies before, which were mainly used to display videos, the Keling large model unveiled this time not only has the same effect as Sora, but has also been opened for invitation testing experience on Kuaishou's Kuaiying App.

According to Kuaishou, the Keling large model was developed by the Kuaishou AI team. It uses a similar technical route to Sora and combines a number of self-developed technological innovations. The video resolution it generates reaches 1080p, the maximum length can reach 2 minutes (frame rate 30fps), and supports free aspect ratios.

In addition, the official also claimed that the Keling large model can generate large-scale reasonable movements and make them conform to the objective laws of motion.

In the official video example, an astronaut is running on the moon. As the camera slowly rises, the astronaut's gait and shadow are kept reasonable and appropriate.

Almost at the same time, Meitu announced that it will launch a new product MOKI at the end of July. This product is based on the video generation capabilities of Meitu's large model and can help users generate AI short films.

However, there is also a view that compared to the large language model that has emerged in droves, the large video model is slower to heat up and lacks the presence of giants.

Why is this so?

Aren’t the big companies interested?

At the same time, in the last round of large language model competition, Kuaishou and Meitu had a low presence.

And in the field of large video models, what are the biggest advantages of these two companies?

In this regard, Beijing Business Daily reporter Wei Wei and Shu Le had a discussion. I think:

Large companies that are still preparing for the "college entrance examination" will not directly attack the "postdoctoral" level.

Making a video is not just a bunch of pictures making up a PPT. Big companies are not in a hurry to make efforts in this area, and it is not very practical. It is just a muscle show.

After all, video generation isn’t just about stringing a bunch of AI drawings together into a cartoon.

In addition to considering more details such as image consistency, compliance with description, light and shadow division, storyboard performance, etc., there is also the ability to understand and recreate the plot.

All of these require in-depth learning in multiple vertical fields such as video structure, content analysis, shooting techniques, and narrative methods.

Its difficulty is far from that of chatting, painting or specializing in chess, which can be accomplished by accumulating data and user error correction.

Even masters in the field of film and television often make mistakes. It is conceivable how difficult it is to make a film with artificial intelligence, which is still in the "college entrance examination stage".

But Kuaishou and Meitu need to show off their muscles, even if it’s just a show.

Whether it is Kuaishou or Meitu, their biggest advantage in the field of large video models is that they have rich "learning materials" for deep learning of artificial intelligence.

Relying on these "learning materials", certain copyright issues can be avoided. In addition, through years of content accumulation, vertical segmentation and labeling in the video field, the large model can better "retrieve" knowledge and also have a certain degree of video professionalism in algorithm design.

But that’s all. Technically, we still lack the original accumulation of artificial intelligence algorithms.

In addition, even if the video big model is mature, it is difficult to make a big breakthrough in the film and television industry.

Whether it is a short drama, an advertisement, a long video or a movie, they will all have the "blockbuster special effects".

But what ultimately attracts the audience is the content (from the screenplay to the camera movements and the actors' acting skills).

These are the keys to large-scale commercial monetization.

I believe that large video models may find it easier to find some business opportunities in the animation field.

<<:  Manner's current main contradiction is the mismatch between its boutique positioning and its cost-effective route.

>>:  The price of one video is nearly 350,000 yuan. The short drama company does not compete with investment but with numbers.

Recommend

Can e-commerce conversion rates be viewed in this way?

In today's digital age, e-commerce has become ...

Where is the Amazon Outlet Sale? What does it mean?

In order to help Amazon sellers deal with inventor...

Live broadcast of car company bosses

This article tells the story of how car industry b...

Can Amazon product packaging have Chinese characters? What are the rules?

Amazon, as we all know, is a cross-border e-commer...

Amazon takes on Temu and SHEIN

As a global e-commerce giant, Amazon has recently ...

Video account live broadcast room user path and two key models

As live streaming has developed to this point, onl...

How is Shopee Malaysia? How to increase sales?

Shopee is a relatively late-developed platform, bu...

Is Lazada's cross-border e-commerce business easy to do? What are the prospects?

Among the cross-border e-commerce platforms, Lazad...

8 methodologies you must know to do Xiaohongshu, super easy

How to create good content on Xiaohongshu? How to ...

Another round of Amazon account sweeps? How should sellers respond?

Amazon Prime Day is scheduled for July 11 and 12, ...

What does Amazon ERP software mean? What are the functions of Amazon ERP system?

There are many merchants engaged in cross-border e...