Methods

2026-05-26

7 min read

Generative AI Video in 2026: Why It Fascinates, Annoys, and Still Needs Strategy

Generative AI video is making audiovisual production easier to access, but it does not solve attention, brand coherence, or editorial strategy. A 2026 field report.

Théo Willems

TJCW Content Factory

Editorial collage about generative AI video in 2026, between traditional shoots, AI micro-drama and social formats.

The history of media is full of moments when audiovisual production suddenly becomes easier to access. DV cameras in the 1990s, smartphones in the 2010s, generative AI today.

Each time, part of the production chain becomes available to many more people. Sometimes an entire crew is replaced by three or four people in an apartment. And each time, the same opposition returns: experimenters versus traditionalists, people excited by the opportunity versus people mourning the loss of established craft.

I do not want to be blindly pro-AI, and I do not want to reject the tool on principle either. The interesting question is elsewhere: where does AI video actually work, where does it still fail, and what does it change for brands producing content in 2026?

Why AI video still triggers rejection

Why do some viewers reject an AI-generated video almost immediately? After all, it is an image like any other. But that forgets something important: even after Photoshop, VFX and AI, an image still carries an implicit promise. Something was there, in front of a lens, at some point.

AI video breaks that promise. It does not document reality. It manufactures the appearance of reality.

In a tech documentary or an explainer video, a generated face can work. It represents "a generic engineer", "a typical customer", a functional character whose role is to illustrate an idea. The viewer reads it as a diagram. The reality of the image is not central.

The problem appears when the format changes. A feature film, a character we have to follow for 90 minutes, an emotion that has to carry a whole scene: that is where things become fragile. We need to believe that something real happened somewhere. And when a generated image aims for realism but does not fully reach it, it falls back into the uncanny valley: that space where a human figure looks almost right, but not right enough to be accepted.

AI video works better when it embraces a plastic, stylized or absurd form. In animation, parody or a deliberately grotesque universe, it is not trying to replace reality. It becomes another way of representing.

In 2026, the hardest limits are not only visual. They are about continuity, direction, the coherence of a character or world over time, and fine control over tools that still produce a lot of randomness.

Why short-form content is a better fit

AI video works much better in short formats, especially on social platforms.

First, because the contract is often illustrative or deliberately absurd. We are not asking the image to be true. We are asking it to be readable, surprising, funny, strange or shareable.

Second, because short formats reduce the cost of breaking the illusion. An inconsistent hand, face or movement matters less in a twenty-second video than in long-form storytelling. Attention is more mobile, the screen is smaller, and the narrative standard is different.

That does not mean everything works. It means AI video becomes more credible when it accepts the constraints of the channel instead of trying to imitate the most demanding forms of audiovisual production.

The Chinese factory: when micro-drama becomes a business model

This is where some studios, especially in China on Douyin, have industrialized AI video. Entire series, with characters, sets, voices and editing, can be produced without a camera or visible human actors in just a few weeks. This is no longer just an experiment. It is a market.

Chinese vertical micro-fiction is already worth tens of billions of dollars according to recent estimates, and AI is becoming a major production layer. Some projections put China's micro-drama market at around 120 billion yuan in 2026, roughly $16.5 billion. Exports from Chinese platforms to Western markets, through apps such as DramaBox, ReelShort, NetShort or GoodShort, have also accelerated sharply.

The production chain in these studios relies on a role some observers have started calling card-pulling technicians. The name captures the process: generating a shot with Kling, Seedance, Sora or another model can feel like pulling a card. You launch the generation, wait, and see what comes out. Most shots are unusable: distorted hands, drifting faces, incoherent movement, impossible continuity. You keep one in five, sometimes one in ten, and run it again.

In practice, their day is simple: launch a generation, wait, watch the result for a few seconds, keep it or discard it, then repeat. The work is repetitive, low-skilled and paid accordingly. It is not an editorial role. It is an industrial one.

The business model looks like advertising arbitrage: produce a complete series at a much lower cost than live action, buy audience aggressively on platforms, then monetize through paid episode unlocks. The margin comes from the gap between production cost, acquisition cost and revenue per user. AI video is not a magic tool here. It is a machine for lowering the cost of a format already optimized for distribution.

Skibidi Tentafruit: the case study

The phenomenon had a Western equivalent with L'ile de la Skibidi Tentafruit: two French students, anthropomorphic fruit characters, reality-TV codes, short AI-generated episodes and massive TikTok virality.

What matters is not just the success of the format. It is what happened immediately after. The concept was copied instantly and massively. Dozens of AI reality-show fruit formats flooded feeds within days. Some were decently executed. None captured the same level of attention.

Not because they were dramatically worse on a technical level. Because they arrived later.

When a format is fully replicable, and with AI it almost always is, the remaining differentiator is identity: the name, the characters, narrative continuity, precedence. Fraisita and Banano existed before the others. The imitators arrived in a mental space that was already occupied. Audiences do not remember the copy once the original has defined the reference.

There is also an uncomfortable legal dimension. By borrowing the mechanics of an existing show, creators operate in a real intellectual-property gray zone. AI-generated content makes the question even more complex: who owns the rights to an AI character that borrows from a TV format, brand codes and a recognizable aesthetic? The question is still open, but probably not for long.

AI video agencies mostly sell coherence

While the general public discovers viral formats, something less visible is happening on the professional side. The French market for AI video agencies has quickly become crowded. The same tools appear everywhere: Sora, Veo, Kling, Runway, Midjourney, ElevenLabs, CapCut, After Effects.

For most players, there is no proprietary stack that fundamentally changes the equation. Technical differentiation is weak.

What is really monetized is brand coherence. Generating one AI video is accessible to almost anyone. Generating one hundred videos that feel like they come from the same visual universe, with the same palette, movement style, lighting treatment and recognizable characters from one episode to the next, is much harder.

Generic tools do not solve that problem on their own. It requires upstream work on assets, prompt engineering applied to a specific identity, proprietary references, naming conventions, art direction and production discipline that many brands cannot internalize.

The agencies that understand this are not only selling AI video. They are selling the ability to industrialize a visual identity in a format AI tools can reproduce at scale.

The reference-frame test

I tested it. A fully AI-generated video, sent internally to colleagues, just to see. The reaction came on WhatsApp at 2:08 p.m. from a 21-year-old intern, exactly the kind of viewer we were trying to reach:

"This is so baaaaad, it's AI."

A hot take, unfiltered.

What it reveals is that detection does not happen only at the level of technical quality. It happens at the level of reference frames. A 21-year-old who consumes video content all day has precise, intuitive standards that are hard to verbalize and very hard to fool.

Knowing your audience means knowing where they have that reference frame and where they do not. On some formats, targets and distribution contexts, AI video passes without friction. On others, it exposes the sender within seconds.

The real problem is not volume

Many brands treat AI video as the answer to a volume problem. Not enough content? AI can produce more. Problem solved.

Except volume is rarely the real problem.

A brand that was producing content nobody remembered does not capture ten times more attention by producing ten times more of it. It saturates an already crowded space. Human attention does not expand with supply. It gets redistributed.

What people call AI slop, the mass-generated, interchangeable flow of content that looks like content without having its properties, is already visible on TikTok, YouTube Shorts and Instagram Reels. Platforms are starting to treat it as noise. Users are developing an intuitive recognition of content made without intent.

They do not reject it because it was generated by AI. They reject it because it is boring.

The barrier changed shape, but it did not disappear

When it comes to democratization, reality is more nuanced than it looks. AI video has a real cost. Producing a clean full-AI video can be more expensive than filming a TikTok on a phone, especially once you count failed attempts, art direction, voice, editing, fixes and the coherence of a series.

The blocker is not only technical. It is the ability to build narratives that hold attention, then use AI tools to execute them. It is a technical-artistic problem, and it requires both sides at once.

Full-AI video has not removed the barrier to entry of traditional video. It has moved it. It is a new mode of expression with its own constraints, but the same questions return: why would someone watch, why would they stay, why would they recognize the brand, why would they want to see the next piece?

AI video is a production tool, not a strategy tool. When the strategy is clear, it multiplies it. When the strategy is blurry, it amplifies the blur.

Indicative sources: The Next Web on China's micro-drama market, China.org.cn on micro-drama exports, VML on AI micro-dramas, Officielles on Skibidi Tentafruit and Oasis.