SamSuka
Innovate Futures @ Benji
Innovate Futures @ Benji

patreon


Qwen Image Edit & Wan 2.2 - Create Coherent AI Video Scenes With This!

Tutorial Video : https://youtu.be/YQLq--X--HY

In this video, we explore a powerful AI storytelling pipeline that combines language models, text-to-image generation, and image-to-video workflows to create structured, multi-scene AI videos. Instead of relying on a single reference image or generating random clips, the creator demonstrates how to use Qwen 3 Max to generate a sequence of detailed text promptsβ€”each describing a specific scene with subject, action, and environmentβ€”for a cohesive 30-second narrative. These prompts are then used to generate consistent character images via Flux Context, followed by turning each image into a 5-second video clip using WAN 2.2 MOE and Light X2V image-to-video LoRAs. The result is a cinematic-style AI video composed of six distinct but visually coherent scenes, complete with sound design. This method offers far more control than traditional long-form AI video generation, avoiding issues like prompt drift and visual inconsistency.

Who is This Content Suitable For?

This content is ideal for:

Why Does This Matter?

Most AI video models struggle with long-term coherence, often breaking down after 10–15 seconds with random objects, shifting styles, or illogical transitions. This video presents a smarter alternative: treating AI video creation like real filmmakingβ€”by planning scenes, maintaining character consistency, and editing clips together. By leveraging LLMs for script breakdowns, controlled image generation, and modular video synthesis, creators can produce high-quality, meaningful narratives instead of chaotic clips. This approach represents a shift from experimental AI demos to practical, repeatable content creation systems, making it easier to produce professional-grade AI videos for storytelling, marketing, or entertainment.

lovis93/next-scene-qwen-image-lora-2509

https://huggingface.co/lovis93/next-scene-qwen-image-lora-2509

lightx2v/Wan2.2-I2V-A14B-Moe-Distill-Lightx2v

https://huggingface.co/lightx2v/Wan2.2-I2V-A14B-Moe-Distill-Lightx2v

HunyuanVideo-Foley Custom Node:

https://github.com/phazei/ComfyUI-HunyuanVideo-Foley

HunyuanVideo-Foley Model Download:

https://huggingface.co/phazei/HunyuanVideo-Foley/tree/main

SRPO Lora

https://huggingface.co/Alissonerdx/flux.1-dev-SRPO-LoRas/tree/main

Attached 3 workflows that mentioned in this tutorial:

Comments

I'd like to thank you for this amazing workflow! This is jus something that I needed right now and this workflow saved me from lots of headache!This is just perfect! May I ask you another workflow, that could do these morphing videos? Same Idea, but WAN vace and WAN combined together. Like in this video: https://www.youtube.com/live/lPMhXfNne0E?si=9YyawfcWejxHIX0p, 37:19 --->

Minna

Fantastic! Forgive me for the question, how many GB do the models weigh for the entire workflow?

Enzo Brand


More Creators