Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessLost Warship From Battle of Copenhagen Found After 225 YearsGizmodoThese One-of-a-Kind Objects Are in the Wrong MuseumsGizmodoNew 'GeForge' and 'GDDRHammer' attacks can fully infiltrate your system through Nvidia's GPU memory — Rowhammer attacks in GPUs force bit flips in protected VRAM regions to gain read/write accesstomshardware.comGoodbye, middle managers. Hello, 'player-coaches' and 'org leads.'Business InsiderI Uploaded My Blood Work to AI. Am I Oversharing? - WSJGNews AI healthcareAI’s next frontier is the real worldFortune TechDebris from aerial interception strikes Oracle building in Dubai, UAE saysCNBC TechnologyI Audited 30+ Small Businesses on Their AI Visibility. Here's What Most Are Getting Wrong.Dev.to AIHow to Actually Monitor Your LLM Costs (Without a Spreadsheet)Dev.to AIОдин промпт приносит мне $500 в неделю на фрилансеDev.to AINetflix AI Team Just Open-Sourced VOID: an AI Model That Erases Objects From Videos — Physics and AllMarkTechPostUnderstanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.DEV CommunityBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessLost Warship From Battle of Copenhagen Found After 225 YearsGizmodoThese One-of-a-Kind Objects Are in the Wrong MuseumsGizmodoNew 'GeForge' and 'GDDRHammer' attacks can fully infiltrate your system through Nvidia's GPU memory — Rowhammer attacks in GPUs force bit flips in protected VRAM regions to gain read/write accesstomshardware.comGoodbye, middle managers. Hello, 'player-coaches' and 'org leads.'Business InsiderI Uploaded My Blood Work to AI. Am I Oversharing? - WSJGNews AI healthcareAI’s next frontier is the real worldFortune TechDebris from aerial interception strikes Oracle building in Dubai, UAE saysCNBC TechnologyI Audited 30+ Small Businesses on Their AI Visibility. Here's What Most Are Getting Wrong.Dev.to AIHow to Actually Monitor Your LLM Costs (Without a Spreadsheet)Dev.to AIОдин промпт приносит мне $500 в неделю на фрилансеDev.to AINetflix AI Team Just Open-Sourced VOID: an AI Model That Erases Objects From Videos — Physics and AllMarkTechPostUnderstanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.DEV Community
AI NEWS HUBbyEIGENVECTOREigenvector

Training-free Motion Factorization for Compositional Video Generation

arXivMarch 31, 20262 min read2 views
Source Quiz
🧒Explain Like I'm 5Simple language

Hey there, little explorer! 🚀

Imagine you have a magic drawing machine that can make videos! 🎨✨

Sometimes, when you ask it to draw a video, like a puppy running and a ball bouncing, it gets a little confused about how things should move.

This new magic trick helps the machine understand movement much better! It teaches the machine to see three kinds of moves:

  1. Standing still (like a tree). 🌳
  2. Moving all together (like a car driving). 🚗
  3. Wiggly moves (like a flag flapping or a puppy's tail wagging). 🐶

So, before the machine draws, it plans all the wiggles and jiggles. This way, when it makes your video, everything moves just right, making super cool and realistic cartoon movies! Isn't that neat? 🎬🤩

arXiv:2603.09104v2 Announce Type: replace Abstract: Compositional video generation aims to synthesize multiple instances with diverse appearance and motion. However, current approaches mainly focus on binding semantics, neglecting to understand diverse motion categories specified in prompts. In this paper, we propose a motion factorization framework that decomposes complex motion into three primary categories: motionlessness, rigid motion, and non-rigid motion. Specifically, our framework follows a planning before generation paradigm. (1) During planning, we reason about motion laws on the mot — Zixuan Wang, Ziqin Zhou, Feng Chen, Duo Peng, Yixin Hu, Changsheng Li, Yinjie Lei

View PDF HTML (experimental)

Abstract:Compositional video generation aims to synthesize multiple instances with diverse appearance and motion. However, current approaches mainly focus on binding semantics, neglecting to understand diverse motion categories specified in prompts. In this paper, we propose a motion factorization framework that decomposes complex motion into three primary categories: motionlessness, rigid motion, and non-rigid motion. Specifically, our framework follows a planning before generation paradigm. (1) During planning, we reason about motion laws on the motion graph to obtain frame-wise changes in the shape and position of each instance. This alleviates semantic ambiguities in the user prompt by organizing it into a structured representation of instances and their interactions. (2) During generation, we modulate the synthesis of distinct motion categories in a disentangled manner. Conditioned on the motion cues, guidance branches stabilize appearance in motionless regions, preserve rigid-body geometry, and regularize local non-rigid deformations. Crucially, our two modules are model-agnostic, which can be seamlessly incorporated into various diffusion model architectures. Extensive experiments demonstrate that our framework achieves impressive performance in motion synthesis on real-world benchmarks. Code is available at this https URL.

Comments: Accepted by CVPR2026

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.09104 [cs.CV]

(or arXiv:2603.09104v2 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.09104

arXiv-issued DOI via DataCite

Submission history

From: Zixuan Wang [view email] [v1] Tue, 10 Mar 2026 02:27:48 UTC (9,154 KB) [v2] Mon, 30 Mar 2026 04:59:21 UTC (24,758 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Training-fr…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 275 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers