Research Papers research paper arxiv computer-vision image-recognition

LAMP: Language-Assisted Motion Planning for Controllable Video Generation

arXivMarch 31, 20262 min read0 views

arXiv:2512.03619v3 Announce Type: replace Abstract: Video generation has achieved remarkable progress in visual fidelity and controllability, enabling conditioning on text, layout, or motion. Among these, motion control - specifying object dynamics and camera trajectories - is essential for composing complex, cinematic scenes, yet existing interfaces remain limited. We introduce LAMP that leverages large language models (LLMs) as motion planners to translate natural language descriptions into explicit 3D trajectories for dynamic objects and (relatively defined) cameras. LAMP defines a motion d — Muhammed Burak Kizil, Enes Sanli, Niloy J. Mitra, Erkut Erdem, Aykut Erdem, Duygu Ceylan

View PDF HTML (experimental)

Abstract:Video generation has achieved remarkable progress in visual fidelity and controllability, enabling conditioning on text, layout, or motion. Among these, motion control - specifying object dynamics and camera trajectories - is essential for composing complex, cinematic scenes, yet existing interfaces remain limited. We introduce LAMP that leverages large language models (LLMs) as motion planners to translate natural language descriptions into explicit 3D trajectories for dynamic objects and (relatively defined) cameras. LAMP defines a motion domain-specific language (DSL), inspired by cinematography conventions. By harnessing program synthesis capabilities of LLMs, LAMP generates structured motion programs from natural language, which are deterministically mapped to 3D trajectories. We construct a large-scale procedural dataset pairing natural text descriptions with corresponding motion programs and 3D trajectories. Experiments demonstrate LAMP's improved performance in motion controllability and alignment with user intent compared to state-of-the-art alternatives establishing the first framework for generating both object and camera motions directly from natural language specifications. Code, models and data are available on our project page.

Comments: CVPR 2026. Project Page: this https URL

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2512.03619 [cs.CV]

(or arXiv:2512.03619v3 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2512.03619

arXiv-issued DOI via DataCite

Submission history

From: Muhammed Burak Kızıl [view email] [v1] Wed, 3 Dec 2025 09:51:13 UTC (39,114 KB) [v2] Mon, 8 Dec 2025 11:47:40 UTC (39,114 KB) [v3] Sun, 29 Mar 2026 17:38:57 UTC (39,583 KB)

Original source

arXiv

https://arxiv.org/abs/2512.03619

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Research Papers

AI Safety Researcher Warns “World Is in Peril” as He Quits Anthropic to Study Poetry - vocal.media

AI Safety Researcher Warns “World Is in Peril” as He Quits Anthropic to Study Poetry vocal.media

Google News: AI Safety

1m6 days ago

Research PapersRecent

How AI-powered echolocation is giving small drones night vision

To help small aerial robots navigate in the dark and other low-visibility environments, my colleagues and I developed an ultrasound-based perception system inspired by bat echolocation. Current robots rely heavily on cameras or light detection and ranging , known as lidar, or both. But these sensors fail in visually challenging conditions, such as smoke, fog, dust, snow, or complete darkness. I’m a scientific engineer who develops bio-inspired microrobots. To solve this challenge, my research team looked at nature’s experts at navigating in poor visibility: bats. They thrive in dark, damp, and dusty caves and can detect obstacles as thin as a human hair using echolocation while weighing as little as two paper clips. They emit sound waves and listen to weak echoes reflected from objects. Ho

Fast Company Tech

4mabout 23 hours ago

ModelsFresh

Hormuud Telecom and Germany's GIZ sign MoU with focus on cross-border payments, digital and AI training - Telecompaper

Hormuud Telecom and Germany's GIZ sign MoU with focus on cross-border payments, digital and AI training Telecompaper

Google News - AI Somalia

1mabout 3 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 191 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research Papers

AI Safety Researcher Warns “World Is in Peril” as He Quits Anthropic to Study Poetry - vocal.media

AI Safety Researcher Warns “World Is in Peril” as He Quits Anthropic to Study Poetry vocal.media

Google News: AI Safety

1m6 days ago

Research PapersRecent

How AI-powered echolocation is giving small drones night vision

Fast Company Tech

4mabout 23 hours ago

Research PapersLive

Why AI health chatbots won’t make you better at diagnosing yourself – new research - Gavi, the Vaccine Alliance

Why AI health chatbots won’t make you better at diagnosing yourself – new research Gavi, the Vaccine Alliance

Google News: AI

1m28 minutes ago

Research PapersFresh

Infinite-Horizon Ergodic Control via Kernel Mean Embeddings

arXiv:2604.01023v1 Announce Type: new Abstract: This paper derives an infinite-horizon ergodic controller based on kernel mean embeddings for long-duration coverage tasks on general domains. While existing kernel-based ergodic control methods provide strong coverage guarantees on general coverage domains, their practical use has been limited to sub-ergodic, finite-time horizons due to intractable computational scaling, prohibiting its use for long-duration coverage. We resolve this scaling by deriving an infinite-horizon ergodic controller equipped with an extended kernel mean embedding error visitation state that recursively records state visitation. This extended state decouples past visitation from future control synthesis and expands ergodic control to infinite-time settings. In additi

arXiv cs.RO

1mabout 9 hours ago