Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessConnecting MCP servers to Amazon Bedrock AgentCore Gateway using Authorization Code flowAWS Machine Learning BlogParsing the AI and gaming future with Nvidia’s Jensen Huang | GTC Q&A - GamesBeatGNews AI NVIDIAStartup Battlefield 200 applications open: A chance for VC access, TechCrunch coverage, and $100KTechCrunch Venture🔥 Jeffallan/claude-skillsGitHub Trending🔥 teng-lin/notebooklm-pyGitHub Trending🔥 HKUDS/DeepTutorGitHub TrendingNebius Stock Rises on $12B Meta AI Deal & Nvidia Investment | 2026 - News and Statistics - indexbox.ioGNews AI NVIDIAHow to use the new ChatGPT app integrations, including DoorDash, Spotify, Uber, and othersTechCrunch AINvidia’s AI Boom Faces Taiwan Supply Risk - NVIDIA (NASDAQ:NVDA), Taiwan Semiconductor (NYSE:TSM) - BenzingaGNews AI NVIDIAFrom Prompt Engineering to Harness Engineering: The Next Evolution of LLM SystemsTowards AIAdvancing Responsible AI Adoption and Use in the Public Sector: Three Policy Priorities for State LegislationCenter for Democracy & TechnologyNvidia Might Have a Memory Problem, Analyst Says. What It Means for the Stock. - Barron'sGNews AI NVIDIABlack Hat USAAI BusinessBlack Hat AsiaAI BusinessConnecting MCP servers to Amazon Bedrock AgentCore Gateway using Authorization Code flowAWS Machine Learning BlogParsing the AI and gaming future with Nvidia’s Jensen Huang | GTC Q&A - GamesBeatGNews AI NVIDIAStartup Battlefield 200 applications open: A chance for VC access, TechCrunch coverage, and $100KTechCrunch Venture🔥 Jeffallan/claude-skillsGitHub Trending🔥 teng-lin/notebooklm-pyGitHub Trending🔥 HKUDS/DeepTutorGitHub TrendingNebius Stock Rises on $12B Meta AI Deal & Nvidia Investment | 2026 - News and Statistics - indexbox.ioGNews AI NVIDIAHow to use the new ChatGPT app integrations, including DoorDash, Spotify, Uber, and othersTechCrunch AINvidia’s AI Boom Faces Taiwan Supply Risk - NVIDIA (NASDAQ:NVDA), Taiwan Semiconductor (NYSE:TSM) - BenzingaGNews AI NVIDIAFrom Prompt Engineering to Harness Engineering: The Next Evolution of LLM SystemsTowards AIAdvancing Responsible AI Adoption and Use in the Public Sector: Three Policy Priorities for State LegislationCenter for Democracy & TechnologyNvidia Might Have a Memory Problem, Analyst Says. What It Means for the Stock. - Barron'sGNews AI NVIDIA
AI NEWS HUBbyEIGENVECTOREigenvector

VERTIGO: Visual Preference Optimization for Cinematic Camera Trajectory Generation

arXiv cs.CVby Mengtian Li, Yuwei Lu, Feifei Li, Chenqi Gan, Zhifeng Xie, Xi WangApril 6, 20262 min read0 views
Source Quiz

arXiv:2604.02467v1 Announce Type: new Abstract: Cinematic camera control relies on a tight feedback loop between director and cinematographer, where camera motion and framing are continuously reviewed and refined. Recent generative camera systems can produce diverse, text-conditioned trajectories, but they lack this "director in the loop" and have no explicit supervision of whether a shot is visually desirable. This results in in-distribution camera motion but poor framing, off-screen characters, and undesirable visual aesthetics. In this paper, we introduce VERTIGO, the first framework for visual preference optimization of camera trajectory generators. Our framework leverages a real-time graphics engine (Unity) to render 2D visual previews from generated camera motion. A cinematically fin

View PDF HTML (experimental)

Abstract:Cinematic camera control relies on a tight feedback loop between director and cinematographer, where camera motion and framing are continuously reviewed and refined. Recent generative camera systems can produce diverse, text-conditioned trajectories, but they lack this "director in the loop" and have no explicit supervision of whether a shot is visually desirable. This results in in-distribution camera motion but poor framing, off-screen characters, and undesirable visual aesthetics. In this paper, we introduce VERTIGO, the first framework for visual preference optimization of camera trajectory generators. Our framework leverages a real-time graphics engine (Unity) to render 2D visual previews from generated camera motion. A cinematically fine-tuned vision-language model then scores these previews using our proposed cyclic semantic similarity mechanism, which aligns renders with text prompts. This process provides the visual preference signals for Direct Preference Optimization (DPO) post-training. Both quantitative evaluations and user studies on Unity renders and diffusion-based Camera-to-Video pipelines show consistent gains in condition adherence, framing quality, and perceptual realism. Notably, VERTIGO reduces the character off-screen rate from 38% to nearly 0% while preserving the geometric fidelity of camera motion. User study participants further prefer VERTIGO over baselines across composition, consistency, prompt adherence, and aesthetic quality, confirming the perceptual benefits of our visual preference post-training.

Comments: 28 pages, 10 figures, ECCV 2026

Subjects:

Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Cite as: arXiv:2604.02467 [cs.CV]

(or arXiv:2604.02467v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2604.02467

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Yuwei Lu [view email] [v1] Thu, 2 Apr 2026 18:58:56 UTC (32,475 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modeltraining

Knowledge Map

Knowledge Map
TopicsEntitiesSource
VERTIGO: Vi…modellanguage mo…trainingannouncevaluationreviewarXiv cs.CV

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 266 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models