Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessDeveloper’s Guide to Building ADK Agents with SkillsGoogle Developers BlogCargill Wins 2026 BIG Artificial Intelligence Excellence Award - foodmarket.comGoogle News: AIMeet the Agentic AI Design-to-Source Workspace for PLM: From CAD to Confident Sourcing Decisions - Oracle BlogsGNews AI agenticYouTube blasted by hundreds of experts over ‘AI slop’ videos served up to kidsFast Company TechZenity Emphasizes Security Controls for Expanding Enterprise AI Agent Ecosystems - TipRanksGoogle News: AI SafetyApono Uses Gamified AI Security Exercise to Engage Cloud Security Community - TipRanksGoogle News: AI SafetyUniversity of Colorado delays student rollout of ChatGPT Edu - Boulder Daily CameraGoogle News: ChatGPTSpaceX finally files for IPO, targets $1.75 trillion valuationArs TechnicaMeta’s natural gas binge could power South DakotaTechCrunch AIYour AI Vendor's Worst Enemy Is Its Own Development Pipeline - GovInfoSecurityGoogle News: Machine LearningLegal AI startup Legora hits $100 million in annual recurring revenueBusiness InsiderAnthropic's leaked AI coding tool has been cloned over 8,000 times on GitHub despite mass takedownsThe DecoderBlack Hat USADark ReadingBlack Hat AsiaAI BusinessDeveloper’s Guide to Building ADK Agents with SkillsGoogle Developers BlogCargill Wins 2026 BIG Artificial Intelligence Excellence Award - foodmarket.comGoogle News: AIMeet the Agentic AI Design-to-Source Workspace for PLM: From CAD to Confident Sourcing Decisions - Oracle BlogsGNews AI agenticYouTube blasted by hundreds of experts over ‘AI slop’ videos served up to kidsFast Company TechZenity Emphasizes Security Controls for Expanding Enterprise AI Agent Ecosystems - TipRanksGoogle News: AI SafetyApono Uses Gamified AI Security Exercise to Engage Cloud Security Community - TipRanksGoogle News: AI SafetyUniversity of Colorado delays student rollout of ChatGPT Edu - Boulder Daily CameraGoogle News: ChatGPTSpaceX finally files for IPO, targets $1.75 trillion valuationArs TechnicaMeta’s natural gas binge could power South DakotaTechCrunch AIYour AI Vendor's Worst Enemy Is Its Own Development Pipeline - GovInfoSecurityGoogle News: Machine LearningLegal AI startup Legora hits $100 million in annual recurring revenueBusiness InsiderAnthropic's leaked AI coding tool has been cloned over 8,000 times on GitHub despite mass takedownsThe Decoder

Generation Is Compression: Zero-Shot Video Coding via Stochastic Rectified Flow

arXivMarch 30, 202610 min read0 views
Source Quiz

arXiv:2603.26571v1 Announce Type: cross Abstract: Existing generative video compression methods use generative models only as post-hoc reconstruction modules atop conventional codecs. We propose \emph{Generative Video Codec} (GVC), a zero-shot framework that turns a pretrained video generative model into the codec itself: the transmitted bitstream directly specifies the generative decoding trajectory, with no retraining required. To enable this, we convert the deterministic rectified-flow ODE of modern video foundation models into an equivalent SDE at inference time, unlocking per-step stochas — Ziyue Zeng, Xun Su, Haoyuan Liu, Bingyu Lu, Yui Tatsumi, Hiroshi Watanabe

View PDF HTML (experimental)

Abstract:Existing generative video compression methods use generative models only as post-hoc reconstruction modules atop conventional codecs. We propose \emph{Generative Video Codec} (GVC), a zero-shot framework that turns a pretrained video generative model into the codec itself: the transmitted bitstream directly specifies the generative decoding trajectory, with no retraining required. To enable this, we convert the deterministic rectified-flow ODE of modern video foundation models into an equivalent SDE at inference time, unlocking per-step stochastic injection points for codebook-driven compression. Building on this unified backbone, we instantiate three complementary conditioning strategies -- \emph{Image-to-Video} (I2V) with adaptive tail-frame atom allocation, \emph{Text-to-Video} (T2V) operating at near-zero side information as a pure generative prior, and \emph{First-Last-Frame-to-Video} (FLF2V) with boundary-sharing GOP chaining for dual-anchor temporal control. Together, these variants span a principled trade-off space between spatial fidelity, temporal coherence, and compression efficiency. Experiments on standard benchmarks show that GVC achieves high-quality reconstruction below 0.002,bpp while supporting flexible bitrate control through a single hyperparameter.

Comments: 9 pages, 3 figures

Subjects:

Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.26571 [cs.CV]

(or arXiv:2603.26571v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.26571

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Ziyue Zeng [view email] [v1] Fri, 27 Mar 2026 16:33:20 UTC (4,543 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Generation …researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 200 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers