Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessMassachusetts Sen. Ed Markey is putting AV firms on blast for using human staffersFast Company TechOpenClaw has 500,000 instances and no enterprise kill switchVentureBeat AIJump to play: Building with Gemini & MediaPipeGoogle Developers BlogADK Go 1.0 Arrives!Google Developers BlogAnnouncing ADK for Java 1.0.0: Building the Future of AI Agents in JavaGoogle Developers BlogPlan mode is now available in Gemini CLIGoogle Developers BlogUnleash Your Development Superpowers: Refining the Core Coding ExperienceGoogle Developers BlogClosing the knowledge gap with agent skillsGoogle Developers BlogBuild a smart financial assistant with LlamaParse and Gemini 3.1Google Developers BlogDeveloper’s Guide to AI Agent ProtocolsGoogle Developers BlogAnnouncing the Colab MCP Server: Connect Any AI Agent to Google ColabGoogle Developers BlogIntroducing Finish Changes and Outlines, now available in Gemini Code Assist extensions on IntelliJ and VS CodeGoogle Developers BlogBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessMassachusetts Sen. Ed Markey is putting AV firms on blast for using human staffersFast Company TechOpenClaw has 500,000 instances and no enterprise kill switchVentureBeat AIJump to play: Building with Gemini & MediaPipeGoogle Developers BlogADK Go 1.0 Arrives!Google Developers BlogAnnouncing ADK for Java 1.0.0: Building the Future of AI Agents in JavaGoogle Developers BlogPlan mode is now available in Gemini CLIGoogle Developers BlogUnleash Your Development Superpowers: Refining the Core Coding ExperienceGoogle Developers BlogClosing the knowledge gap with agent skillsGoogle Developers BlogBuild a smart financial assistant with LlamaParse and Gemini 3.1Google Developers BlogDeveloper’s Guide to AI Agent ProtocolsGoogle Developers BlogAnnouncing the Colab MCP Server: Connect Any AI Agent to Google ColabGoogle Developers BlogIntroducing Finish Changes and Outlines, now available in Gemini Code Assist extensions on IntelliJ and VS CodeGoogle Developers Blog

$R_{dm}$: Re-conceptualizing Distribution Matching as a Reward for Diffusion Distillation

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2603.28460v1 Announce Type: cross Abstract: Diffusion models achieve state-of-the-art generative performance but are fundamentally bottlenecked by their slow iterative sampling process. While diffusion distillation techniques enable high-fidelity few-step generation, traditional objectives often restrict the student's performance by anchoring it solely to the teacher. Recent approaches have attempted to break this ceiling by integrating Reinforcement Learning (RL), typically through a simple summation of distillation and RL objectives. In this work, we propose a novel paradigm by reconce — Linqian Fan, Peiqin Sun, Tiancheng Wen, Shun Lu, Chengru Song

View PDF HTML (experimental)

Abstract:Diffusion models achieve state-of-the-art generative performance but are fundamentally bottlenecked by their slow iterative sampling process. While diffusion distillation techniques enable high-fidelity few-step generation, traditional objectives often restrict the student's performance by anchoring it solely to the teacher. Recent approaches have attempted to break this ceiling by integrating Reinforcement Learning (RL), typically through a simple summation of distillation and RL objectives. In this work, we propose a novel paradigm by reconceptualizing distribution matching as a reward, denoted as $R_{dm}$. This unified perspective bridges the algorithmic gap between Diffusion Matching Distillation (DMD) and RL, providing several key benefits. (1) Enhanced optimization stability: we introduce Group Normalized Distribution Matching (GNDM), which adapts standard RL group normalization to stabilize $R_{dm}$ estimation. By leveraging group-mean statistics, GNDM establishes a more robust and effective optimization direction. (2) Seamless reward integration: our reward-centric formulation inherently supports adaptive weighting mechanisms, allowing flexible combination of DMD with external reward models. (3) Improved sampling efficiency: by aligning with RL principles, the framework readily incorporates importance sampling (IS), leading to a significant boost in sampling efficiency. Extensive experiments demonstrate that GNDM outperforms vanilla DMD, reducing the FID by 1.87. Furthermore, our multi-reward variant, GNDMR, surpasses existing baselines by achieving a strong balance between aesthetic quality and fidelity, reaching a peak HPS of 30.37 and a low FID-SD of 12.21. Overall, $R_{dm}$ provides a flexible, stable, and efficient framework for real-time high-fidelity synthesis. Code will be released upon publication.

Subjects:

Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Cite as: arXiv:2603.28460 [cs.CV]

(or arXiv:2603.28460v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.28460

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Linqian Fan [view email] [v1] Mon, 30 Mar 2026 14:01:31 UTC (9,573 KB)

Original source

arXiv

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
$R_{dm}$: R…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 92 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers