Research Papers research paper arxiv statistics machine-learning

Manifold Generalization Provably Proceeds Memorization in Diffusion Models

arXivMarch 24, 202610 min read0 views

Diffusion models often generate novel samples even when the learned score is only \emph{coarse} -- a phenomenon not accounted for by the standard view of diffusion training as density estimation. In this paper, we show that, under the \emph{manifold hypothesis}, this behavior can instead be explained by coarse scores capturing the \emph{geometry} of the data while discarding the fine-scale distributional structure of the population measure~$μ_{\scriptscriptstyle\mathrm{data}}$. Concretely, whereas estimating the full data distribution $μ_{\scriptscriptstyle\mathrm{data}}$ supported on a $k$-di — Zebang Shen, Ya-Ping Hsieh, Niao He

View PDF HTML (experimental)

Abstract:Diffusion models often generate novel samples even when the learned score is only \emph{coarse} -- a phenomenon not accounted for by the standard view of diffusion training as density estimation. In this paper, we show that, under the \emph{manifold hypothesis}, this behavior can instead be explained by coarse scores capturing the \emph{geometry} of the data while discarding the fine-scale distributional structure of the population measure~$\mu_{\scriptscriptstyle\mathrm{data}}$. Concretely, whereas estimating the full data distribution $\mu_{\scriptscriptstyle\mathrm{data}}$ supported on a $k$-dimensional manifold is known to require the classical minimax rate $\tilde{\mathcal{O}}(N^{-1/k})$, we prove that diffusion models trained with coarse scores can exploit the \emph{regularity of the manifold support} and attain a near-parametric rate toward a \emph{different} target distribution. This target distribution has density uniformly comparable to that of~$\mu_{\scriptscriptstyle\mathrm{data}}$ throughout any $\tilde{\mathcal{O}}\bigl(N^{-\beta/(4k)}\bigr)$-neighborhood of the manifold, where $\beta$ denotes the manifold regularity. Our guarantees therefore depend only on the smoothness of the underlying support, and are especially favorable when the data density itself is irregular, for instance non-differentiable. In particular, when the manifold is sufficiently smooth, we obtain that \emph{generalization} -- formalized as the ability to generate novel, high-fidelity samples -- occurs at a statistical rate strictly faster than that required to estimate the full population distribution~$\mu_{\scriptscriptstyle\mathrm{data}}$.

Comments: The first two authors contributed equally

Subjects:

Machine Learning (cs.LG); Machine Learning (stat.ML)

Cite as: arXiv:2603.23792 [cs.LG]

(or arXiv:2603.23792v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.23792

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Zebang Shen [view email] [v1] Tue, 24 Mar 2026 23:50:09 UTC (708 KB)

Original source

arXiv

https://arxiv.org/abs/2603.23792v1

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Releases

Obsolescence without hostility: optimization, uniformity, and the erosion of human meaning in a post-AI world

Most contemporary discussions of artificial intelligence focus on misalignment, loss of control, or catastrophic harm. This paper examines a different and comparatively neglected possibility: that advanced AI may erode the social conditions under which human meaning has historically been generated, without conflict, coercion, or displacement. The central question is not whether AI dominates humanity, but whether human participation remains causally significant once AI systems outperform humans across core instrumental domains. The argument is conditional and long-horizon in scope. It proceeds from the observation that existing limits on AI superiority are primarily technological and economic rather than principled. If these constraints are progressively overcome, and AI systems come to out

AI & Society Journal

2m3 days ago

Analyst NewsRecent

Advancing human–AI teams: evolving from instrumental tools to trusted partners

Human–Computer Interaction (HCI) has undergone fundamental transformations as AI capabilities have advanced, necessitating new theoretical frameworks for understanding human–AI collaboration (HAIC). This review traces HCI’s evolution through four paradigm shifts: the Equipment Era (pre-1970s), Interactive System Era (1980s–1990s), Autonomous Agent Era (1990s–2010s), and the emerging Coexistential AI Era (2020s–present), reflecting changing metaphors from tools to dialog partners, autonomous agents, and co-creative partners. The analysis reveals how anthropomorphism and affective computing have become central to contemporary AI systems, enabling emotional intelligence and pseudo-intimate relationships that fundamentally alter human–technology dynamics. Traditional performance metrics such a

AI & Society Journal

1m1 day ago

Market NewsRecent

The algorithmic blind spot: bias, moral status, and the future of robot rights

Contemporary debates in AI ethics increasingly foreground the prospective moral status of artificial intelligence and the possibility of extending moral or legal rights to artificial agents. While such discussions raise substantive philosophical questions, they often proceed alongside a comparatively limited engagement with the empirically documented harms generated by algorithmic systems already embedded within social, legal, and economic institutions. We conceptualize this asymmetry as an algorithmic blind spot: a discursive-structural pattern in which disproportionate ethical investment in speculative future artificial agents marginalizes empirically documented and asymmetrically distributed harms affecting human populations. The paper analyzes prominent strands of the robot-rights lite

AI & Society Journal

30m1 day ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 164 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research Papers

Neo-Nazi Exploitation Online: AI Voice-Cloning and the Revival of Hitler Speeches - gnet-research.org

Neo-Nazi Exploitation Online: AI Voice-Cloning and the Revival of Hitler Speeches gnet-research.org

GNews AI voice

1m4 months ago

Research PapersFresh

Realistic Lip Motion Generation Based on 3D Dynamic Viseme and Coarticulation Modeling for Human-Robot Interaction

arXiv:2604.01756v1 Announce Type: new Abstract: Realistic lip synchronization is essential for the natural human-robot non-verbal interaction of humanoid robots. Motivated by this need, this paper presents a lip motion generation framework based on 3D dynamic viseme and coarticulation modeling. By analyzing Chinese pronunciation theory, a 3D dynamic viseme library is constructed based on the ARKit standard, which offers coherent prior trajectories of lips. To resolve motion conflicts within continuous speech streams, a coarticulation mechanism is developed by incorporating initial-final (Shengmu-Yunmu) decoupling and energy modulation. After developing a strategy to retarget high-dimensional spatial lip motion to a 14-DOF lip actuation system of a humanoid head platform, the efficiency and

arXiv cs.RO

2mabout 7 hours ago

Research PapersFresh

3-D Relative Localization for Multi-Robot Systems with Angle and Self-Displacement Measurements

arXiv:2604.01703v1 Announce Type: new Abstract: Realizing relative localization by leveraging inter-robot local measurements is a challenging problem, especially in the presence of measurement noise. Motivated by this challenge, in this paper we propose a novel and systematic 3-D relative localization framework based on inter-robot interior angle and self-displacement measurements. Initially, we propose a linear relative localization theory comprising a distributed linear relative localization algorithm and sufficient conditions for localizability. According to this theory, robots can determine their neighbors' relative positions and orientations in a purely linear manner. Subsequently, in order to deal with measurement noise, we present an advanced Maximum a Posterior (MAP) estimator by a

arXiv cs.RO

2mabout 7 hours ago

Research PapersFresh

Coupler Position Optimization and Channel Estimation for Flexible Coupler Antenna Aided Multiuser Communication

arXiv:2602.11319v2 Announce Type: replace-cross Abstract: In this paper, we propose a distributed flexible coupler antenna (FCA) array to enhance communication performance with low hardware cost. At each FCA, there is one fixed-position active antenna and multiple passive couplers that can move within a designated region around the active antenna. Moreover, each FCA is equipped with a local processing unit (LPU). All LPUs exchange signals with a central processing unit (CPU) for joint signal processing. We study an FCA-aided multiuser multiple-input multiple-output (MIMO) system, where an FCA array base station (BS) is deployed to enhance the downlink communication between the BS and multiple single-antenna users. We formulate optimization problems to maximize the achievable sum rate of us

arXiv eess.SP

2mabout 7 hours ago