Research Papers research paper arxiv machine-learning deep-learning

DADP: Domain Adaptive Diffusion Policy

arXivby [Submitted on 3 Feb 2026 (v1), last revised 30 Mar 2026 (this version, v2)]March 31, 20262 min read1 views

arXiv:2602.04037v2 Announce Type: replace Abstract: Learning domain adaptive policies that can generalize to unseen transition dynamics, remains a fundamental challenge in learning-based control. Substantial progress has been made through domain representation learning to capture domain-specific information, thus enabling domain-aware decision making. We analyze the process of learning domain representations through dynamical prediction and find that selecting contexts adjacent to the current step causes the learned representations to entangle static domain information with varying dynamical p — Pengcheng Wang, Qinghang Liu, Haotian Lin, Yiheng Li, Guojian Zhan, Masayoshi Tomizuka, Yixiao Wang

View PDF HTML (experimental)

Abstract:Learning domain adaptive policies that can generalize to unseen transition dynamics, remains a fundamental challenge in learning-based control. Substantial progress has been made through domain representation learning to capture domain-specific information, thus enabling domain-aware decision making. We analyze the process of learning domain representations through dynamical prediction and find that selecting contexts adjacent to the current step causes the learned representations to entangle static domain information with varying dynamical properties. Such mixture can confuse the conditioned policy, thereby constraining zero-shot adaptation. To tackle the challenge, we propose DADP (Domain Adaptive Diffusion Policy), which achieves robust adaptation through unsupervised disentanglement and domain-aware diffusion injection. First, we introduce Lagged Context Dynamical Prediction, a strategy that conditions future state estimation on a historical offset context; by increasing this temporal gap, we unsupervisedly disentangle static domain representations by filtering out transient properties. Second, we integrate the learned domain representations directly into the generative process by biasing the prior distribution and reformulating the diffusion target. Extensive experiments on challenging benchmarks across locomotion and manipulation demonstrate the superior performance, and the generalizability of DADP over prior methods. More visualization results are available on the this https URL.

Subjects:

Machine Learning (cs.LG); Robotics (cs.RO)

Cite as: arXiv:2602.04037 [cs.LG]

(or arXiv:2602.04037v2 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2602.04037

arXiv-issued DOI via DataCite

Submission history

From: Pengcheng Wang [view email] [v1] Tue, 3 Feb 2026 22:04:46 UTC (11,618 KB) [v2] Mon, 30 Mar 2026 06:42:29 UTC (11,619 KB)

Original source

arXiv

https://arxiv.org/abs/2602.04037

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Building knowledge graph…

Discussion

No comments yet — be the first to share your thoughts!