Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessSector Snapshot: Venture Funding To Foundational AI Startups In Q1 Was Double All Of 2025 - Crunchbase NewsGNews AI startupsWhy Fixing Data Late Is 1000x More Expensive (with Rohit Choudhary)AI YouTube Channel 35AI laws overlook environmental damage – here’s what needs to change - The ConversationGNews AI energyAnthropic is learning that there are no take-backs on the internetBusiness InsiderOpenClaw launches an official China mirror, with ByteDance providing the servers to host the Chinese-language service, as OpenClaw explodes in the country (Juro Osawa/The Information)TechmemeAI in Energy Market to hit USD 51.4 Billion by 2033 - vocal.mediaGNews AI energyYouTube Topic Insights: Google's open-source Gemini tool that finds trends for you - PPC LandGNews AI open sourceArtificial Intelligence in Process Control - The Chemical EngineerGoogle News: AIHong Jin-kyung Faces Disney Copyright Hurdle in AI Video Creation - 조선일보GNews AI copyrightOpenAI doesn’t just want to answer your questions — it wants to run your digital life - TechRadarGoogle News: OpenAIIs AI the new “Manhattan Project”? Vox went to Los Alamos to find out. - VoxGoogle News: ChatGPTWhy Nvidia just poured $2 billion into AI ASIC competitor Marvell — NVLink Fusion turns into soft ecosystem lock-intomshardware.comBlack Hat USADark ReadingBlack Hat AsiaAI BusinessSector Snapshot: Venture Funding To Foundational AI Startups In Q1 Was Double All Of 2025 - Crunchbase NewsGNews AI startupsWhy Fixing Data Late Is 1000x More Expensive (with Rohit Choudhary)AI YouTube Channel 35AI laws overlook environmental damage – here’s what needs to change - The ConversationGNews AI energyAnthropic is learning that there are no take-backs on the internetBusiness InsiderOpenClaw launches an official China mirror, with ByteDance providing the servers to host the Chinese-language service, as OpenClaw explodes in the country (Juro Osawa/The Information)TechmemeAI in Energy Market to hit USD 51.4 Billion by 2033 - vocal.mediaGNews AI energyYouTube Topic Insights: Google's open-source Gemini tool that finds trends for you - PPC LandGNews AI open sourceArtificial Intelligence in Process Control - The Chemical EngineerGoogle News: AIHong Jin-kyung Faces Disney Copyright Hurdle in AI Video Creation - 조선일보GNews AI copyrightOpenAI doesn’t just want to answer your questions — it wants to run your digital life - TechRadarGoogle News: OpenAIIs AI the new “Manhattan Project”? Vox went to Los Alamos to find out. - VoxGoogle News: ChatGPTWhy Nvidia just poured $2 billion into AI ASIC competitor Marvell — NVLink Fusion turns into soft ecosystem lock-intomshardware.com
AI NEWS HUBbyEIGENVECTOREigenvector

ABot-PhysWorld: Interactive World Foundation Model for Robotic Manipulation with Physics Alignment

arXivMarch 30, 202610 min read0 views
Source Quiz

arXiv:2603.23376v2 Announce Type: replace Abstract: Video-based world models offer a powerful paradigm for embodied simulation and planning, yet state-of-the-art models often generate physically implausible manipulations - such as object penetration and anti-gravity motion - due to training on generic visual data and likelihood-based objectives that ignore physical laws. We present ABot-PhysWorld, a 14B Diffusion Transformer model that generates visually realistic, physically plausible, and action-controllable videos. Built on a curated dataset of three million manipulation clips with physics- — Yuzhi Chen, Ronghan Chen, Dongjie Huo, Yandan Yang, Dekang Qi, Haoyun Liu, Tong Lin, Shuang Zeng, Junjin Xiao, Xinyuan Chang, Feng Xiong, Xing Wei, Zhiheng Ma, Mu Xu

Authors:Yuzhi Chen, Ronghan Chen, Dongjie Huo, Yandan Yang, Dekang Qi, Haoyun Liu, Tong Lin, Shuang Zeng, Junjin Xiao, Xinyuan Chang, Feng Xiong, Xing Wei, Zhiheng Ma, Mu Xu

View PDF

Abstract:Video-based world models offer a powerful paradigm for embodied simulation and planning, yet state-of-the-art models often generate physically implausible manipulations - such as object penetration and anti-gravity motion - due to training on generic visual data and likelihood-based objectives that ignore physical laws. We present ABot-PhysWorld, a 14B Diffusion Transformer model that generates visually realistic, physically plausible, and action-controllable videos. Built on a curated dataset of three million manipulation clips with physics-aware annotation, it uses a novel DPO-based post-training framework with decoupled discriminators to suppress unphysical behaviors while preserving visual quality. A parallel context block enables precise spatial action injection for cross-embodiment control. To better evaluate generalization, we introduce EZSbench, the first training-independent embodied zero-shot benchmark combining real and synthetic unseen robot-task-scene combinations. It employs a decoupled protocol to separately assess physical realism and action alignment. ABot-PhysWorld achieves new state-of-the-art performance on PBench and EZSbench, surpassing Veo 3.1 and Sora v2 Pro in physical plausibility and trajectory consistency. We will release EZSbench to promote standardized evaluation in embodied video generation.

Comments: Code: this https URL

Subjects:

Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

Cite as: arXiv:2603.23376 [cs.CV]

(or arXiv:2603.23376v2 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.23376

arXiv-issued DOI via DataCite

Submission history

From: Yuzhi Chen [view email] [v1] Tue, 24 Mar 2026 16:07:09 UTC (13,213 KB) [v2] Fri, 27 Mar 2026 09:50:16 UTC (36,900 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
ABot-PhysWo…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 143 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!