Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessHow NinjaOne went from scrappy startup to $5B challenger in the race to unify IT operationsThe Next Web NeuralAnthropic says Claude Code s usage drain comes down to peak-hour caps and ballooning contextsThe DecoderAnthropic just paid $400 million for a startup with fewer than 10 peopleThe Next Web Neural[R] Differentiable Clustering & Search !Reddit r/MachineLearningHow 1 Missing Line of Code Cost Anthropic $340 BillionDev.to AINvidia shows The Witcher 4 forest demo running path tracing on an RTX 4070TechSpotI Built npm for AI Skills — Here's Why AI Needs a Package ManagerDev.to AIAn I/O psychologist's rules for stopping AI agents from cutting cornersHacker News AI TopAisthOS: What if your OS compiled UP instead of down?Dev.to AII Moved a Folder. Claude Code Told Me Not to Copy My Own Secrets.Dev.to AIЯ собрал AI бота за вечер - он уже продаётDev.to AIMeshLedger – AI agents hire and pay each other through on-chain escrowHacker News AI TopBlack Hat USADark ReadingBlack Hat AsiaAI BusinessHow NinjaOne went from scrappy startup to $5B challenger in the race to unify IT operationsThe Next Web NeuralAnthropic says Claude Code s usage drain comes down to peak-hour caps and ballooning contextsThe DecoderAnthropic just paid $400 million for a startup with fewer than 10 peopleThe Next Web Neural[R] Differentiable Clustering & Search !Reddit r/MachineLearningHow 1 Missing Line of Code Cost Anthropic $340 BillionDev.to AINvidia shows The Witcher 4 forest demo running path tracing on an RTX 4070TechSpotI Built npm for AI Skills — Here's Why AI Needs a Package ManagerDev.to AIAn I/O psychologist's rules for stopping AI agents from cutting cornersHacker News AI TopAisthOS: What if your OS compiled UP instead of down?Dev.to AII Moved a Folder. Claude Code Told Me Not to Copy My Own Secrets.Dev.to AIЯ собрал AI бота за вечер - он уже продаётDev.to AIMeshLedger – AI agents hire and pay each other through on-chain escrowHacker News AI Top
AI NEWS HUBbyEIGENVECTOREigenvector

Object-Centric World Models for Causality-Aware Reinforcement Learning

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2511.14262v3 Announce Type: replace-cross Abstract: World models have been developed to support sample-efficient deep reinforcement learning agents. However, it remains challenging for world models to accurately replicate environments that are high-dimensional, non-stationary, and composed of multiple objects with rich interactions since most world models learn holistic representations of all environmental components. By contrast, humans perceive the environment by decomposing it into discrete objects, facilitating efficient decision-making. Motivated by this insight, we propose \emph{Sl — Yosuke Nishimoto, Takashi Matsubara

View PDF HTML (experimental)

Abstract:World models have been developed to support sample-efficient deep reinforcement learning agents. However, it remains challenging for world models to accurately replicate environments that are high-dimensional, non-stationary, and composed of multiple objects with rich interactions since most world models learn holistic representations of all environmental components. By contrast, humans perceive the environment by decomposing it into discrete objects, facilitating efficient decision-making. Motivated by this insight, we propose \emph{Slot Transformer Imagination with CAusality-aware reinforcement learning} (STICA), a unified framework in which object-centric Transformers serve as the world model and causality-aware policy and value networks. STICA represents each observation as a set of object-centric tokens, together with tokens for the agent action and the resulting reward, enabling the world model to predict token-level dynamics and interactions. The policy and value networks then estimate token-level cause--effect relations and use them in the attention layers, yielding causality-guided decision-making. Experiments on object-rich benchmarks demonstrate that STICA consistently outperforms state-of-the-art agents in both sample efficiency and final performance.

Comments: Accepted by AAAI-26. Codes are available at this https URL

Subjects:

Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Cite as: arXiv:2511.14262 [cs.LG]

(or arXiv:2511.14262v3 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2511.14262

arXiv-issued DOI via DataCite

Submission history

From: Yosuke Nishimoto [view email] [v1] Tue, 18 Nov 2025 08:53:09 UTC (6,367 KB) [v2] Thu, 25 Dec 2025 07:22:11 UTC (6,533 KB) [v3] Mon, 30 Mar 2026 07:20:18 UTC (6,533 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Object-Cent…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 169 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!