Object-Centric World Models for Causality-Aware Reinforcement Learning
arXiv:2511.14262v3 Announce Type: replace-cross Abstract: World models have been developed to support sample-efficient deep reinforcement learning agents. However, it remains challenging for world models to accurately replicate environments that are high-dimensional, non-stationary, and composed of multiple objects with rich interactions since most world models learn holistic representations of all environmental components. By contrast, humans perceive the environment by decomposing it into discrete objects, facilitating efficient decision-making. Motivated by this insight, we propose \emph{Sl — Yosuke Nishimoto, Takashi Matsubara
View PDF HTML (experimental)
Abstract:World models have been developed to support sample-efficient deep reinforcement learning agents. However, it remains challenging for world models to accurately replicate environments that are high-dimensional, non-stationary, and composed of multiple objects with rich interactions since most world models learn holistic representations of all environmental components. By contrast, humans perceive the environment by decomposing it into discrete objects, facilitating efficient decision-making. Motivated by this insight, we propose \emph{Slot Transformer Imagination with CAusality-aware reinforcement learning} (STICA), a unified framework in which object-centric Transformers serve as the world model and causality-aware policy and value networks. STICA represents each observation as a set of object-centric tokens, together with tokens for the agent action and the resulting reward, enabling the world model to predict token-level dynamics and interactions. The policy and value networks then estimate token-level cause--effect relations and use them in the attention layers, yielding causality-guided decision-making. Experiments on object-rich benchmarks demonstrate that STICA consistently outperforms state-of-the-art agents in both sample efficiency and final performance.
Comments: Accepted by AAAI-26. Codes are available at this https URL
Subjects:
Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as: arXiv:2511.14262 [cs.LG]
(or arXiv:2511.14262v3 [cs.LG] for this version)
https://doi.org/10.48550/arXiv.2511.14262
arXiv-issued DOI via DataCite
Submission history
From: Yosuke Nishimoto [view email] [v1] Tue, 18 Nov 2025 08:53:09 UTC (6,367 KB) [v2] Thu, 25 Dec 2025 07:22:11 UTC (6,533 KB) [v3] Mon, 30 Mar 2026 07:20:18 UTC (6,533 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv
Anthropic just paid $400 million for a startup with fewer than 10 people
Anthropic has acquired Coefficient Bio, a stealth biotech AI startup founded barely eight months ago, in an all-stock deal worth just over $400 million. The acquisition brings a team of fewer than 10 people, nearly all former Genentech computational biology researchers, into Anthropic’s healthcare and life sciences division, and it signals something larger than a [ ] This story continues at The Next Web
![[R] Differentiable Clustering & Search !](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-graph-nodes-a2pnJLpyKmDnxKWLd5BEAb.webp)
[R] Differentiable Clustering & Search !
Hey guys, I occasionally write articles on my blog, and I am happy to share the new one with you : https://bornlex.github.io/posts/differentiable-clustering/ . It came from something I was working for at work, and we ended up implementing something else because of the constraints that we have. The method mixes different loss terms to achieve a differentiable clustering method that takes into account mutual info, semantic proximity and even constraints such as the developer enforcing two tags (could be documents) to be part of the same cluster. Then it is possible to search the catalog using the clusters. All of it comes from my mind, I used an AI to double check the sentences, spelling, so it might have rewritten a few sentences, but most of it is human made. I've added the research flair
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.





Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!