OccSim: Multi-kilometer Simulation with Long-horizon Occupancy World Models
arXiv:2603.28887v1 Announce Type: new Abstract: Data-driven autonomous driving simulation has long been constrained by its heavy reliance on pre-recorded driving logs or spatial priors, such as HD maps. This fundamental dependency severely limits scalability, restricting open-ended generation capabilities to the finite scale of existing collected datasets. To break this bottleneck, we present OccSim, the first occupancy world model-driven 3D simulator. OccSim obviates the requirement for continuous logs or HD maps; conditioned only on a single initial frame and a sequence of future ego-actions, it can stably generate over 3,000 continuous frames, enabling the continuous construction of large-scale 3D occupancy maps spanning over 4 kilometers for simulation. This represents an >80x improvem
View PDF HTML (experimental)
Abstract:Data-driven autonomous driving simulation has long been constrained by its heavy reliance on pre-recorded driving logs or spatial priors, such as HD maps. This fundamental dependency severely limits scalability, restricting open-ended generation capabilities to the finite scale of existing collected datasets. To break this bottleneck, we present OccSim, the first occupancy world model-driven 3D simulator. OccSim obviates the requirement for continuous logs or HD maps; conditioned only on a single initial frame and a sequence of future ego-actions, it can stably generate over 3,000 continuous frames, enabling the continuous construction of large-scale 3D occupancy maps spanning over 4 kilometers for simulation. This represents an >80x improvement in stable generation length over previous state-of-the-art occupancy world models. OccSim is powered by two modules: W-DiT based static occupancy world model and the Layout Generator. W-DiT handles the ultra-long-horizon generation of static environments by explicitly introducing known rigid transformations in architecture design, while the Layout Generator populates the dynamic foreground with reactive agents based on the synthesized road topology. With these designs, OccSim can synthesize massive, diverse simulation streams. Extensive experiments demonstrate its downstream utility: data collected directly from OccSim can pre-train 4D semantic occupancy forecasting models to achieve up to 67% zero-shot performance on unseen data, outperforming previous asset-based simulator by 11%. When scaling the OccSim dataset to 5x the size, the zero-shot performance increases to about 74%, while the improvement over asset-based simulators expands to 22.1%.
Subjects:
Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
Cite as: arXiv:2603.28887 [cs.CV]
(or arXiv:2603.28887v1 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2603.28887
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Tianran Liu [view email] [v1] Mon, 30 Mar 2026 18:13:51 UTC (22,936 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelannounceforecast
New model, old risks: sociodemographic bias and adversarial hallucinations vulnerability in GPT-5
npj Digital Medicine, Published online: 04 April 2026; doi:10.1038/s41746-026-02584-8 We re-evaluated GPT-5 using our published pipelines: 500 emergency vignettes across 32 sociodemographic labels for bias, and adversarial prompts with fabricated details. GPT-5 showed no measurable improvement over GPT-4o in sociodemographic-linked decision variation, with several LGBTQIA+ groups flagged for mental-health screening in 100% of cases. Adversarial hallucination rates were higher (65% vs 53% for GPT-4o); a mitigation prompt reduced this to 7.67%.
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Self-Evolving AI

Anthropic’s Designs Three-Agent Harness Supports Long-Running Full-Stack AI Development
Anthropic introduces a three-agent harness separating planning, generation, and evaluation to improve long-running autonomous AI workflows for frontend and full-stack development. Industry commentary highlights structured approaches, iterative evaluation, and practical methods to maintain coherence and quality over multi-hour AI coding sessions. By Leela Kumili





Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!