Efficient Camera Pose Augmentation for View Generalization in Robotic Policy Learning

arXiv cs.ROby [Submitted on 31 Mar 2026]April 1, 20261 min read1 views

arXiv:2603.29192v1 Announce Type: new Abstract: Prevailing 2D-centric visuomotor policies exhibit a pronounced deficiency in novel view generalization, as their reliance on static observations hinders consistent action mapping across unseen views. In response, we introduce GenSplat, a feed-forward 3D Gaussian Splatting framework that facilitates view-generalized policy learning through novel view rendering. GenSplat employs a permutation-equivariant architecture to reconstruct high-fidelity 3D scenes from sparse, uncalibrated inputs in a single forward pass. To ensure structural integrity, we design a 3D-prior distillation strategy that regularizes the 3DGS optimization, preventing the geometric collapse typical of purely photometric supervision. By rendering diverse synthetic views from t

View PDF HTML (experimental)

Abstract:Prevailing 2D-centric visuomotor policies exhibit a pronounced deficiency in novel view generalization, as their reliance on static observations hinders consistent action mapping across unseen views. In response, we introduce GenSplat, a feed-forward 3D Gaussian Splatting framework that facilitates view-generalized policy learning through novel view rendering. GenSplat employs a permutation-equivariant architecture to reconstruct high-fidelity 3D scenes from sparse, uncalibrated inputs in a single forward pass. To ensure structural integrity, we design a 3D-prior distillation strategy that regularizes the 3DGS optimization, preventing the geometric collapse typical of purely photometric supervision. By rendering diverse synthetic views from these stable 3D representations, we systematically augment the observational manifold during training. This augmentation forces the policy to ground its decisions in underlying 3D structures, thereby ensuring robust execution under severe spatial perturbations where baselines severely degrade.

Subjects:

Robotics (cs.RO)

Cite as: arXiv:2603.29192 [cs.RO]

(or arXiv:2603.29192v1 [cs.RO] for this version)

https://doi.org/10.48550/arXiv.2603.29192

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Sen Wang [view email] [v1] Tue, 31 Mar 2026 02:56:10 UTC (4,572 KB)

Original source

arXiv cs.RO

https://arxiv.org/abs/2603.29192

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

trainingannouncepolicy

ReleasesLive

Work from home, but we’re watching: Indonesia, Malaysia geo-track remote civil servants

Indonesia and Malaysia have ordered civil servants to work from home to save fuel amid the Iran war but with digital surveillance measures far stricter than those used during the pandemic. Civil servants in Indonesia must activate location tracking and respond to work communications within five minutes. Their Malaysian counterparts must log into a geolocation monitoring system every hour. Those who fail to comply face escalating sanctions. The work-from-home policies, announced within days of...

SCMP Tech (Asia AI)

1mabout 1 hour ago

Laws & RegulationLive

QIS for Mental Health: Routing Treatment Outcome Intelligence Without Centralizing Patient Records

QIS (Quadratic Intelligence Swarm) is a distributed intelligence architecture discovered by Christopher Thomas Trevethan, protected under 39 provisional patents. The architecture enables N agents to synthesize across N(N-1)/2 unique paths at O(log N) routing cost per agent — without centralizing raw data. Understanding QIS — Part 29 The Largest Untreated Public Health Problem on Earth 970 million people live with a mental health disorder. That is the WHO's figure from the 2022 World Mental Health Report — not an estimate, not a projection, a count. Depression and anxiety alone affect more people than diabetes, cancer, and cardiovascular disease combined when measured by years lived with disability. The treatment gap is the number that should stop anyone working in health systems or technol

Dev.to AI

15m23 minutes ago

ProductsLive

From Scattershot to Sharpshooter: AI Automation in Grant Writing

Staring down another dense RFP, you feel the familiar dread. Hours of parsing, aligning, and drafting loom, pulling you away from mission-critical work. What if you could automate the tedious groundwork and focus on strategic storytelling? The Principle: Build a Learning System, Not Just a Tool The most powerful application of AI isn't generating generic text; it's creating a learning system that internalizes your organization's unique voice and past successes. This turns a one-off chatbot into a consistent, strategic partner that improves with every grant cycle. A key tool for this is a Custom GPT (via ChatGPT Plus). Its purpose is to act as a dedicated, trained team member by ingesting your past winning proposals, organizational language, and strategic plans from a central knowledge base

Dev.to AI

2m28 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 169 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsLive

Detecting collusion through multi-agent interpretability

TL;DR Prior work has shown that linear probes are effective at detecting deception in singular LLM agents. Our work extends this use to multi-agent settings, where we aggregate the activations of groups of interacting agents in order to detect collusion. We propose five probing techniques, underpinned by the distributed anomaly detection taxonomy, and train and evaluate them on NARCBench - a novel open-source three tier collusion benchmark Paper | Code Introducing the problem LLM agents are being increasingly deployed in multi-agent settings (e.g., software engineering through agentic coding or financial analysis of a stock) and with this poses a significant safety risk through potential covert coordination. Agents has been shown to try to steer outcomes/suppress information for their own

LessWrong AI

17mabout 1 hour ago

ModelsLive

Beyond the Video Hype: Why World Models Feel Different in 2026

How benchmarks, toolchains, and real-world limits are reshaping world models in robotics and embodied Artificial Intelligence. Continue reading on Medium »

Medium AI

1m24 minutes ago

Models

The Defense Department reportedly plans to train AI models on classified military data - engadget.com

The Defense Department reportedly plans to train AI models on classified military data engadget.com

GNews AI military

1m16 days ago

ModelsLive

Abacus AI Review (2026): ChatLLM, DeepAgent, Pricing, Features, and Whether It’s Worth It

If you’re wondering whether Abacus AI can replace multiple AI subscriptions and support real workflows — not just simple chat — this is… Continue reading on Medium »

Medium AI

1m23 minutes ago