Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessThis International Fact-Checking Day, use these 5 tips to spot AI-generated contentFast Company TechOpen Models have crossed a thresholdLangChain BlogGoogle releases Gemma 4 under Apache 2.0 — and that license change may matter more than benchmarksVentureBeat AI8 Ways Artificial Intelligence (AI) Can Overvalue Commercial Real Estate, Leading To Property Tax Overpayment - The National Law ReviewGoogle News: AIOpenAI acquires TBPN - OpenAIGoogle News: OpenAIOpenAI just bought TBPN - The VergeGoogle News: OpenAIOpenAI just bought TBPNThe Verge AIOpenAI Acquires TBPNHacker News TopExclusive | OpenAI Buys Tech-Industry Talk Show TBPN - WSJGoogle News: OpenAIPrediction: The $700 Billion Artificial Intelligence (AI) Capex Boom Will Create the Best Buying Opportunity of 2026 for These 3 Stocks - The Motley FoolGoogle News: AIp-e-w/gemma-4-E2B-it-heretic-ara: Gemma 4's defenses shredded by Heretic's new ARA method 90 minutes after the official releaseReddit r/LocalLLaMABlack Hat USADark ReadingBlack Hat AsiaAI BusinessThis International Fact-Checking Day, use these 5 tips to spot AI-generated contentFast Company TechOpen Models have crossed a thresholdLangChain BlogGoogle releases Gemma 4 under Apache 2.0 — and that license change may matter more than benchmarksVentureBeat AI8 Ways Artificial Intelligence (AI) Can Overvalue Commercial Real Estate, Leading To Property Tax Overpayment - The National Law ReviewGoogle News: AIOpenAI acquires TBPN - OpenAIGoogle News: OpenAIOpenAI just bought TBPN - The VergeGoogle News: OpenAIOpenAI just bought TBPNThe Verge AIOpenAI Acquires TBPNHacker News TopExclusive | OpenAI Buys Tech-Industry Talk Show TBPN - WSJGoogle News: OpenAIPrediction: The $700 Billion Artificial Intelligence (AI) Capex Boom Will Create the Best Buying Opportunity of 2026 for These 3 Stocks - The Motley FoolGoogle News: AIp-e-w/gemma-4-E2B-it-heretic-ara: Gemma 4's defenses shredded by Heretic's new ARA method 90 minutes after the official releaseReddit r/LocalLLaMA
AI NEWS HUBbyEIGENVECTOREigenvector

BAT: Balancing Agility and Stability via Online Policy Switching for Long-Horizon Whole-Body Humanoid Control

arXiv cs.ROby Donghoon Baek, Sang-Hun Kim, Sehoon HaApril 2, 20261 min read0 views
Source Quiz

arXiv:2604.01064v1 Announce Type: new Abstract: Despite recent advances in control, reinforcement learning, and imitation learning, developing a unified framework that can achieve agile, precise, and robust whole-body behaviors, particularly in long-horizon tasks, remains challenging. Existing approaches typically follow two paradigms: coupled whole-body policies for global coordination and decoupled policies for modular precision. However, without a systematic method to integrate both, this trade-off between agility, robustness, and precision remains unresolved. In this work, we propose BAT, an online policy-switching framework that dynamically selects between two complementary whole-body RL controllers to balance agility and stability across different motion contexts. Our framework consi

View PDF HTML (experimental)

Abstract:Despite recent advances in control, reinforcement learning, and imitation learning, developing a unified framework that can achieve agile, precise, and robust whole-body behaviors, particularly in long-horizon tasks, remains challenging. Existing approaches typically follow two paradigms: coupled whole-body policies for global coordination and decoupled policies for modular precision. However, without a systematic method to integrate both, this trade-off between agility, robustness, and precision remains unresolved. In this work, we propose BAT, an online policy-switching framework that dynamically selects between two complementary whole-body RL controllers to balance agility and stability across different motion contexts. Our framework consists of two complementary modules: a switching policy learned via hierarchical RL with an expert guidance from sliding-horizon policy pre-evaluation, and an option-aware VQ-VAE that predicts option preference from discrete motion token sequences for improved generalization. The final decision is obtained via confidence-weighted fusion of two modules. Extensive simulations and real-world experiments on the Unitree G1 humanoid robot demonstrate that BAT enables versatile long-horizon loco-manipulation and outperforms prior methods across diverse tasks.

Subjects:

Robotics (cs.RO)

Cite as: arXiv:2604.01064 [cs.RO]

(or arXiv:2604.01064v1 [cs.RO] for this version)

https://doi.org/10.48550/arXiv.2604.01064

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Donghoon Baek [view email] [v1] Wed, 1 Apr 2026 16:03:27 UTC (4,124 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

announcevaluationpolicy

Knowledge Map

Knowledge Map
TopicsEntitiesSource
BAT: Balanc…announcevaluationpolicyglobalarxivarXiv cs.RO

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 148 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!