Research Papers research paper arxiv machine-learning deep-learning

COMPASS-Hedge: Learning Safely Without Knowing the World

arXivMarch 30, 202610 min read0 views

arXiv:2603.22348v2 Announce Type: replace Abstract: Online learning algorithms often faces a fundamental trilemma: balancing regret guarantees between adversarial and stochastic settings and providing baseline safety against a fixed comparator. While existing methods excel in one or two of these regimes, they typically fail to unify all three without sacrificing optimal rates or requiring oracle access to problem-dependent parameters. In this work, we bridge this gap by introducing COMPASS-Hedge. Our algorithm is the first full-information method to simultaneously achieve: i) Minimax-optimal r — Ting Hu, Luanda Cai, Manolis Vlatakis

View PDF HTML (experimental)

Abstract:Online learning algorithms often faces a fundamental trilemma: balancing regret guarantees between adversarial and stochastic settings and providing baseline safety against a fixed comparator. While existing methods excel in one or two of these regimes, they typically fail to unify all three without sacrificing optimal rates or requiring oracle access to problem-dependent parameters. In this work, we bridge this gap by introducing COMPASS-Hedge. Our algorithm is the first full-information method to simultaneously achieve: i) Minimax-optimal regret in adversarial environments; ii) Instance-optimal, gap-dependent regret in stochastic environments; and iii) $\tilde{\mathcal{O}}(1)$ regret relative to a designated baseline policy, up to logarithmic factors. Crucially, COMPASS-Hedge is parameter-free and requires no prior knowledge of the environment's nature or the magnitude of the stochastic sub optimality gaps. Our approach hinges on a novel integration of adaptive pseudo-regret scaling and phase-based aggression, coupled with a comparator-aware mixing strategy. To the best of our knowledge, this provides the first "best-of-three-world" guarantee in the full-information setting, establishing that baseline safety does not have to come at the cost of worst-case robustness or stochastic efficiency.

Subjects:

Machine Learning (cs.LG); Computer Science and Game Theory (cs.GT)

Cite as: arXiv:2603.22348 [cs.LG]

(or arXiv:2603.22348v2 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.22348

arXiv-issued DOI via DataCite

Submission history

From: Ting Hu [view email] [v1] Sun, 22 Mar 2026 04:17:43 UTC (1,167 KB) [v2] Fri, 27 Mar 2026 16:39:05 UTC (1,167 KB)

Original source

arXiv

https://arxiv.org/abs/2603.22348

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Research Papers

We Asked 300 Finance Leaders What's Next in Fintech. Here's What They Said.: By Sergiy Fitsak - Finextra Research

<a href="https://news.google.com/rss/articles/CBMitAFBVV95cUxQSkNxZGExOG5KR1piVXBnRTN0dkxmak84akUyc0QteDdvSFlXZVNRZzktUjRyYVNvLWlKUVI5Ulp1M0hPY3g0RU9yNmowd0xmWDBIMmxCVkVDTkVjMXRscXFaV1lGTWVXajRycklSWnA4end2NDRkckM3ZE1VenZ6ZVluMmh4LXVqWXVzMEZGY2hyMXBpdnBYYldHTzVfZ2JxT3JCYmExOFphQUlTRER6bl9waWY?oc=5" target="_blank">We Asked 300 Finance Leaders What's Next in Fintech. Here's What They Said.: By Sergiy Fitsak</a> Finextra Research

GNews AI finance

1mabout 2 months ago

AI Tools

Stanford Researchers Find Thin Evidence Behind AI Classroom Tools - GovTech

<a href="https://news.google.com/rss/articles/CBMipwFBVV95cUxQYmVMLUpxaHV6R1RPY1R0WGtNLTVrQXlWTzUySzJRamxoWEdqYlptMW1lMjNWMWRuS1hhb2pVNjhpdWRxekRfclhVbl9FT3E0U1Byc18xcWd0Wm5XM1BTUlNRRWNpaFlzNVk4SDN3eW9YRkFWNlJsVXhIUWdnWmdxX3ZJQUUtcm5MSFRxNTRlZ0I1cXdnV2xHUGdRT0NaQ015Z3czV3J2Yw?oc=5" target="_blank">Stanford Researchers Find Thin Evidence Behind AI Classroom Tools</a> GovTech

GNews AI education

1m16 days ago

Research PapersLive

AI Models Lie, Cheat, and Steal to Protect Other Models From Being Deleted

A new study from researchers at UC Berkeley and UC Santa Cruz suggests models will disobey human commands to protect their own kind.

Wired AI

1mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 130 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research Papers

We Asked 300 Finance Leaders What's Next in Fintech. Here's What They Said.: By Sergiy Fitsak - Finextra Research

GNews AI finance

1mabout 2 months ago

Research PapersLive

AI Models Lie, Cheat, and Steal to Protect Other Models From Being Deleted

A new study from researchers at UC Berkeley and UC Santa Cruz suggests models will disobey human commands to protect their own kind.

Wired AI

1mabout 1 hour ago

Research PapersRecent

Data centers are creating ‘heat islands’ on land around them – warming them by up to 16 degrees, researchers warn - The Independent

<a href="https://news.google.com/rss/articles/CBMiogFBVV95cUxQcVVnRFpzdEtnNVFmdll6VlViUUc5aUhkSzR4Wi1zOVNOMFo2TGtBcjZLR1ZnNVdmYUlPcDNrNW9oT3YzUFFSYlJjLUlLUmtQT1pWQzFxVWRnSXZjelJpaXoxTURrZGw0OFVMc2U5SGhyOVpEMnlnVmhrQ3R6VF9teFNPLTJ0c3JaNGJJeHRaR3ZmOGRFd0FMLVQ2ZHpTMm42NGc?oc=5" target="_blank">Data centers are creating ‘heat islands’ on land around them – warming them by up to 16 degrees, researchers warn</a> The Independent

GNews AI climate

1m1 day ago

Research PapersFresh

The Quantum Threat to Bitcoin Dividing Crypto

Two papers published this week have reignited debates about the risk posed by “Q-day” to the cryptography that underpins digital assets.

Decrypt AI

1mabout 3 hours ago