Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessPolymarket Kalshi ArbitrageDEV CommunityBMAD-Method: AI-Driven Agile Development That Actually Works (Part 1: Core Framework)DEV CommunityBehind the Scenes: How Database Traffic Control WorksDEV CommunityWe Built the Same Agent Three Times Before It WorkedDEV CommunityWhy Cybersecurity Compliance Is Now a Strategic Business Asset — Not Just a Legal ObligationDEV CommunityScan Any Document to a Searchable PDF For Free, Right in Your BrowserDEV CommunityAI Writes Better UI Without React Than With ItDEV CommunityScan Any Document to a Searchable PDF — For Free, Right in Your BrowserDEV CommunityWhy LLM orchestration is broken (and how cryptographic agent identities fix it)DEV CommunityBeyond the Hype: A Practical Guide to Integrating AI into Your Development WorkflowDEV CommunityBoston Becomes First Major District to Bring AI Literacy Into Classrooms - GoverningGoogle News: AIHow payment fraud evolved from ancient Roman coins to AI-deepfakes — and what's next - The Business JournalsGNews AI deepfakeBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessPolymarket Kalshi ArbitrageDEV CommunityBMAD-Method: AI-Driven Agile Development That Actually Works (Part 1: Core Framework)DEV CommunityBehind the Scenes: How Database Traffic Control WorksDEV CommunityWe Built the Same Agent Three Times Before It WorkedDEV CommunityWhy Cybersecurity Compliance Is Now a Strategic Business Asset — Not Just a Legal ObligationDEV CommunityScan Any Document to a Searchable PDF For Free, Right in Your BrowserDEV CommunityAI Writes Better UI Without React Than With ItDEV CommunityScan Any Document to a Searchable PDF — For Free, Right in Your BrowserDEV CommunityWhy LLM orchestration is broken (and how cryptographic agent identities fix it)DEV CommunityBeyond the Hype: A Practical Guide to Integrating AI into Your Development WorkflowDEV CommunityBoston Becomes First Major District to Bring AI Literacy Into Classrooms - GoverningGoogle News: AIHow payment fraud evolved from ancient Roman coins to AI-deepfakes — and what's next - The Business JournalsGNews AI deepfake

COMPASS-Hedge: Learning Safely Without Knowing the World

arXivMarch 30, 202610 min read0 views
Source Quiz

arXiv:2603.22348v2 Announce Type: replace Abstract: Online learning algorithms often faces a fundamental trilemma: balancing regret guarantees between adversarial and stochastic settings and providing baseline safety against a fixed comparator. While existing methods excel in one or two of these regimes, they typically fail to unify all three without sacrificing optimal rates or requiring oracle access to problem-dependent parameters. In this work, we bridge this gap by introducing COMPASS-Hedge. Our algorithm is the first full-information method to simultaneously achieve: i) Minimax-optimal r — Ting Hu, Luanda Cai, Manolis Vlatakis

View PDF HTML (experimental)

Abstract:Online learning algorithms often faces a fundamental trilemma: balancing regret guarantees between adversarial and stochastic settings and providing baseline safety against a fixed comparator. While existing methods excel in one or two of these regimes, they typically fail to unify all three without sacrificing optimal rates or requiring oracle access to problem-dependent parameters. In this work, we bridge this gap by introducing COMPASS-Hedge. Our algorithm is the first full-information method to simultaneously achieve: i) Minimax-optimal regret in adversarial environments; ii) Instance-optimal, gap-dependent regret in stochastic environments; and iii) $\tilde{\mathcal{O}}(1)$ regret relative to a designated baseline policy, up to logarithmic factors. Crucially, COMPASS-Hedge is parameter-free and requires no prior knowledge of the environment's nature or the magnitude of the stochastic sub optimality gaps. Our approach hinges on a novel integration of adaptive pseudo-regret scaling and phase-based aggression, coupled with a comparator-aware mixing strategy. To the best of our knowledge, this provides the first "best-of-three-world" guarantee in the full-information setting, establishing that baseline safety does not have to come at the cost of worst-case robustness or stochastic efficiency.

Subjects:

Machine Learning (cs.LG); Computer Science and Game Theory (cs.GT)

Cite as: arXiv:2603.22348 [cs.LG]

(or arXiv:2603.22348v2 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.22348

arXiv-issued DOI via DataCite

Submission history

From: Ting Hu [view email] [v1] Sun, 22 Mar 2026 04:17:43 UTC (1,167 KB) [v2] Fri, 27 Mar 2026 16:39:05 UTC (1,167 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
COMPASS-Hed…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 130 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers