Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessStop Chatting with Large Language Models: A Product Manager's Guide to Reconstructing AI Workflows - 36 KrGoogle News: LLMOpenAI’s Never-Ending Soap Opera - The InformationGoogle News: OpenAITennibot launches Partner V2, its latest robotic tennis ball machineThe Robot ReportI fear Anthropic, OpenAI, and SpaceX IPOs will suck capital out of the market, says Jim Cramer - CNBCGoogle News: OpenAIAn Implementation Guide to Running NVIDIA Transformer Engine with Mixed Precision, FP8 Checks, Benchmarking, and Fallback ExecutionMarkTechPostBusinesses scramble to get noticed by AI searchBBC TechnologyOpenAI is getting weird again - PlatformerGoogle News: OpenAI[D] How's MLX and jax/ pytorch on MacBooks these days?Reddit r/MachineLearningWhich Artificial Intelligence (AI) Supercycle Stock Will Make You Richer Over the Next 10 Years? - The Motley FoolGoogle News: AIOpenAI policy blueprint sparks AI regulation debate - Fox BusinessGNews AI regulationAnthropic Claude AI training model targets AI skills gap | ETIH EdTech News - EdTech Innovation HubGoogle News: ClaudeSamsung flags eightfold jump in Q1 profit as AI chip demand drives up prices - ReutersGNews AI SamsungBlack Hat USADark ReadingBlack Hat AsiaAI BusinessStop Chatting with Large Language Models: A Product Manager's Guide to Reconstructing AI Workflows - 36 KrGoogle News: LLMOpenAI’s Never-Ending Soap Opera - The InformationGoogle News: OpenAITennibot launches Partner V2, its latest robotic tennis ball machineThe Robot ReportI fear Anthropic, OpenAI, and SpaceX IPOs will suck capital out of the market, says Jim Cramer - CNBCGoogle News: OpenAIAn Implementation Guide to Running NVIDIA Transformer Engine with Mixed Precision, FP8 Checks, Benchmarking, and Fallback ExecutionMarkTechPostBusinesses scramble to get noticed by AI searchBBC TechnologyOpenAI is getting weird again - PlatformerGoogle News: OpenAI[D] How's MLX and jax/ pytorch on MacBooks these days?Reddit r/MachineLearningWhich Artificial Intelligence (AI) Supercycle Stock Will Make You Richer Over the Next 10 Years? - The Motley FoolGoogle News: AIOpenAI policy blueprint sparks AI regulation debate - Fox BusinessGNews AI regulationAnthropic Claude AI training model targets AI skills gap | ETIH EdTech News - EdTech Innovation HubGoogle News: ClaudeSamsung flags eightfold jump in Q1 profit as AI chip demand drives up prices - ReutersGNews AI Samsung
AI NEWS HUBbyEIGENVECTOREigenvector

Few Batches or Little Memory, But Not Both: Simultaneous Space and Adaptivity Constraints in Stochastic Bandits

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2603.13742v2 Announce Type: replace Abstract: We study stochastic multi-armed bandits under simultaneous constraints on space and adaptivity: the learner interacts with the environment in $B$ batches and has only $W$ bits of persistent memory. Prior work shows that each constraint alone is surprisingly mild: near-minimax regret $\widetilde{O}(\sqrt{KT})$ is achievable with $O(\log T)$ bits of memory under fully adaptive interaction, and with a $K$-independent $O(\log\log T)$-type number of batches when memory is unrestricted. We show that this picture breaks down in the simultaneously co — Ruiyuan Huang, Zicheng Lyu, Xiaoyi Zhu, Zengfeng Huang

View PDF HTML (experimental)

Abstract:We study stochastic multi-armed bandits under simultaneous constraints on space and adaptivity: the learner interacts with the environment in $B$ batches and has only $W$ bits of persistent memory. Prior work shows that each constraint alone is surprisingly mild: near-minimax regret $\widetilde{O}(\sqrt{KT})$ is achievable with $O(\log T)$ bits of memory under fully adaptive interaction, and with a $K$-independent $O(\log\log T)$-type number of batches when memory is unrestricted. We show that this picture breaks down in the simultaneously constrained regime. We prove that any algorithm with a $W$-bit memory constraint must use at least $\Omega(K/W)$ batches to achieve near-minimax regret $\widetilde{O}(\sqrt{KT})$, even under adaptive grids. In particular, logarithmic memory rules out $O(K^{1-\varepsilon})$ batch complexity. Our proof is based on an information bottleneck. We show that near-minimax regret forces the learner to acquire $\Omega(K)$ bits of information about the hidden set of good arms under a suitable hard prior, whereas an algorithm with $B$ batches and $W$ bits of memory allows only $O(BW)$ bits of information. A key ingredient is a localized change-of-measure lemma that yields probability-level arm exploration guarantees, which is of independent interest. We also give an algorithm that, for any bit budget $W$ with $\Omega(\log T) \le W \le O(K\log T)$, uses at most $W$ bits of memory and $\widetilde{O}(K/W)$ batches while achieving regret $\widetilde{O}(\sqrt{KT})$, nearly matching our lower bound up to polylogarithmic factors.

Subjects:

Machine Learning (cs.LG); Machine Learning (stat.ML)

Cite as: arXiv:2603.13742 [cs.LG]

(or arXiv:2603.13742v2 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.13742

arXiv-issued DOI via DataCite

Submission history

From: Ruiyuan Huang [view email] [v1] Sat, 14 Mar 2026 04:02:50 UTC (41 KB) [v2] Mon, 30 Mar 2026 11:03:48 UTC (41 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Few Batches…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 159 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!