Multi-Agent LLMs for Adaptive Acquisition in Bayesian Optimization
arXiv:2603.28959v1 Announce Type: new Abstract: The exploration-exploitation trade-off is central to sequential decision-making and black-box optimization, yet how Large Language Models (LLMs) reason about and manage this trade-off remains poorly understood. Unlike Bayesian Optimization, where exploration and exploitation are explicitly encoded through acquisition functions, LLM-based optimization relies on implicit, prompt-based reasoning over historical evaluations, making search behavior difficult to analyze or control. In this work, we present a metric-level study of LLM-mediated search policy learning, studying how LLMs construct and adapt exploration-exploitation strategies under multiple operational definitions of exploration, including informativeness, diversity, and representative
View PDF HTML (experimental)
Abstract:The exploration-exploitation trade-off is central to sequential decision-making and black-box optimization, yet how Large Language Models (LLMs) reason about and manage this trade-off remains poorly understood. Unlike Bayesian Optimization, where exploration and exploitation are explicitly encoded through acquisition functions, LLM-based optimization relies on implicit, prompt-based reasoning over historical evaluations, making search behavior difficult to analyze or control. In this work, we present a metric-level study of LLM-mediated search policy learning, studying how LLMs construct and adapt exploration-exploitation strategies under multiple operational definitions of exploration, including informativeness, diversity, and representativeness. We show that single-agent LLM approaches, which jointly perform strategy selection and candidate generation within a single prompt, suffer from cognitive overload, leading to unstable search dynamics and premature convergence. To address this limitation, we propose a multi-agent framework that decomposes exploration-exploitation control into strategic policy mediation and tactical candidate generation. A strategy agent assigns interpretable weights to multiple search criteria, while a generation agent produces candidates conditioned on the resulting search policy defined as weights. This decomposition renders exploration-exploitation decisions explicit, observable, and adjustable. Empirical results across various continuous optimization benchmarks indicate that separating strategic control from candidate generation substantially improves the effectiveness of LLM-mediated search.
Comments: Proceedings of the IISE Annual Conference & Expo 2026
Subjects:
Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as: arXiv:2603.28959 [cs.LG]
(or arXiv:2603.28959v1 [cs.LG] for this version)
https://doi.org/10.48550/arXiv.2603.28959
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Mohammadsina Almasi [view email] [v1] Mon, 30 Mar 2026 20:05:30 UTC (4,169 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modellanguage modelbenchmarkAI models will secretly scheme to protect other AI models from being shut down, researchers find - Fortune
<a href="https://news.google.com/rss/articles/CBMixgFBVV95cUxPdDVrRUpkN1RRQU91SDJYYzVzejV4b1JoTWdwVEZVamltZHdKaGtfS3FNQlMyWVdmS2NqRi1pUHJWbG9KX1ZkUmFPeEllc0Q1SjlPdnVPMHRYTXE2S2EtbThEM1lncnVac01Wc2N2V0NGelIwUVFWUTFtdGRxMGpSby11QWNEcHlqcF96QWhuYWQ0YWFuWDBhWGFqSDNFRVNGc19uNzJnUHR4X0VxQzdZTDhUNjg2Y3pOWWw2QjUweFc0djFUSFE?oc=5" target="_blank">AI models will secretly scheme to protect other AI models from being shut down, researchers find</a> <font color="#6f6f6f">Fortune</font>
Source framing triggers systematic bias in large language models - Science | AAAS
<a href="https://news.google.com/rss/articles/CBMiX0FVX3lxTE4wbFNNTVFfcTVXamo3V1Z5cFF4cU1CUGdxU0NQQjBwN2s3Q05LWHJXR2dIQW9FZm1PcFJhX3pueGNEZU9HajFJdlVYd2JWWUdiM2N1ekJFMVU5dFc2WXNJ?oc=5" target="_blank">Source framing triggers systematic bias in large language models</a> <font color="#6f6f6f">Science | AAAS</font>
India's 3-Hour Deepfake Deadline Puts Evidence and Investigators at Risk
<p><strong><a href="https://go.caracomp.com/n/0401261618?src=devto" rel="noopener noreferrer">Analyzing the impact of deepfake regulation on biometric workflows</a></strong></p> <p>The news of India's 3-hour deepfake takedown deadline is a massive stress test for computer vision (CV) engineers and biometric developers. When the response window is that tight, you aren't just building a feature; you're building a race against a clock that doesn't care about false positives or forensic integrity. For those of us in the facial comparison space, this regulation creates a significant technical hurdle: how do you maintain accuracy when the law mandates speed over verification?</p> <p>For developers working in biometrics, this regulation triggers a cascade of architectural problems. If a platform
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models
AI models will secretly scheme to protect other AI models from being shut down, researchers find - Fortune
<a href="https://news.google.com/rss/articles/CBMixgFBVV95cUxPdDVrRUpkN1RRQU91SDJYYzVzejV4b1JoTWdwVEZVamltZHdKaGtfS3FNQlMyWVdmS2NqRi1pUHJWbG9KX1ZkUmFPeEllc0Q1SjlPdnVPMHRYTXE2S2EtbThEM1lncnVac01Wc2N2V0NGelIwUVFWUTFtdGRxMGpSby11QWNEcHlqcF96QWhuYWQ0YWFuWDBhWGFqSDNFRVNGc19uNzJnUHR4X0VxQzdZTDhUNjg2Y3pOWWw2QjUweFc0djFUSFE?oc=5" target="_blank">AI models will secretly scheme to protect other AI models from being shut down, researchers find</a> <font color="#6f6f6f">Fortune</font>
Source framing triggers systematic bias in large language models - Science | AAAS
<a href="https://news.google.com/rss/articles/CBMiX0FVX3lxTE4wbFNNTVFfcTVXamo3V1Z5cFF4cU1CUGdxU0NQQjBwN2s3Q05LWHJXR2dIQW9FZm1PcFJhX3pueGNEZU9HajFJdlVYd2JWWUdiM2N1ekJFMVU5dFc2WXNJ?oc=5" target="_blank">Source framing triggers systematic bias in large language models</a> <font color="#6f6f6f">Science | AAAS</font>
Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ
<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxOQU9Xc09YTnZwb0Myb3VQMDk5MjVGeE50aEUzbkhWdW1OcUltMGMtQXZwYkN0R2l4ZTloTU1scUNkdTA0cHgwdG9LS2lYazk4dWxMLXJuU0liZnN1S2c2RmszV054VUJCMXhrZjFuQmtUQjk0aGU1M1V2RVpfQ0d3amhYMF92dzFhWGkzelFKd2VhaGJDOV9uOXBfZFpkc3A5N3JnT2dNSzBTMGE3Q0pfdzJrbTI4ZmY4S2dYOG1uaTl4UTZoMFFadE54cHlxUk03ZFgwZm1qV2ZnazJTcFNnX2dMN19xMHZtTHB5QmpQeDFKRDljNi1BX01vc2hkQV9rWHpYNk9oSXVpR1pWS2VQVjVIOHhlVHFqalJRZGZTWXd0VjhfMXFhQ3RXLWdNaS03cDYxMDYxWmlCUEg5MjVzNWg4RGVWVks3b3BLSWpOUXpBU255NDBMRFhzd2lwNTBmOGRHVlFXaGRsR3VaaFZJOU9VTDlXZTQ5V3JVcGRSWG13amZLWjNaVm5RejRTT1NlNTFxOXozWEh3eFd2UlByNUJFWHQtSGJUdS1fdy1UM1ZvcUs4ZGlHeQ?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> <font color="#6f6f6f">WSJ</font>
Anthropic Scrambles to Address Leak of Claude Code Source Code - Bloomberg.com
<a href="https://news.google.com/rss/articles/CBMiswFBVV95cUxOUzRiZnEtRmlmNkhGRktyclVYemV4T1A2el9LdDEwQmtKbE9OVV9sMzhhdDNMaExDSVNaYnl5WU8ydDFXTXloWUpuSjZPejY4U2VfSmZmeUZZeTl2Um5ja2NRc05wb1F0NFdQR1VaeFBNWGo3UzhBVHVLRnRHTkxIMzMtSURHbUFDUW45Rm5tXzVvVndad2ZuZFI1dGw4UXZWVXBseTktd0Y5ZTRLdFN3aFAzRQ?oc=5" target="_blank">Anthropic Scrambles to Address Leak of Claude Code Source Code</a> <font color="#6f6f6f">Bloomberg.com</font>
Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!