Models model language model benchmark announce valuation acquisition

Multi-Agent LLMs for Adaptive Acquisition in Bayesian Optimization

arXiv cs.LGby Andrea Carbonati, Mohammadsina Almasi, Hadis AnahidehApril 1, 20262 min read0 views

arXiv:2603.28959v1 Announce Type: new Abstract: The exploration-exploitation trade-off is central to sequential decision-making and black-box optimization, yet how Large Language Models (LLMs) reason about and manage this trade-off remains poorly understood. Unlike Bayesian Optimization, where exploration and exploitation are explicitly encoded through acquisition functions, LLM-based optimization relies on implicit, prompt-based reasoning over historical evaluations, making search behavior difficult to analyze or control. In this work, we present a metric-level study of LLM-mediated search policy learning, studying how LLMs construct and adapt exploration-exploitation strategies under multiple operational definitions of exploration, including informativeness, diversity, and representative

View PDF HTML (experimental)

Abstract:The exploration-exploitation trade-off is central to sequential decision-making and black-box optimization, yet how Large Language Models (LLMs) reason about and manage this trade-off remains poorly understood. Unlike Bayesian Optimization, where exploration and exploitation are explicitly encoded through acquisition functions, LLM-based optimization relies on implicit, prompt-based reasoning over historical evaluations, making search behavior difficult to analyze or control. In this work, we present a metric-level study of LLM-mediated search policy learning, studying how LLMs construct and adapt exploration-exploitation strategies under multiple operational definitions of exploration, including informativeness, diversity, and representativeness. We show that single-agent LLM approaches, which jointly perform strategy selection and candidate generation within a single prompt, suffer from cognitive overload, leading to unstable search dynamics and premature convergence. To address this limitation, we propose a multi-agent framework that decomposes exploration-exploitation control into strategic policy mediation and tactical candidate generation. A strategy agent assigns interpretable weights to multiple search criteria, while a generation agent produces candidates conditioned on the resulting search policy defined as weights. This decomposition renders exploration-exploitation decisions explicit, observable, and adjustable. Empirical results across various continuous optimization benchmarks indicate that separating strategic control from candidate generation substantially improves the effectiveness of LLM-mediated search.

Comments: Proceedings of the IISE Annual Conference & Expo 2026

Subjects:

Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.28959 [cs.LG]

(or arXiv:2603.28959v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.28959

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Mohammadsina Almasi [view email] [v1] Mon, 30 Mar 2026 20:05:30 UTC (4,169 KB)

Original source

arXiv cs.LG

https://arxiv.org/abs/2603.28959

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelbenchmark

ModelsLive

AI models will secretly scheme to protect other AI models from being shut down, researchers find - Fortune

<a href="https://news.google.com/rss/articles/CBMixgFBVV95cUxPdDVrRUpkN1RRQU91SDJYYzVzejV4b1JoTWdwVEZVamltZHdKaGtfS3FNQlMyWVdmS2NqRi1pUHJWbG9KX1ZkUmFPeEllc0Q1SjlPdnVPMHRYTXE2S2EtbThEM1lncnVac01Wc2N2V0NGelIwUVFWUTFtdGRxMGpSby11QWNEcHlqcF96QWhuYWQ0YWFuWDBhWGFqSDNFRVNGc19uNzJnUHR4X0VxQzdZTDhUNjg2Y3pOWWw2QjUweFc0djFUSFE?oc=5" target="_blank">AI models will secretly scheme to protect other AI models from being shut down, researchers find</a> Fortune

Google News: AI Safety

1m15 minutes ago

Models

Source framing triggers systematic bias in large language models - Science | AAAS

<a href="https://news.google.com/rss/articles/CBMiX0FVX3lxTE4wbFNNTVFfcTVXamo3V1Z5cFF4cU1CUGdxU0NQQjBwN2s3Q05LWHJXR2dIQW9FZm1PcFJhX3pueGNEZU9HajFJdlVYd2JWWUdiM2N1ekJFMVU5dFc2WXNJ?oc=5" target="_blank">Source framing triggers systematic bias in large language models</a> Science | AAAS

Google News: LLM

1m5 months ago

ProductsLive

India's 3-Hour Deepfake Deadline Puts Evidence and Investigators at Risk

<a href="https://go.caracomp.com/n/0401261618?src=devto" rel="noopener noreferrer">Analyzing the impact of deepfake regulation on biometric workflows</a> The news of India's 3-hour deepfake takedown deadline is a massive stress test for computer vision (CV) engineers and biometric developers. When the response window is that tight, you aren't just building a feature; you're building a race against a clock that doesn't care about false positives or forensic integrity. For those of us in the facial comparison space, this regulation creates a significant technical hurdle: how do you maintain accuracy when the law mandates speed over verification? For developers working in biometrics, this regulation triggers a cascade of architectural problems. If a platform

DEV Community

3m20 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 192 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsLive

AI models will secretly scheme to protect other AI models from being shut down, researchers find - Fortune

Google News: AI Safety

1m15 minutes ago

Models

Source framing triggers systematic bias in large language models - Science | AAAS

Google News: LLM

1m5 months ago

ModelsRecent

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxOQU9Xc09YTnZwb0Myb3VQMDk5MjVGeE50aEUzbkhWdW1OcUltMGMtQXZwYkN0R2l4ZTloTU1scUNkdTA0cHgwdG9LS2lYazk4dWxMLXJuU0liZnN1S2c2RmszV054VUJCMXhrZjFuQmtUQjk0aGU1M1V2RVpfQ0d3amhYMF92dzFhWGkzelFKd2VhaGJDOV9uOXBfZFpkc3A5N3JnT2dNSzBTMGE3Q0pfdzJrbTI4ZmY4S2dYOG1uaTl4UTZoMFFadE54cHlxUk03ZFgwZm1qV2ZnazJTcFNnX2dMN19xMHZtTHB5QmpQeDFKRDljNi1BX01vc2hkQV9rWHpYNk9oSXVpR1pWS2VQVjVIOHhlVHFqalJRZGZTWXd0VjhfMXFhQ3RXLWdNaS03cDYxMDYxWmlCUEg5MjVzNWg4RGVWVks3b3BLSWpOUXpBU255NDBMRFhzd2lwNTBmOGRHVlFXaGRsR3VaaFZJOU9VTDlXZTQ5V3JVcGRSWG13amZLWjNaVm5RejRTT1NlNTFxOXozWEh3eFd2UlByNUJFWHQtSGJUdS1fdy1UM1ZvcUs4ZGlHeQ?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> WSJ

Google News: LLM

1mabout 23 hours ago

ModelsLive

Anthropic Scrambles to Address Leak of Claude Code Source Code - Bloomberg.com

<a href="https://news.google.com/rss/articles/CBMiswFBVV95cUxOUzRiZnEtRmlmNkhGRktyclVYemV4T1A2el9LdDEwQmtKbE9OVV9sMzhhdDNMaExDSVNaYnl5WU8ydDFXTXloWUpuSjZPejY4U2VfSmZmeUZZeTl2Um5ja2NRc05wb1F0NFdQR1VaeFBNWGo3UzhBVHVLRnRHTkxIMzMtSURHbUFDUW45Rm5tXzVvVndad2ZuZFI1dGw4UXZWVXBseTktd0Y5ZTRLdFN3aFAzRQ?oc=5" target="_blank">Anthropic Scrambles to Address Leak of Claude Code Source Code</a> Bloomberg.com

Google News: Claude

1m23 minutes ago