Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessCentOS Launches Accelerated Infrastructure Enablement For Driving NVIDIA AI Factories - PhoronixGNews AI NVIDIAAI #162: Visions of MythosLessWrong AIThe Fundrise Innovation Fund (VCX) Participates in OpenAI’s $122 Billion Funding Round - citybizGoogle News: OpenAIAI project ‘failure’ has little to do with AI - ComputerworldGoogle News: Generative AIAnaxi Labs Partners with Carnegie Mellon to Tackle AI's Biggest Problem: Economics - Lexington Herald LeaderGoogle News: Generative AIOpenAI’s record $122 billion round is just the start - The Business JournalsGoogle News: OpenAIPrediction: Nvidia Will Do the Unthinkable and Hit $100 Before the End of 2026 - The Motley FoolGNews AI NVIDIAI wrote a novel using AI. Writers must accept artificial intelligence – but we are as valuable as ever - The GuardianGoogle News: AIWill AI make it harder for non-graduates to climb the jobs ladder?Financial Times TechThe hottest EVs from the 2026 New York Auto Show (plus one brawny concept)EngadgetColumn: For the Children – Artificial Intelligence brings new risks for our children - Duncan BannerGoogle News: AIDeepSource vs Snyk: Code Quality vs SecurityDEV CommunityBlack Hat USADark ReadingBlack Hat AsiaAI BusinessCentOS Launches Accelerated Infrastructure Enablement For Driving NVIDIA AI Factories - PhoronixGNews AI NVIDIAAI #162: Visions of MythosLessWrong AIThe Fundrise Innovation Fund (VCX) Participates in OpenAI’s $122 Billion Funding Round - citybizGoogle News: OpenAIAI project ‘failure’ has little to do with AI - ComputerworldGoogle News: Generative AIAnaxi Labs Partners with Carnegie Mellon to Tackle AI's Biggest Problem: Economics - Lexington Herald LeaderGoogle News: Generative AIOpenAI’s record $122 billion round is just the start - The Business JournalsGoogle News: OpenAIPrediction: Nvidia Will Do the Unthinkable and Hit $100 Before the End of 2026 - The Motley FoolGNews AI NVIDIAI wrote a novel using AI. Writers must accept artificial intelligence – but we are as valuable as ever - The GuardianGoogle News: AIWill AI make it harder for non-graduates to climb the jobs ladder?Financial Times TechThe hottest EVs from the 2026 New York Auto Show (plus one brawny concept)EngadgetColumn: For the Children – Artificial Intelligence brings new risks for our children - Duncan BannerGoogle News: AIDeepSource vs Snyk: Code Quality vs SecurityDEV Community
AI NEWS HUBbyEIGENVECTOREigenvector

Mimosa Framework: Toward Evolving Multi-Agent Systems for Scientific Research

ArXiv CS.AIby Martin Legrand, Tao Jiang, Matthieu Feraud, Benjamin Navet, Yousouf Taghzouti, Fabien Gandon, Elise Dumont, Louis-F\'elix NothiasApril 1, 20262 min read0 views
Source Quiz

arXiv:2603.28986v1 Announce Type: new Abstract: Current Autonomous Scientific Research (ASR) systems, despite leveraging large language models (LLMs) and agentic architectures, remain constrained by fixed workflows and toolsets that prevent adaptation to evolving tasks and environments. We introduce Mimosa, an evolving multi-agent framework that automatically synthesizes task-specific multi-agent workflows and iteratively refines them through experimental feedback. Mimosa leverages the Model Context Protocol (MCP) for dynamic tool discovery, generates workflow topologies via a meta-orchestrator, executes subtasks through code-generating agents that invoke available tools and scientific software libraries, and scores executions with an LLM-based judge whose feedback drives workflow refineme

View PDF HTML (experimental)

Abstract:Current Autonomous Scientific Research (ASR) systems, despite leveraging large language models (LLMs) and agentic architectures, remain constrained by fixed workflows and toolsets that prevent adaptation to evolving tasks and environments. We introduce Mimosa, an evolving multi-agent framework that automatically synthesizes task-specific multi-agent workflows and iteratively refines them through experimental feedback. Mimosa leverages the Model Context Protocol (MCP) for dynamic tool discovery, generates workflow topologies via a meta-orchestrator, executes subtasks through code-generating agents that invoke available tools and scientific software libraries, and scores executions with an LLM-based judge whose feedback drives workflow refinement. On ScienceAgentBench, Mimosa achieves a success rate of 43.1% with DeepSeek-V3.2, surpassing both single-agent baselines and static multi-agent configurations. Our results further reveal that models respond heterogeneously to multi-agent decomposition and iterative learning, indicating that the benefits of workflow evolution depend on the capabilities of the underlying execution model. Beyond these benchmarks, Mimosa modular architecture and tool-agnostic design make it readily extensible, and its fully logged execution traces and archived workflows support auditability by preserving every analytical step for inspection and potential replication. Combined with domain-expert guidance, the framework has the potential to automate a broad range of computationally accessible scientific tasks across disciplines. Released as a fully open-source platform, Mimosa aims to provide an open foundation for community-driven ASR.

Comments: 48 pages, 4 figures, 1 table. Clean arXiv version prepared. Includes main manuscript plus appendix/supplementary-style implementation details and prompt listings. Dated 30 March 2026

Subjects:

Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA)

MSC classes: 68T01, 93A16

ACM classes: I.2.11; I.2.6; I.2.8

Cite as: arXiv:2603.28986 [cs.AI]

(or arXiv:2603.28986v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2603.28986

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Louis-Felix Nothias [view email] [v1] Mon, 30 Mar 2026 20:35:57 UTC (1,548 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelbenchmark

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Mimosa Fram…modellanguage mo…benchmarkreleaseannounceavailableArXiv CS.AI

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 167 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models