Models model language model benchmark release announce available

Mimosa Framework: Toward Evolving Multi-Agent Systems for Scientific Research

ArXiv CS.AIby Martin Legrand, Tao Jiang, Matthieu Feraud, Benjamin Navet, Yousouf Taghzouti, Fabien Gandon, Elise Dumont, Louis-F\'elix NothiasApril 1, 20262 min read0 views

Source Quiz

arXiv:2603.28986v1 Announce Type: new Abstract: Current Autonomous Scientific Research (ASR) systems, despite leveraging large language models (LLMs) and agentic architectures, remain constrained by fixed workflows and toolsets that prevent adaptation to evolving tasks and environments. We introduce Mimosa, an evolving multi-agent framework that automatically synthesizes task-specific multi-agent workflows and iteratively refines them through experimental feedback. Mimosa leverages the Model Context Protocol (MCP) for dynamic tool discovery, generates workflow topologies via a meta-orchestrator, executes subtasks through code-generating agents that invoke available tools and scientific software libraries, and scores executions with an LLM-based judge whose feedback drives workflow refineme

View PDF HTML (experimental)

Abstract:Current Autonomous Scientific Research (ASR) systems, despite leveraging large language models (LLMs) and agentic architectures, remain constrained by fixed workflows and toolsets that prevent adaptation to evolving tasks and environments. We introduce Mimosa, an evolving multi-agent framework that automatically synthesizes task-specific multi-agent workflows and iteratively refines them through experimental feedback. Mimosa leverages the Model Context Protocol (MCP) for dynamic tool discovery, generates workflow topologies via a meta-orchestrator, executes subtasks through code-generating agents that invoke available tools and scientific software libraries, and scores executions with an LLM-based judge whose feedback drives workflow refinement. On ScienceAgentBench, Mimosa achieves a success rate of 43.1% with DeepSeek-V3.2, surpassing both single-agent baselines and static multi-agent configurations. Our results further reveal that models respond heterogeneously to multi-agent decomposition and iterative learning, indicating that the benefits of workflow evolution depend on the capabilities of the underlying execution model. Beyond these benchmarks, Mimosa modular architecture and tool-agnostic design make it readily extensible, and its fully logged execution traces and archived workflows support auditability by preserving every analytical step for inspection and potential replication. Combined with domain-expert guidance, the framework has the potential to automate a broad range of computationally accessible scientific tasks across disciplines. Released as a fully open-source platform, Mimosa aims to provide an open foundation for community-driven ASR.

Comments: 48 pages, 4 figures, 1 table. Clean arXiv version prepared. Includes main manuscript plus appendix/supplementary-style implementation details and prompt listings. Dated 30 March 2026

Subjects:

Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA)

MSC classes: 68T01, 93A16

ACM classes: I.2.11; I.2.6; I.2.8

Cite as: arXiv:2603.28986 [cs.AI]

(or arXiv:2603.28986v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2603.28986

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Louis-Felix Nothias [view email] [v1] Mon, 30 Mar 2026 20:35:57 UTC (1,548 KB)

Original source

ArXiv CS.AI

https://arxiv.org/abs/2603.28986

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelbenchmark

Models

State of Evaluation Study: Vector Institute Unlocks New Transparency in Benchmarking Global AI Models

Five takeaways for AI model developers, researchers and users Vector Institute’s first State of Evaluation study, developed by Vector’s AI Engineering team, shines new light on the evaluation and benchmarking [ ] The post State of Evaluation Study: Vector Institute Unlocks New Transparency in Benchmarking Global AI Models appeared first on Vector Institute for Artificial Intelligence .

Vector Institute

1m12 months ago

Models

Vector Institute Unveils Comprehensive Evaluation of Leading AI Models

At a glance: TORONTO, ON, April 10, 2025 — Canada’s Vector Institute has unveiled the results of its independent evaluation of leading large language models (LLMs), offering an objective look [ ] The post Vector Institute Unveils Comprehensive Evaluation of Leading AI Models appeared first on Vector Institute for Artificial Intelligence .

Vector Institute

1m12 months ago

Releases

Vector Institute Announces the Appointment of Glenda Crisp as President and CEO

TORONTO, April 15, 2025 (GLOBE NEWSWIRE) The Board of Directors of the Vector Institute is pleased to announce the appointment of Glenda Crisp as President CEO effective April [ ] The post Vector Institute Announces the Appointment of Glenda Crisp as President and CEO appeared first on Vector Institute for Artificial Intelligence .

Vector Institute

1m12 months ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 167 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

Models

State of Evaluation Study: Vector Institute Unlocks New Transparency in Benchmarking Global AI Models

Vector Institute

1m12 months ago

Models

Vector Institute Unveils Comprehensive Evaluation of Leading AI Models

Vector Institute

1m12 months ago

Models

Exploring Intelligence: Vector Faculty Member Kelsey Allen’s Path from Particle Physics to Cognitive Machine Learning

How do humans and machines build models to enable problem-solving and innovation? This is the question that has shaped Kelsey Allen’s career. It’s guided her from high-energy physics to machine [ ] The post Exploring Intelligence: Vector Faculty Member Kelsey Allen’s Path from Particle Physics to Cognitive Machine Learning appeared first on Vector Institute for Artificial Intelligence .

Vector Institute

1m11 months ago

Models

Mistral AI Raises $830 Million in Debt For Nvidia-Powered Data Center - WSJ

Mistral AI Raises $830 Million in Debt For Nvidia-Powered Data Center WSJ

GNews AI NVIDIA

1m3 days ago