Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessNvidia Needs to Remind Itself What PC Gamers Actually WantGizmodoAI’s affect on communities, students, staff - USI | student newspaperGoogle News: Generative AI2 Artificial Intelligence (AI) Stocks I'd Buy With $1,000 Before They Rebound From the Tech Sell-Off - The Motley FoolGoogle News: AIGoogle Updates Gemini API Pricing Tiers for Optimization - Intellectia AIGoogle News: GeminiIran Says It Hit Oracle Facilities in UAEGizmodoInside the ethics of artificial intelligence - New Day NW - KING5.comGoogle News: AIAI Needs Memory— And The DRAM ETF Is All In On Micron, Samsung And Sandisk - Roundhill Memory ETF (BATS:D - BenzingaGNews AI SamsungMicrosoft Generative AI Report: The 40 Jobs Most Disrupted Jobs & The 40 Most Secure Jobs - HackerNoonGoogle News: Generative AIGeopolitics, AI, and Cybersecurity: Insights From RSAC 2026Dark ReadingQualcomm joins MassRobotics, to support startups with Dragonwing Robotics HubRobotics Business ReviewDisney, OpenAI Eye Future Deal After Sora Shutdown - lamag.comGoogle News: OpenAIThe BR Privacy, Security & AI Download: April 2026 - The National Law ReviewGNews AI cybersecurityBlack Hat USADark ReadingBlack Hat AsiaAI BusinessNvidia Needs to Remind Itself What PC Gamers Actually WantGizmodoAI’s affect on communities, students, staff - USI | student newspaperGoogle News: Generative AI2 Artificial Intelligence (AI) Stocks I'd Buy With $1,000 Before They Rebound From the Tech Sell-Off - The Motley FoolGoogle News: AIGoogle Updates Gemini API Pricing Tiers for Optimization - Intellectia AIGoogle News: GeminiIran Says It Hit Oracle Facilities in UAEGizmodoInside the ethics of artificial intelligence - New Day NW - KING5.comGoogle News: AIAI Needs Memory— And The DRAM ETF Is All In On Micron, Samsung And Sandisk - Roundhill Memory ETF (BATS:D - BenzingaGNews AI SamsungMicrosoft Generative AI Report: The 40 Jobs Most Disrupted Jobs & The 40 Most Secure Jobs - HackerNoonGoogle News: Generative AIGeopolitics, AI, and Cybersecurity: Insights From RSAC 2026Dark ReadingQualcomm joins MassRobotics, to support startups with Dragonwing Robotics HubRobotics Business ReviewDisney, OpenAI Eye Future Deal After Sora Shutdown - lamag.comGoogle News: OpenAIThe BR Privacy, Security & AI Download: April 2026 - The National Law ReviewGNews AI cybersecurity
AI NEWS HUBbyEIGENVECTOREigenvector

COvolve: Adversarial Co-Evolution of Large-Language-Model-Generated Policies and Environments via Two-Player Zero-Sum Game

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2603.28386v1 Announce Type: new Abstract: A central challenge in building continually improving agents is that training environments are typically static or manually constructed. This restricts continual learning and generalization beyond the training distribution. We address this with COvolve, a co-evolutionary framework that leverages large language models (LLMs) to generate both environments and agent policies, expressed as executable Python code. We model the interaction between environment and policy designers as a two-player zero-sum game, ensuring adversarial co-evolution in which — Alkis Sygkounas, Rishi Hazra, Andreas Persson, Pedro Zuidberg Dos Martires, Amy Loutfi

View PDF HTML (experimental)

Abstract:A central challenge in building continually improving agents is that training environments are typically static or manually constructed. This restricts continual learning and generalization beyond the training distribution. We address this with COvolve, a co-evolutionary framework that leverages large language models (LLMs) to generate both environments and agent policies, expressed as executable Python code. We model the interaction between environment and policy designers as a two-player zero-sum game, ensuring adversarial co-evolution in which environments expose policy weaknesses and policies adapt in response. This process induces an automated curriculum in which environments and policies co-evolve toward increasing complexity. To guarantee robustness and prevent forgetting as the curriculum progresses, we compute the mixed-strategy Nash equilibrium (MSNE) of the zero-sum game, thereby yielding a meta-policy. This MSNE meta-policy ensures that the agent does not forget to solve previously seen environments while learning to solve previously unseen ones. Experiments in urban driving, symbolic maze-solving, and geometric navigation showcase that COvolve produces progressively more complex environments. Our results demonstrate the potential of LLM-driven co-evolution to achieve open-ended learning without predefined task distributions or manual intervention.

Comments: Accepted at GECCO 2026

Subjects:

Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.28386 [cs.AI]

(or arXiv:2603.28386v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2603.28386

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Alkis Sygkounas [view email] [v1] Mon, 30 Mar 2026 12:56:54 UTC (12,971 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
COvolve: Ad…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 168 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!