Live
Black Hat USADark ReadingBlack Hat AsiaAI Businessv0.20.3Ollama Releasestrunk/06cee8b2f9c6b2c10076efb3082adb7c2605a98c: [vllm hash update] update the pinned vllm hash (#179531)PyTorch ReleasesAI startup Rocket offers vibe McKinsey-style reports at a fraction of the costTechCrunch AIChatGPT Now Crawls 3.6x More Than Googlebot: What 24M Requests Reveal - Search Engine JournalGoogle News: ChatGPTSources: Jeff Bezos Project Prometheus has hired xAI co-founder Kyle Kosic from OpenAI and has hundreds of staff across its SF HQ and London and Zurich offices (Financial Times)TechmemeYour Claude Code is Starving, the Food’s Scattered All Over Your Org, and Some of it is StaleTowards AItrunk/5e79c7376a212f6abc628dc596ddec1fcf67e1cb: Update third_party/kineto submodule to 4826a43 (#179492)PyTorch ReleasesMistral Introduces "Voxtral TTS": An Open-Weight Text-to-Voice Model Capable Of Cloning Any Voice From 3 Seconds Of Audio, Runs In 9 Languages, & Beats Elevenlabs Flash V2.5 With A 68.4% Human Preference Win Rate.Reddit r/LocalLLaMAAI chatbots programmed to validate users relying on mental health advice, experts warn - FOX 10 PhoenixGNews AI mental healthThe Agentic AI: How Autonomous AI Systems Are Rewriting the Rules of Work, Business, and TechnologyTowards AI[R] Agentic AI and Occupational Displacement: A Multi-Regional Task Exposure Analysis (236 occupations, 5 US metros)Reddit r/MachineLearningBefore Word2Vec: The Strange, Fascinating Road from Counting Words to Learning MeaningTowards AIBlack Hat USADark ReadingBlack Hat AsiaAI Businessv0.20.3Ollama Releasestrunk/06cee8b2f9c6b2c10076efb3082adb7c2605a98c: [vllm hash update] update the pinned vllm hash (#179531)PyTorch ReleasesAI startup Rocket offers vibe McKinsey-style reports at a fraction of the costTechCrunch AIChatGPT Now Crawls 3.6x More Than Googlebot: What 24M Requests Reveal - Search Engine JournalGoogle News: ChatGPTSources: Jeff Bezos Project Prometheus has hired xAI co-founder Kyle Kosic from OpenAI and has hundreds of staff across its SF HQ and London and Zurich offices (Financial Times)TechmemeYour Claude Code is Starving, the Food’s Scattered All Over Your Org, and Some of it is StaleTowards AItrunk/5e79c7376a212f6abc628dc596ddec1fcf67e1cb: Update third_party/kineto submodule to 4826a43 (#179492)PyTorch ReleasesMistral Introduces "Voxtral TTS": An Open-Weight Text-to-Voice Model Capable Of Cloning Any Voice From 3 Seconds Of Audio, Runs In 9 Languages, & Beats Elevenlabs Flash V2.5 With A 68.4% Human Preference Win Rate.Reddit r/LocalLLaMAAI chatbots programmed to validate users relying on mental health advice, experts warn - FOX 10 PhoenixGNews AI mental healthThe Agentic AI: How Autonomous AI Systems Are Rewriting the Rules of Work, Business, and TechnologyTowards AI[R] Agentic AI and Occupational Displacement: A Multi-Regional Task Exposure Analysis (236 occupations, 5 US metros)Reddit r/MachineLearningBefore Word2Vec: The Strange, Fascinating Road from Counting Words to Learning MeaningTowards AI
AI NEWS HUBbyEIGENVECTOREigenvector

APEX-EM: Non-Parametric Online Learning for Autonomous Agents via Structured Procedural-Episodic Experience Replay

arXiv cs.CLby Pratyay Banerjee, Masud Moshtaghi, Ankit ChadhaApril 1, 20262 min read0 views
Source Quiz

arXiv:2603.29093v1 Announce Type: new Abstract: LLM-based autonomous agents lack persistent procedural memory: they re-derive solutions from scratch even when structurally identical tasks have been solved before. We present \textbf{APEX-EM}, a non-parametric online learning framework that accumulates, retrieves, and reuses structured procedural plans without modifying model weights. APEX-EM introduces: (1) a \emph{structured experience representation} encoding the full procedural-episodic trace of each execution -- planning steps, artifacts, iteration history with error analysis, and quality scores; (2) a \emph{Plan-Retrieve-Generate-Iterate-Ingest} (PRGII) workflow with Task Verifiers providing multi-dimensional reward signals; and (3) a \emph{dual-outcome Experience Memory} with hybrid r

View PDF HTML (experimental)

Abstract:LLM-based autonomous agents lack persistent procedural memory: they re-derive solutions from scratch even when structurally identical tasks have been solved before. We present APEX-EM, a non-parametric online learning framework that accumulates, retrieves, and reuses structured procedural plans without modifying model weights. APEX-EM introduces: (1) a structured experience representation encoding the full procedural-episodic trace of each execution -- planning steps, artifacts, iteration history with error analysis, and quality scores; (2) a Plan-Retrieve-Generate-Iterate-Ingest (PRGII) workflow with Task Verifiers providing multi-dimensional reward signals; and (3) a dual-outcome Experience Memory with hybrid retrieval combining semantic search, structural signature matching, and plan DAG traversal -- enabling cross-domain transfer between tasks sharing no lexical overlap but analogous operational structure. Successful experiences serve as positive in-context examples; failures as negative examples with structured error annotations. We evaluate on BigCodeBench, KGQAGen-10k, and Humanity's Last Exam using Claude Sonnet 4.5 and Opus 4.5. On KGQAGen-10k, APEX-EM achieves 89.6% accuracy versus 41.3% without memory (+48.3pp), surpassing the oracle-retrieval upper bound (84.9%). On BigCodeBench, it reaches 83.3% SR from a 53.9% baseline (+29.4pp), exceeding MemRL's +11.0pp gain under comparable frozen-backbone conditions (noting backbone differences controlled for in our analysis). On HLE, entity graph retrieval reaches 48.0% from 25.2% (+22.8pp). Ablations show component value is task-dependent: rich judge feedback is negligible for code generation but critical for structured queries (+10.3pp), while binary-signal iteration partially compensates for weaker feedback.

Comments: 17 pages, 13 figures

Subjects:

Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

Cite as: arXiv:2603.29093 [cs.CL]

(or arXiv:2603.29093v2 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.29093

arXiv-issued DOI via DataCite

Submission history

From: Pratyay Banerjee [view email] [v1] Tue, 31 Mar 2026 00:24:56 UTC (1,096 KB) [v2] Thu, 2 Apr 2026 21:09:27 UTC (1,096 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

claudemodelannounce

Knowledge Map

Knowledge Map
TopicsEntitiesSource
APEX-EM: No…claudemodelannounceanalysiscode genera…componentarXiv cs.CL

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 275 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models