Models model benchmark announce available study agent

SkillReducer: Optimizing LLM Agent Skills for Token Efficiency

arXiv cs.SEby Yudong Gao, Zongjie Li, Yuanyuanyuan, Zimo Ji, Pingchuan Ma, Shuai WangApril 1, 20261 min read0 views

arXiv:2603.29919v1 Announce Type: new Abstract: LLM-based coding agents rely on \emph{skills}, pre-packaged instruction sets that extend agent capabilities, yet every token of skill content injected into the context window incurs both monetary cost and attention dilution. To understand the severity of this problem, we conduct a large-scale empirical study of 55,315 publicly available skills and find systemic inefficiencies: 26.4\% lack routing descriptions entirely, over 60\% of body content is non-actionable, and reference files can inject tens of thousands of tokens per invocation. Motivated by these findings, we present \textsc{SkillReducer}, a two-stage optimization framework. Stage~1 optimizes the routing layer by compressing verbose descriptions and generating missing ones via advers

View PDF HTML (experimental)

Abstract:LLM-based coding agents rely on \emph{skills}, pre-packaged instruction sets that extend agent capabilities, yet every token of skill content injected into the context window incurs both monetary cost and attention dilution. To understand the severity of this problem, we conduct a large-scale empirical study of 55,315 publicly available skills and find systemic inefficiencies: 26.4% lack routing descriptions entirely, over 60% of body content is non-actionable, and reference files can inject tens of thousands of tokens per invocation. Motivated by these findings, we present \textsc{SkillReducer}, a two-stage optimization framework. Stage~~1 optimizes the routing layer by compressing verbose descriptions and generating missing ones via adversarial delta debugging. Stage~~2 restructures skill bodies through taxonomy-driven classification and progressive disclosure, separating actionable core rules from supplementary content loaded on demand, validated by faithfulness checks and a self-correcting feedback loop. Evaluated on 600 skills and the SkillsBench benchmark, \textsc{SkillReducer} achieves 48% description compression and 39% body compression while improving functional quality by 2.8%, revealing a \emph{less-is-more} effect where removing non-essential content reduces distraction in the context window. These benefits transfer across five models from four families with a mean retention of 0.965, and generalize to an independent agent framework.

Subjects:

Software Engineering (cs.SE)

Cite as: arXiv:2603.29919 [cs.SE]

(or arXiv:2603.29919v1 [cs.SE] for this version)

https://doi.org/10.48550/arXiv.2603.29919

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Zongjie Li [view email] [v1] Tue, 31 Mar 2026 15:57:53 UTC (997 KB)

Original source

arXiv cs.SE

https://arxiv.org/abs/2603.29919

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelbenchmarkannounce

Models

A dual perspective review on large language models and code verification - Frontiers

A dual perspective review on large language models and code verification Frontiers

Google News: LLM

1m5 months ago

ModelsLive

National Robotics Week — Latest Physical AI Research, Breakthroughs and Resources

This National Robotics Week, NVIDIA is highlighting the breakthroughs that are bringing AI into the physical world — as well as the growing wave of robots transforming industries, from agricultural and manufacturing to energy and beyond. Advancements in robot learning, simulation and foundation models are accelerating development, enabling robots to move from training in virtual [ ]

NVIDIA Blog

1mabout 2 hours ago

Analyst NewsLive

dark ilan

The second time Vellam uncovers the conspiracy underlying all of society, he approaches a Keeper. Some of the difference is convenience. Since Vellam reported that he’d found out about the first conspiracy, he’s lived in the secret AI research laboratory at the Basement of the World, and Keepers are much easier to come by than when he was a quality control inspector for cheese. But Vellam is honest with himself. If he were making progress, he’d never tell the Keepers no matter how convenient they were, not even if they lined his front walkway every morning to beg him for a scrap of his current intellectual project. He’d sat on his insight about artificial general intelligence for two years before he decided that he preferred isolation to another day of cheese inspection. No, the only reaso

lesswrong.com

16mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 148 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

SkillReducer: Optimizing LLM Agent Skills for Token Efficiency

Submission history

Daily AI Digest

More about

A dual perspective review on large language models and code verification - Frontiers

National Robotics Week — Latest Physical AI Research, Breakthroughs and Resources

dark ilan

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Models

Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT - WSJ

A dual perspective review on large language models and code verification - Frontiers

National Robotics Week — Latest Physical AI Research, Breakthroughs and Resources

Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT - WSJ