I Built npm for AI Skills — Here's Why AI Needs a Package Manager
AI skills are stuck in copy-paste hell. spm fixes this — install reusable AI instructions with one command, works with Claude, Cursor, VS Code, and 11 more clients via MCP. Every developer knows this pain: you find a perfect AI prompt. Maybe it's a code review checklist that catches bugs your linter misses. Or a set of prompt engineering techniques that dramatically improve your LLM outputs. You save it somewhere. A note. A file. A Slack message to yourself. A week later you need it again and can't find it. So you rewrite it from scratch. Or worse — you find a version, but it's outdated, and you don't remember which copy is current. This is 2026 and we're still managing AI knowledge with copy-paste. The problem is obvious once you see it Software development solved this decades ago. Before
AI skills are stuck in copy-paste hell. spm fixes this — install reusable AI instructions with one command, works with Claude, Cursor, VS Code, and 11 more clients via MCP.
Every developer knows this pain: you find a perfect AI prompt. Maybe it's a code review checklist that catches bugs your linter misses. Or a set of prompt engineering techniques that dramatically improve your LLM outputs.
You save it somewhere. A note. A file. A Slack message to yourself.
A week later you need it again and can't find it. So you rewrite it from scratch. Or worse — you find a version, but it's outdated, and you don't remember which copy is current.
This is 2026 and we're still managing AI knowledge with copy-paste.
The problem is obvious once you see it
Software development solved this decades ago. Before npm, developers emailed JavaScript files to each other. Before pip, Python libraries lived on random FTP servers. Package managers brought versioning, dependencies, discovery, and sharing to code.
AI skills — the structured instructions that make LLMs actually useful at specific tasks — have no equivalent. Right now, if you want to give Claude a solid code review methodology, or a set of prompt engineering best practices, you either:
-
Write a custom system prompt yourself (and maintain it forever)
-
Copy someone's prompt from GitHub/Reddit/Twitter (no versioning, no updates)
-
Use a platform-specific solution that locks you into one client
None of these scale. None of these compose. None of these let you share what you've built in a way others can reliably reuse.
So I built spm
spm (Skills Package Manager) does for AI skills what npm did for JavaScript modules. It's a CLI tool that installs, manages, and shares reusable AI instructions across any MCP-compatible client.
Here's what the workflow looks like:
# Install spm npm install -g @skillbase/spm# Install spm npm install -g @skillbase/spmInitialize
spm init
Connect to your AI client
spm connect claude # or cursor, vscode, windsurf, zed...
Install a skill
spm add skillbase/prompt-engineering-craft
That's it. Ask your AI to write a prompt.
It auto-loads the skill — chain-of-thought, few-shot,
structured output techniques — all available instantly.`
Enter fullscreen mode
Exit fullscreen mode
No manual prompt engineering. No copy-pasting. The skill loads into context automatically when the AI needs it, via MCP.
What is a "skill" exactly?
A skill is a directory with a SKILL.md file at its core — structured instructions that tell an AI model how to perform a specific task. But a skill isn't limited to just instructions. The directory can also contain auxiliary scripts, templates, example files, and any other resources the AI might need. Think of SKILL.md as package.json — it's the entry point, but the whole directory is the package.
The format is deliberately simple and extensible. Today a skill might be pure instructions. Tomorrow it could include a Python validation script, a Jinja template for generating reports, or reference data the AI consults during execution. The skill format grows with your needs.
--- name: arch-code-review description: "Architecture-aware code review: coupling, cohesion, SOLID, complexity" tags: [code-review, architecture, solid, complexity, refactoring] ------ name: arch-code-review description: "Architecture-aware code review: coupling, cohesion, SOLID, complexity" tags: [code-review, architecture, solid, complexity, refactoring] ---Code Review
What to evaluate
- Coupling/cohesion at module and class level
- SOLID principle adherence
- Naming quality and consistency
- Cyclomatic complexity hotspots
- Pull request design issues
Review process
[step-by-step methodology follows]`
Enter fullscreen mode
Exit fullscreen mode
Skills have:
-
Semver versioning — skillbase/[email protected], so updates don't break your workflow
-
Dependencies — a skill can depend on other skills
-
Auxiliary files — scripts, templates, and resources bundled alongside instructions
-
Triggers — descriptions that help the AI decide when to load the skill
-
Confidence scores — computed from real user feedback, so popular and effective skills rise to the top
Skills compose into personas
This is where it gets interesting. A persona bundles multiple skills into a complete AI identity:
spm add @skillbase/prompt-engineer
Enter fullscreen mode
Exit fullscreen mode
This persona combines prompt engineering best practices, SKILL.md format knowledge, and quality evaluation — creating an AI that writes and reviews AI skills. It's skills all the way down.
The registry currently has 52 skills and 16 personas. Examples include:
-
Development: skillbase/python-backend (FastAPI, async, Pydantic), skillbase/arch-code-review, skillbase/arch-api-design
-
Security: skillbase/smart-contract-audit, skillbase/appsec (OWASP Top 10), skillbase/prompt-injection-detector
-
DeFi/Trading: skillbase/yield-analysis, skillbase/leverage-calc, skillbase/onchain-signals
-
Meta-skills: skillbase/prompt-engineering-craft — techniques for writing better prompts
Works with 14 AI clients
spm uses the Model Context Protocol (MCP) as its transport layer. One spm connect command and your skills work with:
-
Claude Desktop & Claude Code
-
Cursor
-
VS Code (Copilot)
-
Windsurf
-
JetBrains IDEs
-
Zed
-
And 8 more
Write a skill once, use it everywhere. No vendor lock-in.
How it works under the hood
When you run spm connect claude, spm registers itself as an MCP server. The AI client gets access to a set of tools:
spm serve ├── skill_list → Compact index of installed skills ├── skill_load → Load full skill into context on demand ├── skill_search → Find skills locally or in the registry ├── skill_install → Install new skills at runtime └── skill_feedback → Record whether a skill worked wellspm serve ├── skill_list → Compact index of installed skills ├── skill_load → Load full skill into context on demand ├── skill_search → Find skills locally or in the registry ├── skill_install → Install new skills at runtime └── skill_feedback → Record whether a skill worked wellEnter fullscreen mode
Exit fullscreen mode
The key design decision: lazy loading. The AI model doesn't load all skills upfront — that would blow up the context window. Instead, it gets a compact index with names and trigger descriptions. When it encounters a task that matches a skill, it loads just that one. This keeps context lean and relevant.
Publishing your own skills
Creating and sharing a skill takes minutes:
# Create a skill scaffold spm create my-skill# Create a skill scaffold spm create my-skillEdit SKILL.md with your instructions
Then publish to the registry
spm publish`
Enter fullscreen mode
Exit fullscreen mode
Your skill gets a page on the Explore registry, versioning, download stats, and a confidence score that improves as people use it and leave feedback.
Why this matters now
Three trends are converging:
MCP is becoming the standard. Anthropic's Model Context Protocol is now supported by every major AI client. This creates a universal transport layer — and spm sits right on top of it.
AI capabilities are fragmenting. Every team has their own prompts, their own instructions, their own "secret sauce." Without a sharing mechanism, every organization reinvents the wheel.
Agents need composable skills. As AI agents become more autonomous (OpenClaw, Paperclip, and others), they need a way to discover and use capabilities on demand. A skill registry with MCP-native access is exactly this infrastructure.
Get started
npm install -g @skillbase/spm spm init spm connect claude # or your preferred client spm add skillbase/arch-code-review # try your first skillnpm install -g @skillbase/spm spm init spm connect claude # or your preferred client spm add skillbase/arch-code-review # try your first skillEnter fullscreen mode
Exit fullscreen mode
Browse the registry: skillbase.space/explore
Read the docs: skillbase.space/docs
Star on GitHub: github.com/useskillbase/spm
spm is open source (MIT), model-agnostic, and free to use. The registry is free to publish to.
If you've ever wished you could npm install an AI capability — now you can.
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
claudemodelavailable
Empirical Evaluation of Structured Synthetic Data Privacy Metrics: Novel experimental framework
arXiv:2512.16284v2 Announce Type: replace Abstract: Synthetic data generation is gaining traction as a privacy enhancing technology (PET). When properly generated, synthetic data preserve the analytic utility of real data while avoiding the retention of information that would allow the identification of specific individuals. However, the concept of data privacy remains elusive, making it challenging for practitioners to evaluate and benchmark the degree of privacy protection offered by synthetic data. In this paper, we propose a framework to empirically assess the efficacy of tabular synthetic data privacy quantification methods through controlled, deliberate risk insertion. To demonstrate this framework, we survey existing approaches to synthetic data privacy quantification and the relate
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Releases

Iran threatens to "annihilate" Stargate AI data center backed by OpenAI and Nvidia
The IRGC released a video vowing retaliatory measures should the US attack its power facilities. Spokesperson Brigadier General Ebrahim Zolfaghari said the actions would entail "complete and utter annihilation" of power plants, energy infrastructure, and IT and communications facilities belonging to Israel and to companies with American shareholders. Read Entire Article

Hong Kong developers test homebuyers with modest price increases after sell-outs
Hong Kong developers are raising prices of new homes this week following sold-out launches in recent days, further testing the appetite of homebuyers amid geopolitical and interest rate uncertainties. Henderson Land Development put another 39 units at its Chester project in Hung Hom on sale on Monday, with 25 homes finding buyers, according to agents. With an average discounted price of HK$22,198 (US$2,831) per square foot, the units were priced 4.57 per cent higher than the 123 units that sold...

Why Microservices Struggle With AI Systems
Adding AI to microservices breaks the assumption that same input produces same output, causing unpredictability, debugging headaches, and unreliable systems. To safely integrate AI, validate outputs, version prompts, use a control layer, and implement rule-based fallbacks. Never let AI decide alone—treat it as advisory, not authoritative. Read All

An Empirical Study of Testing Practices in Open Source AI Agent Frameworks and Agentic Applications
arXiv:2509.19185v3 Announce Type: replace Abstract: Foundation model (FM)-based AI agents are rapidly gaining adoption across diverse domains, but their inherent non-determinism and non-reproducibility pose testing and quality assurance challenges. While recent benchmarks provide task-level evaluations, there is limited understanding of how developers verify the internal correctness of these agents during development. To address this gap, we conduct the first large-scale empirical study of testing practices in the AI agent ecosystem, analyzing 39 open-source agent frameworks and 439 agentic applications. We identify ten distinct testing patterns and find that novel, agent-specific methods like DeepEval are seldom used (around 1%), while traditional patterns like negative and membership tes



Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!