I built an npm middleware that scores your LLM prompts before they hit your agent workflow
The problem with most LLM agent workflows is that nobody is checking the quality of the prompts going in. Garbage in, garbage out but at scale, with agents firing hundreds of prompts per day, the garbage compounds fast. I built x402-pqs to fix this. It's an Express middleware that intercepts prompts before they hit any LLM endpoint, scores them for quality, and adds the score to the request headers. Install npm install x402-pqs Usage const express = require ( " express " ); const { pqsMiddleware } = require ( " x402-pqs " ); const app = express (); app . use ( express . json ()); app . use ( pqsMiddleware ({ threshold : 10 , // warn if prompt scores below 10/40 vertical : " crypto " , // scoring context onLowScore : " warn " , // warn | block | ignore })); app . post ( " /api/chat " , ( re
The problem with most LLM agent workflows is that nobody is checking the quality of the prompts going in.
Garbage in, garbage out but at scale, with agents firing hundreds of prompts per day, the garbage compounds fast.
I built x402-pqs to fix this. It's an Express middleware that intercepts prompts before they hit any LLM endpoint, scores them for quality, and adds the score to the request headers.
Install
npm install x402-pqs
Enter fullscreen mode
Exit fullscreen mode
Usage
const express = require("express"); const { pqsMiddleware } = require("x402-pqs");const express = require("express"); const { pqsMiddleware } = require("x402-pqs");const app = express(); app.use(express.json());
app.use(pqsMiddleware({ threshold: 10, // warn if prompt scores below 10/40 vertical: "crypto", // scoring context onLowScore: "warn", // warn | block | ignore }));
app.post("/api/chat", (req, res) => { console.log("Prompt score:", req.pqs.score, req.pqs.grade); res.json({ message: "ok" }); });`
Enter fullscreen mode
Exit fullscreen mode
Every request gets these headers added automatically:
-
X-PQS-Score —> numeric score (0-40)
-
X-PQS-Grade —> letter grade (A-F)
-
X-PQS-Out-Of —> maximum score (40)
How the scoring works
PQS scores prompts across 8 dimensions using 5 cited academic frameworks:
Prompt-side (4 dimensions):
-
Specificity —> does the prompt define what it wants precisely?
-
Context —> does it give the model enough to work with?
-
Clarity —> are the directives unambiguous?
-
Predictability —> would different runs produce consistent results?
Output-side (4 dimensions):
- Completeness, Relevancy, Reasoning depth, Faithfulness
Source frameworks: PEEM (Dongguk University, 2026) · RAGAS · MT-Bench · G-Eval · ROUGE
Real example
This prompt: "who are the smartest wallets on solana right now"
Scored 9/40 —> Grade D.
The optimized version scored 35/40 —> Grade A.
+84% improvement.
Same model. Same API. Completely different output quality.
The payment layer
The scoring API uses x402, an HTTP-native micropayment protocol now governed by the Linux Foundation, with Coinbase, Cloudflare, AWS, Stripe, Google, Microsoft, Visa, and Mastercard as founding members.
Agents can call and pay for scoring autonomously — no API keys, no subscriptions. Just a wallet and $0.001 USDC per score.
There's also a free tier with no payment required:
curl -X POST https://pqs.onchainintel.net/api/score/free \ -H "Content-Type: application/json" \ -d '{"prompt": "your prompt here", "vertical": "general"}'curl -X POST https://pqs.onchainintel.net/api/score/free \ -H "Content-Type: application/json" \ -d '{"prompt": "your prompt here", "vertical": "general"}'Enter fullscreen mode
Exit fullscreen mode
Returns:
{ "score": 11, "out_of": 40, "grade": "D", "upgrade": "Get full dimension breakdown at /api/score for $0.001 USDC" }{ "score": 11, "out_of": 40, "grade": "D", "upgrade": "Get full dimension breakdown at /api/score for $0.001 USDC" }Enter fullscreen mode
Exit fullscreen mode
The data angle
Every scored prompt pair goes into a corpus. At scale this becomes training data for a domain-specific prompt quality model. The thesis is similar to what Andrej Karpathy described recently about LLM knowledge bases, the data compounds in value over time.
Links
-
npm: x402-pqs
-
GitHub: OnChainAIIntel/x402-pqs
-
API: pqs.onchainintel.net
-
Free endpoint: POST https://pqs.onchainintel.net/api/score/free
Would love feedback from anyone building agent workflows. What scoring dimensions would you add?
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modeltrainingversion
How Field Service Management Software Is Transforming Service Businesses in 2026
Not long ago, a field service business ran on clipboards, carbon-copy invoices, and a dispatcher who seemed to hold the entire operation together through sheer force of memory. A missed call meant a missed job. A lost work order meant a billing dispute. And scheduling three technicians across a metro area was considered a full-time job in itself. That world is over. In 2026, the businesses pulling ahead are the ones that have embraced Field Service Management Software — not as a back-office nicety, but as the operational core around which every customer interaction, every dispatch, every invoice, and every performance review is built. The transformation is real, measurable, and accelerating. The Old Model Was Broken — and Everyone Knew It Talk to any field service veteran and you'll hear t

Foundations of Data and Analytics in the Cloud
Understanding the Importance of Data Data is often referred to as the new oil because of its value in today’s economy. Organisations collect data from various sources such as websites, applications, sensors, and customer interactions. This data can be used to: Identify trends and patterns Improve customer experiences Optimise business operations Support decision-making Without proper understanding, this data cannot be used effectively. AB-900 helps individuals learn how to manage and utilise data efficiently. Types of Data in Modern Systems Data comes in different forms, and understanding these types is essential for working with it. The main types include: Structured data: Organised data stored in tables (e.g., databases) Semi-structured data: Data with some structure (e.g., JSON, XML)
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Я протестировал 12 no-code инструментов 2026. Выжили трое.
Пятница, 14 февраля 2026 года, 23:40. Я закрыл ноутбук, чувствуя, как уходит время и деньги с каждым тестированием "революционного" no-code билдера. $340 на подписки и 11 часов на проверку - всё это ради открытия, что эти не экспортируются без платного плана за $89/месяц . Шрифт в FAQ был настолько мелким, что даже ChatGPT бы его не заметил. Каждая новая платформа обещала спасти меня от рутины. Но я терял клиентов и проекты, становясь заложником изменяющихся правил. Один стартап, которому я доверил свои расчёты, тихо закрылся, даже не предупредив. Таких было девять из двенадцати. Я собрал промпты по этой теме в PDF. Забери бесплатно: https://t.me/airozov_bot Почему остальные выжили? Секрет был в стабильности и прозрачности. Bubble оказалась на удивление гибкой, но без неожиданностей в цена






Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!