Live
Black Hat USAAI BusinessBlack Hat AsiaAI Business# MCP, A2A, and FastMCP: The Nervous System of Modern AI ApplicationsDev.to AIIntelligence Driving Industrial TransformationDev.to AIEU AI Act compliance checklist for AI engineering teamsDev.to AIFoundations of Data and Analytics in the CloudDev.to AIThe best deals on MacBooks right nowThe Verge AIRevolutionizing AI Development with AI-CodeGen: The Game-Changing Tool of 2026Dev.to AIIs Your Website Blocking AI Crawlers? Check with This Free Robots.txt AnalyzerDev.to AIWhen GitHub Copilot Plan Mode Overcomplicates Simple ProblemsMedium AIHow Anthropic Is Quietly Winning the AI RaceMedium AIAI agent governance tools compared - 2026 landscapeDev.to AIHow Field Service Management Software Is Transforming Service Businesses in 2026Dev.to AIThey Got Lost in the Transformer — Episode 01Medium AIBlack Hat USAAI BusinessBlack Hat AsiaAI Business# MCP, A2A, and FastMCP: The Nervous System of Modern AI ApplicationsDev.to AIIntelligence Driving Industrial TransformationDev.to AIEU AI Act compliance checklist for AI engineering teamsDev.to AIFoundations of Data and Analytics in the CloudDev.to AIThe best deals on MacBooks right nowThe Verge AIRevolutionizing AI Development with AI-CodeGen: The Game-Changing Tool of 2026Dev.to AIIs Your Website Blocking AI Crawlers? Check with This Free Robots.txt AnalyzerDev.to AIWhen GitHub Copilot Plan Mode Overcomplicates Simple ProblemsMedium AIHow Anthropic Is Quietly Winning the AI RaceMedium AIAI agent governance tools compared - 2026 landscapeDev.to AIHow Field Service Management Software Is Transforming Service Businesses in 2026Dev.to AIThey Got Lost in the Transformer — Episode 01Medium AI
AI NEWS HUBbyEIGENVECTOREigenvector

I built an npm middleware that scores your LLM prompts before they hit your agent workflow

Dev.to AIby OnChainAIIntelApril 4, 20263 min read2 views
Source Quiz

The problem with most LLM agent workflows is that nobody is checking the quality of the prompts going in. Garbage in, garbage out but at scale, with agents firing hundreds of prompts per day, the garbage compounds fast. I built x402-pqs to fix this. It's an Express middleware that intercepts prompts before they hit any LLM endpoint, scores them for quality, and adds the score to the request headers. Install npm install x402-pqs Usage const express = require ( " express " ); const { pqsMiddleware } = require ( " x402-pqs " ); const app = express (); app . use ( express . json ()); app . use ( pqsMiddleware ({ threshold : 10 , // warn if prompt scores below 10/40 vertical : " crypto " , // scoring context onLowScore : " warn " , // warn | block | ignore })); app . post ( " /api/chat " , ( re

The problem with most LLM agent workflows is that nobody is checking the quality of the prompts going in.

Garbage in, garbage out but at scale, with agents firing hundreds of prompts per day, the garbage compounds fast.

I built x402-pqs to fix this. It's an Express middleware that intercepts prompts before they hit any LLM endpoint, scores them for quality, and adds the score to the request headers.

Install

npm install x402-pqs

Enter fullscreen mode

Exit fullscreen mode

Usage

const express = require("express"); const { pqsMiddleware } = require("x402-pqs");

const app = express(); app.use(express.json());

app.use(pqsMiddleware({ threshold: 10, // warn if prompt scores below 10/40 vertical: "crypto", // scoring context onLowScore: "warn", // warn | block | ignore }));

app.post("/api/chat", (req, res) => { console.log("Prompt score:", req.pqs.score, req.pqs.grade); res.json({ message: "ok" }); });`

Enter fullscreen mode

Exit fullscreen mode

Every request gets these headers added automatically:

  • X-PQS-Score —> numeric score (0-40)

  • X-PQS-Grade —> letter grade (A-F)

  • X-PQS-Out-Of —> maximum score (40)

How the scoring works

PQS scores prompts across 8 dimensions using 5 cited academic frameworks:

Prompt-side (4 dimensions):

  • Specificity —> does the prompt define what it wants precisely?

  • Context —> does it give the model enough to work with?

  • Clarity —> are the directives unambiguous?

  • Predictability —> would different runs produce consistent results?

Output-side (4 dimensions):

  • Completeness, Relevancy, Reasoning depth, Faithfulness

Source frameworks: PEEM (Dongguk University, 2026) · RAGAS · MT-Bench · G-Eval · ROUGE

Real example

This prompt: "who are the smartest wallets on solana right now"

Scored 9/40 —> Grade D.

The optimized version scored 35/40 —> Grade A.

+84% improvement.

Same model. Same API. Completely different output quality.

The payment layer

The scoring API uses x402, an HTTP-native micropayment protocol now governed by the Linux Foundation, with Coinbase, Cloudflare, AWS, Stripe, Google, Microsoft, Visa, and Mastercard as founding members.

Agents can call and pay for scoring autonomously — no API keys, no subscriptions. Just a wallet and $0.001 USDC per score.

There's also a free tier with no payment required:

curl -X POST https://pqs.onchainintel.net/api/score/free \  -H "Content-Type: application/json" \  -d '{"prompt": "your prompt here", "vertical": "general"}'

Enter fullscreen mode

Exit fullscreen mode

Returns:

{  "score": 11,  "out_of": 40,  "grade": "D",  "upgrade": "Get full dimension breakdown at /api/score for $0.001 USDC" }

Enter fullscreen mode

Exit fullscreen mode

The data angle

Every scored prompt pair goes into a corpus. At scale this becomes training data for a domain-specific prompt quality model. The thesis is similar to what Andrej Karpathy described recently about LLM knowledge bases, the data compounds in value over time.

Links

Would love feedback from anyone building agent workflows. What scoring dimensions would you add?

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modeltrainingversion

Knowledge Map

Knowledge Map
TopicsEntitiesSource
I built an …modeltrainingversionapplicationreasoningautonomousDev.to AI

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 200 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models

Я протестировал 12 no-code инструментов 2026. Выжили трое.
ModelsLive

Я протестировал 12 no-code инструментов 2026. Выжили трое.

Пятница, 14 февраля 2026 года, 23:40. Я закрыл ноутбук, чувствуя, как уходит время и деньги с каждым тестированием "революционного" no-code билдера. $340 на подписки и 11 часов на проверку - всё это ради открытия, что эти не экспортируются без платного плана за $89/месяц . Шрифт в FAQ был настолько мелким, что даже ChatGPT бы его не заметил. Каждая новая платформа обещала спасти меня от рутины. Но я терял клиентов и проекты, становясь заложником изменяющихся правил. Один стартап, которому я доверил свои расчёты, тихо закрылся, даже не предупредив. Таких было девять из двенадцати. Я собрал промпты по этой теме в PDF. Забери бесплатно: https://t.me/airozov_bot Почему остальные выжили? Секрет был в стабильности и прозрачности. Bubble оказалась на удивление гибкой, но без неожиданностей в цена