Models model training version application reasoning autonomous

I built an npm middleware that scores your LLM prompts before they hit your agent workflow

Dev.to AIby OnChainAIIntelApril 4, 20263 min read2 views

The problem with most LLM agent workflows is that nobody is checking the quality of the prompts going in. Garbage in, garbage out but at scale, with agents firing hundreds of prompts per day, the garbage compounds fast. I built x402-pqs to fix this. It's an Express middleware that intercepts prompts before they hit any LLM endpoint, scores them for quality, and adds the score to the request headers. Install npm install x402-pqs Usage const express = require ( " express " ); const { pqsMiddleware } = require ( " x402-pqs " ); const app = express (); app . use ( express . json ()); app . use ( pqsMiddleware ({ threshold : 10 , // warn if prompt scores below 10/40 vertical : " crypto " , // scoring context onLowScore : " warn " , // warn | block | ignore })); app . post ( " /api/chat " , ( re

The problem with most LLM agent workflows is that nobody is checking the quality of the prompts going in.

Garbage in, garbage out but at scale, with agents firing hundreds of prompts per day, the garbage compounds fast.

I built x402-pqs to fix this. It's an Express middleware that intercepts prompts before they hit any LLM endpoint, scores them for quality, and adds the score to the request headers.

Install

npm install x402-pqs

Enter fullscreen mode

Exit fullscreen mode

Usage

const express = require("express"); const { pqsMiddleware } = require("x402-pqs");

const express = require("express"); const { pqsMiddleware } = require("x402-pqs");

const app = express(); app.use(express.json());

app.use(pqsMiddleware({ threshold: 10, // warn if prompt scores below 10/40 vertical: "crypto", // scoring context onLowScore: "warn", // warn | block | ignore }));

app.post("/api/chat", (req, res) => { console.log("Prompt score:", req.pqs.score, req.pqs.grade); res.json({ message: "ok" }); });`

Enter fullscreen mode

Exit fullscreen mode

Every request gets these headers added automatically:

X-PQS-Score —> numeric score (0-40)
X-PQS-Grade —> letter grade (A-F)
X-PQS-Out-Of —> maximum score (40)

How the scoring works

PQS scores prompts across 8 dimensions using 5 cited academic frameworks:

Prompt-side (4 dimensions):

Specificity —> does the prompt define what it wants precisely?
Context —> does it give the model enough to work with?
Clarity —> are the directives unambiguous?
Predictability —> would different runs produce consistent results?

Output-side (4 dimensions):

Completeness, Relevancy, Reasoning depth, Faithfulness

Source frameworks: PEEM (Dongguk University, 2026) · RAGAS · MT-Bench · G-Eval · ROUGE

Real example

This prompt: "who are the smartest wallets on solana right now"

Scored 9/40 —> Grade D.

The optimized version scored 35/40 —> Grade A.

+84% improvement.

Same model. Same API. Completely different output quality.

The payment layer

The scoring API uses x402, an HTTP-native micropayment protocol now governed by the Linux Foundation, with Coinbase, Cloudflare, AWS, Stripe, Google, Microsoft, Visa, and Mastercard as founding members.

Agents can call and pay for scoring autonomously — no API keys, no subscriptions. Just a wallet and $0.001 USDC per score.

There's also a free tier with no payment required:

curl -X POST https://pqs.onchainintel.net/api/score/free \  -H "Content-Type: application/json" \  -d '{"prompt": "your prompt here", "vertical": "general"}'

curl -X POST https://pqs.onchainintel.net/api/score/free \  -H "Content-Type: application/json" \  -d '{"prompt": "your prompt here", "vertical": "general"}'

Enter fullscreen mode

Exit fullscreen mode

Returns:

{  "score": 11,  "out_of": 40,  "grade": "D",  "upgrade": "Get full dimension breakdown at /api/score for $0.001 USDC" }

{  "score": 11,  "out_of": 40,  "grade": "D",  "upgrade": "Get full dimension breakdown at /api/score for $0.001 USDC" }

Enter fullscreen mode

Exit fullscreen mode

The data angle

Every scored prompt pair goes into a corpus. At scale this becomes training data for a domain-specific prompt quality model. The thesis is similar to what Andrej Karpathy described recently about LLM knowledge bases, the data compounds in value over time.

More about

modeltrainingversion

ProductsLive

How Anthropic Is Quietly Winning the AI Race

In March 2026, Fidji Simo told her team at OpenAI that they couldn’t afford “side quests” anymore. As CEO of Applications, she was… Continue reading on AI Advances »

Medium AI

1m21 minutes ago

ProductsLive

How Field Service Management Software Is Transforming Service Businesses in 2026

Not long ago, a field service business ran on clipboards, carbon-copy invoices, and a dispatcher who seemed to hold the entire operation together through sheer force of memory. A missed call meant a missed job. A lost work order meant a billing dispute. And scheduling three technicians across a metro area was considered a full-time job in itself. That world is over. In 2026, the businesses pulling ahead are the ones that have embraced Field Service Management Software — not as a back-office nicety, but as the operational core around which every customer interaction, every dispatch, every invoice, and every performance review is built. The transformation is real, measurable, and accelerating. The Old Model Was Broken — and Everyone Knew It Talk to any field service veteran and you'll hear t

Dev.to AI

8m23 minutes ago

ProductsLive

Foundations of Data and Analytics in the Cloud

Understanding the Importance of Data Data is often referred to as the new oil because of its value in today’s economy. Organisations collect data from various sources such as websites, applications, sensors, and customer interactions. This data can be used to: Identify trends and patterns Improve customer experiences Optimise business operations Support decision-making Without proper understanding, this data cannot be used effectively. AB-900 helps individuals learn how to manage and utilise data efficiently. Types of Data in Modern Systems Data comes in different forms, and understanding these types is essential for working with it. The main types include: Structured data: Organised data stored in tables (e.g., databases) Semi-structured data: Data with some structure (e.g., JSON, XML)

Dev.to AI

4m12 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 200 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsFresh

Claude Subscribers Now Have to Pay to Use OpenClaw - AI Business

Claude Subscribers Now Have to Pay to Use OpenClaw AI Business

Google News: Generative AI

1mabout 4 hours ago

ModelsLive

Pourquoi l’I.A. va transformer le marketing d’ici 2026 ?

Le marketing tel qu’on le connaît est en train de changer et plus vite qu’on ne le pense. Continue reading on Medium »

Medium AI

1m28 minutes ago

ModelsLive

They Got Lost in the Transformer — Episode 01

Rain drummed a rhythmic tattoo against the window of the cramped student apartment. Floki sat slumped on the sofa, staring vacantly into… Continue reading on Medium »

Medium AI

1m23 minutes ago

ModelsLive

Я протестировал 12 no-code инструментов 2026. Выжили трое.

Пятница, 14 февраля 2026 года, 23:40. Я закрыл ноутбук, чувствуя, как уходит время и деньги с каждым тестированием "революционного" no-code билдера. $340 на подписки и 11 часов на проверку - всё это ради открытия, что эти не экспортируются без платного плана за $89/месяц . Шрифт в FAQ был настолько мелким, что даже ChatGPT бы его не заметил. Каждая новая платформа обещала спасти меня от рутины. Но я терял клиентов и проекты, становясь заложником изменяющихся правил. Один стартап, которому я доверил свои расчёты, тихо закрылся, даже не предупредив. Таких было девять из двенадцати. Я собрал промпты по этой теме в PDF. Забери бесплатно: https://t.me/airozov_bot Почему остальные выжили? Секрет был в стабильности и прозрачности. Bubble оказалась на удивление гибкой, но без неожиданностей в цена

Dev.to AI

2m26 minutes ago

I built an npm middleware that scores your LLM prompts before they hit your agent workflow

Install

Usage

How the scoring works

Real example

The payment layer

The data angle

Links

Daily AI Digest

More about

How Anthropic Is Quietly Winning the AI Race

How Field Service Management Software Is Transforming Service Businesses in 2026

Foundations of Data and Analytics in the Cloud

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Models

Claude Subscribers Now Have to Pay to Use OpenClaw - AI Business

Pourquoi l’I.A. va transformer le marketing d’ici 2026 ?

They Got Lost in the Transformer — Episode 01

Я протестировал 12 no-code инструментов 2026. Выжили трое.