Models claude gemini model open source product study

This CLI Rewrites Your AI Prompts — No LLM, No API, 50ms (Open Source)

DEV Communityby Chris YaoApril 1, 20265 min read2 views

<p>I score every prompt I send to Claude Code. My average is 38 out of 100.</p> <p>Not because I'm bad at prompting — because I'm human. At 2am debugging an auth bug, I don't carefully structure my request. I type "fix the auth bug" and hit enter.</p> <p>I built a scoring engine. Then a compression engine. They told me <em>what was wrong</em> but didn't fix anything. So I built the part I actually wanted: a rewrite engine that takes a lazy prompt and makes it better. No LLM. No API call. Just rules extracted from NLP papers.</p> <h2> Before / After </h2> <div class="highlight js-code-highlight"> <pre class="highlight shell"><code><span class="nv">$ </span>reprompt rewrite <span class="s2">"I was wondering if you could maybe help me fix the authentication bug that seems to be kind of broken

I score every prompt I send to Claude Code. My average is 38 out of 100.

Not because I'm bad at prompting — because I'm human. At 2am debugging an auth bug, I don't carefully structure my request. I type "fix the auth bug" and hit enter.

I built a scoring engine. Then a compression engine. They told me what was wrong but didn't fix anything. So I built the part I actually wanted: a rewrite engine that takes a lazy prompt and makes it better. No LLM. No API call. Just rules extracted from NLP papers.

Before / After

$ reprompt rewrite "I was wondering if you could maybe help me fix the authentication bug that seems to be kind of broken"

34 → 56 (+22)

╭─ Rewritten ────────────────────────────────────────╮ │ Help me fix the authentication bug that seems to │ │ be broken. │ ╰────────────────────────────────────────────────────╯

Changes ✓ Removed filler (18% shorter) ✓ Removed hedging language

You should also → Add actual code snippets or error messages for context → Reference specific files or functions by name → Add constraints (e.g., "Do not modify existing tests")`

Enter fullscreen mode

Exit fullscreen mode

The "You should also" section is honestly the most useful part. The machine handles what it can — filler removal, restructuring — and tells you what only a human can add.

What the Rewriter Does

Four transformations, applied in order:

Strip filler. "Please help me with", "basically what I need is", "I would like you to" — these add tokens without adding information. 40+ English rules, 40+ Chinese rules (reuses the compression engine).
Front-load instructions. If your key ask is buried in the middle, it moves it to the front. This matters: Stanford's "Lost in the Middle" paper found models recall instructions at the start 2-3x better than instructions in the middle.
Echo key requirements. For long prompts (40+ words) with low repetition, the main instruction gets repeated at the end. Google Research (arXiv:2512.14982) found moderate repetition improves recall by up to 76%. This only fires when the prompt is long enough that the model might lose the thread.
Remove hedging. "Maybe", "perhaps", "I was wondering", "kind of", "sort of". These weaken the instruction signal without adding information. 12 regex patterns.

Why Not Use an LLM to Rewrite?

I thought about it. Three reasons I went rule-based:

It's fast. Under 50ms. You can run it in a pre-commit hook or CI pipeline and nobody notices.

It's deterministic. Same input, same output. I actually use reprompt lint in CI with a score threshold — if I used an LLM rewriter, my CI would randomly fail on Tuesdays because GPT was feeling creative.

It's private. My prompts contain production error messages, internal file paths, sometimes API keys I forgot to redact. That's exactly the kind of thing I don't want sending to another LLM for "improvement."

The Broader Toolkit

rewrite is one command. Here's what else is in the box:

reprompt check "your prompt" # full diagnostic: score + lint + rewrite reprompt build "task" --file auth.ts # assemble a prompt from components reprompt compress "your prompt" # save 40-60% tokens reprompt scan # discover sessions from 9 AI tools reprompt privacy --deep # find leaked API keys in sessions reprompt lint --score-threshold 50 # CI quality gate (GitHub Action included)

reprompt check "your prompt" # full diagnostic: score + lint + rewrite reprompt build "task" --file auth.ts # assemble a prompt from components reprompt compress "your prompt" # save 40-60% tokens reprompt scan # discover sessions from 9 AI tools reprompt privacy --deep # find leaked API keys in sessions reprompt lint --score-threshold 50 # CI quality gate (GitHub Action included)

Enter fullscreen mode

Exit fullscreen mode

Auto-discovers sessions from Claude Code, Cursor, Aider, Codex CLI, Gemini CLI, Cline, and OpenClaw. ChatGPT and Claude.ai via export. Browser extension shows a live score badge as you type — click it for inline suggestions.

What I still haven't figured out

The rewriter handles maybe 30% of what makes a good prompt. The other 70% is stuff only you know — the error message you're staring at, the file you just edited, the thing you tried that didn't work. No tool can add that for you.

I also don't think the scoring is "right" yet. A 3-word prompt from someone deep in a debugging session can be more effective than a beautifully structured 200-word request from someone who doesn't understand the codebase. Context that lives in your head doesn't show up in a score.

The weights are calibrated against 4 NLP papers, but papers study prompts in isolation. Real prompting happens in the middle of a conversation, at 2am, when you've already explained the problem three times. I'm not sure how to score that.

Try it

pip install reprompt-cli reprompt check "your worst prompt" reprompt rewrite "your worst prompt"

pip install reprompt-cli reprompt check "your worst prompt" reprompt rewrite "your worst prompt"

Enter fullscreen mode

Exit fullscreen mode

MIT, local-only, 1,800+ tests. GitHub · PyPI

Honestly curious: do you think about your prompts before sending them, or is it more stream-of-consciousness? I've been tracking mine for months and I still default to lazy prompts when I'm tired. Starting to think that's just how humans work.

Original source

DEV Community

https://dev.to/chrishohoho/this-cli-rewrites-your-ai-prompts-no-llm-no-api-50ms-open-source-30p6

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

claudegeminimodel

Self-Evolving AILive

Apps Are Dead? | Agentic AI, Gemma 4 #1 Model & Microsoft vs OpenAI Begins

AI YouTube Channel 9

1mabout 2 hours ago

ModelsFresh

Empirical Evaluation of Structured Synthetic Data Privacy Metrics: Novel experimental framework

arXiv:2512.16284v2 Announce Type: replace Abstract: Synthetic data generation is gaining traction as a privacy enhancing technology (PET). When properly generated, synthetic data preserve the analytic utility of real data while avoiding the retention of information that would allow the identification of specific individuals. However, the concept of data privacy remains elusive, making it challenging for practitioners to evaluate and benchmark the degree of privacy protection offered by synthetic data. In this paper, we propose a framework to empirically assess the efficacy of tabular synthetic data privacy quantification methods through controlled, deliberate risk insertion. To demonstrate this framework, we survey existing approaches to synthetic data privacy quantification and the relate

arXiv cs.CR

1mabout 11 hours ago

Research PapersFresh

Voting by mail: a Markov chain model for managing the security risks of election systems

arXiv:2410.13900v3 Announce Type: replace Abstract: The scrutiny surrounding vote-by-mail (VBM) in the United States has increased in recent years, highlighting the need for a rigorous quantitative framework to evaluate the resilience of the absentee voting infrastructure. This paper addresses these issues by introducing a dynamic mathematical modeling framework for performing a risk assessment of VBM processes. We introduce a discrete-time Markov chain (DTMC) to model the VBM process and assess election performance and risk with a novel layered network approach that considers the interplay between VBM processes, malicious and non-malicious threats, and security mitigations. The time-inhomogeneous DTMC framework captures dynamic risks and evaluates performance over time. The DTMC model acc

arXiv cs.CR

2mabout 11 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 266 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsFresh

Empirical Evaluation of Structured Synthetic Data Privacy Metrics: Novel experimental framework

arXiv cs.CR

1mabout 11 hours ago

ModelsLive

A technical, 100% local writeup on how I replicated and then surpassed the Secret Detection model from Wiz (and the challenges along the way) - including labeling an entire dataset with local AI

Hey everybody, I have a strong interest in offloading work to small, specialized models that I can parallelize - this lets me scale work significantly (plus, I am less dependent on proprietary APIs) Some time ago, I saw a blog post from Wiz about fine-tuning Llama 3.2-1B for secret detection in code. They got 86% Precision and 82% Recall. I wanted to see if I can replicate (or beat) those numbers using purely local AI and produce a local specialized model. After a couple of weekends of trying it out I managed to get a Llama 3.2-1B hitting 88% Precision and 84.4% Recall simultaneously! I also benchmarked Qwen 3.5-2B and 4B - expectedly, they outperformed Llama 1B at the cost of more VRAM and longer inference time. I’ve put together a full write-up with the training stats, examples, and a st

Reddit r/LocalLLaMA

2mabout 1 hour ago

ModelsLive

I open-sourced a tool that compiles raw documents into an AI-navigable wiki with persistent memory; runs 100% locally

After seeing Karpathy's tweet about using LLMs to build personal wikis from research documents, I realized I'd already been using something similar like this internally for our R D. So I cleaned it up and open-sourced it. What it does: You drop a folder of raw documents (PDFs, papers, notes, code, 60+ formats) and the LLM compiles them into a structured markdown wiki with backlinked articles, concept pages, and a master index. It then compresses everything into a .aura archive optimized for RAG retrieval (~97% smaller than raw source data). How it works: pip install aura-research research init my-project # copy docs into raw/ research ingest raw/ research compile research query "your question" Key design decisions: No embeddings, no vector databases. Uses SimHash + Bloom Filters instead. Z

Reddit r/LocalLLaMA

2mabout 1 hour ago

ModelsLive

Gemma4:26b's reasoning capabilities are crazy.

Been experimenting with it, first on my buddy's compute he let me borrow, and then with the Gemini SDK so that I don't need to keep stealing his macbook from 600 miles away. Originally my home agent was run through Gemini-3-Flash because no other model I've tried has been able to match it's reasoning ability. The script(s) I have it running through are a re-implementation of a multi-speaker smart home speaker setup, with several rasperry pi zeroes functioning as speaker satellites for a central LLM hub, right now a raspberry pi 5, soon to be an M4 mac mini prepped for full local operation. It also has a dedicated discord bot I use to interact with it from my phone and PC for more complicated tasks, and those requiring information from an image, like connector pinouts I want help with. I've

Reddit r/LocalLLaMA

3mabout 1 hour ago