Models claude gemini model feature trend insight

How to Actually Monitor Your LLM Costs (Without a Spreadsheet)

Dev.to AIby Henry GodnickApril 4, 20264 min read0 views

I used to think I had a handle on my AI spending. I had a rough mental model: Claude is cheap, GPT-4 is expensive, Gemini is somewhere in the middle. Good enough, right? Then I started actually logging what I was burning through. The gap between my mental model and reality was embarrassing. The problem with just watching your bill Every major AI provider gives you a monthly bill. That's fine for accounting. It's useless for actually understanding your costs. By the time the invoice shows up, the context is gone. You don't remember which project, which feature, which dumb experiment ate half your budget. You just see a number and try to feel bad about it. What you actually need is visibility at the call level. How many tokens did that chat completion use? How expensive was that context wind

I used to think I had a handle on my AI spending. I had a rough mental model: Claude is cheap, GPT-4 is expensive, Gemini is somewhere in the middle. Good enough, right?

Then I started actually logging what I was burning through. The gap between my mental model and reality was embarrassing.

The problem with just watching your bill

Every major AI provider gives you a monthly bill. That's fine for accounting. It's useless for actually understanding your costs.

By the time the invoice shows up, the context is gone. You don't remember which project, which feature, which dumb experiment ate half your budget. You just see a number and try to feel bad about it.

What you actually need is visibility at the call level. How many tokens did that chat completion use? How expensive was that context window? Is the cost per feature trending up as my codebase grows?

None of the dashboards the providers give you answer these questions in real time.

What I tried first

Spreadsheets. Obviously. I had a tab for each provider, manually entered rough token counts after each session, tried to estimate costs.

This lasted about a week before I stopped maintaining it. The friction was too high. I'd forget to log things. I'd ballpark numbers. The data became meaningless noise.

I also tried building a lightweight proxy that logged every API call. That actually worked technically, but then I had to maintain a piece of infrastructure just to track my own costs. As a solo dev building two apps simultaneously, I don't have the bandwidth for that.

The habit that actually worked

I started paying attention to token counts in real time, at the point of use, not after the fact.

This sounds obvious but there's a specific reason it works: when you see the number immediately, you can actually connect cause and effect. Oh, that system prompt is 2,000 tokens every single call. Oh, I'm re-sending the entire conversation history when I only needed the last three messages.

For my Mac menu bar workflows, I ended up using TokenBar — it shows live token counts and estimated cost right in the menu bar as I work. The thing about having it persistent and always visible is that it changes how you think. You start making micro-decisions constantly: is this context worth the extra tokens? Is this feature request worth spinning up a full Claude session or can I handle it with a lighter model?

The three questions I ask now

After a few months of actually paying attention, I've settled into asking three things about every AI interaction:

What's the token density? Not just how many tokens, but how much useful work per token. A 5,000-token call that produces a complete working feature is cheap. A 1,000-token call that produces a vague response I have to iterate on three more times is expensive.
Is this the right model for this job? I was defaulting to Claude Sonnet for everything for a long time. Then I realized: for quick validation tasks, formatting, or simple transformations, Haiku costs a fraction of the price and is fast enough that it doesn't matter. I probably cut my costs by 40% just by routing tasks properly.
Am I paying for laziness? This one stings. A lot of my token burn was from not thinking carefully about prompts before sending them. I'd throw a messy half-formed request at the API, get a mediocre response, and iterate three more times. A little upfront clarity would have been one call instead of four.

What actual monitoring looks like

I check my token usage the same way I check my git diffs — regularly, not obsessively, but with intention.

The key insight is that monitoring shouldn't require effort. If you have to go somewhere to check it, you won't. The visibility needs to be ambient — always there, not intrusive.

Menu bar for live tracking. Provider dashboards for weekly reviews. That's the whole stack.

The spreadsheet phase was necessary because it forced me to pay attention, even if the data was garbage. What replaced it is better because it's automatic — the numbers are just there, and over time you develop intuitions about what's normal and what's a red flag.

The meta-lesson

AI costs are weird because they feel like they should be predictable (it's just API calls!) but they're actually highly variable based on how you're working, what you're building, and how carefully you're thinking.

You can't optimize what you're not measuring. And you can't sustain measurement that requires manual effort.

Find the lowest-friction way to make costs visible in your actual workflow, not in a separate dashboard you have to remember to check. That's the whole game.

Original source

Dev.to AI

https://dev.to/godnick/how-to-actually-monitor-your-llm-costs-without-a-spreadsheet-393c

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

claudegeminimodel

Models

Baidu Unveils New Model, Chips to Keep Up in China’s AI Race - Bloomberg.com

Baidu Unveils New Model, Chips to Keep Up in China’s AI Race Bloomberg.com

GNews AI Baidu

1m5 months ago

CountriesFresh

China not targeting US West Coast with ultra-large underwater drones: lead scientist

China’s unmanned submersibles now rank as the world’s largest, with last year’s military parade showcasing two models (HSU001 and AJX002) approaching 20 metres (66 feet) in length. Satellite imagery analysed by Western media also revealed a classified variant exceeding 40 metres at a naval installation, triggering international concern – particularly in the United States. These dimensions created a brand new class of drones known as extra-extra-large uncrewed underwater vehicles (XXLUUVs). They...

SCMP Tech (Asia AI)

1mabout 11 hours ago

ModelsFresh

How China is transforming Hong Kong into a strategic hub

Hong Kong’s first five-year plan is expected to guide the city’s future development. Never before has the city attempted a comprehensive plan in the style of mainland China, signalling a major shift in how it approaches long‑term growth. The real question is not why a laissez‑faire economy must adopt a new model but how this transformation will unfold. This exercise is unprecedented on multiple fronts. First, it departs from Hong Kong’s long-standing reliance on market forces and incremental...

SCMP Tech (Asia AI)

1mabout 10 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 156 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

How to Actually Monitor Your LLM Costs (Without a Spreadsheet)

The problem with just watching your bill

What I tried first

The habit that actually worked

The three questions I ask now

What actual monitoring looks like

The meta-lesson

Daily AI Digest

More about

Baidu Unveils New Model, Chips to Keep Up in China’s AI Race - Bloomberg.com

China not targeting US West Coast with ultra-large underwater drones: lead scientist

How China is transforming Hong Kong into a strategic hub

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Models

Baidu Unveils New Model, Chips to Keep Up in China’s AI Race - Bloomberg.com

How China is transforming Hong Kong into a strategic hub

China’s DeepSeek taps Huawei chips for new AI model - irishsun.com

Google Unveils Gemma 4 AI Models: Record Efficiency and Multilingual Mastery - Gagadget.com