Asthenosphere
================================================================ ASTHENOSPHERE NPU INFERENCE METRICS Hardware: Device: AMD Phoenix XDNA gen1 (AIE2) Tiles: 12/12 (complete transformer pipeline) Device ID: /dev/accel/accel0 Status: ACTIVE Reliability: 100% Pipeline: PreScale > Q proj > RoPE > Attention > O proj > Attn ResAdd PreScale2 > Gate+SiLU+Up > EltMul > Down > FFN ResAdd > Score Head 14 ops, zero CPU/GPU during NPU compute SESSION AVERAGES (7 messages) Avg tokens/msg: 64.7 Avg elapsed/msg: 83ms Avg eff tok/s: 3866 Avg acceptance: 91.8% Avg cost/msg: 21.3 Motes ALL-TIME AVERAGES (7 messages) Avg tokens/msg: 64.7 Avg elapsed/msg: 83ms Avg eff tok/s: 3866 Avg acceptance: 91.8% Avg cost/msg: 21.3 Motes PER-DISPATCH LOG (7 entries) Time Tokens Dispatches Elapsed Eff tok/s Accept% Motes 16:
Could not retrieve the full article text.
Read on DEV Community →Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modeltransformertraining
How to Actually Monitor Your LLM Costs (Without a Spreadsheet)
I used to think I had a handle on my AI spending. I had a rough mental model: Claude is cheap, GPT-4 is expensive, Gemini is somewhere in the middle. Good enough, right? Then I started actually logging what I was burning through. The gap between my mental model and reality was embarrassing. The problem with just watching your bill Every major AI provider gives you a monthly bill. That's fine for accounting. It's useless for actually understanding your costs. By the time the invoice shows up, the context is gone. You don't remember which project, which feature, which dumb experiment ate half your budget. You just see a number and try to feel bad about it. What you actually need is visibility at the call level. How many tokens did that chat completion use? How expensive was that context wind

$k$NNProxy: Efficient Training-Free Proxy Alignment for Black-Box Zero-Shot LLM-Generated Text Detection
arXiv:2604.02008v1 Announce Type: new Abstract: LLM-generated text (LGT) detection is essential for reliable forensic analysis and for mitigating LLM misuse. Existing LGT detectors can generally be categorized into two broad classes: learning-based approaches and zero-shot methods. Compared with learning-based detectors, zero-shot methods are particularly promising because they eliminate the need to train task-specific classifiers. However, the reliability of zero-shot methods fundamentally relies on the assumption that an off-the-shelf proxy LLM is well aligned with the often unknown source LLM, a premise that rarely holds in real-world black-box scenarios. To address this discrepancy, existing proxy alignment methods typically rely on supervised fine-tuning of the proxy or repeated inter
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

How to Actually Monitor Your LLM Costs (Without a Spreadsheet)
I used to think I had a handle on my AI spending. I had a rough mental model: Claude is cheap, GPT-4 is expensive, Gemini is somewhere in the middle. Good enough, right? Then I started actually logging what I was burning through. The gap between my mental model and reality was embarrassing. The problem with just watching your bill Every major AI provider gives you a monthly bill. That's fine for accounting. It's useless for actually understanding your costs. By the time the invoice shows up, the context is gone. You don't remember which project, which feature, which dumb experiment ate half your budget. You just see a number and try to feel bad about it. What you actually need is visibility at the call level. How many tokens did that chat completion use? How expensive was that context wind

$k$NNProxy: Efficient Training-Free Proxy Alignment for Black-Box Zero-Shot LLM-Generated Text Detection
arXiv:2604.02008v1 Announce Type: new Abstract: LLM-generated text (LGT) detection is essential for reliable forensic analysis and for mitigating LLM misuse. Existing LGT detectors can generally be categorized into two broad classes: learning-based approaches and zero-shot methods. Compared with learning-based detectors, zero-shot methods are particularly promising because they eliminate the need to train task-specific classifiers. However, the reliability of zero-shot methods fundamentally relies on the assumption that an off-the-shelf proxy LLM is well aligned with the often unknown source LLM, a premise that rarely holds in real-world black-box scenarios. To address this discrepancy, existing proxy alignment methods typically rely on supervised fine-tuning of the proxy or repeated inter


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!