Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ
Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models WSJ
Could not retrieve the full article text.
Read on Google News: LLM →Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelresearch
Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud
Gemma Gem is a Chrome extension that loads Google's Gemma 4 (2B) through WebGPU in an offscreen document and gives it tools to interact with any webpage: read content, take screenshots, click elements, type text, scroll, and run JavaScript. You get a small chat overlay on every page. Ask it about the page and it (usually) figures out which tools to call. It has a thinking mode that shows chain-of-thought reasoning as it works. It's a 2B model in a browser. It works for simple page questions and running JavaScript, but multi-step tool chains are unreliable and it sometimes ignores its tools entirely. The agent loop has zero external dependencies and can be extracted as a standalone library if anyone wants to experiment with it. Comments URL: https://news.ycombinator.com/item?id=47655367 Poi

Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud
Gemma Gem is a Chrome extension that loads Google's Gemma 4 (2B) through WebGPU in an offscreen document and gives it tools to interact with any webpage: read content, take screenshots, click elements, type text, scroll, and run JavaScript. You get a small chat overlay on every page. Ask it about the page and it (usually) figures out which tools to call. It has a thinking mode that shows chain-of-thought reasoning as it works. It's a 2B model in a browser. It works for simple page questions and running JavaScript, but multi-step tool chains are unreliable and it sometimes ignores its tools entirely. The agent loop has zero external dependencies and can be extracted as a standalone library if anyone wants to experiment with it. Comments URL: https://news.ycombinator.com/item?id=47655367 Poi

![[R] Reference model free behavioral discovery of AudiBench model organisms via Probe-Mediated Adaptive Auditing](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-graph-nodes-a2pnJLpyKmDnxKWLd5BEAb.webp)
[R] Reference model free behavioral discovery of AudiBench model organisms via Probe-Mediated Adaptive Auditing
Anthropic's AuditBench - 56 Llama 3.3 70B models with planted hidden behaviors - their best agent detects the behaviros 10-13% of the time (42% with a super-agent aggregating many parallel runs). a central finding is the "tool-to-agent gap" - white-box interpretability tools that work in standalone evaluation fail to help the agent in practice. most auditing work uses the base model as a reference to compare against. i wanted to know if you can detect these modifications blind - no reference model, no training data, just the target model itself. maybe you can? and the method is embarrassingly simple. LoRA fine-tuning tends to modify later layers more than earlier ones. so i train a Ridge regression from early-layer activations (~L12) to late-layer activations (~L60) and look at the residua



Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!