Research Papers research paper arxiv machine-learning deep-learning

How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

arXivMarch 26, 202610 min read0 views

Weight pruning is a standard technique for compressing large language models, yet its effect on learned internal representations remains poorly understood. We present the first systematic study of how unstructured pruning reshapes the feature geometry of language models, using Sparse Autoencoders (SAEs) as interpretability probes. Across three model families (Gemma 3 1B, Gemma 2 2B, Llama 3.2 1B), two pruning methods (magnitude and Wanda), and six sparsity levels (0--60%), we investigate five research questions spanning seed stability, feature survival, SAE transferability, feature fragility, — Hector Borobia, Elies Seguí-Mas, Guillermina Tormo-Carbó

View PDF HTML (experimental)

Abstract:Weight pruning is a standard technique for compressing large language models, yet its effect on learned internal representations remains poorly understood. We present the first systematic study of how unstructured pruning reshapes the feature geometry of language models, using Sparse Autoencoders (SAEs) as interpretability probes. Across three model families (Gemma 3 1B, Gemma 2 2B, Llama 3.2 1B), two pruning methods (magnitude and Wanda), and six sparsity levels (0--60%), we investigate five research questions spanning seed stability, feature survival, SAE transferability, feature fragility, and causal relevance. Our most striking finding is that rare SAE features--those with low firing rates--survive pruning far better than frequent ones, with within-condition Spearman correlations of rho = -1.0 in 11 of 17 experimental conditions. This counter-intuitive result suggests that pruning acts as implicit feature selection, preferentially destroying high-frequency generic features while preserving specialized rare ones. We further show that Wanda pruning preserves feature structure up to 3.7x better than magnitude pruning, that pre-trained SAEs remain viable on Wanda-pruned models up to 50% sparsity, and that geometric feature survival does not predict causal importance--a dissociation with implications for interpretability under compression.

Comments: 27 pages, 6 figures, 6 tables. Analysis covers Gemma 3 1B, Gemma 2 2B, and Llama 3.2 1B across 22 experimental runs. Code and data available at this https URL

Subjects:

Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

MSC classes: 68T07, 68T50

ACM classes: I.2.7; I.2.6

Cite as: arXiv:2603.25325 [cs.LG]

(or arXiv:2603.25325v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.25325

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Hector Borobia [view email] [v1] Thu, 26 Mar 2026 11:12:42 UTC (1,431 KB)

Original source

arXiv

https://arxiv.org/abs/2603.25325v1

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

ProductsLive

China reveals military capabilities in new space solar power plant design

A senior Chinese scientist has outlined the potential military applications of space-based solar power technology, offering a rare glimpse into how energy beamed from orbit could also support surveillance and electronic warfare. Duan Baoyan, a leading architect of China’s “Zhuri” space solar power initiative, wrote in a paper published in Scientia Sinica Informationis last month, that his team had revamped the design of the giant orbital infrastructure. In addition to energy transmission, the...

SCMP Tech (Asia AI)

1m37 minutes ago

ModelsLive

Positional Restructuring of System Prompts: Mitigating Transformer Attention Bias in Sub-Frontier Models

I built a sovereign AI system on a Mac Mini that kept forgetting facts written in its own system prompt. Instead of upgrading hardware, I figured out why — and found some things I was not expecting. The obvious part: moving critical facts from the middle to the beginning and end of the system prompt fixes recall (2.0 to 7.0 on a verification battery). This builds on Liu et al.'s lost-in-the-middle work. The less obvious part: a model with 83.4% IFBench scored 3.4/10 on fact recall while a model with 23.9% IFBench scored 7.5/10 after restructuring. Instruction-following and fact recall appear to be independent capabilities. I have not seen this documented elsewhere. The paper also covers a behavioral rule methodology that took a 32B model from 6.2 to 9.4 across seven dimensions with cold re

discuss.huggingface.co

1mabout 1 hour ago

ModelsRecent

Anthropic to all AI companies: Our research tells that all LLMs sometimes act like they have emotion, so - The Times of India

Anthropic to all AI companies: Our research tells that all LLMs sometimes act like they have emotion, so The Times of India

Google News: Claude

1mabout 22 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 166 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersFresh

ARTIFICIAL INTELLIGENCE KEYNOTE SPEAKER FOR CORPORATE EVENTS & AI CONFERENCES - futuristsspeakers.com

ARTIFICIAL INTELLIGENCE KEYNOTE SPEAKER FOR CORPORATE EVENTS & AI CONFERENCES futuristsspeakers.com

Google News: AI

1mabout 2 hours ago

Research PapersRecent

This Wi-Fi receiver can work inside a nuclear reactor, keeping robots connected

The research, presented at the IEEE International Solid-State Circuits Conference in San Francisco earlier this year, shows the receiver can continue operating after exposure to 500 kilograys of radiation. That level of endurance far exceeds what even space-grade electronics are designed to handle. Read Entire Article

TechSpot

1mabout 18 hours ago

Research PapersRecent

AI Music & Creators Conference - Bennett College

AI Music & Creators Conference Bennett College

Google News: AI

1mabout 19 hours ago

Research PapersRecent

Can space solve AI's crisis? Oracle cuts 30,000 workers while half of Earth projects remain stuck - Cryptopolitan

Can space solve AI's crisis? Oracle cuts 30,000 workers while half of Earth projects remain stuck Cryptopolitan

GNews AI USA

1mabout 14 hours ago