Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessDFRobot Showcases AI Maker Projects at Robot Hokoten in Akihabara - Thailand Business NewsGoogle News - AI ThailandAGI vs artificial intelligence: What’s the real difference - WIONGNews AI AGIBuy Facebook Reviews | Boost Brand Trust & VisibilityDev.to AIMy AI Pendant Turned Voice Memos Into Two Shipped ProjectsMedium AIWhy Your Website Is Invisible to AI Search Engines (And How to Fix It)Dev.to AI85% of Companies Claim Skills-Based Hiring. Only 0.14% of Hires Are Actually Affected.Medium AII Tried the Tea Checker App as a Developer — Here’s My Honest ReviewDev.to AIBeyond Simple OCR: Building an Autonomous VLM Auditor for E-Commerce ScaleDev.to AIHow to Build the 1% AI System — A Step-by-Step Implementation That Teams Actually UseMedium AIScheduling & Priority: Teaching Agents What Matters NowMedium AIBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AITiny AIs, Finally Ready? Toward Affordable AIs.Medium AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessDFRobot Showcases AI Maker Projects at Robot Hokoten in Akihabara - Thailand Business NewsGoogle News - AI ThailandAGI vs artificial intelligence: What’s the real difference - WIONGNews AI AGIBuy Facebook Reviews | Boost Brand Trust & VisibilityDev.to AIMy AI Pendant Turned Voice Memos Into Two Shipped ProjectsMedium AIWhy Your Website Is Invisible to AI Search Engines (And How to Fix It)Dev.to AI85% of Companies Claim Skills-Based Hiring. Only 0.14% of Hires Are Actually Affected.Medium AII Tried the Tea Checker App as a Developer — Here’s My Honest ReviewDev.to AIBeyond Simple OCR: Building an Autonomous VLM Auditor for E-Commerce ScaleDev.to AIHow to Build the 1% AI System — A Step-by-Step Implementation That Teams Actually UseMedium AIScheduling & Priority: Teaching Agents What Matters NowMedium AIBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AITiny AIs, Finally Ready? Toward Affordable AIs.Medium AI
AI NEWS HUBbyEIGENVECTOREigenvector

How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

arXivMarch 26, 202610 min read0 views
Source Quiz

Weight pruning is a standard technique for compressing large language models, yet its effect on learned internal representations remains poorly understood. We present the first systematic study of how unstructured pruning reshapes the feature geometry of language models, using Sparse Autoencoders (SAEs) as interpretability probes. Across three model families (Gemma 3 1B, Gemma 2 2B, Llama 3.2 1B), two pruning methods (magnitude and Wanda), and six sparsity levels (0--60%), we investigate five research questions spanning seed stability, feature survival, SAE transferability, feature fragility, — Hector Borobia, Elies Seguí-Mas, Guillermina Tormo-Carbó

View PDF HTML (experimental)

Abstract:Weight pruning is a standard technique for compressing large language models, yet its effect on learned internal representations remains poorly understood. We present the first systematic study of how unstructured pruning reshapes the feature geometry of language models, using Sparse Autoencoders (SAEs) as interpretability probes. Across three model families (Gemma 3 1B, Gemma 2 2B, Llama 3.2 1B), two pruning methods (magnitude and Wanda), and six sparsity levels (0--60%), we investigate five research questions spanning seed stability, feature survival, SAE transferability, feature fragility, and causal relevance. Our most striking finding is that rare SAE features--those with low firing rates--survive pruning far better than frequent ones, with within-condition Spearman correlations of rho = -1.0 in 11 of 17 experimental conditions. This counter-intuitive result suggests that pruning acts as implicit feature selection, preferentially destroying high-frequency generic features while preserving specialized rare ones. We further show that Wanda pruning preserves feature structure up to 3.7x better than magnitude pruning, that pre-trained SAEs remain viable on Wanda-pruned models up to 50% sparsity, and that geometric feature survival does not predict causal importance--a dissociation with implications for interpretability under compression.

Comments: 27 pages, 6 figures, 6 tables. Analysis covers Gemma 3 1B, Gemma 2 2B, and Llama 3.2 1B across 22 experimental runs. Code and data available at this https URL

Subjects:

Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

MSC classes: 68T07, 68T50

ACM classes: I.2.7; I.2.6

Cite as: arXiv:2603.25325 [cs.LG]

(or arXiv:2603.25325v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.25325

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Hector Borobia [view email] [v1] Thu, 26 Mar 2026 11:12:42 UTC (1,431 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
How Pruning…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 166 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!