Live
Black Hat USADark ReadingBlack Hat AsiaAI Business跳出幸存者偏差,从结构性资源分配解析财富真相Dev.to AIJapan's Sakura Internet jumps 20% as Microsoft plans $10 billion AI push with SoftBank - CNBCGNews AI JapanOpenClaw vs Cloud AI: Which One Actually Gives Businesses More Control?Medium AI“In a World of AI Content, Being Human Is Your Superpower”Medium AIHow AI is Transforming the Role of a CFO in 2026.Medium AIHow to Build Self-Running AI Tasks with TypeScript (No Cron Jobs Needed)Dev.to AIFaked Fire Drill!Medium AIMicrosoft To Invest $10 Bn For Japan AI Data Centres - Barron'sGNews AI Japanv4.3.1text-gen-webui ReleasesThe Sentinel: AI-Powered Zero-Touch Insurance for Gig WorkersDev.to AIDecision Trees from Data: Building Context-Aware ModelsDev.to AIFrom Crisis to Clinic: How AI Automates Drug Shortage ResolutionDev.to AIBlack Hat USADark ReadingBlack Hat AsiaAI Business跳出幸存者偏差,从结构性资源分配解析财富真相Dev.to AIJapan's Sakura Internet jumps 20% as Microsoft plans $10 billion AI push with SoftBank - CNBCGNews AI JapanOpenClaw vs Cloud AI: Which One Actually Gives Businesses More Control?Medium AI“In a World of AI Content, Being Human Is Your Superpower”Medium AIHow AI is Transforming the Role of a CFO in 2026.Medium AIHow to Build Self-Running AI Tasks with TypeScript (No Cron Jobs Needed)Dev.to AIFaked Fire Drill!Medium AIMicrosoft To Invest $10 Bn For Japan AI Data Centres - Barron'sGNews AI Japanv4.3.1text-gen-webui ReleasesThe Sentinel: AI-Powered Zero-Touch Insurance for Gig WorkersDev.to AIDecision Trees from Data: Building Context-Aware ModelsDev.to AIFrom Crisis to Clinic: How AI Automates Drug Shortage ResolutionDev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

TinyLoRA – Learning to Reason in 13 Parameters

Hacker Newsby [Submitted on 4 Feb 2026]March 27, 20261 min read1 views
Source Quiz

Comments

View PDF HTML (experimental)

Abstract:Recent research has shown that language models can learn to \textit{reason}, often via reinforcement learning. Some work even trains low-rank parameterizations for reasoning, but conventional LoRA cannot scale below the model dimension. We question whether even rank=1 LoRA is necessary for learning to reason and propose TinyLoRA, a method for scaling low-rank adapters to sizes as small as one parameter. Within our new parameterization, we are able to train the 8B parameter size of Qwen2.5 to 91% accuracy on GSM8K with only 13 trained parameters in bf16 (26 total bytes). We find this trend holds in general: we are able to recover 90% of performance improvements while training $1000x$ fewer parameters across a suite of more difficult learning-to-reason benchmarks such as AIME, AMC, and MATH500. Notably, we are only able to achieve such strong performance with RL: models trained using SFT require $100-1000x$ larger updates to reach the same performance.

Subjects:

Machine Learning (cs.LG)

Cite as: arXiv:2602.04118 [cs.LG]

(or arXiv:2602.04118v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2602.04118

arXiv-issued DOI via DataCite

Submission history

From: John Morris [view email] [v1] Wed, 4 Feb 2026 01:20:04 UTC (1,595 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
TinyLoRA – …Hacker News

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 228 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Open Source AI