Live

•Black Hat USADark Reading •Black Hat AsiaAI Business •跳出幸存者偏差，从结构性资源分配解析财富真相Dev.to AI •Japan's Sakura Internet jumps 20% as Microsoft plans $10 billion AI push with SoftBank - CNBCGNews AI Japan •OpenClaw vs Cloud AI: Which One Actually Gives Businesses More Control?Medium AI •“In a World of AI Content, Being Human Is Your Superpower”Medium AI •How AI is Transforming the Role of a CFO in 2026.Medium AI •How to Build Self-Running AI Tasks with TypeScript (No Cron Jobs Needed)Dev.to AI •Faked Fire Drill!Medium AI •Microsoft To Invest $10 Bn For Japan AI Data Centres - Barron'sGNews AI Japan •v4.3.1text-gen-webui Releases •The Sentinel: AI-Powered Zero-Touch Insurance for Gig WorkersDev.to AI •Decision Trees from Data: Building Context-Aware ModelsDev.to AI •From Crisis to Clinic: How AI Automates Drug Shortage ResolutionDev.to AI •Black Hat USADark Reading •Black Hat AsiaAI Business •跳出幸存者偏差，从结构性资源分配解析财富真相Dev.to AI •Japan's Sakura Internet jumps 20% as Microsoft plans $10 billion AI push with SoftBank - CNBCGNews AI Japan •OpenClaw vs Cloud AI: Which One Actually Gives Businesses More Control?Medium AI •“In a World of AI Content, Being Human Is Your Superpower”Medium AI •How AI is Transforming the Role of a CFO in 2026.Medium AI •How to Build Self-Running AI Tasks with TypeScript (No Cron Jobs Needed)Dev.to AI •Faked Fire Drill!Medium AI •Microsoft To Invest $10 Bn For Japan AI Data Centres - Barron'sGNews AI Japan •v4.3.1text-gen-webui Releases •The Sentinel: AI-Powered Zero-Touch Insurance for Gig WorkersDev.to AI •Decision Trees from Data: Building Context-Aware ModelsDev.to AI •From Crisis to Clinic: How AI Automates Drug Shortage ResolutionDev.to AI

AI NEWS HUBbyEIGENVECTOR

TinyLoRA – Learning to Reason in 13 Parameters

Hacker Newsby [Submitted on 4 Feb 2026]March 27, 20261 min read1 views

Comments

View PDF HTML (experimental)

Abstract:Recent research has shown that language models can learn to \textit{reason}, often via reinforcement learning. Some work even trains low-rank parameterizations for reasoning, but conventional LoRA cannot scale below the model dimension. We question whether even rank=1 LoRA is necessary for learning to reason and propose TinyLoRA, a method for scaling low-rank adapters to sizes as small as one parameter. Within our new parameterization, we are able to train the 8B parameter size of Qwen2.5 to 91% accuracy on GSM8K with only 13 trained parameters in bf16 (26 total bytes). We find this trend holds in general: we are able to recover 90% of performance improvements while training $1000x$ fewer parameters across a suite of more difficult learning-to-reason benchmarks such as AIME, AMC, and MATH500. Notably, we are only able to achieve such strong performance with RL: models trained using SFT require $100-1000x$ larger updates to reach the same performance.

Subjects:

Machine Learning (cs.LG)

Cite as: arXiv:2602.04118 [cs.LG]

(or arXiv:2602.04118v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2602.04118

arXiv-issued DOI via DataCite

Submission history

From: John Morris [view email] [v1] Wed, 4 Feb 2026 01:20:04 UTC (1,595 KB)

Original source

Hacker News

https://arxiv.org/abs/2602.04118

Was this article helpful?

Sign in to highlight and annotate this article

Ask AI about this article

Powered by Eigenvector · full article context loaded

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 228 connections

Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Open Source AI

v4.3.1

Open Source AILive

v4.3.1

Changes Gemma 4 support with full tool-calling in the API and UI. 🆕 ik_llama.cpp support : Add ik_llama.cpp as a new backend through new textgen-portable-ik portable builds and a new --ik flag for full installs. ik_llama.cpp is a fork by the author of the imatrix quants, including support for new quant types, significantly more accurate KV cache quantization (via Hadamard KV cache rotation, enabled by default), and optimizations for MoE models and CPU inference. API: Add echo + logprobs for /v1/completions . The completions endpoint now supports the echo and logprobs parameters, returning token-level log probabilities for both prompt and generated tokens. Token IDs are also included in the output via a new top_logprobs_ids field. Further optimize my custom gradio fork, saving up to 50 ms

text-gen-webui Releases

3m23 minutes ago

From SWE-ZERO to SWE-HERO: Execution-free to Execution-based Fine-tuning for Software Engineering Agents

Open Source AILive

From SWE-ZERO to SWE-HERO: Execution-free to Execution-based Fine-tuning for Software Engineering Agents

arXiv:2604.01496v1 Announce Type: new Abstract: We introduce SWE-ZERO to SWE-HERO, a two-stage SFT recipe that achieves state-of-the-art results on SWE-bench by distilling open-weight frontier LLMs. Our pipeline replaces resource-heavy dependencies with an evolutionary refinement strategy: (1) SWE-ZERO utilizes large-scale, execution-free trajectories to master code semantics and repository-level reasoning, and (2) SWE-HERO applies targeted, execution-backed refinement to transition these semantic intuitions into rigorous engineering workflows. Our empirical results set a new benchmark for open-source models of comparable size. We release a dataset of 300k SWE-ZERO and 13k SWE-HERO trajectories distilled from Qwen3-Coder-480B, alongside a suite of agents based on the Qwen2.5-Coder series.

1mabout 2 hours ago

A Quick Note on Gemma 4 Image Settings in Llama.cpp

Open Source AIFresh

A Quick Note on Gemma 4 Image Settings in Llama.cpp

In my last post, I mentioned using --image-min-tokens to increase the quality of image responses from Qwen3.5 . I went to load Gemma 4 the same way, and hit an error: [58175] srv process_chun: processing image... [58175] encoding image slice... [58175] image slice encoded in 7490 ms [58175] decoding image batch 1/2, n_tokens_batch = 2048 [58175] /Users/socg/llama.cpp-b8639/src/llama-context.cpp:1597: GGML_ASSERT((cparams.causal_attn || cparams.n_ubatch > = n_tokens_all ) "non-causal attention requires n_ubatch >= n_tokens" ) failed [58175] WARNING: Using native backtrace. Set GGML_BACKTRACE_LLDB for more info. [58175] WARNING: GGML_BACKTRACE_LLDB may cause native MacOS Terminal.app to crash. [58175] See: https://github.com/ggml-org/llama.cpp/pull/17869 [58175] 0 libggml-base.0.9.11.dylib 0

3mabout 4 hours ago

Building an AI-Powered DevSecOps Guardrail Pipeline with GitHub Actions

Open Source AIFresh

Building an AI-Powered DevSecOps Guardrail Pipeline with GitHub Actions

Learn how to build an AI-powered DevSecOps guardrail pipeline using GitHub Actions to automatically detect security vulnerabilities before deployment. Read All

1mabout 5 hours ago