Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessThe Cathedral, the Bazaar, and the Winchester Mystery HouseO'Reilly RadarSources: Mercor asked professionals in fields like entertainment to sell their prior work materials for AI training, even if the IP could belong to ex-employers (Katherine Bindley/Wall Street Journal)TechmemeStop Using Robotic AI Voices — Here’s How to Make Them Sound Human (For Free)Medium AILangChain4j TokenWindowChatMemory Crash: IndexOutOfBoundsException Explained and FixedMedium AIGoogle TurboQuant Codes explainedMedium AIStop Storing Data in CSV Like It’s 2010-Apache Parquet Will Change How You Think About StorageMedium AIBest HSE Software in 2026: Top 10 Platforms for Safety ProfessionalsMedium AIPsyche 2.0? Unconsciousness, Preconsciousness, Consciousness, and ComputsciousnessMedium AIPython OperatorsMedium AII Changed My Mind about Error-Correcting Debate, Misogyny and More: Updates from a Former Student of David DeutschLessWrongHow I Would Start From $0 Today Using AI and Affiliate MarketingMedium AIRSAC Innovation Sandbox 2026: Two Sides Of AI On DisplayForrester AI BlogBlack Hat USADark ReadingBlack Hat AsiaAI BusinessThe Cathedral, the Bazaar, and the Winchester Mystery HouseO'Reilly RadarSources: Mercor asked professionals in fields like entertainment to sell their prior work materials for AI training, even if the IP could belong to ex-employers (Katherine Bindley/Wall Street Journal)TechmemeStop Using Robotic AI Voices — Here’s How to Make Them Sound Human (For Free)Medium AILangChain4j TokenWindowChatMemory Crash: IndexOutOfBoundsException Explained and FixedMedium AIGoogle TurboQuant Codes explainedMedium AIStop Storing Data in CSV Like It’s 2010-Apache Parquet Will Change How You Think About StorageMedium AIBest HSE Software in 2026: Top 10 Platforms for Safety ProfessionalsMedium AIPsyche 2.0? Unconsciousness, Preconsciousness, Consciousness, and ComputsciousnessMedium AIPython OperatorsMedium AII Changed My Mind about Error-Correcting Debate, Misogyny and More: Updates from a Former Student of David DeutschLessWrongHow I Would Start From $0 Today Using AI and Affiliate MarketingMedium AIRSAC Innovation Sandbox 2026: Two Sides Of AI On DisplayForrester AI Blog
AI NEWS HUBbyEIGENVECTOREigenvector

Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2509.23392v3 Announce Type: replace Abstract: Large Reasoning Models (LRMs) have achieved impressive performance on challenging tasks, yet their deep reasoning often incurs substantial computational costs. To achieve efficient reasoning, existing reinforcement learning methods still struggle to construct short reasoning path during the rollout stage, limiting effective learning. Inspired by Evidence Accumulation Models, we find that LRMs have accumulated sufficient information early in reasoning, making further reasoning steps redundant. Based on this insight, we propose Just-Enough Thin — Jinyi Han, Ying Huang, Ying Liao, Zishang Jiang, Xikun Lu, Haiquan Zhao, Xinyi Wang, Guanghao Zhou, Sihang Jiang, Jiaqing Liang, Weikang Zhou, Zeye Sun, Fei Yu, Yanghua Xiao

Authors:Jinyi Han, Ying Huang, Ying Liao, Zishang Jiang, Xikun Lu, Haiquan Zhao, Xinyi Wang, Guanghao Zhou, Sihang Jiang, Jiaqing Liang, Weikang Zhou, Zeye Sun, Fei Yu, Yanghua Xiao

View PDF HTML (experimental)

Abstract:Large Reasoning Models (LRMs) have achieved impressive performance on challenging tasks, yet their deep reasoning often incurs substantial computational costs. To achieve efficient reasoning, existing reinforcement learning methods still struggle to construct short reasoning path during the rollout stage, limiting effective learning. Inspired by Evidence Accumulation Models, we find that LRMs have accumulated sufficient information early in reasoning, making further reasoning steps redundant. Based on this insight, we propose Just-Enough Thinking (JET), which trains models to proactively terminate unnecessary reasoning. JET performs trajectory truncation during rollout to expose the model to short, distributionally consistent reasoning paths. Besides, it uses a quality-controlled length reward to better encourage concise reasoning while maintaining correctness. Extensive experiments demonstrate that JET significantly improves reasoning efficiency without sacrificing accuracy. Especially, DeepSeek-Distill-Qwen-1.5B achieves a 4.6% accuracy gain while reducing output length by 46.3% on the Olympiad benchmark. Our code is available in the GitHub.

Subjects:

Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

Cite as: arXiv:2509.23392 [cs.AI]

(or arXiv:2509.23392v3 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2509.23392

arXiv-issued DOI via DataCite

Submission history

From: Jinyi Han [view email] [v1] Sat, 27 Sep 2025 16:25:06 UTC (579 KB) [v2] Sun, 5 Oct 2025 13:54:32 UTC (579 KB) [v3] Mon, 30 Mar 2026 15:21:37 UTC (780 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Your Models…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 192 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!