Research Papers research paper arxiv machine-learning deep-learning

Arithmetic OOD Failure Unfolds in Stages in Minimal GPTs

arXivby [Submitted on 27 Mar 2026]March 31, 20262 min read1 views

arXiv:2603.26828v1 Announce Type: cross Abstract: Arithmetic benchmarks are often reduced to a single held-out score, but that score can conflate qualitatively different failures. We study a controlled minimal GPT trained on exhaustive 2-digit addition, where all local digit transitions are already present in training, and ask why 3-digit generalization still fails. The failure is staged. First, there is a layout barrier: a learned absolute-position model collapses under a pure 3-digit layout shift, and mixed-layout exposure is the only intervention that materially weakens this barrier. Second — Seine A. Shintani

View PDF HTML (experimental)

Abstract:Arithmetic benchmarks are often reduced to a single held-out score, but that score can conflate qualitatively different failures. We study a controlled minimal GPT trained on exhaustive 2-digit addition, where all local digit transitions are already present in training, and ask why 3-digit generalization still fails. The failure is staged. First, there is a layout barrier: a learned absolute-position model collapses under a pure 3-digit layout shift, and mixed-layout exposure is the only intervention that materially weakens this barrier. Second, after layout repair, the hundreds position behaves like a carry flag rather than a semantic hundreds digit; targeted carry probes reverse the relevant logit margin, whereas a matched extra-data control does not. Third, after carry repair, the main remaining bottleneck is conditional recomposition: high-conditioned tail data outperforms a matched control, high-only data, and tail-only data on all true-3-digit suites, and the same ordering reappears in a larger 2-layer bridge experiment. The residual errors after recomposition are then overwhelmingly tens-only, and a separate 10-seed late-stage study shows that a sign-aware tens repair raises exact match on the hardest thousands-carry suite from 0.664 to 0.822. We therefore provide an experimentally testable decomposition of arithmetic OOD failure into layout, carry-semantics, recomposition, and late tens-residual stages.

Comments: 16 pages, 4 figures

Subjects:

Computation and Language (cs.CL); Machine Learning (cs.LG)

ACM classes: I.2.7

Cite as: arXiv:2603.26828 [cs.CL]

(or arXiv:2603.26828v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.26828

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Seine A. Shintani [view email] [v1] Fri, 27 Mar 2026 03:40:27 UTC (1,038 KB)

Original source

arXiv

https://arxiv.org/abs/2603.26828

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Research PapersLive

Illinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!

<a href="https://news.google.com/rss/articles/CBMiXEFVX3lxTE13OVpWMEk1Z3hlMkR2bHNBQ2dkazFwb3VqN3hCa29GWGJvSVlPa00zd2xUakRmYXFqQmc5OWU0eGl4a21FMDAwWUN2Q3p0M3FrbXBkNV8zN0cxaG1s?oc=5" target="_blank">Illinois Tech computer science researcher honored by IEEE Chicago Section</a> EurekAlert!

Google News: Machine Learning

1m42 minutes ago

ProductsLive

My Journey to becoming a Quantum Engineer

I have procrastinated on documenting this process for the longest time. But I think i am ready now (maybe). Coming from a front end engineering background, I am fascinated by the work being done by the quantum engineers at IBM. I am not that great with maths and statistics but I believe anything can be learned with tons of practice and consistency. I want to use this platform to hold myself accountable (that is if i don't give up half way and delete all my posts. I'll try not to btw). This is an article describing <a href="https://www.ibm.com/think/topics/quantum-computing" rel="noopener noreferrer">what quantum computing is</a> and some of it's use cases. I became an IBM qiskit advocate late last year and I have been exposed to a lot of resources and networked a bun

DEV Community

2mabout 1 hour ago

ReleasesLive

Google offers researchers early access to Willow quantum processor

The Early Access Program invites researchers to design and propose quantum experiments that push the boundaries of what current hardware can achieve. It is a selective program – the processor will not be publicly available – and Google is setting firm deadlines for participation. Research teams have until May 15,... Read Entire Article

TechSpot

1mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 171 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersLive

Illinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!

Google News: Machine Learning

1m42 minutes ago

Research PapersFresh

Research roundup: 7 cool science stories we almost missed

Ars Technica

1mabout 2 hours ago

Research PapersFresh

AI maps science papers to predict research trends two to three years ahead - Tech Xplore

<a href="https://news.google.com/rss/articles/CBMie0FVX3lxTE5aTkZYTWdaRDZwTXNRMldpMG1WZ1YzWDZTOHN5M183Z3A1ZTFYbnhEWTdPRmpvZnZFU0xodlRsNWxFaGxTcEpwalhJNmJpQWE5VjhaRS1tOXJIeTc5Z0JNblJ3dFd4WjRYZGJOX0NrWGt6ZmZJVTBpRm5wWQ?oc=5" target="_blank">AI maps science papers to predict research trends two to three years ahead</a> Tech Xplore

Google News: Machine Learning

1mabout 3 hours ago

Research PapersFresh

AI inspires new research topics in materials science - Nanowerk

<a href="https://news.google.com/rss/articles/CBMiZ0FVX3lxTFBPWlJSM2ExeVQ3LVppTm45NHpEMW9YVkxscThCNDd2OVB0c3J1ZmVCbWNSZWZ0TjZwSzlOdEFXN2UtRk5LU1hxdXd4ZklldGxoM0FZSnhCd19PWkNHQ1ZRVDNwSHNUSk0?oc=5" target="_blank">AI inspires new research topics in materials science</a> Nanowerk

GNews AI materials

1mabout 10 hours ago