Do All Vision Transformers Need Registers? A Cross-Architectural Reassessment
arXiv:2603.25803v1 Announce Type: cross Abstract: Training Vision Transformers (ViTs) presents significant challenges, one of which is the emergence of artifacts in attention maps, hindering their interpretability. Darcet et al. (2024) investigated this phenomenon and attributed it to the need of ViTs to store global information beyond the [CLS] token. They proposed a novel solution involving the addition of empty input tokens, named registers, which successfully eliminate artifacts and improve the clarity of attention maps. In this work, we reproduce the findings of Darcet et al. (2024) and e — Spiros Baxevanakis, Platon Karageorgis, Ioannis Dravilas, Konrad Szewczyk
View PDF HTML (experimental)
Abstract:Training Vision Transformers (ViTs) presents significant challenges, one of which is the emergence of artifacts in attention maps, hindering their interpretability. Darcet et al. (2024) investigated this phenomenon and attributed it to the need of ViTs to store global information beyond the [CLS] token. They proposed a novel solution involving the addition of empty input tokens, named registers, which successfully eliminate artifacts and improve the clarity of attention maps. In this work, we reproduce the findings of Darcet et al. (2024) and evaluate the generalizability of their claims across multiple models, including DINO, DINOv2, OpenCLIP, and DeiT3. While we confirm the validity of several of their key claims, our results reveal that some claims do not extend universally to other models. Additionally, we explore the impact of model size, extending their findings to smaller models. Finally, we untie terminology inconsistencies found in the original paper and explain their impact when generalizing to a wider range of models.
Comments: Preprint. Submitted to Transactions on Machine Learning Research (TMLR). 26 pages, 17 figures
Subjects:
Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
ACM classes: I.2.10; I.4.8; I.5.4
Cite as: arXiv:2603.25803 [cs.CV]
(or arXiv:2603.25803v1 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2603.25803
arXiv-issued DOI via DataCite
Submission history
From: Spiros Baxevanakis [view email] [v1] Thu, 26 Mar 2026 18:09:12 UTC (27,638 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv
Is Turboquant really a game changer?
I am currently utilizing qwen3.5 and Gemma 4 model. Realized Gemma 4 requires 2x ram for same context length. As far as I understand, what turbo quant gives is quantizing kv cache into about 4 bit and minimize the loses But Q8 still not lose the context that much so isn't kv cache ram for qwen 3.5 q8 and Gemma 4 truboquant is the same? Is turboquant also applicable in qwen's cache architecture? because as far as I know they didn't tested it in qwen3.5 style kv cache in their paper. Just curious, I started to learn local LLM recently submitted by /u/Interesting-Print366 [link] [comments]

Found how to toggle reasoning mode for Gemma in LM-Studio!
I’ve figured out how to trigger the reasoning process by adding "/think" to the system prompt. Heads up: the thought tags have an unusual pipe ( | ) placement, which is why many LLM fail to parse the reasoning section correctly. So Start String is : " thought" And End String is " " Here is the Jinja template: https://pastebin.com/MGmD8UiC Tested and working with the 26B and 31B versions. submitted by /u/Adventurous-Paper566 [link] [comments]

AI companions can comfort lonely users but may deepen distress over time
AI companions are always available, never judge, never tire and never demand anything in return. If someone is struggling with loneliness, this frictionlessness can seem profoundly appealing. However, new research shows that in the long term, seeking emotional support from an AI companion can pull users away from important human relationships.
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers

New Rowhammer attack can grant kernel-level control on Nvidia workstation GPUs
A study from researchers at UNC Chapel Hill and Georgia Tech shows that GDDR6-based Rowhammer attacks can grant kernel-level access to Linux systems equipped with GPUs based on Nvidia's Ampere and Ada Lovelace architectures. The vulnerability appears significantly more severe than what was outlined in a paper last year. Read Entire Article
![[D] ICML Reviewer Acknowledgement](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-matrix-rain-CvjLrWJiXfamUnvj5xT9J9.webp)
[D] ICML Reviewer Acknowledgement
Hi, I'm a little confused about ICML discussion period Does the period for reviewer acknowledging responses have already ended? One of the four reviewers did not present any answer to a paper of mine. Do you know if the reviewer can still change their score before April 7th? There is a reviewer comment that I will answer on Monday. Will the reviewer be able to update the score after seeing my answer? Thanks! submitted by /u/Massive_Horror9038 [link] [comments]

Considerations for growing the pie
Recently some friends and I were comparing growing the pie interventions to an increasing our friends' share of the pie intervention, and at first we mostly missed some general considerations against the latter type. 1. Decision-theoretic considerations The world is full of people with different values working towards their own ends; each of them can choose to use their resources to increase the total size of the pie or to increase their share of the pie. All of them would significantly prefer a world in which resources were used to increase the size of the pie, and this leads to a number [of] compelling justifications for each individual to cooperate. . . . by increasing the size of the pie we create a world which is better for people on average, and from behind the veil of ignorance we s


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!