How to Train your Tactile Model: Tactile Perception with Multi-fingered Robot Hands
arXiv:2604.00744v1 Announce Type: new Abstract: Rapid deployment of new tactile sensors is essential for scalable robotic manipulation, especially in multi-fingered hands equipped with vision-based tactile sensors. However, current methods for inferring contact properties rely heavily on convolutional neural networks (CNNs), which, while effective on known sensors, require large, sensor-specific datasets. Furthermore, they require retraining for each new sensor due to differences in lens properties, illumination, and sensor wear. Here we introduce TacViT, a novel tactile perception model based on Vision Transformers, designed to generalize on new sensor data. TacViT leverages global self-attention mechanisms to extract robust features from tactile images, enabling accurate contact property
View PDF HTML (experimental)
Abstract:Rapid deployment of new tactile sensors is essential for scalable robotic manipulation, especially in multi-fingered hands equipped with vision-based tactile sensors. However, current methods for inferring contact properties rely heavily on convolutional neural networks (CNNs), which, while effective on known sensors, require large, sensor-specific datasets. Furthermore, they require retraining for each new sensor due to differences in lens properties, illumination, and sensor wear. Here we introduce TacViT, a novel tactile perception model based on Vision Transformers, designed to generalize on new sensor data. TacViT leverages global self-attention mechanisms to extract robust features from tactile images, enabling accurate contact property inference even on previously unseen sensors. This capability significantly reduces the need for data collection and retraining, accelerating the deployment of new sensors. We evaluate TacViT on sensors for a five-fingered robot hand and demonstrate its superior generalization performance compared to CNNs. Our results highlight TacViTs potential to make tactile sensing more scalable and practical for real-world robotic applications.
Comments: Accepted for publication at the International Conference on Robotics and Automation (ICRA) 2026, Vienna
Subjects:
Robotics (cs.RO)
Cite as: arXiv:2604.00744 [cs.RO]
(or arXiv:2604.00744v1 [cs.RO] for this version)
https://doi.org/10.48550/arXiv.2604.00744
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Christopher Ford [view email] [v1] Wed, 1 Apr 2026 11:15:27 UTC (3,800 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modeltransformerneural network
Uber expands its EV incentive program across the US
Uber is expanding its EV incentive program across the US. The company began testing the service in select cities last year. This is a program in which Uber offers drivers a $4,000 grant to switch from their current vehicle to an EV. These grants are available for both new and used electric vehicles, which is nice because new cars are expensive and could be out of financial reach for many Uber drivers. This program is available to Platinum and Diamond drivers who complete 100 eligible rides by December 31. These drivers can apply for the grant on the platform's website, with applications processed from April 16. The $4,000 grant isn't the only incentive on offer here. Drivers who purchase a new or used EV through the platform TrueCar can get an additional discount of $1,000. Also, Kia is pa

Google releases Gemma 4, a family of open models built off of Gemini 3
When Google released Gemini 3 Pro at the end of last year, it was a significant step forward for the company's proprietary large language models. Now, the company is bringing some of the same technology and research that made those models possible to the open source community with the release of its new family of Gemma 4 open-weight models. Google is offering four different versions of Gemma 4, differentiated by the number of parameters on offer. For edge devices, including smartphones, the company has the 2-billion and 4-billion "Effective" models. For more powerful machines, there’s the 26-billion "Mixture of Experts" and 31-billion "Dense" systems. For the unfamiliar, parameters are the settings a large language model can tweak to generate an output. Typically, models with more paramete

Github Integrates AI to Improve Accessibility Issue Management and Automate Feedback Triage
GitHub has launched a continuous AI-powered workflow to manage accessibility feedback at scale. Using GitHub Actions, Copilot, and Models APIs, the system centralizes reports, analyzes WCAG compliance, and automates triage while maintaining human validation. Teams now resolve feedback faster, improving inclusion and cross-functional collaboration. By Leela Kumili
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Mitigating collusive self-preference by redaction and paraphrasing
tldr: superficial self-preference can be mitigated by perturbation, but can be hard to eliminate Introduction Our goal is to understand and mitigate collusion, which we define as an agent’s failure to adhere to its assigned role as a result of interaction with other agents . Collusion is a risk in control , in particular, untrusted monitoring. An agent can collude with its monitor by secretly embedding cues in its output to cause the monitor to overlook harmful actions. The embedded cues don’t need to be subtle when the communication channel is less restricted: the agent can persuade the monitor. With a more restricted channel, the agent can still collude with the monitor using mechanisms like steganography or schelling points . In reward modeling , collusion manifests as self-preference :


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!