Models model transformer neural network training announce application

How to Train your Tactile Model: Tactile Perception with Multi-fingered Robot Hands

arXiv cs.ROby Christopher J. Ford, Kaichen Shi, Laura Butcher, Nathan F. Lepora, Efi PsomopoulouApril 2, 20261 min read0 views

Source Quiz

arXiv:2604.00744v1 Announce Type: new Abstract: Rapid deployment of new tactile sensors is essential for scalable robotic manipulation, especially in multi-fingered hands equipped with vision-based tactile sensors. However, current methods for inferring contact properties rely heavily on convolutional neural networks (CNNs), which, while effective on known sensors, require large, sensor-specific datasets. Furthermore, they require retraining for each new sensor due to differences in lens properties, illumination, and sensor wear. Here we introduce TacViT, a novel tactile perception model based on Vision Transformers, designed to generalize on new sensor data. TacViT leverages global self-attention mechanisms to extract robust features from tactile images, enabling accurate contact property

View PDF HTML (experimental)

Abstract:Rapid deployment of new tactile sensors is essential for scalable robotic manipulation, especially in multi-fingered hands equipped with vision-based tactile sensors. However, current methods for inferring contact properties rely heavily on convolutional neural networks (CNNs), which, while effective on known sensors, require large, sensor-specific datasets. Furthermore, they require retraining for each new sensor due to differences in lens properties, illumination, and sensor wear. Here we introduce TacViT, a novel tactile perception model based on Vision Transformers, designed to generalize on new sensor data. TacViT leverages global self-attention mechanisms to extract robust features from tactile images, enabling accurate contact property inference even on previously unseen sensors. This capability significantly reduces the need for data collection and retraining, accelerating the deployment of new sensors. We evaluate TacViT on sensors for a five-fingered robot hand and demonstrate its superior generalization performance compared to CNNs. Our results highlight TacViTs potential to make tactile sensing more scalable and practical for real-world robotic applications.

Comments: Accepted for publication at the International Conference on Robotics and Automation (ICRA) 2026, Vienna

Subjects:

Robotics (cs.RO)

Cite as: arXiv:2604.00744 [cs.RO]

(or arXiv:2604.00744v1 [cs.RO] for this version)

https://doi.org/10.48550/arXiv.2604.00744

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Christopher Ford [view email] [v1] Wed, 1 Apr 2026 11:15:27 UTC (3,800 KB)

Original source

arXiv cs.RO

https://arxiv.org/abs/2604.00744

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modeltransformerneural network

ProductsLive

Uber expands its EV incentive program across the US

Uber is expanding its EV incentive program across the US. The company began testing the service in select cities last year. This is a program in which Uber offers drivers a $4,000 grant to switch from their current vehicle to an EV. These grants are available for both new and used electric vehicles, which is nice because new cars are expensive and could be out of financial reach for many Uber drivers. This program is available to Platinum and Diamond drivers who complete 100 eligible rides by December 31. These drivers can apply for the grant on the platform's website, with applications processed from April 16. The $4,000 grant isn't the only incentive on offer here. Drivers who purchase a new or used EV through the platform TrueCar can get an additional discount of $1,000. Also, Kia is pa

Engadget

2m38 minutes ago

ReleasesLive

Google releases Gemma 4, a family of open models built off of Gemini 3

When Google released Gemini 3 Pro at the end of last year, it was a significant step forward for the company's proprietary large language models. Now, the company is bringing some of the same technology and research that made those models possible to the open source community with the release of its new family of Gemma 4 open-weight models. Google is offering four different versions of Gemma 4, differentiated by the number of parameters on offer. For edge devices, including smartphones, the company has the 2-billion and 4-billion "Effective" models. For more powerful machines, there’s the 26-billion "Mixture of Experts" and 31-billion "Dense" systems. For the unfamiliar, parameters are the settings a large language model can tweak to generate an output. Typically, models with more paramete

Engadget

2m7 minutes ago

ReleasesLive

Github Integrates AI to Improve Accessibility Issue Management and Automate Feedback Triage

GitHub has launched a continuous AI-powered workflow to manage accessibility feedback at scale. Using GitHub Actions, Copilot, and Models APIs, the system centralizes reports, analyzes WCAG compliance, and automates triage while maintaining human validation. Teams now resolve feedback faster, improving inclusion and cross-functional collaboration. By Leela Kumili

InfoQ AI/ML

1mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 165 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsLive

Mitigating collusive self-preference by redaction and paraphrasing

tldr: superficial self-preference can be mitigated by perturbation, but can be hard to eliminate Introduction Our goal is to understand and mitigate collusion, which we define as an agent’s failure to adhere to its assigned role as a result of interaction with other agents . Collusion is a risk in control , in particular, untrusted monitoring. An agent can collude with its monitor by secretly embedding cues in its output to cause the monitor to overlook harmful actions. The embedded cues don’t need to be subtle when the communication channel is less restricted: the agent can persuade the monitor. With a more restricted channel, the agent can still collude with the monitor using mechanisms like steganography or schelling points . In reward modeling , collusion manifests as self-preference :