Toward Phonology-Guided Sign Language Motion Generation: A Diffusion Baseline and Conditioning Analysis
arXiv:2603.17388v2 Announce Type: replace Abstract: Generating natural, correct, and visually smooth 3D avatar sign language motion conditioned on the text inputs continues to be very challenging. In this work, we train a generative model of 3D body motion and explore the role of phonological attribute conditioning for sign language motion generation, using ASL-LEX 2.0 annotations such as hand shape, hand location and movement. We first establish a strong diffusion baseline using an Human Motion MDM-style diffusion model with SMPL-X representation, which outperforms SignAvatar, a state-of-the- — Rui Hong, Jana Kosecka
View PDF HTML (experimental)
Abstract:Generating natural, correct, and visually smooth 3D avatar sign language motion conditioned on the text inputs continues to be very challenging. In this work, we train a generative model of 3D body motion and explore the role of phonological attribute conditioning for sign language motion generation, using ASL-LEX 2.0 annotations such as hand shape, hand location and movement. We first establish a strong diffusion baseline using an Human Motion MDM-style diffusion model with SMPL-X representation, which outperforms SignAvatar, a state-of-the-art CVAE method, on gloss discriminability metrics. We then systematically study the role of text conditioning using different text encoders (CLIP vs. T5), conditioning modes (gloss-only vs. gloss+phonological attributes), and attribute notation format (symbolic vs. natural language). Our analysis reveals that translating symbolic ASL-LEX notations to natural language is a necessary condition for effective CLIP-based attribute conditioning, while T5 is largely unaffected by this translation. Furthermore, our best-performing variant (CLIP with mapped attributes) outperforms SignAvatar across all metrics. These findings highlight input representation as a critical factor for text-encoder-based attribute conditioning, and motivate structured conditioning approaches where gloss and phonological attributes are encoded through independent pathways.
Comments: 8 pages, 4 figures
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2603.17388 [cs.CV]
(or arXiv:2603.17388v2 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2603.17388
arXiv-issued DOI via DataCite
Submission history
From: Rui Hong [view email] [v1] Wed, 18 Mar 2026 06:10:05 UTC (553 KB) [v2] Sun, 29 Mar 2026 01:50:01 UTC (553 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivNew Research Finds Earned Media Accounts for 25% of All Large Language Model Citations - Yahoo Finance Singapore
<a href="https://news.google.com/rss/articles/CBMijgFBVV95cUxQeTBqWlJ1c1BFaE5TRU9HOHE5TzdpT3VJQWxUMzVzTWMwc2VqcklLQmxjWFAtUTZ6Y3hTOTMyM0E5VVA1aWw0bXhRdDZSVDlvV2QybzZ2MUVzcXJmUmU1MlVwR2xWdEpjSVV4N0c1WTVKQXhIOWJaZXdXcHdndnI4MGFJblZPLTRGeUhOYkFn?oc=5" target="_blank">New Research Finds Earned Media Accounts for 25% of All Large Language Model Citations</a> <font color="#6f6f6f">Yahoo Finance Singapore</font>
EVP of Integrated Quantum Technologies Publishes White Paper on Privacy-Preserving Machine Learning Without Performance Trade-Offs - Investing News Network
<a href="https://news.google.com/rss/articles/CBMi7gFBVV95cUxQNTZXczhiNlViQm80T0VtWGJMdEs0N09mdTM2cFZUaFVsM180UjA3aU1YNDNOdWhqdTQyNmttSTR5YngwakRvZjNTYlctZjVjV0RaMDU4dk5xSUxFRi1vTXZqVjBwa1M4bzU2dXNSSEZmUE50Vm85MVc4bDN5bmRxRmVTbzVXVURfdWwtdkdzanBUekVjRXU1Wm5oR0hrQkRNczF6TTdHX0RkdERtNks1TFd1WGhydGtSaTdFZy1RSFM3cWVIOU41LXB1MUYwR0FtRVdyZ1J1UWpJVTNGQUQxTWRTeklfVU1uY0xXa2Nn?oc=5" target="_blank">EVP of Integrated Quantum Technologies Publishes White Paper on Privacy-Preserving Machine Learning Without Performance Trade-Offs</a> <font color="#6f6f6f">Investing News Network</font>
Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - wsj.com
<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxPTU4tZnRTaG1rUGQ4a3l6RXdVczBjYkhlVkFTaU9BREZmY3MxMkFtcXJGckJfTDB0dndpSHVYR1JqeEdfV3VwRGRQcGtZQk5fbF9PVkhxS1pDX0wtSXdYOGVOOWZ4cEhkNTJxSFdhQ3FRdjJrSlppOFJrRHd2bUFyZDdCd193U1Q3cmFFMkNWUFh6Wmx1ZjhnRmRDaE1QVFZQeVJCb3JyYWVCbDlJY1QwcG42NS1leXRnamZGd1dXRUlUV2RybGZScGtBc1I2TDFHY0FXeW9ORV9lVzE3cWpvemlNcE0wVjVSRVd4SkJEUnlPc3VWNjB2Y2pnaGFEOGl4V28zamNEVEtsRDROMGhEbGpzc2djelJVZ2lGUjNRNGprZ0p2SWhRTnE2UVRHSW8yX3k3Zm1BcWg4NjJheGw0S0U3ZmNKeXFaRmYwSGtERFRnYzU2QUJhUElCcHFicWV5YlRGRGtHbzB6ZURRdnpaTHFDOHYtbkNQS3NTZzNwNXNJQkk5SS05N3g0bWVaN2hnVi1KLTFtMVZnZUZlN05NMTY5dGZBdmxaSVdXUXg5NEhmT0ZUYkdmcQ?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> <font color="#6f6f6f">wsj.com</font>
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
AI could transform research assessment — and some academics are worried - Nature
<a href="https://news.google.com/rss/articles/CBMiX0FVX3lxTE12VmJ3THU1WmwzcENmWFJqTVRfclJGVkhzTG9Kcm9mTm1VZnJsV2IyZGwtc21EWnZRSkRfSXM3SDRlOVZnUlhpVm9VUEMtRWRRYmNDVU1kdHg5NllvSERj?oc=5" target="_blank">AI could transform research assessment — and some academics are worried</a> <font color="#6f6f6f">Nature</font>

As AI-Generated Music Advances, Humans Still Lead in Creativity, CMU Research Finds
<p> <img loading="lazy" src="https://www.cmu.edu/news/sites/default/files/styles/listings_desktop_1x_/public/2026-01/251104A_WTM_AI-Creativity-Music102.jpg.webp?itok=uEc2ayOO" width="900" height="508" alt="A woman with long black hair is seated on the right opposite a computer screen with a small piano keyboard and computer keyboard in front of her on a desk, where a man next to her with glasses and wavy black hair operates the mouse and talks to her."> </p> AI can write songs, but still has a way to go before matching the creativity of tunes made by people, according to Carnegie Mellon University research.


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!