Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessAssessing Marvell Technology (MRVL) After Nvidia’s US$2b AI Partnership And Connectivity Push - simplywall.stGNews AI NVIDIADutchess to host artificial intelligence summit at Marist in Poughkeepsie - Daily FreemanGoogle News: AIAnthropic’s Catastrophic Leak May Have Just Handed China the Blueprints to Claude Al - TipRanksGoogle News: ClaudeOpenAI's Fidji Simo Is Taking Medical Leave Amid an Executive Shake-UpWired AIMeta's AI push is reshaping how work gets done inside the companyBusiness InsiderOpenAI's Fidji Simo Is Taking Medical Leave Amid an Executive Shake-Up - WIREDGoogle News: OpenAI[P] Remote sensing foundation models made easy to use.Reddit r/MachineLearningFirst time NeurIPS. How different is it from low-ranked conferences? [D]Reddit r/MachineLearningScaling Trust: How Salesforce s Security Team Uses Agentforce to Triage Security Reports at Speedengineering.salesforce.com$18M Salary: UBTech Robotics Wants to Hire a Humanoid AI Mastermind - eWeekGoogle News - AI roboticsAI & Tech brief: Ireland ascendant - The Washington PostGNews AI EUPeople would rather have an Amazon warehouse in their backyard than a data centerTechCrunch AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessAssessing Marvell Technology (MRVL) After Nvidia’s US$2b AI Partnership And Connectivity Push - simplywall.stGNews AI NVIDIADutchess to host artificial intelligence summit at Marist in Poughkeepsie - Daily FreemanGoogle News: AIAnthropic’s Catastrophic Leak May Have Just Handed China the Blueprints to Claude Al - TipRanksGoogle News: ClaudeOpenAI's Fidji Simo Is Taking Medical Leave Amid an Executive Shake-UpWired AIMeta's AI push is reshaping how work gets done inside the companyBusiness InsiderOpenAI's Fidji Simo Is Taking Medical Leave Amid an Executive Shake-Up - WIREDGoogle News: OpenAI[P] Remote sensing foundation models made easy to use.Reddit r/MachineLearningFirst time NeurIPS. How different is it from low-ranked conferences? [D]Reddit r/MachineLearningScaling Trust: How Salesforce s Security Team Uses Agentforce to Triage Security Reports at Speedengineering.salesforce.com$18M Salary: UBTech Robotics Wants to Hire a Humanoid AI Mastermind - eWeekGoogle News - AI roboticsAI & Tech brief: Ireland ascendant - The Washington PostGNews AI EUPeople would rather have an Amazon warehouse in their backyard than a data centerTechCrunch AI
AI NEWS HUBbyEIGENVECTOREigenvector

Realistic Lip Motion Generation Based on 3D Dynamic Viseme and Coarticulation Modeling for Human-Robot Interaction

arXiv cs.ROby [Submitted on 2 Apr 2026]April 3, 20262 min read1 views
Source Quiz

arXiv:2604.01756v1 Announce Type: new Abstract: Realistic lip synchronization is essential for the natural human-robot non-verbal interaction of humanoid robots. Motivated by this need, this paper presents a lip motion generation framework based on 3D dynamic viseme and coarticulation modeling. By analyzing Chinese pronunciation theory, a 3D dynamic viseme library is constructed based on the ARKit standard, which offers coherent prior trajectories of lips. To resolve motion conflicts within continuous speech streams, a coarticulation mechanism is developed by incorporating initial-final (Shengmu-Yunmu) decoupling and energy modulation. After developing a strategy to retarget high-dimensional spatial lip motion to a 14-DOF lip actuation system of a humanoid head platform, the efficiency and

View PDF HTML (experimental)

Abstract:Realistic lip synchronization is essential for the natural human-robot non-verbal interaction of humanoid robots. Motivated by this need, this paper presents a lip motion generation framework based on 3D dynamic viseme and coarticulation modeling. By analyzing Chinese pronunciation theory, a 3D dynamic viseme library is constructed based on the ARKit standard, which offers coherent prior trajectories of lips. To resolve motion conflicts within continuous speech streams, a coarticulation mechanism is developed by incorporating initial-final (Shengmu-Yunmu) decoupling and energy modulation. After developing a strategy to retarget high-dimensional spatial lip motion to a 14-DOF lip actuation system of a humanoid head platform, the efficiency and accuracy of the proposed architecture is experimentally validated and demonstrated with quantitative ablation experiments using the metrics of the Pearson Correlation Coefficient (PCC) and the Mean Absolute Jerk (MAJ). This research offers a lightweight, efficient, and highly practical paradigm for the speech-driven lip motion generation of humanoid robots. The 3D dynamic viseme library and real-world deployment videos are available at {this https URL}

Comments: 8 pages,7 figures

Subjects:

Robotics (cs.RO)

Cite as: arXiv:2604.01756 [cs.RO]

(or arXiv:2604.01756v1 [cs.RO] for this version)

https://doi.org/10.48550/arXiv.2604.01756

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Min Li [view email] [v1] Thu, 2 Apr 2026 08:24:49 UTC (2,942 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Realistic L…modelannounceavailableplatformpaperarxivarXiv cs.RO

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 121 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers