Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessWhich countries use ChatGPT the most? New study reveals top 5 - Deseret NewsGoogle News: ChatGPTOpenAI Is Letting Individuals Invest in Its $852 Billion Valuation—Here’s How - inc.comGoogle News: OpenAITransition From Data Scientist to Machine Learning Engineer 2026 Guide - Interview Kickstart Publishes New Career Guide - The Manila TimesGoogle News: Machine LearningValuations are 'Punchy': Salesforce's DrewsBloomberg TechnologyEarly AI Use Risks Children’s Development, Safety: UN - Mexico Business NewsGoogle News: AI SafetyAI blueprints can be stolen with a single small antennaTechXplore AIYou Have to Start Early in AI: Axiom Founder VenkatachalamBloomberg TechnologyAI and the Work-Product Doctrine: A New Frontier - callaborlaw.comGoogle News: AICompliance Policies: AI Policy & Upcoming Incident Response Plan Deadline - natlawreview.comGoogle News: AIIntegration in the Wealth Management Industry - wealthmanagement.comGoogle News: AI‘Boring’ Liberty Formula One Upgraded To Buy at Bank of AmericaBloomberg TechnologyCan You Run a Computer Without RAM? Surprisingly, Yes—But You’ll Be MiserableGizmodoBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessWhich countries use ChatGPT the most? New study reveals top 5 - Deseret NewsGoogle News: ChatGPTOpenAI Is Letting Individuals Invest in Its $852 Billion Valuation—Here’s How - inc.comGoogle News: OpenAITransition From Data Scientist to Machine Learning Engineer 2026 Guide - Interview Kickstart Publishes New Career Guide - The Manila TimesGoogle News: Machine LearningValuations are 'Punchy': Salesforce's DrewsBloomberg TechnologyEarly AI Use Risks Children’s Development, Safety: UN - Mexico Business NewsGoogle News: AI SafetyAI blueprints can be stolen with a single small antennaTechXplore AIYou Have to Start Early in AI: Axiom Founder VenkatachalamBloomberg TechnologyAI and the Work-Product Doctrine: A New Frontier - callaborlaw.comGoogle News: AICompliance Policies: AI Policy & Upcoming Incident Response Plan Deadline - natlawreview.comGoogle News: AIIntegration in the Wealth Management Industry - wealthmanagement.comGoogle News: AI‘Boring’ Liberty Formula One Upgraded To Buy at Bank of AmericaBloomberg TechnologyCan You Run a Computer Without RAM? Surprisingly, Yes—But You’ll Be MiserableGizmodo

Gelina: Unified Speech and Gesture Synthesis via Interleaved Token Prediction

arXivMarch 30, 202610 min read0 views
Source Quiz

arXiv:2510.12834v3 Announce Type: replace-cross Abstract: Human communication is multimodal, with speech and gestures tightly coupled, yet most computational methods for generating speech and gestures synthesize them sequentially, weakening synchrony and prosody alignment. We introduce Gelina, a unified framework that jointly synthesizes speech and co-speech gestures from text using interleaved token sequences in a discrete autoregressive backbone, with modality-specific decoders. Gelina supports multi-speaker and multi-style cloning and enables gesture-only synthesis from speech inputs. Subje — T\'eo Guichoux, Th\'eodor Lemerle, Shivam Mehta, Jonas Beskow, Gustav Eje Henter, Laure Soulier, Catherine Pelachaud, Nicolas Obin

View PDF HTML (experimental)

Abstract:Human communication is multimodal, with speech and gestures tightly coupled, yet most computational methods for generating speech and gestures synthesize them sequentially, weakening synchrony and prosody alignment. We introduce Gelina, a unified framework that jointly synthesizes speech and co-speech gestures from text using interleaved token sequences in a discrete autoregressive backbone, with modality-specific decoders. Gelina supports multi-speaker and multi-style cloning and enables gesture-only synthesis from speech inputs. Subjective and objective evaluations demonstrate competitive speech quality and improved gesture generation over unimodal baselines.

Comments: Paper accepted at ICASSP 2026, 5 pages

Subjects:

Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)

MSC classes: 68T07

Cite as: arXiv:2510.12834 [cs.SD]

(or arXiv:2510.12834v3 [cs.SD] for this version)

https://doi.org/10.48550/arXiv.2510.12834

arXiv-issued DOI via DataCite

Submission history

From: Téo Guichoux [view email] [v1] Mon, 13 Oct 2025 09:51:26 UTC (586 KB) [v2] Thu, 27 Nov 2025 13:21:14 UTC (583 KB) [v3] Fri, 27 Mar 2026 10:01:19 UTC (583 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Gelina: Uni…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 131 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers