Live
Black Hat USADark ReadingBlack Hat AsiaAI Business1 Artificial Intelligence (AI) Software Stock to Buy Hand Over Fist Before It Soars 62%, According to Dan Ives - The Motley FoolGoogle News: AIGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAI - GizmodoGoogle News: OpenAIGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAIGizmodoInside the race to recreate Claude Code and mine its guts for revelationsBusiness InsiderAnthropic Executive Sees Cowork Agent as Bigger Than Claude CodeBloomberg TechnologyABAP OOP Design Patterns — Part 2: Factory, Observer, and Decorator Patterns in Real SAP SystemsDEV CommunityWhy Your AI Agent Health Check Is Lying to YouDEV CommunityDeep Dive: Array Internals & Memory LayoutDEV CommunityIllinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!Google News: Machine LearningICE Tells Lawmakers It’s Using Spyware in Fight Against FentanylBloomberg TechnologyAmazon Facilities in Bahrain Hit Again as Iran Follows Through on Threat, Report SaysGizmodoWhy AI Agents Need Both Memory and MoneyDEV CommunityBlack Hat USADark ReadingBlack Hat AsiaAI Business1 Artificial Intelligence (AI) Software Stock to Buy Hand Over Fist Before It Soars 62%, According to Dan Ives - The Motley FoolGoogle News: AIGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAI - GizmodoGoogle News: OpenAIGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAIGizmodoInside the race to recreate Claude Code and mine its guts for revelationsBusiness InsiderAnthropic Executive Sees Cowork Agent as Bigger Than Claude CodeBloomberg TechnologyABAP OOP Design Patterns — Part 2: Factory, Observer, and Decorator Patterns in Real SAP SystemsDEV CommunityWhy Your AI Agent Health Check Is Lying to YouDEV CommunityDeep Dive: Array Internals & Memory LayoutDEV CommunityIllinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!Google News: Machine LearningICE Tells Lawmakers It’s Using Spyware in Fight Against FentanylBloomberg TechnologyAmazon Facilities in Bahrain Hit Again as Iran Follows Through on Threat, Report SaysGizmodoWhy AI Agents Need Both Memory and MoneyDEV Community

An Empirical Recipe for Universal Phone Recognition

arXiv cs.CLby Shikhar Bharadwaj, Chin-Jou Li, Kwanghee Choi, Eunjung Yeo, William Chen, Shinji Watanabe, David R. MortensenApril 1, 20261 min read0 views
Source Quiz

arXiv:2603.29042v1 Announce Type: new Abstract: Phone recognition (PR) is a key enabler of multilingual and low-resource speech processing tasks, yet robust performance remains elusive. Highly performant English-focused models do not generalize across languages, while multilingual models underutilize pretrained representations. It also remains unclear how data scale, architecture, and training objective contribute to multilingual PR. We present PhoneticXEUS -- trained on large-scale multilingual data and achieving state-of-the-art performance on both multilingual (17.7% PFER) and accented English speech (10.6% PFER). Through controlled ablations with evaluations across 100+ languages under a unified scheme, we empirically establish our training recipe and quantify the impact of SSL represe

View PDF HTML (experimental)

Abstract:Phone recognition (PR) is a key enabler of multilingual and low-resource speech processing tasks, yet robust performance remains elusive. Highly performant English-focused models do not generalize across languages, while multilingual models underutilize pretrained representations. It also remains unclear how data scale, architecture, and training objective contribute to multilingual PR. We present PhoneticXEUS -- trained on large-scale multilingual data and achieving state-of-the-art performance on both multilingual (17.7% PFER) and accented English speech (10.6% PFER). Through controlled ablations with evaluations across 100+ languages under a unified scheme, we empirically establish our training recipe and quantify the impact of SSL representations, data scale, and loss objectives. In addition, we analyze error patterns across language families, accented speech, and articulatory features. All data and code are released openly.

Comments: Submitted to Interspeech 2026. Code: this https URL

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2603.29042 [cs.CL]

(or arXiv:2603.29042v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.29042

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Shikhar Bharadwaj [view email] [v1] Mon, 30 Mar 2026 22:12:48 UTC (99 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modeltrainingrelease

Knowledge Map

Knowledge Map
TopicsEntitiesSource
An Empirica…modeltrainingreleaseannouncefeaturevaluationarXiv cs.CL

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 179 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models