Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessAnthropic Publishes Official Skills Guide — How It Compares to Soul SpecDEV CommunityEngineering DDoS Resilience at Scale — How ArzenLabs Designs Protection Beyond 200 TbpsDEV CommunityBacktrader vs VnPy vs Qlib: A Deep Comparison of Python Quant Backtesting Frameworks (2026)DEV CommunityWaaseyaa governance seriesDEV CommunityThe audit that started everything: how Waaseyaa designed an invariant-driven architectural reviewDEV CommunityIntroducing HCEL: The Most Fluent Way to Build AI Pipelines in TypeScriptDEV Community30-Day Cloud & DevOps Challenge: Day 2 — Building My First Backend APIDEV CommunityCompliance and Cost Governance for Landing ZonesDEV CommunityYour AI Writes Code. Who Fixes the Build?DEV CommunityClaude AI Source Code Leaked: Individual Rewriting in Rust to Address Security ConcernsDEV CommunityMicrosoft Commits $1B to Thailand's AI future - AI BusinessGoogle News: Generative AITesla admits that remote humans can sometimes take control of its robotaxisTechSpotBlack Hat USADark ReadingBlack Hat AsiaAI BusinessAnthropic Publishes Official Skills Guide — How It Compares to Soul SpecDEV CommunityEngineering DDoS Resilience at Scale — How ArzenLabs Designs Protection Beyond 200 TbpsDEV CommunityBacktrader vs VnPy vs Qlib: A Deep Comparison of Python Quant Backtesting Frameworks (2026)DEV CommunityWaaseyaa governance seriesDEV CommunityThe audit that started everything: how Waaseyaa designed an invariant-driven architectural reviewDEV CommunityIntroducing HCEL: The Most Fluent Way to Build AI Pipelines in TypeScriptDEV Community30-Day Cloud & DevOps Challenge: Day 2 — Building My First Backend APIDEV CommunityCompliance and Cost Governance for Landing ZonesDEV CommunityYour AI Writes Code. Who Fixes the Build?DEV CommunityClaude AI Source Code Leaked: Individual Rewriting in Rust to Address Security ConcernsDEV CommunityMicrosoft Commits $1B to Thailand's AI future - AI BusinessGoogle News: Generative AITesla admits that remote humans can sometimes take control of its robotaxisTechSpot

PHONOS: PHOnetic Neutralization for Online Streaming Applications

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2603.27001v1 Announce Type: cross Abstract: Speaker anonymization (SA) systems modify timbre while leaving regional or non-native accents intact, which is problematic because accents can narrow the anonymity set. To address this issue, we present PHONOS, a streaming module for real-time SA that neutralizes non-native accent to sound native-like. Our approach pre-generates golden speaker utterances that preserve source timbre and rhythm but replace foreign segmentals with native ones using silence-aware DTW alignment and zero-shot voice conversion. These utterances supervise a causal acce — Waris Quamer, Mu-Ruei Tseng, Ghady Nasrallah, Ricardo Gutierrez-Osuna

View PDF HTML (experimental)

Abstract:Speaker anonymization (SA) systems modify timbre while leaving regional or non-native accents intact, which is problematic because accents can narrow the anonymity set. To address this issue, we present PHONOS, a streaming module for real-time SA that neutralizes non-native accent to sound native-like. Our approach pre-generates golden speaker utterances that preserve source timbre and rhythm but replace foreign segmentals with native ones using silence-aware DTW alignment and zero-shot voice conversion. These utterances supervise a causal accent translator that maps non-native content tokens to native equivalents with at most 40ms look-ahead, trained using joint cross-entropy and CTC losses. Our evaluations show an 81% reduction in non-native accent confidence, with listening-test ratings consistent with this shift, and reduced speaker linkability as accent-neutralized utterances move away from the original speaker in embedding space while having latency under 241 ms on single GPU.

Comments: The paper is submitted to Interspeech 2026 and currently under review

Subjects:

Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG)

Cite as: arXiv:2603.27001 [eess.AS]

(or arXiv:2603.27001v1 [eess.AS] for this version)

https://doi.org/10.48550/arXiv.2603.27001

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Waris Quamer [view email] [v1] Fri, 27 Mar 2026 21:24:18 UTC (217 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
PHONOS: PHO…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 201 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers