Models model benchmark training release announce open-source

Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis

arXiv eess.ASby Yushen Chen, Junzhe Liu, Yujie Tu, Zhikang Niu, Yuzhe Liang, Chunyu Qiang, Chen Zhang, Kai Yu, Xie ChenApril 1, 20261 min read0 views

Source Quiz

arXiv:2601.13802v2 Announce Type: replace-cross Abstract: Arabic spans over 30 spoken varieties, yet no open-source text-to-speech system unifies them. Key barriers include substantial cross-dialect lexical and phonological divergence, scarce synthesis-grade data, and the absence of a standardized multi-dialect evaluation benchmark. We present Habibi, a unified-dialectal Arabic TTS framework that addresses all three. Through a multi-step curation pipeline, we repurpose open-source ASR corpora into TTS training data covering 12+ regional dialects. A linguistically-informed curriculum learning strategy - progressing from Modern Standard Arabic to dialectal data - enables robust zero-shot synthesis without text diacritization. We further release the first standardized multi-dialect Arabic TTS

View PDF HTML (experimental)

Abstract:Arabic spans over 30 spoken varieties, yet no open-source text-to-speech system unifies them. Key barriers include substantial cross-dialect lexical and phonological divergence, scarce synthesis-grade data, and the absence of a standardized multi-dialect evaluation benchmark. We present Habibi, a unified-dialectal Arabic TTS framework that addresses all three. Through a multi-step curation pipeline, we repurpose open-source ASR corpora into TTS training data covering 12+ regional dialects. A linguistically-informed curriculum learning strategy - progressing from Modern Standard Arabic to dialectal data - enables robust zero-shot synthesis without text diacritization. We further release the first standardized multi-dialect Arabic TTS benchmark, comprising over 11,000 utterances across 7 dialect subsets with manually verified transcripts. On this benchmark, our unified model matches or surpasses per-dialect specialized models. Both automatic metrics and human evaluations confirm that Habibi is highly competitive with ElevenLabs' Eleven v3 (alpha) in intelligibility, speaker similarity, and naturalness. Extensive ablations (~8,000 H100 GPU hours, 30+ configurations) validate each design choice. We open-source all checkpoints, training and inference code, and benchmark data - the first such release for multi-dialect Arabic TTS - at this https URL .

Subjects:

Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Cite as: arXiv:2601.13802 [cs.CL]

(or arXiv:2601.13802v2 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2601.13802

arXiv-issued DOI via DataCite

Submission history

From: Yushen Chen [view email] [v1] Tue, 20 Jan 2026 10:02:11 UTC (922 KB) [v2] Tue, 31 Mar 2026 12:16:39 UTC (935 KB)

Original source

arXiv eess.AS

https://arxiv.org/abs/2601.13802

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelbenchmarktraining

Models

NVIDIA Platform Delivers Lowest Token Cost Enabled by Extreme Co-Design

Co-designed hardware, software, and models are key to delivering the highest AI factory throughput and lowest token cost. Measuring this goes far beyond peak...

NVIDIA Developer Blog

1m2 days ago

ModelsFresh

Microsoft AI Launches Multimodal Foundation Models to Expand In-House AI Capabilities - AI Insider

Microsoft AI Launches Multimodal Foundation Models to Expand In-House AI Capabilities AI Insider

GNews AI multimodal

1mabout 3 hours ago

Open Source AILive

With hf cli, how do I resume an interrupted model download?

I have a slow internet and the download of a large file was interrupted 30GB in! I download using the ‘hf’ CLI command, like this: hf download unsloth/gemma-4-31B-it-GGUF gemma-4-31B-it-UD-Q8_K_XL.gguf When I ran it again, it started over instead of resuming, to my horror. How do I avoid redownloading a partial model next time? I don’t see a resume option in hf download –help 1 post - 1 participant Read full topic

discuss.huggingface.co

1m39 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 156 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis

Submission history

Daily AI Digest

More about

NVIDIA Platform Delivers Lowest Token Cost Enabled by Extreme Co-Design

Microsoft AI Launches Multimodal Foundation Models to Expand In-House AI Capabilities - AI Insider

With hf cli, how do I resume an interrupted model download?

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Models

NVIDIA Platform Delivers Lowest Token Cost Enabled by Extreme Co-Design

Microsoft AI Launches Multimodal Foundation Models to Expand In-House AI Capabilities - AI Insider

Mistral AI Raises $830 Million in Debt For Nvidia-Powered Data Center - WSJ

Mistral AI Lands Accenture as Latest Big Client - WSJ