Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessMassachusetts Sen. Ed Markey is putting AV firms on blast for using human staffersFast Company TechOpenClaw has 500,000 instances and no enterprise kill switchVentureBeat AIJump to play: Building with Gemini & MediaPipeGoogle Developers BlogADK Go 1.0 Arrives!Google Developers BlogAnnouncing ADK for Java 1.0.0: Building the Future of AI Agents in JavaGoogle Developers BlogPlan mode is now available in Gemini CLIGoogle Developers BlogUnleash Your Development Superpowers: Refining the Core Coding ExperienceGoogle Developers BlogClosing the knowledge gap with agent skillsGoogle Developers BlogBuild a smart financial assistant with LlamaParse and Gemini 3.1Google Developers BlogDeveloper’s Guide to AI Agent ProtocolsGoogle Developers BlogAnnouncing the Colab MCP Server: Connect Any AI Agent to Google ColabGoogle Developers BlogIntroducing Finish Changes and Outlines, now available in Gemini Code Assist extensions on IntelliJ and VS CodeGoogle Developers BlogBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessMassachusetts Sen. Ed Markey is putting AV firms on blast for using human staffersFast Company TechOpenClaw has 500,000 instances and no enterprise kill switchVentureBeat AIJump to play: Building with Gemini & MediaPipeGoogle Developers BlogADK Go 1.0 Arrives!Google Developers BlogAnnouncing ADK for Java 1.0.0: Building the Future of AI Agents in JavaGoogle Developers BlogPlan mode is now available in Gemini CLIGoogle Developers BlogUnleash Your Development Superpowers: Refining the Core Coding ExperienceGoogle Developers BlogClosing the knowledge gap with agent skillsGoogle Developers BlogBuild a smart financial assistant with LlamaParse and Gemini 3.1Google Developers BlogDeveloper’s Guide to AI Agent ProtocolsGoogle Developers BlogAnnouncing the Colab MCP Server: Connect Any AI Agent to Google ColabGoogle Developers BlogIntroducing Finish Changes and Outlines, now available in Gemini Code Assist extensions on IntelliJ and VS CodeGoogle Developers Blog

Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models

HuggingFace PapersMarch 20, 20268 min read0 views
Source Quiz

Full-duplex speech language models require high-quality multi-speaker conversational data, which is scarce, necessitating a robust open-source data processing pipeline to address challenges in natural dialogue dynamics and system accuracy. (2 upvotes on HuggingFace)

Published on Mar 20

Authors:

,

,

,

Abstract

Full-duplex speech language models require high-quality multi-speaker conversational data, which is scarce, necessitating a robust open-source data processing pipeline to address challenges in natural dialogue dynamics and system accuracy.

AI-generated summary

As the paradigm of AI shifts from text-based LLMs to Speech Language Models (SLMs), there is a growing demand for full-duplex systems capable of real-time, natural human-computer interaction. However, the development of such models is constrained by the scarcity of high-quality, multi-speaker conversational data, as existing large-scale resources are predominantly single-speaker or limited in volume. Addressing the complex dynamics of natural dialogue, such as overlapping and back-channeling remains a challenge, with standard processing pipelines suffering from diarization errors and ASR hallucinations. To bridge this gap, we present a robust and scalable open-source data processing pipeline designed for full-duplex model.

View arXiv page View PDF Project page GitHub 4 Add to collection

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.25750 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.25750 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.25750 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Original source

HuggingFace Papers

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Sommelier: …researchpaperarxivSpeech Lang…full-duplex…real-time i…HuggingFace…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 98 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers