Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessIntel Teams up with Musk on AI Chip InitiativeDigit.fyiAn Alignment Journal: Features and policieslesswrong.comI tried Google Photos' new AI Enhance tool: How it crops, relights, and fixes your shots - sometimesZDNet AISuno en Universal zijn het oneens over downloaden eigen AI-muziekTweakers.netWhat Amazon saw in Fauna Robotics’ humanoid strategyThe Robot ReportClaude Code is locking people out for hoursHacker NewsS&P Analysts Report Quantum Computing Arriving Just as Energy Sector Prepares For a Compute-Driven Future - The Quantum InsiderGNews AI quantumNot Adopting AI? Expect the Chopping BlockDigit.fyiCopilot CLI now supports BYOK and local modelsGitHub Copilot ChangelogAuto-creation of agent SKILLs from observing your screen via Gemma 4 for any agent to execute and self-improveReddit r/LocalLLaMARussian-State Hackers Are Hijacking UK Web Traffic, Warns NCSCDigit.fyiThe Leap from Technical Project Management to AI Project Management: How to Make the LeapODSC MediumBlack Hat USADark ReadingBlack Hat AsiaAI BusinessIntel Teams up with Musk on AI Chip InitiativeDigit.fyiAn Alignment Journal: Features and policieslesswrong.comI tried Google Photos' new AI Enhance tool: How it crops, relights, and fixes your shots - sometimesZDNet AISuno en Universal zijn het oneens over downloaden eigen AI-muziekTweakers.netWhat Amazon saw in Fauna Robotics’ humanoid strategyThe Robot ReportClaude Code is locking people out for hoursHacker NewsS&P Analysts Report Quantum Computing Arriving Just as Energy Sector Prepares For a Compute-Driven Future - The Quantum InsiderGNews AI quantumNot Adopting AI? Expect the Chopping BlockDigit.fyiCopilot CLI now supports BYOK and local modelsGitHub Copilot ChangelogAuto-creation of agent SKILLs from observing your screen via Gemma 4 for any agent to execute and self-improveReddit r/LocalLLaMARussian-State Hackers Are Hijacking UK Web Traffic, Warns NCSCDigit.fyiThe Leap from Technical Project Management to AI Project Management: How to Make the LeapODSC Medium
AI NEWS HUBbyEIGENVECTOREigenvector

Asymmetric Encoder-Decoder Based on Time-Frequency Correlation for Speech Separation

arXiv eess.ASby [Submitted on 31 Mar 2026]April 1, 20262 min read2 views
Source Quiz
🧒Explain Like I'm 5Simple language

Imagine you're at a super noisy birthday party! Lots of friends are talking at once, music is playing, and grown-ups are chatting. It's hard to hear just one friend, right?

Scientists made a special computer helper, like a superhero ear! Its name is SR-CorrNet. This superhero ear helps computers listen to all the messy sounds and pick out just one person's voice, even when it's super loud and messy.

It's like having a magic sieve that sifts out all the noise and only lets the voice you want come through. So, computers can understand us better, even when there's a big party happening! Yay for clear listening!

arXiv:2603.29097v1 Announce Type: new Abstract: Speech separation in realistic acoustic environments remains challenging because overlapping speakers, background noise, and reverberation must be resolved simultaneously. Although recent time-frequency (TF) domain models have shown strong performance, most still rely on late-split architectures, where speaker disentanglement is deferred to the final stage, creating an information bottleneck and weakening discriminability under adverse conditions. To address this issue, we propose SR-CorrNet, an asymmetric encoder-decoder framework that introduces the separation-reconstruction (SepRe) strategy into a TF dual-path backbone. The encoder performs coarse separation from mixture observations, while the weight-shared decoder progressively reconstru

View PDF HTML (experimental)

Abstract:Speech separation in realistic acoustic environments remains challenging because overlapping speakers, background noise, and reverberation must be resolved simultaneously. Although recent time-frequency (TF) domain models have shown strong performance, most still rely on late-split architectures, where speaker disentanglement is deferred to the final stage, creating an information bottleneck and weakening discriminability under adverse conditions. To address this issue, we propose SR-CorrNet, an asymmetric encoder-decoder framework that introduces the separation-reconstruction (SepRe) strategy into a TF dual-path backbone. The encoder performs coarse separation from mixture observations, while the weight-shared decoder progressively reconstructs speaker-discriminative features with cross-speaker interaction, enabling stage-wise refinement. To complement this architecture, we formulate speech separation as a structured correlation-to-filter problem: spatio-spectro-temporal correlations computed from the observations are used as input features, and the corresponding deep filters are estimated to recover target signals. We further incorporate an attractor-based dynamic split module to adapt the number of output streams to the actual speaker configuration. Experimental results on WSJ0-2/3/4/5Mix, WHAMR!, and LibriCSS demonstrate consistent improvements across anechoic, noisy-reverberant, and real-recorded conditions in both single- and multi-channel settings, highlighting the effectiveness of TF-domain SepRe with correlation-based filter estimation for speech separation.

Comments: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processing (T-ASLP)

Subjects:

Audio and Speech Processing (eess.AS); Sound (cs.SD)

Cite as: arXiv:2603.29097 [eess.AS]

(or arXiv:2603.29097v1 [eess.AS] for this version)

https://doi.org/10.48550/arXiv.2603.29097

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Ui-Hyeop Shin [view email] [v1] Tue, 31 Mar 2026 00:37:15 UTC (1,538 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelannouncefeature

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Asymmetric …modelannouncefeaturearxivarXiv eess.…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 210 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!