Asymmetric Encoder-Decoder Based on Time-Frequency Correlation for Speech Separation
Imagine you're at a super noisy birthday party! Lots of friends are talking at once, music is playing, and grown-ups are chatting. It's hard to hear just one friend, right?
Scientists made a special computer helper, like a superhero ear! Its name is SR-CorrNet. This superhero ear helps computers listen to all the messy sounds and pick out just one person's voice, even when it's super loud and messy.
It's like having a magic sieve that sifts out all the noise and only lets the voice you want come through. So, computers can understand us better, even when there's a big party happening! Yay for clear listening!
arXiv:2603.29097v1 Announce Type: new Abstract: Speech separation in realistic acoustic environments remains challenging because overlapping speakers, background noise, and reverberation must be resolved simultaneously. Although recent time-frequency (TF) domain models have shown strong performance, most still rely on late-split architectures, where speaker disentanglement is deferred to the final stage, creating an information bottleneck and weakening discriminability under adverse conditions. To address this issue, we propose SR-CorrNet, an asymmetric encoder-decoder framework that introduces the separation-reconstruction (SepRe) strategy into a TF dual-path backbone. The encoder performs coarse separation from mixture observations, while the weight-shared decoder progressively reconstru
View PDF HTML (experimental)
Abstract:Speech separation in realistic acoustic environments remains challenging because overlapping speakers, background noise, and reverberation must be resolved simultaneously. Although recent time-frequency (TF) domain models have shown strong performance, most still rely on late-split architectures, where speaker disentanglement is deferred to the final stage, creating an information bottleneck and weakening discriminability under adverse conditions. To address this issue, we propose SR-CorrNet, an asymmetric encoder-decoder framework that introduces the separation-reconstruction (SepRe) strategy into a TF dual-path backbone. The encoder performs coarse separation from mixture observations, while the weight-shared decoder progressively reconstructs speaker-discriminative features with cross-speaker interaction, enabling stage-wise refinement. To complement this architecture, we formulate speech separation as a structured correlation-to-filter problem: spatio-spectro-temporal correlations computed from the observations are used as input features, and the corresponding deep filters are estimated to recover target signals. We further incorporate an attractor-based dynamic split module to adapt the number of output streams to the actual speaker configuration. Experimental results on WSJ0-2/3/4/5Mix, WHAMR!, and LibriCSS demonstrate consistent improvements across anechoic, noisy-reverberant, and real-recorded conditions in both single- and multi-channel settings, highlighting the effectiveness of TF-domain SepRe with correlation-based filter estimation for speech separation.
Comments: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processing (T-ASLP)
Subjects:
Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as: arXiv:2603.29097 [eess.AS]
(or arXiv:2603.29097v1 [eess.AS] for this version)
https://doi.org/10.48550/arXiv.2603.29097
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Ui-Hyeop Shin [view email] [v1] Tue, 31 Mar 2026 00:37:15 UTC (1,538 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelannouncefeature
OpenAI, Anthropic, and Google team up against unauthorized Chinese model copying
OpenAI, Anthropic, and Google have started working together to combat the unauthorized copying of their AI models by Chinese competitors, according to Bloomberg. The article OpenAI, Anthropic, and Google team up against unauthorized Chinese model copying appeared first on The Decoder .

Tax Changes and Share Incentives Set to Boost Early Stage Businesses
The UK government announced tax and share incentive changes at the 2025 budget which come into effect now at the start of the tax year, creating opportunities for early stage, innovative businesses. The updates are sign of the UK governments commitment to growing the startup and scaleup ecosystem which in turn is expected to [ ] The post Tax Changes and Share Incentives Set to Boost Early Stage Businesses appeared first on DIGIT .

An Alignment Journal: Features and policies
We previously announced a forthcoming research journal for AI alignment. This cross-post from our blog describes our tentative plans for the features and policies of the journal, including experiments like reviewer compensation and reviewer abstracts . It is the first in a series of posts that will go on to discuss our theory of change, comparison to related projects, possible partnerships and extensions, scope, personnel, and organizational structure. The journal is being built to serve the alignment research community. This post’s purpose is to solicit feedback and encourage you to contact us here if you want to participate, especially if you are interested in becoming a founding editor or part-time operations lead. The current plans are merely a starting point for the founding editorial
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Products

Anthropic signs multi-gigawatt TPU deal with Google and Broadcom
Anthropic has signed a deal with Google and Broadcom for multiple gigawatts of TPU computing capacity, set to come online starting in 2027. The article Anthropic signs multi-gigawatt TPU deal with Google and Broadcom appeared first on The Decoder .

Engineering An AI Agent To Navigate Large-scale Event Data Part 2
Part 2: From Query Patterns to Intelligent Tools Agent Design A simple search application can take in keywords, find exact matches and return results. It cannot however, reliably and... View article

Bezos Project Prometheus hires xAI co-founder from OpenAI
Jeff Bezos' startup Project Prometheus has hired Kyle Kosic, a co-founder of Elon Musk's xAI who most recently worked at OpenAI, the Financial Times reports. The article Bezos Project Prometheus hires xAI co-founder from OpenAI appeared first on The Decoder .

What Amazon saw in Fauna Robotics’ humanoid strategy
Amazon did not acquire Fauna to ship a consumer humanoid. Sprout will not be walking around your living room folding laundry anytime soon. The post What Amazon saw in Fauna Robotics’ humanoid strategy appeared first on The Robot Report .


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!