GAP-URGENet: A Generative-Predictive Fusion Framework for Universal Speech Enhancement
arXiv:2604.01832v1 Announce Type: new Abstract: We introduce GAP-URGENet, a generative-predictive fusion framework developed for Track 1 of the ICASSP 2026 URGENT Challenge. The system integrates a generative branch, which performs full-stack speech restoration in a self-supervised representation domain and reconstructs the waveform via a neural vocoder, along with a predictive branch that performs spectrogram-domain enhancement, providing complementary cues. Outputs from both branches are fused by a post-processing module, which also performs bandwidth extension to generate the enhanced waveform at 48 kHz, later downsampled to the original sampling rate. This generative-predictive fusion improves robustness and perceptual quality, achieving top performance in the blind-test phase and rank
View PDF HTML (experimental)
Abstract:We introduce GAP-URGENet, a generative-predictive fusion framework developed for Track 1 of the ICASSP 2026 URGENT Challenge. The system integrates a generative branch, which performs full-stack speech restoration in a self-supervised representation domain and reconstructs the waveform via a neural vocoder, along with a predictive branch that performs spectrogram-domain enhancement, providing complementary cues. Outputs from both branches are fused by a post-processing module, which also performs bandwidth extension to generate the enhanced waveform at 48 kHz, later downsampled to the original sampling rate. This generative-predictive fusion improves robustness and perceptual quality, achieving top performance in the blind-test phase and ranking 1st in the objective evaluation. Audio examples are available at this https URL.
Comments: Awarded 1st place in the URGENT 2026 Challenge (objective phase), accepted by ICASSP 2026
Subjects:
Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as: arXiv:2604.01832 [eess.AS]
(or arXiv:2604.01832v1 [eess.AS] for this version)
https://doi.org/10.48550/arXiv.2604.01832
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Xiaobin Rong [view email] [v1] Thu, 2 Apr 2026 09:45:58 UTC (88 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
announceavailablevaluation![[R] Differentiable Clustering & Search !](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-graph-nodes-a2pnJLpyKmDnxKWLd5BEAb.webp)
[R] Differentiable Clustering & Search !
Hey guys, I occasionally write articles on my blog, and I am happy to share the new one with you : https://bornlex.github.io/posts/differentiable-clustering/ . It came from something I was working for at work, and we ended up implementing something else because of the constraints that we have. The method mixes different loss terms to achieve a differentiable clustering method that takes into account mutual info, semantic proximity and even constraints such as the developer enforcing two tags (could be documents) to be part of the same cluster. Then it is possible to search the catalog using the clusters. All of it comes from my mind, I used an AI to double check the sentences, spelling, so it might have rewritten a few sentences, but most of it is human made. I've added the research flair

Prologue: After We No Longer Write Code by Hand, What Remains for Engineers?
1. A Question We Can No Longer Avoid See Figures 0-1 and 0-2 in this chapter. Over the past decade, software engineers have had a broadly stable understanding of themselves. We proved our value by writing implementations, reading systems, fixing bugs, refactoring, and aligning team collaboration. Even as job specialization became more detailed, that central image did not change: an engineer was, first of all, someone who personally built complex things. But once agents began to enter real development workflows, that image was quietly unsettled. Code implementation, test scaffolding, documentation patches, simple regressions, fault reproduction, and localized fixes—more and more steps that once depended on human hands began to be handed over to models. The change is uneven and far from comp

Agents Can Pay. That's Not the Problem.
On April 2, 2026, the x402 Foundation launched under the Linux Foundation. The founding members included Visa, Mastercard, American Express, Stripe, Coinbase, Cloudflare, Google, Microsoft, AWS, Adyen, Fiserv, Shopify, and a dozen others. Twenty-three organizations representing essentially the entire payments industry signed up on day one. The announcement celebrated something real: the agent payment problem is, for practical purposes, solved. Any AI agent on the planet can now send a payment to any resource that accepts x402. The plumbing is done. This is worth sitting with, because it changes the nature of the problem. If the question was "can agents pay?" — x402 answers it. If the question was "will the payment networks support this?" — 23 members of the Linux Foundation answer it. If t
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Releases

Vantage Foundation and Banking Academy of Vietnam Launch Nationwide AI Finance Programme with Vietnam's Top Universities for More Than 1,500 Educators and Students - The Manila Times
Vantage Foundation and Banking Academy of Vietnam Launch Nationwide AI Finance Programme with Vietnam's Top Universities for More Than 1,500 Educators and Students The Manila Times

Agents Can Pay. That's Not the Problem.
On April 2, 2026, the x402 Foundation launched under the Linux Foundation. The founding members included Visa, Mastercard, American Express, Stripe, Coinbase, Cloudflare, Google, Microsoft, AWS, Adyen, Fiserv, Shopify, and a dozen others. Twenty-three organizations representing essentially the entire payments industry signed up on day one. The announcement celebrated something real: the agent payment problem is, for practical purposes, solved. Any AI agent on the planet can now send a payment to any resource that accepts x402. The plumbing is done. This is worth sitting with, because it changes the nature of the problem. If the question was "can agents pay?" — x402 answers it. If the question was "will the payment networks support this?" — 23 members of the Linux Foundation answer it. If t




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!