Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessYouTube blokkeert Nvidia s DLSS 5-video na auteursclaim Italiaanse tv-zenderTweakers.netWhat are the differences between pipelines and models in Hugging Face?discuss.huggingface.coAI Mastery Course in Telugu: Hands-On Training with Real ProjectsDev.to AIHow I'm Running Autonomous AI Agents That Actually Earn USDCDev.to AIUnderstanding NLP Token Classification: From Basics to Real-World ApplicationsMedium AIGPT-5.4 Scored 75% on a Test That Measures Real Human Work. My Data Team Scored 72%.Medium AIBizNode Workflow Marketplace: chain multiple bot handles into multi-step pipelines. Client onboarding, contract-to-payment,...Dev.to AITop Artificial Intelligence Development Companies in Dubai, UAE (2026 Edition)Medium AIЯ потратил месяц на AI-инструменты и удалил половину из нихDev.to AI500 AI Demos at AZ Tech Week. Every One Hits the Same Scaling Ceiling.Dev.to AIPEDIGREE® Uses Artificial Intelligence to Drive Responsible Dog Adoption in Brazil - PA MediaGoogle News: AIWhen the accountability tool becomes the procrastination toolDev.to AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessYouTube blokkeert Nvidia s DLSS 5-video na auteursclaim Italiaanse tv-zenderTweakers.netWhat are the differences between pipelines and models in Hugging Face?discuss.huggingface.coAI Mastery Course in Telugu: Hands-On Training with Real ProjectsDev.to AIHow I'm Running Autonomous AI Agents That Actually Earn USDCDev.to AIUnderstanding NLP Token Classification: From Basics to Real-World ApplicationsMedium AIGPT-5.4 Scored 75% on a Test That Measures Real Human Work. My Data Team Scored 72%.Medium AIBizNode Workflow Marketplace: chain multiple bot handles into multi-step pipelines. Client onboarding, contract-to-payment,...Dev.to AITop Artificial Intelligence Development Companies in Dubai, UAE (2026 Edition)Medium AIЯ потратил месяц на AI-инструменты и удалил половину из нихDev.to AI500 AI Demos at AZ Tech Week. Every One Hits the Same Scaling Ceiling.Dev.to AIPEDIGREE® Uses Artificial Intelligence to Drive Responsible Dog Adoption in Brazil - PA MediaGoogle News: AIWhen the accountability tool becomes the procrastination toolDev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

A Human-Inspired Decoupled Architecture for Efficient Audio Representation Learning

arXivby [Submitted on 27 Mar 2026]March 30, 20261 min read1 views
Source Quiz

arXiv:2603.26098v1 Announce Type: cross Abstract: While self-supervised learning (SSL) has revolutionized audio representation, the excessive parameterization and quadratic computational cost of standard Transformers limit their deployment on resource-constrained devices. To address this bottleneck, we propose HEAR (Human-inspired Efficient Audio Representation), a novel decoupled architecture. Inspired by the human cognitive ability to isolate local acoustic features from global context, HEAR splits the processing pipeline into two dedicated modules: an Acoustic Model for local feature extrac — Harunori Kawano, Takeshi Sasaki

View PDF HTML (experimental)

Abstract:While self-supervised learning (SSL) has revolutionized audio representation, the excessive parameterization and quadratic computational cost of standard Transformers limit their deployment on resource-constrained devices. To address this bottleneck, we propose HEAR (Human-inspired Efficient Audio Representation), a novel decoupled architecture. Inspired by the human cognitive ability to isolate local acoustic features from global context, HEAR splits the processing pipeline into two dedicated modules: an Acoustic Model for local feature extraction and a Task Model for global semantic integration. Coupled with an Acoustic Tokenizer trained via knowledge distillation, our approach enables robust Masked Audio Modeling (MAM). Extensive experiments demonstrate that HEAR requires only 15M parameters and 9.47 GFLOPs for inference, operating at a fraction of the computational cost of conventional foundation models (which typically require 85M-94M parameters). Despite this high efficiency, HEAR achieves highly competitive performance across diverse audio classification benchmarks. The code and pre-trained models are available at this https URL

Subjects:

Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Cite as: arXiv:2603.26098 [cs.SD]

(or arXiv:2603.26098v1 [cs.SD] for this version)

https://doi.org/10.48550/arXiv.2603.26098

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Harunori Kawano [view email] [v1] Fri, 27 Mar 2026 06:09:03 UTC (557 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
A Human-Ins…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 312 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers