Research Papers research paper arxiv ai artificial-intelligence

Can pre-trained Deep Learning models predict groove ratings?

arXivMarch 31, 202610 min read0 views

arXiv:2603.27237v1 Announce Type: cross Abstract: This study explores the extent to which deep learning models can predict groove and its related perceptual dimensions directly from audio signals. We critically examine the effectiveness of seven state-of-the-art deep learning models in predicting groove ratings and responses to groove-related queries through the extraction of audio embeddings. Additionally, we compare these predictions with traditional handcrafted audio features. To better understand the underlying mechanics, we extend this methodology to analyze predictions based on source-se — Axel Marmoret, Nicolas Farrugia, Jan Alexander Stupacher

View PDF HTML (experimental)

Abstract:This study explores the extent to which deep learning models can predict groove and its related perceptual dimensions directly from audio signals. We critically examine the effectiveness of seven state-of-the-art deep learning models in predicting groove ratings and responses to groove-related queries through the extraction of audio embeddings. Additionally, we compare these predictions with traditional handcrafted audio features. To better understand the underlying mechanics, we extend this methodology to analyze predictions based on source-separated instruments, thereby isolating the contributions of individual musical elements. Our analysis reveals a clear separation of groove characteristics driven by the underlying musical style of the tracks (funk, pop, and rock). These findings indicate that deep audio representations can successfully encode complex, style-dependent groove components that traditional features often miss. Ultimately, this work highlights the capacity of advanced deep learning models to capture the multifaceted concept of groove, demonstrating the strong potential of representation learning to advance predictive Music Information Retrieval methodologies.

Comments: Submitted to the SMC 2026 conference. 3 figures and 2 tables

Subjects:

Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

ACM classes: H.5.5

Cite as: arXiv:2603.27237 [cs.SD]

(or arXiv:2603.27237v1 [cs.SD] for this version)

https://doi.org/10.48550/arXiv.2603.27237

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Axel Marmoret [view email] [v1] Sat, 28 Mar 2026 11:20:38 UTC (501 KB)

Original source

arXiv

https://arxiv.org/abs/2603.27237

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Market NewsRecent

Microsoft Stock Lands Overdue Lift on Asia AI Growth Plans - Schaeffer's Investment Research

<a href="https://news.google.com/rss/articles/CBMiuAFBVV95cUxOZzd2cUQ4cGU3M3p3LTZfMTZteUx5S01sX0dBemhjSVA0OUk2dnpHZGZjWWRpcDJ6dzNvaUctZHJhSHdEaWdrUTdGQjhwYmdlWlJwdHBDTUJNX01fQUtDVnh3N29CdjFvZFhvWkhIR3FwTjJWd1FNQmgtTHE0MGVVMThVX3V3WWdfUGlGdTNUMFpHQ01xc2YyXzd6ajBadURQeGVYdzhDbm1HYllnTmxmU0NuTjVSTXBt?oc=5" target="_blank">Microsoft Stock Lands Overdue Lift on Asia AI Growth Plans</a> <font color="#6f6f6f">Schaeffer's Investment Research</font>

GNews AI Microsoft

1mabout 17 hours ago

ModelsRecent

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxNZUVvVzNvMVZBN09Hc3c5QWoxMUN4MEo1dDIzQjFXd1pkUlVWT25ZQ3pjOS1RMGdONzNrLUpfMHdLYTZUVm5YQkZXSDZxLU5qSkpCR0pYTTRURWtOR2JhMWMtMWVPX1dIRmlwN0lGTkFtVlVaRHdIeEFabFBTQU9FeWFiY3B1NERUNTc1X2N1djhGUk0yYVdUUjFzdGd0eFVuWVZ1TzN1akZCQmtuS3RzNWg3YVNraTFWd0ozX2hISTlUbElnQ29vUUx0WThlUVppU3E5LTNTSllDcXV0dUJUU29mWkVWZDV2OEtRbC1mRWYtWGZQOXUxZ1ppS1B6dThsSXM3V3lfNUxseERsUFVISW1HUXNUSVRhbUNLZ09sSjhDRnNGZXQtdGVsbGZCZXppRUdsdUhrWERta1V5Vm5GeWR6V1pvWE91TUFqQ0tfdU04TkpNTnIwSnd6RGNLWUo2ZVc1dHlzV1lWbi0xdWYtWXJCZ3RwTkdmS1R3Sk5RTl9iTFpTNXVFQktzYWZqYmNRZUpJMHBnRkNUb0tWMG1fV2JzckVhS1Zla0hoOE9Ra2hkdmZFT2lXWA?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> <font color="#6f6f6f">WSJ</font>

Google News: LLM

1m1 day ago

ModelsFresh

Polysemanticity or Polysemy? Lexical Identity Confounds Superposition Metrics

arXiv:2604.00443v1 Announce Type: new Abstract: If the same neuron activates for both "lender" and "riverside," standard metrics attribute the overlap to superposition--the neuron must be compressing two unrelated concepts. This work explores how much of the overlap is due a lexical confound: neurons fire for a shared word form (such as "bank") rather than for two compressed concepts. A 2x2 factorial decomposition reveals that the lexical-only condition (same word, different meaning) consistently exceeds the semantic-only condition (different word, same meaning) across models spanning 110M-70B parameters. The confound carries into sparse autoencoders (18-36% of features blend senses), sits in <=1% of activation dimensions, and hurts downstream tasks: filtering it out improves word sense di

arXiv cs.CL

1mabout 3 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 198 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersFresh

GUIDE: Reinforcement Learning for Behavioral Action Support in Type 1 Diabetes

arXiv:2604.00385v1 Announce Type: new Abstract: Type 1 Diabetes (T1D) management requires continuous adjustment of insulin and lifestyle behaviors to maintain blood glucose within a safe target range. Although automated insulin delivery (AID) systems have improved glycemic outcomes, many patients still fail to achieve recommended clinical targets, warranting new approaches to improve glucose control in patients with T1D. While reinforcement learning (RL) has been utilized as a promising approach, current RL-based methods focus primarily on insulin-only treatment and do not provide behavioral recommendations for glucose control. To address this gap, we propose GUIDE, an RL-based decision-support framework designed to complement AID technologies by providing behavioral recommendations to pre

arXiv cs.LG

2mabout 3 hours ago

Research PapersFresh

Beyond Symbolic Solving: Multi Chain-of-Thought Voting for Geometric Reasoning in Large Language Models

arXiv:2604.00890v1 Announce Type: new Abstract: Geometric Problem Solving (GPS) remains at the heart of enhancing mathematical reasoning in large language models because it requires the combination of diagrammatic understanding, symbolic manipulation and logical inference. In existing literature, researchers have chiefly focused on synchronising the diagram descriptions with text literals and solving the problem. In this vein, they have either taken a neural, symbolic or neuro-symbolic approach. But this solves only the first two of the requirements, namely diagrammatic understanding and symbolic manipulation, while leaving logical inference underdeveloped. The logical inference is often limited to one chain-of-thought (CoT). To address this weakness in hitherto existing models, this paper

ArXiv CS.AI

1mabout 3 hours ago

Research PapersRecent

Google research suggests encryption technique used by Bitcoin will be cracked by quantum computers around 2029 — search giant says quantum attacks need to be prepared for now

tomshardware.com

1mabout 22 hours ago

Research PapersFresh

ARGS: Auto-Regressive Gaussian Splatting via Parallel Progressive Next-Scale Prediction

arXiv:2604.00494v1 Announce Type: new Abstract: Auto-regressive frameworks for next-scale prediction of 2D images have demonstrated strong potential for producing diverse and sophisticated content by progressively refining a coarse input. However, extending this paradigm to 3D object generation remains largely unexplored. In this paper, we introduce auto-regressive Gaussian splatting (ARGS), a framework for making next-scale predictions in parallel for generation according to levels of detail. We propose a Gaussian simplification strategy and reverse the simplification to guide next-scale generation. Benefiting from the use of hierarchical trees, the generation process requires only \(\mathcal{O}(\log n)\) steps, where \(n\) is the number of points. Furthermore, we propose a tree-based tra

arXiv cs.CV

1mabout 3 hours ago