Research Papers research paper arxiv ai artificial-intelligence

Distilling Conversations: Abstract Compression of Conversational Audio Context for LLM-based ASR

arXivMarch 30, 202610 min read0 views

arXiv:2603.26246v1 Announce Type: cross Abstract: Standard LLM-based speech recognition systems typically process utterances in isolation, limiting their ability to leverage conversational context. In this work, we study whether multimodal context from prior turns improves LLM-based ASR and how to represent that context efficiently. We find that, after supervised multi-turn training, conversational context mainly helps with the recognition of contextual entities. However, conditioning on raw context is expensive because the prior-turn audio token sequence grows rapidly with conversation length — Shashi Kumar, Esa\'u Villatoro-Tello, Sergio Burdisso, Kadri Hacioglu, Thibault Ba\~neras-Roux, Hasindri Watawana, Dairazalia Sanchez-Cortes, Srikanth Madikeri, Petr Motlicek, Andreas Stolcke

View PDF HTML (experimental)

Abstract:Standard LLM-based speech recognition systems typically process utterances in isolation, limiting their ability to leverage conversational context. In this work, we study whether multimodal context from prior turns improves LLM-based ASR and how to represent that context efficiently. We find that, after supervised multi-turn training, conversational context mainly helps with the recognition of contextual entities. However, conditioning on raw context is expensive because the prior-turn audio token sequence grows rapidly with conversation length. To address this, we propose Abstract Compression, which replaces the audio portion of prior turns with a fixed number of learned latent tokens while retaining corresponding transcripts explicitly. On both in-domain and out-of-domain test sets, the compressed model recovers part of the gains of raw-context conditioning with a smaller prior-turn audio footprint. We also provide targeted analyses of the compression setup and its trade-offs.

Comments: 11 pages

Subjects:

Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

Cite as: arXiv:2603.26246 [cs.CL]

(or arXiv:2603.26246v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.26246

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Shashi Kumar [view email] [v1] Fri, 27 Mar 2026 10:09:30 UTC (1,818 KB)

Original source

arXiv

https://arxiv.org/abs/2603.26246

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Models

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - wsj.com

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models wsj.com

Google News: LLM

1m2 days ago

Open Source AIFresh

Gemma 4 - 31b abliterated quants

Got inspired to try and crack this egg without using heretic. FP16, Q8_0 and Q4_K_M quants, plus the abliteration script for modification/use is here: https://huggingface.co/paperscarecrow/Gemma-4-31B-it-abliterated-gguf based off of mlabonne's Orthogonalized Representation Intervention method , because I loved his ablits of gemma3 so much. Edit: Overestimated my internet speeds, still uploading the models. submitted by /u/Polymorphic-X [link] [comments]

Reddit r/LocalLLaMA

1mabout 3 hours ago

Research PapersFresh

Google Research touts memory-compression breakthrough for AI processing - Network World

Google Research touts memory-compression breakthrough for AI processing Network World

GNews AI Google

1mabout 3 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 164 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersFresh

Google Research touts memory-compression breakthrough for AI processing - Network World

Google Research touts memory-compression breakthrough for AI processing Network World

GNews AI Google

1mabout 3 hours ago

Research Papers

Consistency Amplifies: How Behavioral Variance Shapes Agent Accuracy

Analysis of behavioral consistency in large language model agents reveals that while consistent performance correlates with higher accuracy, consistency can amplify both correct and incorrect interpretations, emphasizing that accurate interpretation is more crucial than execution consistency for production deployment. (2 upvotes on HuggingFace)

HuggingFace Papers

2m8 days ago

Research PapersRecent

A Survey of On-Policy Distillation for Large Language Models

On-Policy Distillation for large language models unifies diverse approaches through an f-divergence framework organized by feedback signals, teacher access, and loss granularity. (4 upvotes on HuggingFace)

HuggingFace Papers

2m1 day ago

Research Papers

Brevity Constraints Reverse Performance Hierarchies in Language Models

Large language models can underperform smaller ones due to verbose responses that introduce errors, but constraining output length reveals their superior capabilities and improves performance across benchmarks. (16 upvotes on HuggingFace)

HuggingFace Papers

2m23 days ago