Can LLM Agents Identify Spoken Dialects like a Linguist?
arXiv:2603.29541v1 Announce Type: new Abstract: Due to the scarcity of labeled dialectal speech, audio dialect classification is a challenging task for most languages, including Swiss German. In this work, we explore the ability of large language models (LLMs) as agents in understanding the dialects and whether they can show comparable performance to models such as HuBERT in dialect classification. In addition, we provide an LLM baseline and a human linguist one. Our approach uses phonetic transcriptions produced by ASR systems and combines them with linguistic resources such as dialect feature maps, vowel history, and rules. Our findings indicate that, when linguistic information is provided, the LLM predictions improve. The human baseline shows that automatically generated transcriptions
View PDF HTML (experimental)
Abstract:Due to the scarcity of labeled dialectal speech, audio dialect classification is a challenging task for most languages, including Swiss German. In this work, we explore the ability of large language models (LLMs) as agents in understanding the dialects and whether they can show comparable performance to models such as HuBERT in dialect classification. In addition, we provide an LLM baseline and a human linguist one. Our approach uses phonetic transcriptions produced by ASR systems and combines them with linguistic resources such as dialect feature maps, vowel history, and rules. Our findings indicate that, when linguistic information is provided, the LLM predictions improve. The human baseline shows that automatically generated transcriptions can be beneficial for such classifications, but also presents opportunities for improvement.
Comments: Accepted to DialRes Workshop @ LREC 2026
Subjects:
Computation and Language (cs.CL)
Cite as: arXiv:2603.29541 [cs.CL]
(or arXiv:2603.29541v1 [cs.CL] for this version)
https://doi.org/10.48550/arXiv.2603.29541
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Akbar Karimi [view email] [v1] Tue, 31 Mar 2026 10:24:20 UTC (960 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modellanguage modelannounce
Extracting System Prompt and Model Identity from Telegram's AI. It's Qwen 3.5
Article URL: https://medium.com/@metraoklam/extracting-system-prompt-model-identity-from-telegrams-ai-feature-it-s-qwen-3-5-5a6204c9d76a Comments URL: https://news.ycombinator.com/item?id=47634548 Points: 2 # Comments: 0
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Extracting System Prompt and Model Identity from Telegram's AI. It's Qwen 3.5
Article URL: https://medium.com/@metraoklam/extracting-system-prompt-model-identity-from-telegrams-ai-feature-it-s-qwen-3-5-5a6204c9d76a Comments URL: https://news.ycombinator.com/item?id=47634548 Points: 2 # Comments: 0




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!