Live
Black Hat USAAI BusinessBlack Hat AsiaAI Business1 Artificial Intelligence (AI) Software Stock to Buy Hand Over Fist Before It Soars 62%, According to Dan Ives - The Motley FoolGoogle News: AIGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAIGizmodoGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAI - GizmodoGoogle News: OpenAIInside the race to recreate Claude Code and mine its guts for revelationsBusiness InsiderAnthropic Executive Sees Cowork Agent as Bigger Than Claude Code - Bloomberg.comGoogle News: ClaudeAnthropic Executive Sees Cowork Agent as Bigger Than Claude CodeBloomberg TechnologyABAP OOP Design Patterns — Part 2: Factory, Observer, and Decorator Patterns in Real SAP SystemsDEV CommunityWhy Your AI Agent Health Check Is Lying to YouDEV CommunityDeep Dive: Array Internals & Memory LayoutDEV CommunityIllinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!Google News: Machine LearningICE Tells Lawmakers It’s Using Spyware in Fight Against FentanylBloomberg TechnologyAmazon Facilities in Bahrain Hit Again as Iran Follows Through on Threat, Report SaysGizmodoBlack Hat USAAI BusinessBlack Hat AsiaAI Business1 Artificial Intelligence (AI) Software Stock to Buy Hand Over Fist Before It Soars 62%, According to Dan Ives - The Motley FoolGoogle News: AIGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAIGizmodoGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAI - GizmodoGoogle News: OpenAIInside the race to recreate Claude Code and mine its guts for revelationsBusiness InsiderAnthropic Executive Sees Cowork Agent as Bigger Than Claude Code - Bloomberg.comGoogle News: ClaudeAnthropic Executive Sees Cowork Agent as Bigger Than Claude CodeBloomberg TechnologyABAP OOP Design Patterns — Part 2: Factory, Observer, and Decorator Patterns in Real SAP SystemsDEV CommunityWhy Your AI Agent Health Check Is Lying to YouDEV CommunityDeep Dive: Array Internals & Memory LayoutDEV CommunityIllinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!Google News: Machine LearningICE Tells Lawmakers It’s Using Spyware in Fight Against FentanylBloomberg TechnologyAmazon Facilities in Bahrain Hit Again as Iran Follows Through on Threat, Report SaysGizmodo

KazByte: Adapting Qwen models to Kazakh via Byte-level Adapter

arXivMarch 31, 20261 min read0 views
Source Quiz

arXiv:2603.27859v1 Announce Type: new Abstract: Large language models fragment Kazakh text into many more tokens than equivalent English text, because their tokenizers were built for high-resource languages. This tokenizer tax inflates compute, shortens the effective context window, and weakens the model's grip on Kazakh morphology. We propose to bypass the tokenizer entirely by feeding raw bytes through a small adapter that learns to speak the internal language of a frozen Qwen2.5-7B. Once the adapter is trained, we freeze it and fine-tune only the attention layers of Qwen on Kazakh text. Our — Rauan Akylzhanov

View PDF HTML (experimental)

Abstract:Large language models fragment Kazakh text into many more tokens than equivalent English text, because their tokenizers were built for high-resource languages. This tokenizer tax inflates compute, shortens the effective context window, and weakens the model's grip on Kazakh morphology. We propose to bypass the tokenizer entirely by feeding raw bytes through a small adapter that learns to speak the internal language of a frozen Qwen2.5-7B. Once the adapter is trained, we freeze it and fine-tune only the attention layers of Qwen on Kazakh text. Our central hypothesis is that this two-stage process -- first teach the interface, then adapt the model -- should match or exceed the accuracy of the original Qwen2.5-7B on standard Kazakh benchmarks. This report describes the ByteKaz architecture and training protocol. Empirical validation is ongoing; this version stakes the design and hypotheses for the record.

Comments: Technical announcement

Subjects:

Computation and Language (cs.CL); Numerical Analysis (math.NA)

Cite as: arXiv:2603.27859 [cs.CL]

(or arXiv:2603.27859v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.27859

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Rauan Akylzhanov [view email] [v1] Sun, 29 Mar 2026 20:27:58 UTC (14 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
KazByte: Ad…researchpaperarxivnlplanguage-mo…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 171 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers