KALAVAI: Predicting When Independent Specialist Fusion Works -- A Quantitative Model for Post-Hoc Cooperative LLM Training
arXiv:2603.22755v2 Announce Type: replace-cross Abstract: Independently trained domain specialists can be fused post-hoc into a single model that outperforms any individual specialist, and the gain is predictable: gain = 0.82 x divergence - 2.72 (R^2 = 0.856, n=6, 3-26% divergence). This enables practitioners to estimate cooperative value before committing compute. Below ~3.3% divergence, gains approach zero.In the KALAVAI protocol, contributors fine-tune copies of a shared checkpoint independently, then submit for lightweight MoE routing (500 steps). Gains are consistent: +7.72% at 410M (+/-0 — Ramchand Kumaresan
View PDF HTML (experimental)
Abstract:Independently trained domain specialists can be fused post-hoc into a single model that outperforms any individual specialist, and the gain is predictable: gain = 0.82 x divergence - 2.72 (R^2 = 0.856, n=6, 3-26% divergence). This enables practitioners to estimate cooperative value before committing compute. Below ~3.3% divergence, gains approach this http URL the KALAVAI protocol, contributors fine-tune copies of a shared checkpoint independently, then submit for lightweight MoE routing (500 steps). Gains are consistent: +7.72% at 410M (+/-0.02%, 3 seeds), +7.49% at 1B (+/-0.01%, 3 seeds), +6.53% at 6.9B, each over the best specialist. The router matches domain-oracle routing within <10^{-5} nats. Cross-lingual fusion (Tamil/Yoruba/Welsh/Code) achieves +21.76%, with Yoruba perplexity falling 41.9 to 7.7. A 20-contributor federation achieves +16.71% (+/-0.07pp, 3 seeds).Three requirements bound the protocol. Shared initialisation is necessary: checkpoint mismatch degrades routing. Frozen layers are optional below ~10,000 steps and beneficial beyond. Learned routing is essential: uniform averaging degrades by -1.2% vs. best specialist, while any trained router achieves oracle-optimal assignment.
Subjects:
Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as: arXiv:2603.22755 [cs.CL]
(or arXiv:2603.22755v2 [cs.CL] for this version)
https://doi.org/10.48550/arXiv.2603.22755
arXiv-issued DOI via DataCite
Submission history
From: Ramchand Kumaresan [view email] [v1] Tue, 24 Mar 2026 03:32:04 UTC (1,623 KB) [v2] Fri, 27 Mar 2026 15:25:04 UTC (1,792 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivUganda To Host Climate Change, Artificial Intelligence Summit, Sept 5-6 - Independent Newspaper Nigeria
<a href="https://news.google.com/rss/articles/CBMimAFBVV95cUxNcnBtdldJUERlX0dzOTJEY2sybEc2ZjZSbUtiLWIzUUhJbkQ1N3BwUWlCcV95YmZNSmFGbFQ1enE5VWJlY0JBWDhlSENlNEFNMmM5Q0hrM080V3Q2eUF3cmpkeFBXRS01YXBpRUI4Uk5KOVY5bjFaRm1GNmVudGUtNTFmVDlBMDIyNGVGaF9WTkdHTDMxY1BZcw?oc=5" target="_blank">Uganda To Host Climate Change, Artificial Intelligence Summit, Sept 5-6</a> <font color="#6f6f6f">Independent Newspaper Nigeria</font>
AI could transform research assessment — and some academics are worried - Nature
<a href="https://news.google.com/rss/articles/CBMiX0FVX3lxTE12VmJ3THU1WmwzcENmWFJqTVRfclJGVkhzTG9Kcm9mTm1VZnJsV2IyZGwtc21EWnZRSkRfSXM3SDRlOVZnUlhpVm9VUEMtRWRRYmNDVU1kdHg5NllvSERj?oc=5" target="_blank">AI could transform research assessment — and some academics are worried</a> <font color="#6f6f6f">Nature</font>
Instrument maker Roland launches AI melody generator powered by research from Sony Computer Science Laboratories - Music Business Worldwide
<a href="https://news.google.com/rss/articles/CBMi5wFBVV95cUxQaW5rU25RUmwtd01xd0xKRVlDWEx6b204MFYzM3FHQlBXeE5wYzhYczVGdm1HOS03VjVURE02YzBGcE8yYTRzbk1IX3AtVlJmeUVaazlVQWduNnYxN05mamVYVGNmNGdFOVRxbTRhV3hqamhfY1JNSTdsTTB1U2Nic2lNcnd2YVpFMUY5YmlyWVZFY1FQTGd3dndCS3R6Zmt3QWVnWm14WFdVeUNFd0Y0a1FQU1ZLT2psSVRxeWQ0X0FaSGhxQU5UbjZBT1JGWDZERmRRV1c1VEU0RkNkZF9HLWZyXzFxUmc?oc=5" target="_blank">Instrument maker Roland launches AI melody generator powered by research from Sony Computer Science Laboratories</a> <font color="#6f6f6f">Music Business Worldwide</font>
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
AI could transform research assessment — and some academics are worried - Nature
<a href="https://news.google.com/rss/articles/CBMiX0FVX3lxTE12VmJ3THU1WmwzcENmWFJqTVRfclJGVkhzTG9Kcm9mTm1VZnJsV2IyZGwtc21EWnZRSkRfSXM3SDRlOVZnUlhpVm9VUEMtRWRRYmNDVU1kdHg5NllvSERj?oc=5" target="_blank">AI could transform research assessment — and some academics are worried</a> <font color="#6f6f6f">Nature</font>

As AI-Generated Music Advances, Humans Still Lead in Creativity, CMU Research Finds
<p> <img loading="lazy" src="https://www.cmu.edu/news/sites/default/files/styles/listings_desktop_1x_/public/2026-01/251104A_WTM_AI-Creativity-Music102.jpg.webp?itok=uEc2ayOO" width="900" height="508" alt="A woman with long black hair is seated on the right opposite a computer screen with a small piano keyboard and computer keyboard in front of her on a desk, where a man next to her with glasses and wavy black hair operates the mouse and talks to her."> </p> AI can write songs, but still has a way to go before matching the creativity of tunes made by people, according to Carnegie Mellon University research.


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!