KARMA: Knowledge-Action Regularized Multimodal Alignment for Personalized Search at Taobao
arXiv:2603.22779v2 Announce Type: replace Abstract: Large Language Models (LLMs) are equipped with profound semantic knowledge, making them a natural choice for injecting semantic generalization into personalized search systems. However, in practice we find that directly fine-tuning LLMs on industrial personalized tasks (e.g. next item prediction) often yields suboptimal results. We attribute this bottleneck to a critical Knowledge--Action Gap: the inherent conflict between preserving pre-trained semantic knowledge and aligning with specific personalized actions by discriminative objectives. Empirically, action-only training objectives induce Semantic Collapse, such as attention "sinks". This degradation severely cripples the LLM's generalization, failing to bring improvements to personali
View PDF HTML (experimental)
Abstract:Large Language Models (LLMs) are equipped with profound semantic knowledge, making them a natural choice for injecting semantic generalization into personalized search systems. However, in practice we find that directly fine-tuning LLMs on industrial personalized tasks (e.g. next item prediction) often yields suboptimal results. We attribute this bottleneck to a critical Knowledge--Action Gap: the inherent conflict between preserving pre-trained semantic knowledge and aligning with specific personalized actions by discriminative objectives. Empirically, action-only training objectives induce Semantic Collapse, such as attention "sinks". This degradation severely cripples the LLM's generalization, failing to bring improvements to personalized search systems. We propose KARMA (Knowledge--Action Regularized Multimodal Alignment), a unified framework that treats semantic reconstruction as a train-only regularizer. KARMA optimizes a next-interest embedding for retrieval (Action) while enforcing semantic decodability (Knowledge) through two complementary objectives: (i) history-conditioned semantic generation, which anchors optimization to the LLM's native next-token distribution, and (ii) embedding-conditioned semantic reconstruction, which constrains the interest embedding to remain semantically recoverable. On Taobao search system, KARMA mitigates semantic collapse (attention-sink analysis) and improves both action metrics and semantic fidelity. In ablations, semantic decodability yields up to +22.5 HR@200. With KARMA, we achieve +0.25 CTR AUC in ranking, +1.86 HR in pre-ranking and +2.51 HR in recalling. Deployed online with low inference overhead at ranking & pre-ranking stage, KARMA drives +0.9% increase in GMV.
Subjects:
Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as: arXiv:2603.22779 [cs.IR]
(or arXiv:2603.22779v2 [cs.IR] for this version)
https://doi.org/10.48550/arXiv.2603.22779
arXiv-issued DOI via DataCite
Submission history
From: Zhi Sun [view email] [v1] Tue, 24 Mar 2026 04:13:30 UTC (1,195 KB) [v2] Tue, 31 Mar 2026 09:40:53 UTC (1,195 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modellanguage modeltrainingThe Fallback That Never Fires
<p>Your agent hits a rate limit. The fallback logic kicks in, picks an alternative model. Everything should be fine.</p> <p>Except the request still goes to the original model. And gets rate-limited again. And again. Forever.</p> <h2> The Setup </h2> <p>When your primary model returns 429:</p> <ol> <li>Fallback logic detects rate_limit_error</li> <li>Selects next model in the fallback chain</li> <li>Retries with the fallback model</li> <li>User never notices</li> </ol> <p>OpenClaw has had model fallback chains for months, and they generally work well.</p> <h2> The Override </h2> <p><a href="https://github.com/openclaw/openclaw/issues/59213" rel="noopener noreferrer">Issue #59213</a> exposes a subtle timing problem. Between steps 2 and 3, there is another system: <strong>session model recon
Promoting late-gameplay BG3 composition contracts in the TD2 SDL port
<h1> Promoting late-gameplay BG3 composition contracts in the TD2 SDL port </h1> <p>This checkpoint moved one late-gameplay renderer hypothesis out of tooling and into the runtime.</p> <p>The late live-entry bundles at frames <code>3250</code>, <code>3400</code>, and <code>3550</code> already had a strong signal from the cutoff sweep: the missing horizon strip was not explained by missing assets, but by a narrow composition rule. The best candidates were consistent enough to promote:</p> <ul> <li>frame <code>3250</code>: enable <code>BG3</code> in the top <code>79</code> scanlines and keep <code>BG3 > BG2</code> there</li> <li>frame <code>3400</code>: same <code>79</code>-line window</li> <li>frame <code>3550</code>: same rule with a deeper <code>95</code>-line window</li> </ul> <p>The run
I Asked AI to Do Agile Sprint Planning (GitHub Copilot Test)
<p>AI tools are getting very good at writing code.</p> <p>GitHub Copilot can generate entire functions, review pull requests, and even help refactor legacy codebases. But software development isn’t just about writing code.</p> <p>A big part of the process is <strong>planning the work</strong>.</p> <p>So I decided to run a small experiment:</p> <p><strong>Can AI actually perform Agile sprint planning?</strong></p> <p>Using <strong>GitHub Copilot inside Visual Studio 2026</strong>, I asked AI to review a legacy codebase and generate a <strong>Scrum sprint plan for rewriting the application</strong>.</p> <p>The results were… interesting.</p> <h1> Watch Video </h1> <h2> <iframe src="https://www.youtube.com/embed/ErwuATHHXw4"> </iframe> </h2> <h1> The Setup </h1> <p>The experiment was intention
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models
Walmart expands AI-powered shopping — and checkout — with Google Gemini - axios.com
<a href="https://news.google.com/rss/articles/CBMidEFVX3lxTFBrSFVRNTBJMlZoWVRGNEdJNlpBbUl3RVpPRVlIVWQ0OFhtck5zSzdIRkt1bDQ5aDJTZ1g3SVVDZnNfSm1Nbk1DdGRyc2RqQWJnNkRZYjJlWEU5SGdKUjU3ZkdDekt6bHgxT3dwUEJFMU5URHFy?oc=5" target="_blank">Walmart expands AI-powered shopping — and checkout — with Google Gemini</a> <font color="#6f6f6f">axios.com</font>
Walmart teams up with Google’s Gemini for AI-assisted shopping - Retail Dive
<a href="https://news.google.com/rss/articles/CBMiigFBVV95cUxOX3g3TkoxOTZieXhpOWd2ZnBTNnM2Rl9rZTJ1WmlzMVZhUFlmVWlpWmVyOTZJUV9WcHIyR1VaeGxaQzZDYW1BeDRWbGVIWGx6UWpEdUJ4LXpoZk1YUDNHcnlJNTFKOWxCOXJDNm13V1NnNmFJRjFiM2FKUnp1VkdobmVTZ1NpN2ZEV2c?oc=5" target="_blank">Walmart teams up with Google’s Gemini for AI-assisted shopping</a> <font color="#6f6f6f">Retail Dive</font>
Google’s Gemini AI is getting a bigger role across Docs, Sheets, and Slides - The Verge
<a href="https://news.google.com/rss/articles/CBMiiAFBVV95cUxPMHdiN2dqSUwyNDlzaVRCU1RUSW1iYnZZdmgxVXJtUm9JR2pqbE5LQ3V3eWRZV3htREYwNDMwaThfYVd2RjhhQUZqZWRtVHd3aFhuOFRZMDNRbGQwUmFMTm0wckpLMThLTlZyU2RlX1ZfaGI2WThSMVEtLU9qZXlPSS11dzREUnBv?oc=5" target="_blank">Google’s Gemini AI is getting a bigger role across Docs, Sheets, and Slides</a> <font color="#6f6f6f">The Verge</font>
The Fallback That Never Fires
<p>Your agent hits a rate limit. The fallback logic kicks in, picks an alternative model. Everything should be fine.</p> <p>Except the request still goes to the original model. And gets rate-limited again. And again. Forever.</p> <h2> The Setup </h2> <p>When your primary model returns 429:</p> <ol> <li>Fallback logic detects rate_limit_error</li> <li>Selects next model in the fallback chain</li> <li>Retries with the fallback model</li> <li>User never notices</li> </ol> <p>OpenClaw has had model fallback chains for months, and they generally work well.</p> <h2> The Override </h2> <p><a href="https://github.com/openclaw/openclaw/issues/59213" rel="noopener noreferrer">Issue #59213</a> exposes a subtle timing problem. Between steps 2 and 3, there is another system: <strong>session model recon

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!