Releases model training announce available update prediction

Drift-Aware Continual Tokenization for Generative Recommendation

arXiv cs.IRby Yuebo Feng, Jiahao Liu, Mingzhe Han, Dongsheng Li, Hansu Gu, Peng Zhang, Tun Lu, Ning GuApril 1, 20262 min read0 views

Source Quiz

arXiv:2603.29705v1 Announce Type: new Abstract: Generative recommendation commonly adopts a two-stage pipeline in which a learnable tokenizer maps items to discrete token sequences (i.e. identifiers) and an autoregressive generative recommender model (GRM) performs prediction based on these identifiers. Recent tokenizers further incorporate collaborative signals so that items with similar user-behavior patterns receive similar codes, substantially improving recommendation quality. However, real-world environments evolve continuously: new items cause identifier collision and shifts, while new interactions induce collaborative drift in existing items (e.g., changing co-occurrence patterns and popularity). Fully retraining both tokenizer and GRM is often prohibitively expensive, yet naively f

View PDF HTML (experimental)

Abstract:Generative recommendation commonly adopts a two-stage pipeline in which a learnable tokenizer maps items to discrete token sequences (i.e. identifiers) and an autoregressive generative recommender model (GRM) performs prediction based on these identifiers. Recent tokenizers further incorporate collaborative signals so that items with similar user-behavior patterns receive similar codes, substantially improving recommendation quality. However, real-world environments evolve continuously: new items cause identifier collision and shifts, while new interactions induce collaborative drift in existing items (e.g., changing co-occurrence patterns and popularity). Fully retraining both tokenizer and GRM is often prohibitively expensive, yet naively fine-tuning the tokenizer can alter token sequences for the majority of existing items, undermining the GRM's learned token-embedding alignment. To balance plasticity and stability for collaborative tokenizers, we propose DACT, a Drift-Aware Continual Tokenization framework with two stages: (i) tokenizer fine-tuning, augmented with a jointly trained Collaborative Drift Identification Module (CDIM) that outputs item-level drift confidence and enables differentiated optimization for drifting and stationary items; and (ii) hierarchical code reassignment using a relaxed-to-strict strategy to update token sequences while limiting unnecessary changes. Experiments on three real-world datasets with two representative GRMs show that DACT consistently achieves better performance than baselines, demonstrating effective adaptation to collaborative evolution with reduced disruption to prior knowledge. Our implementation is publicly available at this https URL for reproducibility.

Subjects:

Information Retrieval (cs.IR)

Cite as: arXiv:2603.29705 [cs.IR]

(or arXiv:2603.29705v1 [cs.IR] for this version)

https://doi.org/10.48550/arXiv.2603.29705

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Yuebo Feng [view email] [v1] Tue, 31 Mar 2026 13:02:47 UTC (720 KB)

Original source

arXiv cs.IR

https://arxiv.org/abs/2603.29705

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modeltrainingannounce

Products

Jetson

<p> Automatically turn support messages into product updates </p> <p> <a href="https://www.producthunt.com/products/jetson-3?utm_campaign=producthunt-atom-posts-feed&utm_medium=rss-feed&utm_source=producthunt-atom-posts-feed">Discussion</a> | <a href="https://www.producthunt.com/r/p/1089998?app_id=339">Link</a> </p>

Product Hunt

1m29 days ago

ModelsRecent

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models WSJ

Google News: LLM

1m1 day ago

ProductsFresh

Gauge-Mediated Contagion: A Quantum Electrodynamics-Inspired Framework for Non-Local Epidemic Dynamics and Superdiffusion

arXiv:2602.13913v2 Announce Type: replace-cross Abstract: In this paper, we introduce a gauge-mediated Epidemiological Model inspired by Quantum Electrodynamics (QED). In this model, the ``direct contact'' paradigm of classical SIR models is replaced by a gauge-mediated interaction where the environment, represented by a pathogen field $\varphi$, plays a fundamental role in the epidemic dynamics. In this model, the non-local characteristics of epidemics appear naturally by integrating out the pathogen field. Utilizing the Doi-Peliti formalism, we derive the effective action of the system and the standard Feynman rules that can be used to compute perturbatively any observables. The standard deterministic SIR equations emerge as the mean-field saddle-point approximation of this formalism. Go

arXiv physics.data-an

2mabout 4 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 312 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Releases

ReleasesFresh

Quantum-Safe Code Auditing: LLM-Assisted Static Analysis and Quantum-Aware Risk Scoring for Post-Quantum Cryptography Migration

arXiv:2604.00560v1 Announce Type: cross Abstract: The impending arrival of cryptographically relevant quantum computers (CRQCs) threatens the security foundations of modern software: Shor's algorithm breaks RSA, ECDSA, ECDH, and Diffie-Hellman, while Grover's algorithm reduces the effective security of symmetric and hash-based schemes. Despite NIST standardising post-quantum cryptography (PQC) in 2024 (FIPS 203 ML-KEM, FIPS 204 ML-DSA, FIPS 205 SLH-DSA), most codebases lack automated tooling to inventory classical cryptographic usage and prioritise migration based on quantum risk. We present Quantum-Safe Code Auditor, a quantum-aware static analysis framework that combines (i) regex-based detection of 15 classes of quantum-vulnerable primitives, (ii) LLM-assisted contextual enrichment to c

arXiv cs.SE

1mabout 4 hours ago

ReleasesFresh

Active learning emulators for nuclear two-body scattering in momentum space

arXiv:2512.17842v2 Announce Type: replace-cross Abstract: We extend the active learning emulators for two-body scattering in coordinate space with error estimation, recently developed by Maldonado et al. [Phys. Rev. C 112, 024002], to coupled-channel scattering in momentum space. Our full-order model (FOM) solver is based on the Lippmann-Schwinger integral equation for the scattering $t$-matrix as opposed to the radial Schr\"odinger equation. We use (Petrov-)Galerkin projections and high-fidelity calculations at a few snapshots across the parameter space of the interaction to construct efficient reduced-order models (ROMs), trained by a greedy algorithm for locally optimal snapshot selection. Both the FOM solver and the corresponding ROMs are implemented efficiently in Python using Google'

arXiv physics.data-an

2mabout 4 hours ago

ReleasesFresh

Ultrasonic Brain Computer Interfaces for Enhancing Human-Machine Cognition

arXiv:2604.00349v1 Announce Type: new Abstract: Low-intensity transcranial focused ultrasound (tFUS) is rapidly emerging as a transformative non-invasive brain stimulation (NIBS) modality characterized by high spatial resolution and ability to target deep brain circuits. Unlike electromagnetic techniques such as transcranial magnetic stimulation and transcranial direct current stimulation, which are constrained by centimeter-scale resolution and a depth-focality tradeoff, tFUS leverages mechanical pressure waves to modulate both superficial cortical and deep subcortical structures with millimeter precision. This article discusses recent scientific observations and engineering breakthroughs in the advancement of tFUS for next-generation ultrasonic brain-computer interfaces (uBCIs) and human

arXiv q-bio.NC

1mabout 4 hours ago

ReleasesFresh

3D User Localization for Planar Arrays in LoS Near- and Far-Fields via Summed Phase Differences

arXiv:2604.00727v1 Announce Type: new Abstract: This paper presents a phase-difference-based scheme for three-dimensional (3D) line-of-sight (LoS) user localization using a uniform planar array (UPA), applicable to both near-field and far-field regimes under the exact spherical-wave model. Unlike the previously studied two-dimensional (2D) uniform linear array (ULA) case, the 3D UPA case requires jointly exploiting the two array axes in order to recover the user's range, azimuth, and zenith angle. Adjacent-antenna phase-differences are first estimated from uplink pilots and then summed along the array axes to obtain unwrapped phase-differences between widely separated antenna elements. These summed phase-differences enable the construction of multiple three-equation systems whose solutions

arXiv eess.SP

1mabout 4 hours ago