UK backs off default AI training on copyrighted material - theregister.com

More about

training

TRIMS: Trajectory-Ranked Instruction Masked Supervision for Diffusion Language Models

arXiv:2604.00666v1 Announce Type: new Abstract: Diffusion language models (DLMs) offer a promising path toward low-latency generation through parallel decoding, but their practical efficiency depends heavily on the decoding trajectory. In practice, this advantage often fails to fully materialize because standard training does not provide explicit supervision over token reveal order, creating a train-inference mismatch that leads to suboptimal decoding behavior. We propose Trajectory-Ranked Instruction Masked Supervision (TRIMS), a simple trajectory-guided supervised fine-tuning framework that injects trajectory supervision into standard Masked Diffusion Language Model (MDLM) training with minimal overhead. Instead of relying on costly DLM-based distillation, TRIMS uses lightweight signals

arXiv cs.CL

1mabout 4 hours ago

ModelsFresh

More Human, More Efficient: Aligning Annotations with Quantized SLMs

arXiv:2604.00586v1 Announce Type: new Abstract: As Large Language Model (LLM) capabilities advance, the demand for high-quality annotation of exponentially increasing text corpora has outpaced human capacity, leading to the widespread adoption of LLMs in automatic evaluation and annotation. However, proprietary LLMs often exhibit systematic biases that diverge from human expert consensus, lacks reproducibility, and raises data privacy concerns. Our work examines the viability of finetuning a quantized Small Language Model of 1.7B parameter size on limited human-annotated data to serve as a highly aligned, deterministic evaluator and annotator. By implementing a custom, multi-dimensional rubric framework and simple augmentation and regularization techniques, the proposed approach achieves h

arXiv cs.CL

1mabout 4 hours ago

ModelsFresh

Adapting Text LLMs to Speech via Multimodal Depth Up-Scaling

arXiv:2604.00489v1 Announce Type: new Abstract: Adapting pre-trained text Large Language Models (LLMs) into Speech Language Models (Speech LMs) via continual pretraining on speech data is promising, but often degrades the original text capabilities. We propose Multimodal Depth Upscaling, an extension of an emerging strategy in continual LLM pre-training, where new transformer layers are inserted into a frozen text LLM and only the added layers are trained on speech data. Experiments with SmolLM2-360M and SmolLM2-1.7B on 48k hours of English Automatic Speech Recognition (ASR) data show that depth up-scaling achieves ASR comparable to full fine-tuning while causing far less text degradation than both full fine-tuning and Low-Rank Adaptation (LoRA). We further show that incorporating E-Branch

arXiv cs.CL

1mabout 4 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 295 connections

Scroll to zoom · drag to pan · click to open

More in Models

ModelsFresh

TRIMS: Trajectory-Ranked Instruction Masked Supervision for Diffusion Language Models

arXiv:2604.00666v1 Announce Type: new Abstract: Diffusion language models (DLMs) offer a promising path toward low-latency generation through parallel decoding, but their practical efficiency depends heavily on the decoding trajectory. In practice, this advantage often fails to fully materialize because standard training does not provide explicit supervision over token reveal order, creating a train-inference mismatch that leads to suboptimal decoding behavior. We propose Trajectory-Ranked Instruction Masked Supervision (TRIMS), a simple trajectory-guided supervised fine-tuning framework that injects trajectory supervision into standard Masked Diffusion Language Model (MDLM) training with minimal overhead. Instead of relying on costly DLM-based distillation, TRIMS uses lightweight signals

arXiv cs.CL

1mabout 4 hours ago

ModelsFresh

English to Central Kurdish Speech Translation: Corpus Creation, Evaluation, and Orthographic Standardization

arXiv:2604.00613v1 Announce Type: new Abstract: We present KUTED, a speech-to-text translation (S2TT) dataset for Central Kurdish, derived from TED and TEDx talks. The corpus comprises 91,000 sentence pairs, including 170 hours of English audio, 1.65 million English tokens, and 1.40 million Central Kurdish tokens. We evaluate KUTED on the S2TT task and find that orthographic variation significantly degrades Kurdish translation performance, producing nonstandard outputs. To address this, we propose a systematic text standardization approach that yields substantial performance gains and more consistent translations. On a test set separated from TED talks, a fine-tuned Seamless model achieves 15.18 BLEU, and we improve Seamless baseline by 3.0 BLEU on the FLEURS benchmark. We also train a Tra

arXiv cs.CL

1mabout 4 hours ago

ModelsFresh

Speech LLMs are Contextual Reasoning Transcribers

arXiv:2604.00610v1 Announce Type: new Abstract: Despite extensions to speech inputs, effectively leveraging the rich knowledge and contextual understanding of large language models (LLMs) in automatic speech recognition (ASR) remains non-trivial, as the task primarily involves direct speech-to-text mapping. To address this, this paper proposes chain-of-thought ASR (CoT-ASR), which constructs a reasoning chain that enables LLMs to first analyze the input speech and generate contextual analysis, thereby fully exploiting their generative capabilities. With this contextual reasoning, CoT-ASR then performs more informed speech recognition and completes both reasoning and transcription in a single pass. Moreover, CoT-ASR naturally supports user-guided transcription: while designed to self-genera

arXiv cs.CL

1mabout 4 hours ago

ModelsFresh

More Human, More Efficient: Aligning Annotations with Quantized SLMs

arXiv:2604.00586v1 Announce Type: new Abstract: As Large Language Model (LLM) capabilities advance, the demand for high-quality annotation of exponentially increasing text corpora has outpaced human capacity, leading to the widespread adoption of LLMs in automatic evaluation and annotation. However, proprietary LLMs often exhibit systematic biases that diverge from human expert consensus, lacks reproducibility, and raises data privacy concerns. Our work examines the viability of finetuning a quantized Small Language Model of 1.7B parameter size on limited human-annotated data to serve as a highly aligned, deterministic evaluator and annotator. By implementing a custom, multi-dimensional rubric framework and simple augmentation and regularization techniques, the proposed approach achieves h

arXiv cs.CL

1mabout 4 hours ago

UK backs off default AI training on copyrighted material - theregister.com

Daily AI Digest

More about

TRIMS: Trajectory-Ranked Instruction Masked Supervision for Diffusion Language Models

More Human, More Efficient: Aligning Annotations with Quantized SLMs

Adapting Text LLMs to Speech via Multimodal Depth Up-Scaling

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Models

TRIMS: Trajectory-Ranked Instruction Masked Supervision for Diffusion Language Models

English to Central Kurdish Speech Translation: Corpus Creation, Evaluation, and Orthographic Standardization

Speech LLMs are Contextual Reasoning Transcribers

More Human, More Efficient: Aligning Annotations with Quantized SLMs