Research Papers research paper arxiv masked diffusion language models LoRA-based continual pre-training multilingual encoder

Diffutron: A Masked Diffusion Language Model for Turkish Language

HuggingFace PapersMarch 20, 20268 min read0 views

Source Quiz

Masked diffusion language models offer a resource-efficient, non-autoregressive approach to text generation for Turkish through progressive instruction tuning and LoRA-based pre-training. (2 upvotes on HuggingFace)

Papers arxiv:2603.20466 Copy markdown

Diffutron: A Masked Diffusion Language Model for Turkish Language

Published on Mar 20 · Submitted by Talha Rüzgar Akkuş on Mar 30 · Diffutron Upvote 6

Authors: Şuayp Talha Kocabay , Talha Rüzgar Akkuş

Abstract

Masked diffusion language models offer a resource-efficient, non-autoregressive approach to text generation for Turkish through progressive instruction tuning and LoRA-based pre-training.

AI-generated summary

Masked Diffusion Language Models (MDLMs) have emerged as a compelling non-autoregressive alternative to standard large language models; however, their application to morphologically rich languages remains limited. In this paper, we introduce Diffutron, a masked diffusion language model specifically designed for Turkish. Our approach leverages a resource-efficient training pipeline, starting with LoRA-based continual pre-training of a multilingual encoder on a large-scale corpus. To enable generative capabilities, we employ a progressive instruction-tuning strategy, sequentially adapting the model on general and task-specific instruction sets. Experimental results across comprehensive benchmarks demonstrate that, despite its compact size, our model achieves competitive performance compared to existing multi-billion-parameter baselines. These findings validate the effectiveness of masked diffusion modeling combined with multi-stage tuning for non-autoregressive text generation in Turkish.

View arXiv page View PDF Add to collection

Community

Q-bert Paper author Paper submitter 3 days ago

Reply librarian-bot 2 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

EstLLM: Enhancing Estonian Capabilities in Multilingual LLMs via Continued Pretraining and Post-Training (2026)
Raising Bars, Not Parameters: LilMoo Compact Language Model for Hindi (2026)
AraModernBERT: Transtokenized Initialization and Long-Context Encoder Modeling for Arabic (2026)
ViCLSR: A Supervised Contrastive Learning Framework with Natural Language Inference for Natural Language Understanding Tasks (2026)
MzansiText and MzansiLM: An Open Corpus and Decoder-Only Language Model for South African Languages (2026)
MrBERT: Modern Multilingual Encoders via Vocabulary, Domain, and Dimensional Adaptation (2026)
Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs (2026)

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Reply EditPreview Upload images, audio, and videos by dragging in the text input, pasting, or clicking here. Tap or paste here to upload images Comment

· Sign up or log in to comment

Upvote 6

Get this paper in your agent:

hf papers read 2603.20466 Don't have the latest CLI? curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 3

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.20466 in a Space README.md to link it from this page.

Collections including this paper 1

Original source

HuggingFace Papers

https://huggingface.co/papers/2603.20466

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

ModelsLive

Google's TurboQuant saves memory, but won't save us from DRAM-pricing hell

<h4>Chocolate Factory’s compression tech clears the way to cheaper AI inference, not more affordable memory</h4> When Google unveiled <a target="_blank" rel="nofollow" href="https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/">TurboQuant</a>, an AI data compression technology that promises to slash the amount of memory required to serve models, many hoped it would help with a memory shortage that has seen prices triple since last year. Not so much.…

The Register AI/ML

1mabout 1 hour ago

Research PapersFresh

Illinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!

<a href="https://news.google.com/rss/articles/CBMiXEFVX3lxTE13OVpWMEk1Z3hlMkR2bHNBQ2dkazFwb3VqN3hCa29GWGJvSVlPa00zd2xUakRmYXFqQmc5OWU0eGl4a21FMDAwWUN2Q3p0M3FrbXBkNV8zN0cxaG1s?oc=5" target="_blank">Illinois Tech computer science researcher honored by IEEE Chicago Section</a> EurekAlert!

Google News: Machine Learning

1mabout 2 hours ago

ProductsFresh

My Journey to becoming a Quantum Engineer

I have procrastinated on documenting this process for the longest time. But I think i am ready now (maybe). Coming from a front end engineering background, I am fascinated by the work being done by the quantum engineers at IBM. I am not that great with maths and statistics but I believe anything can be learned with tons of practice and consistency. I want to use this platform to hold myself accountable (that is if i don't give up half way and delete all my posts. I'll try not to btw). This is an article describing <a href="https://www.ibm.com/think/topics/quantum-computing" rel="noopener noreferrer">what quantum computing is</a> and some of it's use cases. I became an IBM qiskit advocate late last year and I have been exposed to a lot of resources and networked a bun

DEV Community

2mabout 2 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 200 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersFresh

Illinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!

Google News: Machine Learning

1mabout 2 hours ago

Research PapersFresh

Research roundup: 7 cool science stories we almost missed

Ars Technica

1mabout 4 hours ago

Research PapersFresh

AI maps science papers to predict research trends two to three years ahead - Tech Xplore

<a href="https://news.google.com/rss/articles/CBMie0FVX3lxTE5aTkZYTWdaRDZwTXNRMldpMG1WZ1YzWDZTOHN5M183Z3A1ZTFYbnhEWTdPRmpvZnZFU0xodlRsNWxFaGxTcEpwalhJNmJpQWE5VjhaRS1tOXJIeTc5Z0JNblJ3dFd4WjRYZGJOX0NrWGt6ZmZJVTBpRm5wWQ?oc=5" target="_blank">AI maps science papers to predict research trends two to three years ahead</a> Tech Xplore

Google News: Machine Learning

1mabout 4 hours ago

Research PapersFresh

AI inspires new research topics in materials science - Nanowerk

<a href="https://news.google.com/rss/articles/CBMiZ0FVX3lxTFBPWlJSM2ExeVQ3LVppTm45NHpEMW9YVkxscThCNDd2OVB0c3J1ZmVCbWNSZWZ0TjZwSzlOdEFXN2UtRk5LU1hxdXd4ZklldGxoM0FZSnhCd19PWkNHQ1ZRVDNwSHNUSk0?oc=5" target="_blank">AI inspires new research topics in materials science</a> Nanowerk

GNews AI materials

1mabout 12 hours ago