StreamDiT: Real-Time Streaming Text-to-Video Generation
arXiv:2507.03745v4 Announce Type: replace-cross Abstract: Recently, great progress has been achieved in text-to-video (T2V) generation by scaling transformer-based diffusion models to billions of parameters, which can generate high-quality videos. However, existing models typically produce only short clips offline, restricting their use cases in interactive and real-time applications. This paper addresses these challenges by proposing StreamDiT, a streaming video generation model. StreamDiT training is based on flow matching by adding a moving buffer. We design mixed training with different pa — Akio Kodaira, Tingbo Hou, Ji Hou, Markos Georgopoulos, Felix Juefei-Xu, Masayoshi Tomizuka, Yue Zhao
View PDF HTML (experimental)
Abstract:Recently, great progress has been achieved in text-to-video (T2V) generation by scaling transformer-based diffusion models to billions of parameters, which can generate high-quality videos. However, existing models typically produce only short clips offline, restricting their use cases in interactive and real-time applications. This paper addresses these challenges by proposing StreamDiT, a streaming video generation model. StreamDiT training is based on flow matching by adding a moving buffer. We design mixed training with different partitioning schemes of buffered frames to boost both content consistency and visual quality. StreamDiT modeling is based on adaLN DiT with varying time embedding and window attention. To practice the proposed method, we train a StreamDiT model with 4B parameters. In addition, we propose a multistep distillation method tailored for StreamDiT. Sampling distillation is performed in each segment of a chosen partitioning scheme. After distillation, the total number of function evaluations (NFEs) is reduced to the number of chunks in a buffer. Finally, our distilled model reaches real-time performance at 16 FPS on one GPU, which can generate video streams at 512p resolution. We evaluate our method through both quantitative metrics and human evaluation. Our model enables real-time applications, e.g. streaming generation, interactive generation, and video-to-video. We provide video results and more examples in our project website: this https URL
Comments: CVPR 2026
Subjects:
Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
Cite as: arXiv:2507.03745 [cs.CV]
(or arXiv:2507.03745v4 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2507.03745
arXiv-issued DOI via DataCite
Submission history
From: Akio Kodaira [view email] [v1] Fri, 4 Jul 2025 18:00:01 UTC (29,321 KB) [v2] Tue, 8 Jul 2025 03:10:13 UTC (29,321 KB) [v3] Fri, 14 Nov 2025 10:50:56 UTC (29,316 KB) [v4] Fri, 27 Mar 2026 15:59:42 UTC (29,316 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivIllinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!
<a href="https://news.google.com/rss/articles/CBMiXEFVX3lxTE13OVpWMEk1Z3hlMkR2bHNBQ2dkazFwb3VqN3hCa29GWGJvSVlPa00zd2xUakRmYXFqQmc5OWU0eGl4a21FMDAwWUN2Q3p0M3FrbXBkNV8zN0cxaG1s?oc=5" target="_blank">Illinois Tech computer science researcher honored by IEEE Chicago Section</a> <font color="#6f6f6f">EurekAlert!</font>

My Journey to becoming a Quantum Engineer
<p>I have procrastinated on documenting this process for the longest time. But I think i am ready now (maybe). <br> Coming from a front end engineering background, I am fascinated by the work being done by the quantum engineers at IBM. I am not that great with maths and statistics but I believe anything can be learned with tons of practice and consistency. I want to use this platform to hold myself accountable (that is if i don't give up half way and delete all my posts. I'll try not to btw). </p> <p>This is an article describing <a href="https://www.ibm.com/think/topics/quantum-computing" rel="noopener noreferrer">what quantum computing is</a> and some of it's use cases.</p> <p>I became an IBM qiskit advocate late last year and I have been exposed to a lot of resources and networked a bun
Google offers researchers early access to Willow quantum processor
The Early Access Program invites researchers to design and propose quantum experiments that push the boundaries of what current hardware can achieve. It is a selective program – the processor will not be publicly available – and Google is setting firm deadlines for participation. Research teams have until May 15,... Read Entire Article
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
Illinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!
<a href="https://news.google.com/rss/articles/CBMiXEFVX3lxTE13OVpWMEk1Z3hlMkR2bHNBQ2dkazFwb3VqN3hCa29GWGJvSVlPa00zd2xUakRmYXFqQmc5OWU0eGl4a21FMDAwWUN2Q3p0M3FrbXBkNV8zN0cxaG1s?oc=5" target="_blank">Illinois Tech computer science researcher honored by IEEE Chicago Section</a> <font color="#6f6f6f">EurekAlert!</font>
AI maps science papers to predict research trends two to three years ahead - Tech Xplore
<a href="https://news.google.com/rss/articles/CBMie0FVX3lxTE5aTkZYTWdaRDZwTXNRMldpMG1WZ1YzWDZTOHN5M183Z3A1ZTFYbnhEWTdPRmpvZnZFU0xodlRsNWxFaGxTcEpwalhJNmJpQWE5VjhaRS1tOXJIeTc5Z0JNblJ3dFd4WjRYZGJOX0NrWGt6ZmZJVTBpRm5wWQ?oc=5" target="_blank">AI maps science papers to predict research trends two to three years ahead</a> <font color="#6f6f6f">Tech Xplore</font>
AI inspires new research topics in materials science - Nanowerk
<a href="https://news.google.com/rss/articles/CBMiZ0FVX3lxTFBPWlJSM2ExeVQ3LVppTm45NHpEMW9YVkxscThCNDd2OVB0c3J1ZmVCbWNSZWZ0TjZwSzlOdEFXN2UtRk5LU1hxdXd4ZklldGxoM0FZSnhCd19PWkNHQ1ZRVDNwSHNUSk0?oc=5" target="_blank">AI inspires new research topics in materials science</a> <font color="#6f6f6f">Nanowerk</font>

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!