Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessHow 1 Missing Line of Code Cost Anthropic $340 BillionDev.to AII Built npm for AI Skills — Here's Why AI Needs a Package ManagerDev.to AIAn I/O psychologist's rules for stopping AI agents from cutting cornersHacker News AI TopAisthOS: What if your OS compiled UP instead of down?Dev.to AII Moved a Folder. Claude Code Told Me Not to Copy My Own Secrets.Dev.to AIЯ собрал AI бота за вечер - он уже продаётDev.to AIMeshLedger – AI agents hire and pay each other through on-chain escrowHacker News AI TopAgents Can Pay. That's Not the Problem.Dev.to AIBizNode's self-healing watchdog auto-restarts crashed services. Zero downtime, zero babysitting neededDev.to AIPrologue: After We No Longer Write Code by Hand, What Remains for Engineers?Dev.to AIAI Knows Your Project Budget Will Fail Before You DoDev.to AILong Term AI Memory by creator of Apache CassandraDev.to AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessHow 1 Missing Line of Code Cost Anthropic $340 BillionDev.to AII Built npm for AI Skills — Here's Why AI Needs a Package ManagerDev.to AIAn I/O psychologist's rules for stopping AI agents from cutting cornersHacker News AI TopAisthOS: What if your OS compiled UP instead of down?Dev.to AII Moved a Folder. Claude Code Told Me Not to Copy My Own Secrets.Dev.to AIЯ собрал AI бота за вечер - он уже продаётDev.to AIMeshLedger – AI agents hire and pay each other through on-chain escrowHacker News AI TopAgents Can Pay. That's Not the Problem.Dev.to AIBizNode's self-healing watchdog auto-restarts crashed services. Zero downtime, zero babysitting neededDev.to AIPrologue: After We No Longer Write Code by Hand, What Remains for Engineers?Dev.to AIAI Knows Your Project Budget Will Fail Before You DoDev.to AILong Term AI Memory by creator of Apache CassandraDev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

MPDiT: Multi-Patch Global-to-Local Transformer Architecture For Efficient Flow Matching and Diffusion Model

arXivMarch 30, 202610 min read0 views
Source Quiz

arXiv:2603.26357v1 Announce Type: new Abstract: Transformer architectures, particularly Diffusion Transformers (DiTs), have become widely used in diffusion and flow-matching models due to their strong performance compared to convolutional UNets. However, the isotropic design of DiTs processes the same number of patchified tokens in every block, leading to relatively heavy computation during training process. In this work, we introduce a multi-patch transformer design in which early blocks operate on larger patches to capture coarse global context, while later blocks use smaller patches to refi — Quan Dao, Dimitris Metaxas

View PDF HTML (experimental)

Abstract:Transformer architectures, particularly Diffusion Transformers (DiTs), have become widely used in diffusion and flow-matching models due to their strong performance compared to convolutional UNets. However, the isotropic design of DiTs processes the same number of patchified tokens in every block, leading to relatively heavy computation during training process. In this work, we introduce a multi-patch transformer design in which early blocks operate on larger patches to capture coarse global context, while later blocks use smaller patches to refine local details. This hierarchical design could reduces computational cost by up to 50% in GFLOPs while achieving good generative performance. In addition, we also propose improved designs for time and class embeddings that accelerate training convergence. Extensive experiments on the ImageNet dataset demonstrate the effectiveness of our architectural choices. Code is released at \url{this https URL}

Comments: Accepted at CVPR 2026

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.26357 [cs.CV]

(or arXiv:2603.26357v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.26357

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Quan Dao [view email] [v1] Fri, 27 Mar 2026 12:30:10 UTC (5,241 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
MPDiT: Mult…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 166 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers