Research Papers research paper arxiv computer-vision image-recognition

ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction

arXivMarch 31, 20262 min read0 views

arXiv:2512.05422v2 Announce Type: replace Abstract: Unified multimodal models significantly improve visual generation by combining vision-language models (VLMs) with diffusion models. However, existing methods struggle to fully balance sufficient interaction and flexible implementation due to vast representation difference. Considering abundant and hierarchical information in VLM's layers from low-level details to high-level semantics, we propose \textbf{ParaUni}. It extracts features from variants VLM's layers in a \textbf{Para}llel way for comprehensive information interaction and retains a — Jiangtong Tan, Lin Liu, Jie Huanng, Xiaopeng Zhang, Qi Tian, Feng Zhao

View PDF HTML (experimental)

Abstract:Unified multimodal models significantly improve visual generation by combining vision-language models (VLMs) with diffusion models. However, existing methods struggle to fully balance sufficient interaction and flexible implementation due to vast representation difference. Considering abundant and hierarchical information in VLM's layers from low-level details to high-level semantics, we propose \textbf{ParaUni}. It extracts features from variants VLM's layers in a \textbf{Para}llel way for comprehensive information interaction and retains a flexible separation architecture to enhance generation in \textbf{Uni}fied multimodal model. Concretely, visual features from all VLM's layers are fed in parallel into a Layer Integration Module (LIM), which efficiently integrates fine-grained details and semantic abstractions and provides the fused representation as a condition to the diffusion model. To further enhance performance, we reveal that these hierarchical layers respond unequally to different rewards in Reinforcement Learning (RL). Crucially, we design a Layer-wise Dynamic Adjustment Mechanism (LDAM) to facilitate multiple reward improvements that aligns the hierarchical properties of these layers using RL. Extensive experiments show ParaUni leverages complementary multi-layer features to substantially improve generation quality and shows strong potential for multiple reward advances during RL stages. Code is available at this https URL.

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2512.05422 [cs.CV]

(or arXiv:2512.05422v2 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2512.05422

arXiv-issued DOI via DataCite

Submission history

From: Jiangtong Tan [view email] [v1] Fri, 5 Dec 2025 04:41:57 UTC (1,300 KB) [v2] Sat, 28 Mar 2026 13:02:42 UTC (1,278 KB)

Original source

arXiv

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

CountriesRecent

Japan To Build Global Hubs For AI Robotics Research - themorningnews.com

<a href="https://news.google.com/rss/articles/CBMimwFBVV95cUxPTndfWlc3Q1V4dzFydUlsZ0tFS1ZGWkNVLXVucF9CVkVWZkVveUlWaTVqTU1ndmY1cUJhMFNIN25VSmtCTG5vbXg3NHlYYkt2ZGNlYUlxQmVYT2ZlTGJEX1Y1Uml1UlFmb19EbExZcFdMRUI1SXpfRHlYMVp4SzBnZTRRM01ORjhWZmljNmhZb2RCT1Brb1MyN2YwWQ?oc=5" target="_blank">Japan To Build Global Hubs For AI Robotics Research</a> themorningnews.com

GNews AI Japan

1m1 day ago

Laws & Regulation

Beyond chatbots: How CoCounsel Legal delivers AI legal research you can trust - Thomson Reuters Legal Solutions

<a href="https://news.google.com/rss/articles/CBMitgFBVV95cUxOS2ctUWtGazRtM1hMOFZTc2FxcFRETzJDei02THdYTW5qZm5VTUZBbVc4TmFiLWthc1VnUEZ4LUNKWnJpOFNObEpPbklTUW1kMTMybzNNOXRoZzlaQ01IZHVwNzVmRFo2SEpRQ3YxeFZHY0pwZ09NLU9SdE1XMDNObE1GZGo1ZGQ4SWVtUzA1SURkTjRJUWJMbzdxV0RxN1loMWd2U25WZXVleWQ5MGtYM2Z2cUd5Zw?oc=5" target="_blank">Beyond chatbots: How CoCounsel Legal delivers AI legal research you can trust</a> Thomson Reuters Legal Solutions

GNews AI legal

1m12 days ago

Generative UI

Materials Project Database Growth Supports AI-Ready Materials Science Research - Lab Manager

<a href="https://news.google.com/rss/articles/CBMisAFBVV95cUxQY2NKN3ZGYng1LW96b3RJa0Rlei04WXFleEttdWJYODJWX05Wc3NJRDJPUmw2VEZ0b1ltbkFoVGdxRWUzR01vcUdzRU1EWHRnUEg5VEhrWkZMLVJ6YUZUUDhiLW5OdWprOWJlck1EOExLZ0pzVW9UOFZqTTRNdGZOY2RRcE1NV0ZHdjJ1WDVXYlZXQVdBMjBLOWpjX29Ednhxb2xmTlRsRkQ4aDgxS2ZEQw?oc=5" target="_blank">Materials Project Database Growth Supports AI-Ready Materials Science Research</a> Lab Manager

GNews AI materials

1mabout 2 months ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 162 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research Papers

PharosAI and 10x Genomics Partner to Transform Cancer Research with AI and Spatial Biology - PR Newswire

<a href="https://news.google.com/rss/articles/CBMi3wFBVV95cUxPNlFmMldlS0lDR2JWc052YXpVMnRudUZHWE16M25GR1NWbENQd0I5RWlaamFYREdsRnB1dXlWSExwV183TXVjcGwxZ3Y0eEFlX1dELUhtNkNqWDg4V1FvOFhfWFRUS2Nockd4MmhiMjdRWDJwS1ZYOVMzVVRaQU5zOUNudWpEWmk1Sjc3TFpTbWhfM2VvdWZhWlY1ZkVmUlNxRWRzX1pSanhmb2I0RFdNSmxqRnZEd1gyLUpXU2l3M3BLaHAtR1BrVWpDQjB3WFNRUnhNR3hZQTdva3planA0?oc=5" target="_blank">PharosAI and 10x Genomics Partner to Transform Cancer Research with AI and Spatial Biology</a> PR Newswire

GNews AI genomics

1mabout 2 months ago

Research PapersRecent

Safeguarding cryptocurrency by disclosing quantum vulnerabilities responsibly - research.google

<a href="https://news.google.com/rss/articles/CBMiqwFBVV95cUxNZlIydE4tc3hxMjh6enFJRVRqNWZzcFQ5Szl4M3d2QWxzOGsxMlQwTnVUU3NEYTlsODlmMFo2Xy1ULU11cF8xTnJYZXRmT3VwTGdKbGpHOXdkRWpHc3hJME9MUlB5ZmVGYzZlbF9FcllZRm5vVmpFdHFWZmoxQ1VxUHJPWUQ3VV9LVUxENHJnazhoRGxBUDBzT1p1SzkwMFFaRml2cmpqMW5NTkE?oc=5" target="_blank">Safeguarding cryptocurrency by disclosing quantum vulnerabilities responsibly</a> research.google

GNews AI Google

1mabout 16 hours ago

Research Papers

US data centers’ energy use amid the artificial intelligence boom - pewresearch.org

<a href="https://news.google.com/rss/articles/CBMiuAFBVV95cUxPb1lqZC1Wdnk4aEwzVVFZZ01DTmxycVRBWENTTUFpSGdZZ2NWYlFnWDdWVXBzbjhIZnJpZ1V6akc5YnVQY2pTVjFPSDQ1dUlLN3ZiVjhaM2dXTVplU29hWndlSU9SeTNGc2JqRVQ3b1lWUnJoVXdQRmR4dC1ITkNIdDg5TWpwVVJrc1lDZVJ4X2dRNzlqaWJOdGpodS1Va1pQeFRTRGhLZUJUQVhvUlBEbVFlM2gwSlRY?oc=5" target="_blank">US data centers’ energy use amid the artificial intelligence boom</a> pewresearch.org

GNews AI energy

1m5 months ago

Research Papers

Researchers Uncover Hidden Ingredients Behind AI Creativity - Quanta Magazine

<a href="https://news.google.com/rss/articles/CBMiogFBVV95cUxPSTRPVlIyREgzM2xsT0dhcDJoZXZqS25hSkFWODJGQ1JQUlNQb21RQXdmd0ZoSHB0RlFncjlpUTMyM3RBVHRFNGJNR3cxNzdkX2ZhcjZzLWR0UWhDdFNESmJabXdINUdZOEMxOW1mcHFQOWhZSGZFZFp2czFVWnZ0TE52OUx2cFlXekJvakdsSVdNcFcwTk55RUhXVm1YRWdfQ0E?oc=5" target="_blank">Researchers Uncover Hidden Ingredients Behind AI Creativity</a> Quanta Magazine

GNews AI diffusion

1m9 months ago