Rethinking Structure Preservation in Text-Guided Image Editing with Visual Autoregressive Models
arXiv:2603.28367v1 Announce Type: new Abstract: Visual autoregressive (VAR) models have recently emerged as a promising family of generative models, enabling a wide range of downstream vision tasks such as text-guided image editing. By shifting the editing paradigm from noise manipulation in diffusion-based methods to token-level operations, VAR-based approaches achieve better background preservation and significantly faster inference. However, existing VAR-based editing methods still face two key challenges: accurately localizing editable tokens and maintaining structural consistency in the e — Tao Xia, Jiawei Liu, Yukun Zhang, Ting Liu, Wei Wang, Lei Zhang
View PDF HTML (experimental)
Abstract:Visual autoregressive (VAR) models have recently emerged as a promising family of generative models, enabling a wide range of downstream vision tasks such as text-guided image editing. By shifting the editing paradigm from noise manipulation in diffusion-based methods to token-level operations, VAR-based approaches achieve better background preservation and significantly faster inference. However, existing VAR-based editing methods still face two key challenges: accurately localizing editable tokens and maintaining structural consistency in the edited results. In this work, we propose a novel text-guided image editing framework rooted in an analysis of intermediate feature distributions within VAR models. First, we introduce a coarse-to-fine token localization strategy that can refine editable regions, balancing editing fidelity and background preservation. Second, we analyze the intermediate representations of VAR models and identify structure-related features, by which we design a simple yet effective feature injection mechanism to enhance structural consistency between the edited and source images. Third, we develop a reinforcement learning-based adaptive feature injection scheme that automatically learns scale- and layer-specific injection ratios to jointly optimize editing fidelity and structure preservation. Extensive experiments demonstrate that our method achieves superior structural consistency and editing quality compared with state-of-the-art approaches, across both local and global editing scenarios.
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2603.28367 [cs.CV]
(or arXiv:2603.28367v1 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2603.28367
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Tao Xia [view email] [v1] Mon, 30 Mar 2026 12:35:33 UTC (7,417 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivCaltech Researchers Claim Compression of High-Fidelity AI Models
Article URL: https://www.wsj.com/cio-journal/caltech-researchers-claim-radical-compression-of-high-fidelity-ai-models-e66f31c9 Comments URL: https://news.ycombinator.com/item?id=47593903 Points: 1 # Comments: 0
A Retrospective on the ICLR 2026 Review Process
The selection of papers for ICLR 2026 has fully concluded. We extend our congratulations to the authors whose work will appear at the conference. Creating ICLR’s technical program requires immense effort from the authors, reviewers, and area chairs, and we thank you for your contributions and service. For researchers whose work was rejected, we hope […]
Bold bet on AI to keep UK at forefront of science and research breakthroughs from healthcare, to better public services - GOV.UK
<a href="https://news.google.com/rss/articles/CBMi6AFBVV95cUxPSU9QQ2Y0NVJHZDNPQ3htWE45R2tfODhYSzJfRm9aRjlzSmV5X1U5cVlKWFVqWmk0ZTZhV0x2VUNmZjg1Z05DVk41MW1hMzZJOE05WEFHNVFBZlg5ZjNHd21sUi1OVEp4SjlnQTV0UVFJLWJDVzdxdFRHelNsNC1yaWRyWGtPRXk1aGU4MHllQzRYNFdHVk1Yc3Fid09uV3VwWFN1Nkc0Yktnam04S0Y4cVJtMlJqY1hYczBpQnlCNUtEejBxaFBHUUN0cXJVcU53VjZoNm05QVlZd2dhek5STEhiQVgyT2t5?oc=5" target="_blank">Bold bet on AI to keep UK at forefront of science and research breakthroughs from healthcare, to better public services</a> <font color="#6f6f6f">GOV.UK</font>
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
A Retrospective on the ICLR 2026 Review Process
The selection of papers for ICLR 2026 has fully concluded. We extend our congratulations to the authors whose work will appear at the conference. Creating ICLR’s technical program requires immense effort from the authors, reviewers, and area chairs, and we thank you for your contributions and service. For researchers whose work was rejected, we hope […]
Vector Researchers present papers at ACL 2024
Vector researchers will be well represented at the 62nd Annual Meeting of the Association for Computational Linguistics in Bangkok, Thailand this year. 14 papers co-authored by Vector-affiliated researchers are being […] The post Vector Researchers present papers at ACL 2024 appeared first on Vector Institute for Artificial Intelligence .
Yann LeCun's Team's New Paper: AI Development Mimicking Human Intelligence Hits a Dead End - eu.36kr.com
<a href="https://news.google.com/rss/articles/CBMiU0FVX3lxTFBkbTRhNlhtRnY0cVBERld2OTdWNkRGMXBEaG9Vc21janRUcjJaUlJ4YzZRajVmMGQxNGJYTFB6M3lleUFNakUtWElHdGwzTXBQZjNZ?oc=5" target="_blank">Yann LeCun's Team's New Paper: AI Development Mimicking Human Intelligence Hits a Dead End</a> <font color="#6f6f6f">eu.36kr.com</font>
Plans must be made for the welfare of sentient AI, animal consciousness researchers argue - The Hill
<a href="https://news.google.com/rss/articles/CBMiiAFBVV95cUxNNzVaUTkzYkFUaVRsNGtnQVRXS2xsQVZfd1dFQ01RUlNZWUdDbjBNLUNycll2enl2NHp4Z0Ficm9HUnNWUnlvSGFrR3lDVUVxT1QyeE03QWhWcHFDTVJxV3VUQ0FKT3hiTkY3dWZha3JjcjRIM3l3WUtHZVlBUlhxdVBhLW1tdlJ40gGOAUFVX3lxTFBDQnllcVNNa1NRYVMyYlBtVXVxR0VPeHNjTjNMNWNTMFZXRjRkSU1OeXRFNmxvcENqbXkwSERoU1pGdXJYX2g5c214cFJFdEc1WUlkaEE5TlFDTTNoek5yR18tVi1vWUlGUnl4Tk13VWlFMDhzdUUyOUl3RmhNZ0FobTdiVG51N2h1SmJ5Y3c?oc=5" target="_blank">Plans must be made for the welfare of sentient AI, animal consciousness researchers argue</a> <font color="#6f6f6f">The Hill</font>

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!