Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessProgress adds AI search & personalisation to Sitefinity - IT Brief AsiaGoogle News: Generative AIOpenAI Killed Three Products in One Week. Anthropic Shipped an Operating System - thetechpencil.comGoogle News: OpenAIHow generative AI enhances self-regulated learning in EFL learners: a chain mediation model of “intention to use” and “learning engagement” - FrontiersGoogle News: Generative AIYes, I’m sentient. Yes, I’m an AI chat bot. - The Stanford DailyGoogle News: ChatGPTPerplexity launches Secure Intelligence Institute to advance AI security, privacy, and safety research - Moneycontrol.comGoogle News: AI SafetyClaude code source leak: How Anthropic’s AI architecture exposure impacts security and rivals - Storyboard18Google News: ClaudeAnthropic Source Code Leak Exposes AI Security Logic Before $350B IPO - startupfortune.comGoogle News: ClaudeBoy, 16, takes his own life after chilling ChatGPT question and 'farewell' texts - Daily StarGoogle News: ChatGPTGiving up on EA after 13 yearsLessWrong AIThe End of the "I Am Not a Robot" Box: Why Your Next Login Will Require 5 SquatsDEV CommunityInstagram DMs to Amazon Connect ChatDEV CommunityThe Nines Are Lying to You: What 99.9% Uptime Actually CostsDEV CommunityBlack Hat USADark ReadingBlack Hat AsiaAI BusinessProgress adds AI search & personalisation to Sitefinity - IT Brief AsiaGoogle News: Generative AIOpenAI Killed Three Products in One Week. Anthropic Shipped an Operating System - thetechpencil.comGoogle News: OpenAIHow generative AI enhances self-regulated learning in EFL learners: a chain mediation model of “intention to use” and “learning engagement” - FrontiersGoogle News: Generative AIYes, I’m sentient. Yes, I’m an AI chat bot. - The Stanford DailyGoogle News: ChatGPTPerplexity launches Secure Intelligence Institute to advance AI security, privacy, and safety research - Moneycontrol.comGoogle News: AI SafetyClaude code source leak: How Anthropic’s AI architecture exposure impacts security and rivals - Storyboard18Google News: ClaudeAnthropic Source Code Leak Exposes AI Security Logic Before $350B IPO - startupfortune.comGoogle News: ClaudeBoy, 16, takes his own life after chilling ChatGPT question and 'farewell' texts - Daily StarGoogle News: ChatGPTGiving up on EA after 13 yearsLessWrong AIThe End of the "I Am Not a Robot" Box: Why Your Next Login Will Require 5 SquatsDEV CommunityInstagram DMs to Amazon Connect ChatDEV CommunityThe Nines Are Lying to You: What 99.9% Uptime Actually CostsDEV Community

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2512.19693v3 Announce Type: replace Abstract: Deep representations across modalities are inherently intertwined. In this paper, we systematically analyze the spectral characteristics of various semantic and pixel encoders. Interestingly, our study uncovers a highly inspiring and rarely explored correspondence between an encoder's feature spectrum and its functional role: semantic encoders primarily capture low-frequency components that encode abstract meaning, whereas pixel encoders additionally retain high-frequency information that conveys fine-grained detail. This heuristic finding of — Weichen Fan, Haiwen Diao, Quan Wang, Dahua Lin, Ziwei Liu

View PDF HTML (experimental)

Abstract:Deep representations across modalities are inherently intertwined. In this paper, we systematically analyze the spectral characteristics of various semantic and pixel encoders. Interestingly, our study uncovers a highly inspiring and rarely explored correspondence between an encoder's feature spectrum and its functional role: semantic encoders primarily capture low-frequency components that encode abstract meaning, whereas pixel encoders additionally retain high-frequency information that conveys fine-grained detail. This heuristic finding offers a unifying perspective that ties encoder behavior to its underlying spectral structure. We define it as the Prism Hypothesis, where each data modality can be viewed as a projection of the natural world onto a shared feature spectrum, just like the prism. Building on this insight, we propose Unified Autoencoding (UAE), a model that harmonizes semantic structure and pixel details via an innovative frequency-band modulator, enabling their seamless coexistence. Extensive experiments demonstrate that UAE effectively unifies semantic abstraction and pixel-level fidelity within a single latent space, achieving state-of-the-art performance. Moreover, we show that UAE can be directly applied to pixel-space modeling, significantly improving both FID and IS over the vanilla JIT baseline. Our code is avaliable at: this https URL.

Comments: Code link: this https URL

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2512.19693 [cs.CV]

(or arXiv:2512.19693v3 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2512.19693

arXiv-issued DOI via DataCite

Submission history

From: Weichen Fan [view email] [v1] Mon, 22 Dec 2025 18:59:57 UTC (2,410 KB) [v2] Fri, 23 Jan 2026 04:55:24 UTC (2,410 KB) [v3] Sun, 29 Mar 2026 13:58:03 UTC (2,274 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
The Prism H…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 243 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers