ConceptPrism: Concept Disentanglement in Personalized Diffusion Models via Residual Token Optimization
arXiv:2602.19575v2 Announce Type: replace Abstract: Personalized text-to-image (T2I) generation has emerged as a key application for creating user-specific concepts from a few reference images. The core challenge is concept disentanglement: separating the target concept from irrelevant residual information. Lacking such disentanglement, capturing high-fidelity features often incorporates undesired attributes that conflict with user prompts, compromising the trade-off between concept fidelity and text alignment. While existing methods rely on manual guidance, they often fail to represent intric — Minseo Kim, Minchan Kwon, Dongyeun Lee, Yunho Jeon, Junmo Kim
View PDF HTML (experimental)
Abstract:Personalized text-to-image (T2I) generation has emerged as a key application for creating user-specific concepts from a few reference images. The core challenge is concept disentanglement: separating the target concept from irrelevant residual information. Lacking such disentanglement, capturing high-fidelity features often incorporates undesired attributes that conflict with user prompts, compromising the trade-off between concept fidelity and text alignment. While existing methods rely on manual guidance, they often fail to represent intricate visual details and lack scalability. We introduce ConceptPrism, a framework that extracts shared features exclusively through cross-image comparison without external information. We jointly optimize a target token and image-wise residual tokens via reconstruction and exclusion losses. By suppressing shared information in residual tokens, the exclusion loss creates an information vacuum that forces the target token to capture the common concept. Extensive evaluations demonstrate that ConceptPrism achieves accurate concept disentanglement and significantly improves overall performance across diverse and complex visual concepts. The code is available at this https URL.
Comments: Accepted to CVPR 2026
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2602.19575 [cs.CV]
(or arXiv:2602.19575v2 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2602.19575
arXiv-issued DOI via DataCite
Submission history
From: Minseo Kim [view email] [v1] Mon, 23 Feb 2026 07:46:19 UTC (3,249 KB) [v2] Mon, 30 Mar 2026 12:03:06 UTC (3,688 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivAustralian govt partners Anthropic on AI safety, research and infrastructure - Telecompaper
<a href="https://news.google.com/rss/articles/CBMiugFBVV95cUxNUjhfY3dKRFdBV3hIOW1PMXE4M1g2SGZkbjYxTWozbFBKdW1HN0RrU0VfdVRfbEt6MW0tRUhiQWsxUXppMzlnQk10SnVTZjY5MXBNVlYzWEtOeUZYSXBqTFZZb2lqX2hnRlZjV0pWMzkzNE5CNDl0TWV2MEczVHI2eGVIR0pZeFJTUE90VFNWSUkxdnloZzlYcHB4b0VRdC1QcXYxME0wRlFGVnAwaGhiYURNT1lYRkdOeEE?oc=5" target="_blank">Australian govt partners Anthropic on AI safety, research and infrastructure</a> <font color="#6f6f6f">Telecompaper</font>
Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ
<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxOTGxaVmNpenBkbkRYZmhsOG9MRTF4YTk0TEEwanVSUS05X2w5TE9sY1BuenFOWlozaElZWTUxVzZYTFVGTUJ3QjNpMmV6d1AtNVhjUEVMbF9Cdy1GSnFpUnVQOVN6ZzJjdzRWWnNBXzRYOEdRUW9xdEpPMFlHUmV3OFBIV1hBUmc0and2MjNZNjJIVTZqeTd6V2Q2NWlydkhDN0xEa1NyUmYtNXkxb3NvUjZWelAzQndPeDRjY2J0RHYzNi1wTW1FeWwxd2hkTWJXeHJjaENTYXFPb3VtQTlQWFFZSXVENXhMaWpJTTN1bVl1bXVUY0dFVXluTnJkQXpKNmVJdUZEZ2I3WVdsS1dnaGdrZGlwZjJFZGtqaGo3X1ZBNEltcXZna1g4c3Z3WXlqWks5Yl9SMjJyQTVCM0trNkZuV1NSUF93YzdHdXJwWlVtQ3VrcUlsTDNQZ1NEOTk5NkhVWGF6TWVpMmJ4NXNLMWJPOVFpU3lNMW52Z0lEaWN5aXJwNU9VbXR6d0VsOHo4b00wNDFrYmlRZ3BLTWphbVMtVGtTVTFoX2hYQmtjaG1GVkJSbHVzdw?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> <font color="#6f6f6f">WSJ</font>
Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ
<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxOdkxzRng0QzhXSGNBR21UU0k1XzZqR3VTdFpXeDhEdUlCT2s2WTRPYjhZZ2c0cGktY3ZCUFBlS3hzU3pURkhvTW0yTFhmSE9iMWk1Uy0xRXRzZzlDU2FrUzc5M1cxZnhEM3I3NThqOEFydmxxYm1UOVNOTEJBalZwWnFLd21YTXJHSDFtQmhqUWU0aS1fNW1nTmo5VXBER21XQWZQaVhuVzNMRUN3eTB5Tkk1eHEwX1ZxNGprMWgyT2Y4cGVIa1lTb0FkRnV2N24tNXRJcVQxaUtDSFJKQmpJUE0td3M2LWJTLXRWRVZOeW5SYUF5Q01SenVQeFZwR0Z0LVd5d1dPbjBYZm1tYm0yR1J5T0dVS0VHVDdRYy1WY2RLMm4zZEVpelRUeFA3WjZRV1YtY3NDeEpaX1ROa3l3eUx4RC1DSnV6djJtSEE2T3JRalduZU92TkJacEN3ZWJ5MkRlZlVXd3k1by1saXNCdWxXSmFQUDRDaVFIVFZUNXlUdDc4VmNBVVZqMG81ZmJ3eTBDYnA2U00yaXk5aEpfaGtjWTh0RXh6SXNDWTZLZENzbVEwWWZ5Zg?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> <font color="#6f6f6f">WSJ</font>
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
Australian govt partners Anthropic on AI safety, research and infrastructure - Telecompaper
<a href="https://news.google.com/rss/articles/CBMiugFBVV95cUxNUjhfY3dKRFdBV3hIOW1PMXE4M1g2SGZkbjYxTWozbFBKdW1HN0RrU0VfdVRfbEt6MW0tRUhiQWsxUXppMzlnQk10SnVTZjY5MXBNVlYzWEtOeUZYSXBqTFZZb2lqX2hnRlZjV0pWMzkzNE5CNDl0TWV2MEczVHI2eGVIR0pZeFJTUE90VFNWSUkxdnloZzlYcHB4b0VRdC1QcXYxME0wRlFGVnAwaGhiYURNT1lYRkdOeEE?oc=5" target="_blank">Australian govt partners Anthropic on AI safety, research and infrastructure</a> <font color="#6f6f6f">Telecompaper</font>

Monocular Building Height Estimation from PhiSat-2 Imagery: Dataset and Method
arXiv:2603.29245v1 Announce Type: new Abstract: Monocular building height estimation from optical imagery is important for urban morphology characterization but remains challenging due to ambiguous height cues, large inter-city variations in building morphology, and the long-tailed distribution of building heights. PhiSat-2 is a promising open-access data source for this task because of its global coverage, 4.75 m spatial resolution, and seven-band spectral observations, yet its potential has not been systematically evaluated. To address this gap, we construct a PhiSat-2-Height dataset (PHDataset) and propose a Two-Stream Ordinal Network (TSONet). PHDataset contains 9,475 co-registered image-label patch pairs from 26 cities worldwide. TSONet jointly models footprint segmentation and height

Deep Learning-Based Anomaly Detection in Spacecraft Telemetry on Edge Devices
arXiv:2603.29375v1 Announce Type: new Abstract: Spacecraft anomaly detection is critical for mission safety, yet deploying sophisticated models on-board presents significant challenges due to hardware constraints. This paper investigates three approaches for spacecraft telemetry anomaly detection -- forecasting & threshold, direct classification, and image classification -- and optimizes them for edge deployment using multi-objective neural architecture optimization on the European Space Agency Anomaly Dataset. Our baseline experiments demonstrate that forecasting & threshold achieves superior detection performance (92.7% Corrected Event-wise F0.5-score (CEF0.5)) [1] compared to alternatives. Through Pareto-optimal architecture optimization, we dramatically reduced computational requiremen

Multi-Layered Memory Architectures for LLM Agents: An Experimental Evaluation of Long-Term Context Retention
arXiv:2603.29194v1 Announce Type: new Abstract: Long-horizon dialogue systems suffer from semanticdrift and unstable memory retention across extended sessions. This paper presents a Multi-Layer Memory Framework that decomposes dialogue history into working, episodic, and semantic layers with adaptive retrieval gating and retention regularization. The architecture controls cross-session drift while maintaining bounded context growth and computational efficiency. Experiments on LOCOMO, LOCCO, and LoCoMo show improved performance, achieving 46.85 Success Rate, 0.618 overall F1 with 0.594 multi-hop F1, and 56.90% six-period retention while reducing false memory rate to 5.1% and context usage to 58.40%. Results confirm enhanced long-term retention and reasoning stability under constrained conte
Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!