Research Papers research paper arxiv computer-vision image-recognition

UniSER: A Foundation Model for Unified Soft Effects Removal

arXivby [Submitted on 18 Nov 2025 (v1), last revised 27 Mar 2026 (this version, v2)]March 30, 20262 min read1 views

arXiv:2511.14183v2 Announce Type: replace Abstract: Digital images are often degraded by soft effects such as lens flare, haze, shadows, and reflections, which reduce aesthetics even though the underlying pixels remain partially visible. The prevailing works address these degradations in isolation, developing highly specialized, specialist models that lack scalability and fail to exploit the shared underlying essences of these restoration problems. Meanwhile, although recent large-scale generalist models (e.g., GPT-4o, Flux Kontext, Nano Banana) offer powerful text-driven editing capabilities, — Jingdong Zhang, Lingzhi Zhang, Qing Liu, Mang Tik Chiu, Connelly Barnes, Yizhou Wang, Haoran You, Xiaoyang Liu, Yuqian Zhou, Zhe Lin, Eli Shechtman, Sohrab Amirghodsi, Xin Li, Wenping Wang, Xiaohang Zhan

Authors:Jingdong Zhang, Lingzhi Zhang, Qing Liu, Mang Tik Chiu, Connelly Barnes, Yizhou Wang, Haoran You, Xiaoyang Liu, Yuqian Zhou, Zhe Lin, Eli Shechtman, Sohrab Amirghodsi, Xin Li, Wenping Wang, Xiaohang Zhan

View PDF HTML (experimental)

Abstract:Digital images are often degraded by soft effects such as lens flare, haze, shadows, and reflections, which reduce aesthetics even though the underlying pixels remain partially visible. The prevailing works address these degradations in isolation, developing highly specialized, specialist models that lack scalability and fail to exploit the shared underlying essences of these restoration problems. Meanwhile, although recent large-scale generalist models (e.g., GPT-4o, Flux Kontext, Nano Banana) offer powerful text-driven editing capabilities, they heavily rely on detailed prompts and often fail to achieve robust removal on such fine-grained tasks while preserving the scene's identity. Leveraging the common essence of soft effects, i.e., semi-transparent occlusions, we introduce a foundational versatile model UniSER, capable of addressing diverse degradations caused by soft effects within a single framework. Our methodology centers on curating a massive 3.8M-pair dataset to ensure robustness and generalization, which includes novel, physically-plausible data to fill critical gaps in public benchmarks, and a tailored training pipeline that fine-tunes a Diffusion Transformer to learn robust restoration priors from this diverse data, integrating fine-grained mask and strength controls. This synergistic approach allows UniSER to significantly outperform both specialist and generalist models, achieving robust, high-fidelity restoration in the wild.

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2511.14183 [cs.CV]

(or arXiv:2511.14183v2 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2511.14183

arXiv-issued DOI via DataCite

Submission history

From: Jingdong Zhang [view email] [v1] Tue, 18 Nov 2025 06:39:39 UTC (43,492 KB) [v2] Fri, 27 Mar 2026 07:15:47 UTC (55,091 KB)

Original source

arXiv

https://arxiv.org/abs/2511.14183

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Market NewsFresh

Microsoft to invest USD 1 billion in cloud, AI infrastructure in Thailand by 2028 - Telecompaper

Microsoft to invest USD 1 billion in cloud, AI infrastructure in Thailand by 2028 Telecompaper

GNews AI Microsoft

1mabout 2 hours ago

CountriesFresh

‘Impossible for Chinese’: Yale scientist Zhang Kai leaves US to break racial ceiling

For Zhang Kai, a pioneering scientist who is building an ultra-large-scale cellular structure group data bank with unprecedented precision, returning home to China was the natural choice to fulfil his ambition. “In the United States, it is almost impossible for a Chinese scholar to take the lead on this project,” Zhang said during a March 26 interview with China Science Daily, the official newspaper of the Chinese Academy of Sciences (CAS), the country’s most prestigious research institution. On...

SCMP Tech (Asia AI)

1mabout 5 hours ago

ReleasesFresh

Reachability-Aware Time Scaling for Path Tracking

arXiv:2604.00439v1 Announce Type: new Abstract: This paper studies tracking of collision-free waypoint paths produced by an offline planner for a planar double-integrator system with bounded speed and acceleration. Because sampling-based planners must route around obstacles, the resulting waypoint paths can contain sharp turns and high-curvature regions, so one-step reachability under acceleration limits becomes critical even when the path geometry is collision-free. We build on a pure-pursuit-style, reachability-guided quadratic-program (QP) tracker with a one-step acceleration margin. Offline, we evaluate this margin along a spline fitted to the waypoint path and update a scalar speed-scaling profile so that the required one-step acceleration remains below the available bound. Online, th

arXiv cs.RO

1mabout 5 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 172 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersFresh

Sampling-based Task and Kinodynamic Motion Planning under Semantic Uncertainty

arXiv:2604.00401v1 Announce Type: new Abstract: This paper tackles the problem of integrated task and kinodynamic motion planning in uncertain environments. We consider a robot with nonlinear dynamics tasked with a Linear Temporal Logic over finite traces ($\ltlf$) specification operating in a partially observable environment. Specifically, the uncertainty is in the semantic labels of the environment. We show how the problem can be modeled as a Partially Observable Stochastic Hybrid System that captures the robot dynamics, $\ltlf$ task, and uncertainty in the environment state variables. We propose an anytime algorithm that takes advantage of the structure of the hybrid system, and combines the effectiveness of decision-making techniques and sampling-based motion planning. We prove the sou

arXiv cs.RO

1mabout 5 hours ago

Research PapersFresh

Real Time Local Wind Inference for Robust Autonomous Navigation

arXiv:2604.00343v1 Announce Type: new Abstract: This thesis presents a solution that enables aerial robots to reason about surrounding wind flow fields in real time using on board sensors and embedded flight hardware. The core novelty of this research is the fusion of range measurements with sparse in situ wind measurements to predict surrounding flow fields. We aim to address two fundamental questions: first, the sufficiency of topographical data for accurate wind prediction in dense urban environments; and second, the utility of learned wind models for motion planning with an emphasis on energy efficiency and obstacle avoidance. Drawing on tools from deep learning, fluid mechanics, and optimal control, we establish a framework for local wind prediction using navigational LiDAR, and then

arXiv cs.RO

2mabout 5 hours ago

Research PapersFresh

Play-Testing REMind: Evaluating an Educational Robot-Mediated Role-Play Game

arXiv:2604.00300v1 Announce Type: new Abstract: This paper presents REMind, an innovative educational robot-mediated role-play game designed to support anti-bullying bystander intervention among children. REMind invites players to observe a bullying scenario enacted by social robots, reflect on the perspectives of the characters, and rehearse defending strategies by puppeteering a robotic avatar. We evaluated REMind through a mixed-methods play-testing study with 18 children aged 9--10. The findings suggest that the experience supported key learning goals related to self-efficacy, perspective-taking, understanding outcomes of defending, and intervention strategies. These results highlight the promise of Robot-Mediated Applied Drama (RMAD) as a novel pedagogical framework to support Social-

arXiv cs.RO

1mabout 5 hours ago

Research PapersFresh

Long-Horizon Geometry-Aware Navigation among Polytopes via MILP-MPC and Minkowski-Based CBFs

arXiv:2604.00162v1 Announce Type: new Abstract: Autonomous navigation in complex, non-convex environments remains challenging when robot dynamics, control limits, and exact robot geometry must all be taken into account. In this paper, we propose a hierarchical planning and control framework that bridges long-horizon guidance and geometry-aware safety guarantees for a polytopic robot navigating among polytopic obstacles. At the high level, Mixed-Integer Linear Programming (MILP) is embedded within a Model Predictive Control (MPC) framework to generate a nominal trajectory around polytopic obstacles while modeling the robot as a point mass for computational tractability. At the low level, we employ a control barrier function (CBF) based on the exact signed distance in the Minkowski-differenc

arXiv cs.RO

1mabout 5 hours ago