Transferring Physical Priors into Remote Sensing Segmentation via Large Language Models
arXiv:2603.27504v1 Announce Type: new Abstract: Semantic segmentation of remote sensing imagery is fundamental to Earth observation. Achieving accurate results requires integrating not only optical images but also physical variables such as the Digital Elevation Model (DEM), Synthetic Aperture Radar (SAR) and Normalized Difference Vegetation Index (NDVI). Recent foundation models (FMs) leverage pre-training to exploit these variables but still depend on spatially aligned data and costly retraining when involving new sensors. To overcome these limitations, we introduce a novel paradigm for inte — Yuxi Lu, Kunqi Li, Zhidong Li, Xiaohan Su, Biao Wu, Chenya Huang, Bin Liang
View PDF HTML (experimental)
Abstract:Semantic segmentation of remote sensing imagery is fundamental to Earth observation. Achieving accurate results requires integrating not only optical images but also physical variables such as the Digital Elevation Model (DEM), Synthetic Aperture Radar (SAR) and Normalized Difference Vegetation Index (NDVI). Recent foundation models (FMs) leverage pre-training to exploit these variables but still depend on spatially aligned data and costly retraining when involving new sensors. To overcome these limitations, we introduce a novel paradigm for integrating domain-specific physical priors into segmentation models. We first construct a Physical-Centric Knowledge Graph (PCKG) by prompting large language models to extract physical priors from 1,763 vocabularies, and use it to build a heterogeneous, spatial-aligned dataset, Phy-Sky-SA. Building on this foundation, we develop PriorSeg, a physics-aware residual refinement model trained with a joint visual-physical strategy that incorporates a novel physics-consistency loss. Experiments on heterogeneous settings demonstrate that PriorSeg improves segmentation accuracy and physical plausibility without retraining the FMs. Ablation studies verify the effectiveness of the Phy-Sky-SA dataset, the PCKG, and the physics-consistency loss.
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2603.27504 [cs.CV]
(or arXiv:2603.27504v1 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2603.27504
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Yuxi Lu [view email] [v1] Sun, 29 Mar 2026 03:55:11 UTC (2,014 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivTSMC Founder Morris Chang
We flew to Taiwan to interview TSMC Founder Morris Chang in a rare English interview. In fact, the last long-form video interview we could find was 17 years ago at the Computer History Museum… conducted by the one-and-only Jensen Huang! This episode came about after asking ourselves a version of the Jeff Bezos “regret minimization” question: what conversations would we most regret not having if the chance passed Acquired by? Dr. Chang was number one on our list, and thanks to a little help from Jensen himself, we’re so happy to make it happen. Dr. Chang shares the stories of a few crucial moments from TSMC’s history which have only been written about in his (currently Chinese-only) memoirs, including how TSMC won Apple’s iPhone and Mac chip business and a 2009 discrepancy with NVIDIA that
The coalescent architecture of agency : normative directionality as the key to human–AI integration
This paper advances the notion of coalescent agency as a framework for understanding human–AI integration, thereby entering ongoing debates about machine agency, extended cognition, and AI governance. I argue that the persistence or erosion of human agency in human–AI systems can be predicted through four operational criteria constituting normative directionality : domain understanding, critical evaluation capacity, override authority, and responsibility attribution. Drawing on segmented ontology and predictive processing theory, I distinguish material-segment mechanisms (AI computational processing) from social-segment mechanisms (human normative practices) while showing how these heterogeneous structures can coordinate productively. The framework’s central prediction—that automation bias
Systems programming the model
This paper examines the status of the language model object in generative AI, arguing that what we call a ‘model’ is inseparable from the systems deploying it. I first theorize how these objects emerge from systems-level interactions between trained artifacts, prompting mechanisms, and sampling methods, drawing on the philosophy of digital objects as well as software studies to show how models gain their objective character. Such interactions converge on programming, not prompting, language models, and I illustrate how critical code studies can therefore track these dynamics. In an overview of language model programming approaches, I discuss how prompt and program converge, demonstrating how this confluence tends toward the production of new feedback loops wherein models become models
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
Data-centric AI governance for responsible organizational value: evidence from a European public administration
This paper explores how data-centric artificial intelligence governance frameworks enable responsible organizational value creation within complex institutional environments. Using an empirical case from a European public administration, it examines the implementation of an automated legislative monitoring system designed to detect, classify, and summarize regulatory information. The study highlights the shift from model-centric experimentation to a mature data governance and Machine Learning Operations (MLOps) framework, integrating continuous human oversight and ethical accountability. A qualitative case study, DGOBCAN-AI, was employed, combining technical documentation, process observation, and organizational evaluation. The system evolved from a basic extract–transform–load (ETL) scrip
The pipeline exquis: a critical coding exercise to re-enact ML practice
Narratives which present smart algorithms as the major driver behind the successes of machine learning (ML) systems and ideas of automating ML development fail to acknowledge the contributions made by developers which are not directly reflected in functional code, but ground ML systems in reality. In line with ethnographic studies highlighting the importance of human collaboration and sensemaking in ML practices, we present an exercise which allows us to reflect on the consequences of reducing this dimension (Passi and Jackson in Proc ACM Hum-Comput Interact 2(CSCW):136:1–28, 2018; Neff et al. in Big Data 5(2): 85–97, 2017; Zhang et al. Proc ACM Human-Comput Interact 4(CSCW1):1–23, 2020; Muller et al. in: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, New Yor
Artificial intelligence as a moral mediator: emotional reciprocity driving happiness in hospitality
Artificial intelligence (AI) in hospitality is often portrayed as a cold, efficiency-focused tool, overlooking its potential to mediate emotional and ethical dynamics in the workplace. This study addresses the problem of how AI can ethically regulate emotional labor without dehumanizing work, and how emotional reciprocity contributes to workplace happiness. Using a quantitative, multigroup survey methodology, data were collected from 754 hospitality employees and 42 managers across hotels in Spain. Structural equation modeling examined the mediating role of AI-mediated emotional reciprocity (AI-MER) between emotional labor sustainability (ELS), shared prosperity (SP), human-centered leadership, and workplace happiness. Findings reveal that ELS is a foundational anchor enabling AI to mediat
Unpacking the message: visual cues to reduce bystander uncertainty about delivery drones in public spaces
As drones are deployed in public spaces for tasks such as package delivery, drones will encounter the public as bystanders passing by. The distinctive character of bystanders is that they are not the package recipients, so they lack prior information about the drone. Clear communication of drone intentions is essential to reduce uncertainty and improve public safety and trust. Limited research, however, has examined how a drone’s communication strategies affect bystanders. This online questionnaire study investigated how a drone’s visual cues affect bystanders' uncertainty about a drone’s intentions. Participants ( N = 150) viewed software simulated scenarios of drones delivering packages either by landing or by cable drop, each with or without visual interfaces (on-board lights, on-board

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!