Cross-Scale MAE: A Tale of Multi-Scale Exploitation in Remote Sensing
arXiv:2401.15855v1 Announce Type: cross Abstract: Remote sensing images present unique challenges to image analysis due to the extensive geographic coverage, hardware limitations, and misaligned multi-scale images. This paper revisits the classical multi-scale representation learning problem but under the general framework of self-supervised learning for remote sensing image understanding. We present Cross-Scale MAE, a self-supervised model built upon the Masked Auto-Encoder (MAE).During pre-training, Cross-Scale MAE employs scale augmentation techniques and enforces cross-scale consistency constraints through both contrastive and generative losses to ensure consistent and meaningful representations well-suited for a wide range of downstream tasks. Further, our implementation leverages the
View PDF
Abstract:Remote sensing images present unique challenges to image analysis due to the extensive geographic coverage, hardware limitations, and misaligned multi-scale images. This paper revisits the classical multi-scale representation learning problem but under the general framework of self-supervised learning for remote sensing image understanding. We present Cross-Scale MAE, a self-supervised model built upon the Masked Auto-Encoder (MAE).During pre-training, Cross-Scale MAE employs scale augmentation techniques and enforces cross-scale consistency constraints through both contrastive and generative losses to ensure consistent and meaningful representations well-suited for a wide range of downstream tasks. Further, our implementation leverages the xFormers library to accelerate network pre-training on a single GPU while maintaining the quality of learned representations. Experimental evaluations demonstrate that Cross-Scale MAE exhibits superior performance compared to standard MAE and other state-of-the-art remote sensing MAE methods.
Subjects:
Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
Cite as: arXiv:2401.15855 [cs.CV]
(or arXiv:2401.15855v1 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2401.15855
arXiv-issued DOI via DataCite
Submission history
From: Maofeng Tang [view email] [v1] Mon, 29 Jan 2024 03:06:19 UTC (950 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modeltrainingannounce
From Broad Exploration to Stable Synthesis: Entropy-Guided Optimization for Autoregressive Image Generation
arXiv:2604.02355v1 Announce Type: new Abstract: Combining Chain-of-Thought (CoT) with Reinforcement Learning (RL) improves text-to-image (T2I) generation, yet the underlying interaction between CoT's exploration and RL's optimization remains unclear. We present a systematic entropy-based analysis that yields three key insights: (1) CoT expands the generative exploration space, while RL contracts it toward high-reward regions; (2) final reward is strongly negatively correlated with both the mean and variance of image-token entropy, highlighting the need to reduce uncertainty and instability; and (3) the entropy of the textual CoT directly governs downstream image quality, with lower-entropy CoTs leading to better generations. Motivated by these findings, we propose Entropy-Guided Group Rela

Prism: Policy Reuse via Interpretable Strategy Mapping in Reinforcement Learning
arXiv:2604.02353v1 Announce Type: new Abstract: We present PRISM (Policy Reuse via Interpretable Strategy Mapping), a framework that grounds reinforcement learning agents' decisions in discrete, causally validated concepts and uses those concepts as a zero-shot transfer interface between agents trained with different algorithms. PRISM clusters each agent's encoder features into $K$ concepts via K-means. Causal intervention establishes that these concepts directly drive - not merely correlate with - agent behavior: overriding concept assignments changes the selected action in 69.4% of interventions ($p = 8.6 \times 10^{-86}$, 2500 interventions). Concept importance and usage frequency are dissociated: the most-used concept (C47, 33.0% frequency) causes only a 9.4% win-rate drop when ablated

YC Bench: a Live Benchmark for Forecasting Startup Outperformance in Y Combinator Batches
arXiv:2604.02378v1 Announce Type: new Abstract: Forecasting startup success is notoriously difficult, partly because meaningful outcomes, such as exits, large funding rounds, and sustained revenue growth, are rare and can take years to materialize. As a result, signals are sparse and evaluation cycles are slow. Y Combinator batches offer a unique mitigation: each batch comprises around 200 startups, funded simultaneously, with evaluation at Demo Day only three months later. We introduce YC Bench, a live benchmark for forecasting early outperformance within YC batches. Using the YC W26 batch as a case study (196 startups), we measure outperformance with a Pre-Demo Day Score, a KPI combining publicly available traction signals and web visibility. This short-term metric enables rapid evaluati
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Modeling and Controlling Deployment Reliability under Temporal Distribution Shift
arXiv:2604.02351v1 Announce Type: new Abstract: Machine learning models deployed in non-stationary environments are exposed to temporal distribution shift, which can erode predictive reliability over time. While common mitigation strategies such as periodic retraining and recalibration aim to preserve performance, they typically focus on average metrics evaluated at isolated time points and do not explicitly model how reliability evolves during deployment. We propose a deployment-centric framework that treats reliability as a dynamic state composed of discrimination and calibration. The trajectory of this state across sequential evaluation windows induces a measurable notion of volatility, allowing deployment adaptation to be formulated as a multi-objective control problem that balances re

An Initial Exploration of Contrastive Prompt Tuning to Generate Energy-Efficient Code
arXiv:2604.02352v1 Announce Type: new Abstract: Although LLMs are capable of generating functionally correct code, they also tend to produce less energy-efficient code in comparison to human-written solutions. As these inefficiencies lead to higher computational overhead, they are in direct conflict with Green Software Development (GSD) efforts, which aim to reduce the energy consumption of code. To support these efforts, this study aims to investigate whether and how LLMs can be optimized to promote the generation of energy-efficient code. To this end, we employ Contrastive Prompt Tuning (CPT). CPT combines Contrastive Learning techniques, which help the model to distinguish between efficient and inefficient code, and Prompt Tuning, a Parameter-Efficient Fine Tuning (PEFT) approach that r

Differentiable Symbolic Planning: A Neural Architecture for Constraint Reasoning with Learned Feasibility
arXiv:2604.02350v1 Announce Type: new Abstract: Neural networks excel at pattern recognition but struggle with constraint reasoning -- determining whether configurations satisfy logical or physical constraints. We introduce Differentiable Symbolic Planning (DSP), a neural architecture that performs discrete symbolic reasoning while remaining fully differentiable. DSP maintains a feasibility channel (phi) that tracks constraint satisfaction evidence at each node, aggregates this into a global feasibility signal (Phi) through learned rule-weighted combination, and uses sparsemax attention to achieve exact-zero discrete rule selection. We integrate DSP into a Universal Cognitive Kernel (UCK) that combines graph attention with iterative constraint propagation. Evaluated on three constraint rea


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!