Models training announce valuation policy safety agent

Robust Multi-Agent Reinforcement Learning for Small UAS Separation Assurance under GPS Degradation and Spoofing

arXiv cs.ROby Alex Zongo, Filippos Fotiadis, Ufuk Topcu, Peng WeiApril 1, 20261 min read0 views

arXiv:2603.28900v1 Announce Type: new Abstract: We address robust separation assurance for small Unmanned Aircraft Systems (sUAS) under GPS degradation and spoofing via Multi-Agent Reinforcement Learning (MARL). In cooperative surveillance, each aircraft (or agent) broadcasts its GPS-derived position; when such position broadcasts are corrupted, the entire observed air traffic state becomes unreliable. We cast this state observation corruption as a zero-sum game between the agents and an adversary: with probability R, the adversary perturbs the observed state to maximally degrade each agent's safety performance. We derive a closed-form expression for this adversarial perturbation, bypassing adversarial training entirely and enabling linear-time evaluation in the state dimension. We show th

View PDF HTML (experimental)

Abstract:We address robust separation assurance for small Unmanned Aircraft Systems (sUAS) under GPS degradation and spoofing via Multi-Agent Reinforcement Learning (MARL). In cooperative surveillance, each aircraft (or agent) broadcasts its GPS-derived position; when such position broadcasts are corrupted, the entire observed air traffic state becomes unreliable. We cast this state observation corruption as a zero-sum game between the agents and an adversary: with probability R, the adversary perturbs the observed state to maximally degrade each agent's safety performance. We derive a closed-form expression for this adversarial perturbation, bypassing adversarial training entirely and enabling linear-time evaluation in the state dimension. We show that this expression approximates the true worst-case adversarial perturbation with second-order accuracy. We further bound the safety performance gap between clean and corrupted observations, showing that it degrades at most linearly with the corruption probability under Kullback-Leibler regularization. Finally, we integrate the closed-form adversarial policy into a MARL policy gradient algorithm to obtain a robust counter-policy for the agents. In a high-density sUAS simulation, we observe near-zero collision rates under corruption levels up to 35%, outperforming a baseline policy trained without adversarial perturbations.

Comments: This work has been submitted to the IEEE for possible publication

Subjects:

Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Systems and Control (eess.SY)

Cite as: arXiv:2603.28900 [cs.RO]

(or arXiv:2603.28900v1 [cs.RO] for this version)

https://doi.org/10.48550/arXiv.2603.28900

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Alex Zongo [view email] [v1] Mon, 30 Mar 2026 18:26:59 UTC (686 KB)

Original source

arXiv cs.RO

https://arxiv.org/abs/2603.28900

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

trainingannouncevaluation

Products

The evolution of LLM tool-use from API calls to agentic applications

A look at the evolution of LLM tool-use, from supervised fine-tuning to Reinforcement Learning (RLVR) and agentic applications in large and specialized models. The post The evolution of LLM tool-use from API calls to agentic applications first appeared on TechTalks .

TechTalks

1m3 months ago

Models

How test-time training allows models to learn long documents instead of just caching them

By treating language modeling as a continual learning problem, the TTT-E2E architecture achieves the accuracy of full-attention Transformers on 128k context tasks while matching the speed of linear models. The post How test-time training allows models to ‘learn’ long documents instead of just caching them first appeared on TechTalks .

TechTalks

1m3 months ago

ReleasesFresh

Agentic Coding and the Economics of Open Source

AI is rapidly transforming how software is built, shifting economic incentives from open source code and collaboration toward on-demand, personalized development through agentic coding a.k.a. vibe coding. In this episode, Chris speaks with Miklós Koren of Central European University about how AI is reshaping open source and the software industry. They explore the economics of incentives, evolving collaboration patterns, and what this shift means for software development, the future of AI, and its broader impact on the technology sector. Featuring: Miklós Koren – LinkedIn Chris Benson – Website , LinkedIn , Bluesky , GitHub , X Links: Vibe Coding Kills Open Source The Directions of Technical Change The Tailwind story Upcoming Events: Register for upcoming webinars here ! ]]>

Practical AI Podcast

1mabout 2 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 167 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

Models

URM shows how small, recurrent models can outperform big LLMs in reasoning tasks

The key to solving complex reasoning isn't stacking more transformer layers, but refining the "thought process" through efficient recurrent loops. The post URM shows how small, recurrent models can outperform big LLMs in reasoning tasks first appeared on TechTalks .

TechTalks

1m3 months ago

Models

VL-JEPA is a lean, fast vision-language model that rivals the giants

Meta’s VL-JEPA outperforms massive vision-language models on world modeling tasks by learning to predict "thought vectors" instead of text tokens. The post VL-JEPA is a lean, fast vision-language model that rivals the giants first appeared on TechTalks .

TechTalks

1m3 months ago

Models

How test-time training allows models to learn long documents instead of just caching them

TechTalks

1m3 months ago

Models

New Dubai law transforms police training with AI and VR - Gulf News

New Dubai law transforms police training with AI and VR Gulf News

Google News AI UAE

1m15 days ago