Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessWhy TSMC grew four times faster than its foundry rivals in 2025 — price hikes, vertical integration, and commanding technology lead pay dividendstomshardware.comThe Complete DevSecOps Engineer Career Guide: From Pipeline Security to Platform Architect in 2026DEV CommunityOpenAI’s $1M API Credits, Holos’ Agentic Web, and Xpertbench’s Expert TasksDEV CommunitySemantic matching in graph space without matrix computation and hallucinations and no GPUdiscuss.huggingface.coWhy We Built 5 Products on FastAPI + Next.js (and Would Do It Again)DEV CommunityHow We Run 5 Live SaaS Products on $35/Month in InfrastructureDEV CommunityOur Email Provider Banned Us Overnight -- Here's What We LearnedDEV CommunityThe AI Stack: A Practical Guide to Building Your Own Intelligent ApplicationsDEV Community🚀 Day 29 of My Automation Journey – Arrays (Full Guide + Tricky Questions)DEV CommunityThe Real Size of AI Frameworks: A Wake-Up CallDEV CommunityInside OmegaLessWrong AIGoogle quietly releases an offline-first AI dictation app on iOSTechCrunch AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessWhy TSMC grew four times faster than its foundry rivals in 2025 — price hikes, vertical integration, and commanding technology lead pay dividendstomshardware.comThe Complete DevSecOps Engineer Career Guide: From Pipeline Security to Platform Architect in 2026DEV CommunityOpenAI’s $1M API Credits, Holos’ Agentic Web, and Xpertbench’s Expert TasksDEV CommunitySemantic matching in graph space without matrix computation and hallucinations and no GPUdiscuss.huggingface.coWhy We Built 5 Products on FastAPI + Next.js (and Would Do It Again)DEV CommunityHow We Run 5 Live SaaS Products on $35/Month in InfrastructureDEV CommunityOur Email Provider Banned Us Overnight -- Here's What We LearnedDEV CommunityThe AI Stack: A Practical Guide to Building Your Own Intelligent ApplicationsDEV Community🚀 Day 29 of My Automation Journey – Arrays (Full Guide + Tricky Questions)DEV CommunityThe Real Size of AI Frameworks: A Wake-Up CallDEV CommunityInside OmegaLessWrong AIGoogle quietly releases an offline-first AI dictation app on iOSTechCrunch AI
AI NEWS HUBbyEIGENVECTOREigenvector

GraSP-STL: A Graph-Based Framework for Zero-Shot Signal Temporal Logic Planning via Offline Goal-Conditioned Reinforcement Learning

arXiv cs.ROby [Submitted on 31 Mar 2026]April 1, 20262 min read1 views
Source Quiz

arXiv:2603.29533v1 Announce Type: new Abstract: This paper studies offline, zero-shot planning under Signal Temporal Logic (STL) specifications. We assume access only to an offline dataset of state-action-state transitions collected by a task-agnostic behavior policy, with no analytical dynamics model, no further environment interaction, and no task-specific retraining. The objective is to synthesize a control strategy whose resulting trajectory satisfies an arbitrary unseen STL specification. To this end, we propose GraSP-STL, a graph-search-based framework for zero-shot STL planning from offline trajectories. The method learns a goal-conditioned value function from offline data and uses it to induce a finite-horizon reachability metric over the state space. Based on this metric, it const

View PDF HTML (experimental)

Abstract:This paper studies offline, zero-shot planning under Signal Temporal Logic (STL) specifications. We assume access only to an offline dataset of state-action-state transitions collected by a task-agnostic behavior policy, with no analytical dynamics model, no further environment interaction, and no task-specific retraining. The objective is to synthesize a control strategy whose resulting trajectory satisfies an arbitrary unseen STL specification. To this end, we propose GraSP-STL, a graph-search-based framework for zero-shot STL planning from offline trajectories. The method learns a goal-conditioned value function from offline data and uses it to induce a finite-horizon reachability metric over the state space. Based on this metric, it constructs a directed graph abstraction whose nodes represent representative states and whose edges encode feasible short-horizon transitions. Planning is then formulated as a graph search over waypoint sequences, evaluated using arithmetic-geometric mean robustness and its interval semantics, and executed by a learned goal-conditioned policy. The proposed framework separates reusable reachability learning from task-conditioned planning, enabling zero-shot generalization to unseen STL tasks and long-horizon planning through the composition of short-horizon behaviors from offline data. Experimental results demonstrate its effectiveness on a range of offline STL planning tasks.

Subjects:

Robotics (cs.RO)

Cite as: arXiv:2603.29533 [cs.RO]

(or arXiv:2603.29533v1 [cs.RO] for this version)

https://doi.org/10.48550/arXiv.2603.29533

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Ancheng Hou [view email] [v1] Tue, 31 Mar 2026 10:15:42 UTC (735 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
GraSP-STL: …modeltrainingannouncepolicypaperarxivarXiv cs.RO

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Building knowledge graph…

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!