Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessAnthropic laat klanten extra betalen als ze Claude via OpenClaw willen gebruikenTweakers.netHackers Are Posting the Claude Code Leak With Bonus MalwareWired AIUnpacking the True Cost of Blockchain Indexing: More Than Just InfrastructureDEV CommunityThe coordinate space bug that four rewrites couldn't fixDEV CommunityThe Programmer's Fulcrum: 03 April, 2026DEV CommunityEnthusiast installs Win 3.1X on bare metal Ryzen 9 9900X and RTX 5060 Ti system using floppy disk drive — Asus motherboard’s ‘classic BIOS’ functionality was instrumental to the feattomshardware.comI Put VS Code, Claude, and a Terminal Inside a File Manager I built using React and Rust — Here's What HappenedDEV CommunityLooking for arXiv endorsement (cs.LG) – RL fine-tuning for VLMs (GRPO, MathVista)discuss.huggingface.coClaude Code at Enterprise Scale: Why You Need an AI GatewayDEV CommunityPowering Down Enterprises Tackle AI’s Soaring Energy CostsDev.to AIIs Micron the New Nvidia? - The Motley FoolGNews AI NVIDIAFrom Guesswork to Growth: AI-Driven Analytics for Grant WritingDev.to AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessAnthropic laat klanten extra betalen als ze Claude via OpenClaw willen gebruikenTweakers.netHackers Are Posting the Claude Code Leak With Bonus MalwareWired AIUnpacking the True Cost of Blockchain Indexing: More Than Just InfrastructureDEV CommunityThe coordinate space bug that four rewrites couldn't fixDEV CommunityThe Programmer's Fulcrum: 03 April, 2026DEV CommunityEnthusiast installs Win 3.1X on bare metal Ryzen 9 9900X and RTX 5060 Ti system using floppy disk drive — Asus motherboard’s ‘classic BIOS’ functionality was instrumental to the feattomshardware.comI Put VS Code, Claude, and a Terminal Inside a File Manager I built using React and Rust — Here's What HappenedDEV CommunityLooking for arXiv endorsement (cs.LG) – RL fine-tuning for VLMs (GRPO, MathVista)discuss.huggingface.coClaude Code at Enterprise Scale: Why You Need an AI GatewayDEV CommunityPowering Down Enterprises Tackle AI’s Soaring Energy CostsDev.to AIIs Micron the New Nvidia? - The Motley FoolGNews AI NVIDIAFrom Guesswork to Growth: AI-Driven Analytics for Grant WritingDev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

Koopman-based surrogate modeling for reinforcement-learning-control of Rayleigh-Benard convection

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2603.28074v1 Announce Type: new Abstract: Training reinforcement learning (RL) agents to control fluid dynamics systems is computationally expensive due to the high cost of direct numerical simulations (DNS) of the governing equations. Surrogate models offer a promising alternative by approximating the dynamics at a fraction of the computational cost, but their feasibility as training environments for RL is limited by distribution shifts, as policies induce state distributions not covered by the surrogate training data. In this work, we investigate the use of Linear Recurrent Autoencoder — Tim Plotzki, Sebastian Peitz

View PDF HTML (experimental)

Abstract:Training reinforcement learning (RL) agents to control fluid dynamics systems is computationally expensive due to the high cost of direct numerical simulations (DNS) of the governing equations. Surrogate models offer a promising alternative by approximating the dynamics at a fraction of the computational cost, but their feasibility as training environments for RL is limited by distribution shifts, as policies induce state distributions not covered by the surrogate training data. In this work, we investigate the use of Linear Recurrent Autoencoder Networks (LRANs) for accelerating RL-based control of 2D Rayleigh-Bénard convection. We evaluate two training strategies: a surrogate trained on precomputed data generated with random actions, and a policy-aware surrogate trained iteratively using data collected from an evolving policy. Our results show that while surrogate-only training leads to reduced control performance, combining surrogates with DNS in a pretraining scheme recovers state-of-the-art performance while reducing training time by more than 40%. We demonstrate that policy-aware training mitigates the effects of distribution shift, enabling more accurate predictions in policy-relevant regions of the state space.

Subjects:

Machine Learning (cs.LG); Dynamical Systems (math.DS)

Cite as: arXiv:2603.28074 [cs.LG]

(or arXiv:2603.28074v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.28074

arXiv-issued DOI via DataCite

Submission history

From: Sebastian Peitz [view email] [v1] Mon, 30 Mar 2026 06:23:03 UTC (662 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Koopman-bas…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 177 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers