Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessLetters: As a former English teacher, I know why using ChatGPT on college applications is wrong - sfchronicle.comGoogle News: ChatGPTOpenAI calls for robot taxes, a public wealth fund, and a 4-day workweek to tackle AI disruptionBusiness InsiderOpenAI calls for robot taxes, a public wealth fund, and a 4-day workweek to tackle AI disruption - Business InsiderGoogle News: OpenAI📈 Data to start your week: The AI squeezeExponential ViewOpenAI suggests electric grid, public wealth fund for AI era - Seeking AlphaGoogle News: OpenAIHow I use Claude for strategy, Gemini for research and ChatGPT for 'the grind' - Tom's GuideGoogle News: ChatGPTHow to Reap Compound Benefits From Generative AI - MIT Sloan Management ReviewGoogle News: Generative AIThe AI agent buffet is closed - AxiosGoogle News: ClaudeLocal data science company awarded Missile Defense Agency contract - Rome SentinelGoogle News: Machine LearningOpenAI's Fidji Simo takes medical leave, company announces leadership changes - CNBCGoogle News: OpenAIWD Innovation Day 2026 press Q&A transcript: roadmap plans to reach 60TB with ePMR and 100TB via HAMR by 2029 — 'at some point, the laws of physics will require us to transition to HAMR'tomshardware.comFortune Tech: OpenAI CFO hot seat, China vs. drones, North Korea hacker coup - FortuneGoogle News: OpenAIBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessLetters: As a former English teacher, I know why using ChatGPT on college applications is wrong - sfchronicle.comGoogle News: ChatGPTOpenAI calls for robot taxes, a public wealth fund, and a 4-day workweek to tackle AI disruptionBusiness InsiderOpenAI calls for robot taxes, a public wealth fund, and a 4-day workweek to tackle AI disruption - Business InsiderGoogle News: OpenAI📈 Data to start your week: The AI squeezeExponential ViewOpenAI suggests electric grid, public wealth fund for AI era - Seeking AlphaGoogle News: OpenAIHow I use Claude for strategy, Gemini for research and ChatGPT for 'the grind' - Tom's GuideGoogle News: ChatGPTHow to Reap Compound Benefits From Generative AI - MIT Sloan Management ReviewGoogle News: Generative AIThe AI agent buffet is closed - AxiosGoogle News: ClaudeLocal data science company awarded Missile Defense Agency contract - Rome SentinelGoogle News: Machine LearningOpenAI's Fidji Simo takes medical leave, company announces leadership changes - CNBCGoogle News: OpenAIWD Innovation Day 2026 press Q&A transcript: roadmap plans to reach 60TB with ePMR and 100TB via HAMR by 2029 — 'at some point, the laws of physics will require us to transition to HAMR'tomshardware.comFortune Tech: OpenAI CFO hot seat, China vs. drones, North Korea hacker coup - FortuneGoogle News: OpenAI
AI NEWS HUBbyEIGENVECTOREigenvector

Automatic feature identification in least-squares policy iteration using the Koopman operator framework

arXivMarch 30, 202610 min read0 views
Source Quiz

arXiv:2603.26464v1 Announce Type: new Abstract: In this paper, we present a Koopman autoencoder-based least-squares policy iteration (KAE-LSPI) algorithm in reinforcement learning (RL). The KAE-LSPI algorithm is based on reformulating the so-called least-squares fixed-point approximation method in terms of extended dynamic mode decomposition (EDMD), thereby enabling automatic feature learning via the Koopman autoencoder (KAE) framework. The approach is motivated by the lack of a systematic choice of features or kernels in linear RL techniques. We compare the KAE-LSPI algorithm with two previou — Christian Mugisho Zagabe, Sebastian Petiz

View PDF

Abstract:In this paper, we present a Koopman autoencoder-based least-squares policy iteration (KAE-LSPI) algorithm in reinforcement learning (RL). The KAE-LSPI algorithm is based on reformulating the so-called least-squares fixed-point approximation method in terms of extended dynamic mode decomposition (EDMD), thereby enabling automatic feature learning via the Koopman autoencoder (KAE) framework. The approach is motivated by the lack of a systematic choice of features or kernels in linear RL techniques. We compare the KAE-LSPI algorithm with two previous works, the classical least-squares policy iteration (LSPI) and the kernel-based least-squares policy iteration (KLSPI), using stochastic chain walk and inverted pendulum control problems as examples. Unlike previous works, no features or kernels need to be fixed a priori in our approach. Empirical results show the number of features learned by the KAE technique remains reasonable compared to those fixed in the classical LSPI algorithm. The convergence to an optimal or a near-optimal policy is also comparable to the other two methods.

Comments: 6 pages

Subjects:

Machine Learning (cs.LG); Dynamical Systems (math.DS)

Cite as: arXiv:2603.26464 [cs.LG]

(or arXiv:2603.26464v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.26464

arXiv-issued DOI via DataCite

Submission history

From: Christian Mugisho Zagabe [view email] [v1] Fri, 27 Mar 2026 14:31:31 UTC (3,176 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Automatic f…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 211 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers