Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessFive Agent Memory Types in LangGraph: A Deep Code Walkthrough (Part 2)DEV CommunityLayered Context Routing for Campus Operations: A Facilities Intake PoCDEV CommunityHow Crypto Lending Actually Works Under the Hood: A Developer's PerspectiveDEV CommunityAutomating Landed Cost: The AI Advantage for ASEAN SellersDEV CommunityAsync Web Scraping in Python: httpx + asyncio for 10x Faster Data CollectionDEV CommunityUsing GPT-4 and Claude to Extract Structured Data From Any Webpage in 2026DEV CommunityBuilding Cross-Cloud Java Applications with Capa-Java: The Good, The Bad, and What I Learned the Hard WayDEV CommunityUBTECH 2025 "Report Card": Revenue from Full-Size Humanoid Robots Grows Over 22-Fold - GasgooGoogle News - AI roboticsI Built an MCP Server So Claude Can Answer Questions About Its Own UsageDEV CommunityAI Image Generation in 2026: A Developer's Guide to Building with AI Art APIsDEV CommunityUnder the Skin of America’s Humanoid Robots: Chinese Technology - WSJGoogle News - AI roboticsLoving and Hating Apple, OEM Manufacturing of AI Glasses: Can Goertek Inc. "Change Its Fate Against All Odds"? - 36Kr 36氪GNews AI manufacturingBlack Hat USADark ReadingBlack Hat AsiaAI BusinessFive Agent Memory Types in LangGraph: A Deep Code Walkthrough (Part 2)DEV CommunityLayered Context Routing for Campus Operations: A Facilities Intake PoCDEV CommunityHow Crypto Lending Actually Works Under the Hood: A Developer's PerspectiveDEV CommunityAutomating Landed Cost: The AI Advantage for ASEAN SellersDEV CommunityAsync Web Scraping in Python: httpx + asyncio for 10x Faster Data CollectionDEV CommunityUsing GPT-4 and Claude to Extract Structured Data From Any Webpage in 2026DEV CommunityBuilding Cross-Cloud Java Applications with Capa-Java: The Good, The Bad, and What I Learned the Hard WayDEV CommunityUBTECH 2025 "Report Card": Revenue from Full-Size Humanoid Robots Grows Over 22-Fold - GasgooGoogle News - AI roboticsI Built an MCP Server So Claude Can Answer Questions About Its Own UsageDEV CommunityAI Image Generation in 2026: A Developer's Guide to Building with AI Art APIsDEV CommunityUnder the Skin of America’s Humanoid Robots: Chinese Technology - WSJGoogle News - AI roboticsLoving and Hating Apple, OEM Manufacturing of AI Glasses: Can Goertek Inc. "Change Its Fate Against All Odds"? - 36Kr 36氪GNews AI manufacturing
AI NEWS HUBbyEIGENVECTOREigenvector

Policy Gradient Algorithms

Lilian Weng BlogApril 8, 20181 min read0 views
Source Quiz

<!-- Abstract: In this post, we are going to look deep into policy gradient, why it works, and many new policy gradient algorithms proposed in recent years: vanilla policy gradient, actor-critic, off-policy actor-critic, A3C, A2C, DPG, DDPG, D4PG, MADDPG, TRPO, PPO, ACER, ACTKR, SAC, TD3 & SVPG. --> <p><span class=&#34;update&#34;>[Updated on 2018-06-30: add two new policy gradient methods, <a href=&#34;#sac&#34;>SAC</a> and <a href=&#34;#d4pg&#34;>D4PG</a>.]</span> <br/> <span class=&#34;update&#34;>[Updated on 2018-09-30: add a new policy gradient method, <a href=&#34;#td3&#34;>TD3</a>.]</span> <br/> <span class=&#34;update&#34;>[Updated on 2019-02-09: add <a href=&#34;#sac-with-automatically-adjusted-temperature&#34;>SAC with automatically adjusted temperature</a>].</span> <br/> <span c

Could not retrieve the full article text.

Read on Lilian Weng Blog →
Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

versionupdatepolicy

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Policy Grad…versionupdatepolicykoreapaperLilian Weng…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 167 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!