Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessPost Quantum Cryptography - ComputerphileComputerphile YTScientists Build Living Robots With Nervous SystemsIEEE RoboticsWhy AI health chatbots won’t make you better at diagnosing yourself – new research - Gavi, the Vaccine AllianceGoogle News: AIRun OpenCode in Docker - Clean machine, same convenienceDEV CommunityGood UI Is Just Invisible EngineeringDEV CommunityFace Tracking for Vertical Video: Why It's Harder Than It Looks (And How It Works)DEV CommunityI Built a Privacy-First Developer Toolbox That Runs 100% in Your BrowserDEV CommunityI Published 3 Products on Gumroad. 0 Sales. Here's My Honest Postmortem.DEV CommunityPersist session state with filesystem configuration and execute shell commandsAWS AI BlogExclusive: Beehiiv expands into podcasting, taking aim at PatreonTechCrunch AICommon Manual Testing Techniques and The Future of Manual Testing in the age of AIDEV CommunityBlue Owl caps private credit funds redemptions at 5% after steep request levelsCNBC TechnologyBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessPost Quantum Cryptography - ComputerphileComputerphile YTScientists Build Living Robots With Nervous SystemsIEEE RoboticsWhy AI health chatbots won’t make you better at diagnosing yourself – new research - Gavi, the Vaccine AllianceGoogle News: AIRun OpenCode in Docker - Clean machine, same convenienceDEV CommunityGood UI Is Just Invisible EngineeringDEV CommunityFace Tracking for Vertical Video: Why It's Harder Than It Looks (And How It Works)DEV CommunityI Built a Privacy-First Developer Toolbox That Runs 100% in Your BrowserDEV CommunityI Published 3 Products on Gumroad. 0 Sales. Here's My Honest Postmortem.DEV CommunityPersist session state with filesystem configuration and execute shell commandsAWS AI BlogExclusive: Beehiiv expands into podcasting, taking aim at PatreonTechCrunch AICommon Manual Testing Techniques and The Future of Manual Testing in the age of AIDEV CommunityBlue Owl caps private credit funds redemptions at 5% after steep request levelsCNBC Technology
AI NEWS HUBbyEIGENVECTOREigenvector

Scale-Adaptive Balancing of Exploration and Exploitation in Classical Planning

arXivMarch 30, 202610 min read0 views
Source Quiz

arXiv:2305.09840v4 Announce Type: replace Abstract: Balancing exploration and exploitation has been an important problem in both game tree search and automated planning. However, while the problem has been extensively analyzed within the Multi-Armed Bandit (MAB) literature, the planning community has had limited success when attempting to apply those results. We show that a more detailed theoretical understanding of MAB literature helps improve existing planning algorithms that are based on Monte Carlo Tree Search (MCTS) / Trial Based Heuristic Tree Search (THTS). In particular, THTS uses UCB1 — Stephen Wissow, Masataro Asai

View PDF

Abstract:Balancing exploration and exploitation has been an important problem in both game tree search and automated planning. However, while the problem has been extensively analyzed within the Multi-Armed Bandit (MAB) literature, the planning community has had limited success when attempting to apply those results. We show that a more detailed theoretical understanding of MAB literature helps improve existing planning algorithms that are based on Monte Carlo Tree Search (MCTS) / Trial Based Heuristic Tree Search (THTS). In particular, THTS uses UCB1 MAB algorithms in an ad hoc manner, as UCB1's theoretical requirement of fixed bounded support reward distributions is not satisfied within heuristic search for classical planning. The core issue lies in UCB1's lack of adaptations to the different scales of the rewards. We propose GreedyUCT-Normal, a MCTS/THTS algorithm with UCB1-Normal bandit for agile classical planning, which handles distributions with different scales by taking the reward variance into consideration, and resulted in an improved algorithmic performance (more plans found with less node expansions) that outperforms Greedy Best First Search and existing MCTS/THTS-based algorithms (GreedyUCT,GreedyUCT*).*

Comments: Outstanding paper award in ECAI 2024

Subjects:

Artificial Intelligence (cs.AI)

Cite as: arXiv:2305.09840 [cs.AI]

(or arXiv:2305.09840v4 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2305.09840

arXiv-issued DOI via DataCite

Submission history

From: Masataro Asai [view email] [v1] Tue, 16 May 2023 22:46:37 UTC (249 KB) [v2] Mon, 3 Jul 2023 20:00:03 UTC (1,744 KB) [v3] Fri, 30 Aug 2024 15:57:01 UTC (3,722 KB) [v4] Thu, 26 Mar 2026 19:23:28 UTC (3,723 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Scale-Adapt…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 178 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers