Integrating Deep RL and Bayesian Inference for ObjectNav in Mobile Robotics
Autonomous object search is challenging for mobile robots operating in indoor environments due to partial observability, perceptual uncertainty, and the need to trade off exploration and navigation efficiency. Classical probabilistic approaches explicitly represent uncertainty but typically rely on handcrafted action-selection heuristics, while deep reinforcement learning enables adaptive policies but often suffers from slow convergence and limited interpretability. This paper proposes a hybrid object-search framework that integrates Bayesian inference with deep reinforcement learning. The met — João Castelo-Branco, José Santos-Victor, Alexandre Bernardino
View PDF
Abstract:Autonomous object search is challenging for mobile robots operating in indoor environments due to partial observability, perceptual uncertainty, and the need to trade off exploration and navigation efficiency. Classical probabilistic approaches explicitly represent uncertainty but typically rely on handcrafted action-selection heuristics, while deep reinforcement learning enables adaptive policies but often suffers from slow convergence and limited interpretability. This paper proposes a hybrid object-search framework that integrates Bayesian inference with deep reinforcement learning. The method maintains a spatial belief map over target locations, updated online through Bayesian inference from calibrated object detections, and trains a reinforcement learning policy to select navigation actions directly from this probabilistic representation. The approach is evaluated in realistic indoor simulation using Habitat 3.0 and compared against developed baseline strategies. Across two indoor environments, the proposed method improves success rate while reducing search effort. Overall, the results support the value of combining Bayesian belief estimation with learned action selection to achieve more efficient and reliable objectsearch behavior under partial observability.
Comments: Accepted and to be published in the ICARSC 2026 26th IEEE International Conference on Autonomous Robot Systems and Competitions
Subjects:
Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2603.25366 [cs.RO]
(or arXiv:2603.25366v1 [cs.RO] for this version)
https://doi.org/10.48550/arXiv.2603.25366
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: João Castelo-Branco [view email] [v1] Thu, 26 Mar 2026 12:15:12 UTC (6,753 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv#475 – Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
Demis Hassabis is the CEO of Google DeepMind and Nobel Prize winner for his groundbreaking work in protein structure prediction using AI. Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep475-sc See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript: https://lexfridman.com/demis-hassabis-2-transcript CONTACT LEX: Feedback give feedback to Lex: https://lexfridman.com/survey AMA submit questions, videos or call-in: https://lexfridman.com/ama Hiring join our team: https://lexfridman.com/hiring Other other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: Demis s X: https://x.com/demishassabis DeepMind s X: https://x.com/GoogleDeepMind DeepMind s Instagram: https://instagram.com/GoogleDeepMi
#480 – Dave Hone: T-Rex, Dinosaurs, Extinction, Evolution, and Jurassic Park
Dave Hone is a paleontologist, expert on dinosaurs, co-host of the Terrible Lizards podcast, and author of numerous scientific papers and books on the behavior and ecology of dinosaurs. He lectures at Queen Mary University of London on topics of Ecology, Zoology, Biology, and Evolution. Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep480-sc See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript: https://lexfridman.com/dave-hone-transcript CONTACT LEX: Feedback give feedback to Lex: https://lexfridman.com/survey AMA submit questions, videos or call-in: https://lexfridman.com/ama Hiring join our team: https://lexfridman.com/hiring Other other ways to get in touch: https://lexfridman.com/contact EPISODE LI
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
VRUD: A Drone Dataset for Complex Vehicle-VRU Interactions within Mixed Traffic
arXiv:2604.01134v1 Announce Type: cross Abstract: The Operational Design Domain (ODD) of urbanoriented Level 4 (L4) autonomous driving, especially for autonomous robotaxis, confronts formidable challenges in complex urban mixed traffic environments. These challenges stem mainly from the high density of Vulnerable Road Users (VRUs) and their highly uncertain and unpredictable interaction behaviors. However, existing open-source datasets predominantly focus on structured scenarios such as highways or regulated intersections, leaving a critical gap in data representing chaotic, unstructured urban environments. To address this, this paper proposes an efficient, high-precision method for constructing drone-based datasets and establishes the Vehicle-Vulnerable Road User Interaction Dataset (VRUD
From Code Changes to Quality Gains: An Empirical Study in Python ML Systems with PyQu
arXiv:2511.02827v3 Announce Type: replace Abstract: In an era shaped by Generative Artificial Intelligence for code generation and the rising adoption of Python-based Machine Learning systems (MLS), software quality has emerged as a major concern. As these systems grow in complexity and importance, a key obstacle lies in understanding exactly how specific code changes affect overall quality-a shortfall aggravated by the lack of quality assessment tools and a clear mapping between ML systems code changes and their quality effects. Although prior work has explored code changes in MLS, it mostly stops at what the changes are, leaving a gap in our knowledge of the relationship between code changes and the MLS quality. To address this gap, we conducted a large-scale empirical study of 3,340 ope
Mitigating Omitted Variable Bias in Empirical Software Engineering
arXiv:2501.17026v5 Announce Type: replace Abstract: Omitted variable bias occurs when a statistical model leaves out variables that are relevant determinants of the effects under study. This results in the model attributing the missing variables' effect to some of the included variables -- hence over- or under-estimating the latter's true effect. Omitted variable bias presents a significant threat to the validity of empirical research, particularly in non-experimental studies such as those prevalent in empirical software engineering. This paper illustrates the impact of omitted variable bias on two illustrative examples in the software engineering domain, and uses them to present methods to investigate the possible presence of omitted variable bias, to estimate its impact, and to mitigate


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!