Neuro-Cognitive Reward Modeling for Human-Centered Autonomous Vehicle Control
arXiv:2603.25968v1 Announce Type: new Abstract: Recent advancements in computer vision have accelerated the development of autonomous driving. Despite these advancements, training machines to drive in a way that aligns with human expectations remains a significant challenge. Human factors are still essential, as humans possess a sophisticated cognitive system capable of rapidly interpreting scene information and making accurate decisions. Aligning machine with human intent has been explored with Reinforcement Learning with Human Feedback (RLHF). Conventional RLHF methods rely on collecting hum — Zhuoli Zhuang, Yu-Cheng Chang, Yu-Kai Wang, Thomas Do, Chin-Teng Lin
View PDF HTML (experimental)
Abstract:Recent advancements in computer vision have accelerated the development of autonomous driving. Despite these advancements, training machines to drive in a way that aligns with human expectations remains a significant challenge. Human factors are still essential, as humans possess a sophisticated cognitive system capable of rapidly interpreting scene information and making accurate decisions. Aligning machine with human intent has been explored with Reinforcement Learning with Human Feedback (RLHF). Conventional RLHF methods rely on collecting human preference data by manually ranking generated outputs, which is time-consuming and indirect. In this work, we propose an electroencephalography (EEG)-guided decision-making framework to incorporate human cognitive insights without behaviour response interruption into reinforcement learning (RL) for autonomous driving. We collected EEG signals from 20 participants in a realistic driving simulator and analyzed event-related potentials (ERP) in response to sudden environmental changes. Our proposed framework employs a neural network to predict the strength of ERP based on the cognitive information from visual scene information. Moreover, we explore the integration of such cognitive information into the reward signal of the RL algorithm. Experimental results show that our framework can improve the collision avoidance ability of the RL algorithm, highlighting the potential of neuro-cognitive feedback in enhancing autonomous driving systems. Our project page is: this https URL.
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2603.25968 [cs.CV]
(or arXiv:2603.25968v2 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2603.25968
arXiv-issued DOI via DataCite
Submission history
From: Zhuoli Zhuang [view email] [v1] Thu, 26 Mar 2026 23:16:16 UTC (32,017 KB) [v2] Tue, 31 Mar 2026 01:56:42 UTC (32,017 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv
Akira Hackers Shrink Encryption Timeline to Under One Hour
A notorious ransomware group has been observed leveraging long‑standing exploits and stolen credentials to slip past MFA protections and execute attacks in as little as one hour. Tracking the well-known Akira ransomware group, security researchers from Halcyon witnessed hackers abusing CVE-2024-40766 to gain unauthorised access to SonicWall management interfaces and configuration backups on unpatched devices. [ ] The post Akira Hackers Shrink Encryption Timeline to Under One Hour appeared first on DIGIT .
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
DynaVid: Learning to Generate Highly Dynamic Videos using Synthetic Motion Data
DynaVid addresses limitations in video diffusion models by using synthetic motion data represented as optical flow to improve realistic video synthesis with dynamic motions and fine-grained motion control. (2 upvotes on HuggingFace)
Omni123: Exploring 3D Native Foundation Models with Limited 3D Data by Unifying Text to 2D and 3D Generation
Omni123 is a 3D-native foundation model that unifies text-to-2D and text-to-3D generation using a shared sequence space with cross-modal consistency as an implicit structural constraint. (1 upvotes on HuggingFace)




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!