Beyond Localization: Recoverable Headroom and Residual Frontier in Repository-Level RAG-APR
arXiv:2603.29067v1 Announce Type: new Abstract: Repository-level automated program repair (APR) increasingly treats stronger localization as the main path to better repair. We ask a more targeted question: once localization is strengthened, which post-localization levers still provide recoverable gains, which are bounded within our protocol, and what residual frontier remains? We study this question on SWE-bench Lite with three representative repository-level RAG-APR paradigms, Agentless, KGCompass, and ExpeRepair. Our protocol combines Oracle Localization, within-pool Best-of-K, fixed-interface added context probes with per-condition same-token filler controls and same-repository hard negatives, and a common-wrapper oracle check. Oracle Localization improves all three systems, but Oracle
View PDF
Abstract:Repository-level automated program repair (APR) increasingly treats stronger localization as the main path to better repair. We ask a more targeted question: once localization is strengthened, which post-localization levers still provide recoverable gains, which are bounded within our protocol, and what residual frontier remains? We study this question on SWE-bench Lite with three representative repository-level RAG-APR paradigms, Agentless, KGCompass, and ExpeRepair. Our protocol combines Oracle Localization, within-pool Best-of-K, fixed-interface added context probes with per-condition same-token filler controls and same-repository hard negatives, and a common-wrapper oracle check. Oracle Localization improves all three systems, but Oracle success still stays below 50%. Extra candidate diversity still helps inside the sampled 10-patch pools, but that headroom saturates quickly. Under the two fixed interfaces, most informative added context conditions still outperform their own matched controls. The common-wrapper check shows different system responses: under a common wrapper, gains remain large for KGCompass and ExpeRepair, while Agentless changes more with builder choice. Prompt-level fusion still leaves a large residual frontier: the best fixed probe adds only 6 solved instances beyond the native three-system Solved@10 union. Overall, stronger localization, bounded search, evidence quality, and interface design all shape repository-level repair outcomes.
Subjects:
Software Engineering (cs.SE)
Cite as: arXiv:2603.29067 [cs.SE]
(or arXiv:2603.29067v1 [cs.SE] for this version)
https://doi.org/10.48550/arXiv.2603.29067
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Pengtao Zhao [view email] [v1] Mon, 30 Mar 2026 23:10:19 UTC (621 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
announcestudyinterface
Google Open Sources Experimental Multi-Agent Orchestration Testbed Scion
Designed to manage concurrent agents running in containers across local and remote compute, Scion is an experimental orchestration testbed that enables developers to run groups of specialized agents with isolated identities, credentials, and shared workspaces. By Sergio De Simone

250 Clones in 4 Days: A Student's Journey Building an AI Security Tool
🚀 250 Clones in 4 Days: A Student's Journey Building an AI Security Tool By Nasarah Peter Dashe Cybersecurity Student @ UNIJOS | Founder of Permi The Numbers That Surprised Me On April 2nd, 2026, I did something terrifying. I typed pip install permi into my terminal, ran a few final tests, and hit publish on PyPI. A vulnerability scanner built by a student with no funding, no team, and no prior accomplishments was now available for anyone in the world to download. Four days later, GitHub told me something I didn't expect: 250 clones. 62 developers per day, on average, downloading Permi. Testing it. Breaking it. Some even giving feedback. This isn't a Silicon Valley startup with millions in backing. This is a cybersecurity student at the University of Jos, building in public, one commit at

WHY use OBIX?
By Obi Nnamdi Michael Okpala (OBINexus) WHY use OBIX? https://www.github.com/obinexusmk2/obix search repository By Obi Nnamdi Michael Okpala (OBINexus) There’s a moment every builder hits. You realize the tools you’re using don’t actually respect the human on the other side of the screen. They render pixels. They move data. They ship fast. But they don’t care . And that’s where OBIX begins. I didn’t build OBIX for convenience — I built it out of necessity When systems fail you, you either adapt… or you build your own. I chose to build. OBIX — the OBINexus Interface Experience — comes from Obi , meaning heart and soul. That’s not branding. That’s the foundation. Because I believe something most systems ignore: The interface is not the surface — it is the contract between human and system. I
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers

Energy-Efficient State Estimation with 1-Bit Sensing: A Bussgang-Kalman Framework for Internet of Things
arXiv:2507.17284v2 Announce Type: replace Abstract: Accurate state estimation from heavily quantized measurements is a key challenge in resource-constrained Internet of Things (IoT) sensing and tracking, where battery-powered devices may employ low-resolution analog-to-digital converters (ADCs) to simplify sensor hardware and reduce the amount of data. Existing model-based and hybrid learning-based estimators, however, typically assume high-resolution observations and therefore degrade severely under 1-bit quantization. In this paper, we study nonlinear state estimation with 1-bit observations and develop a Bussgang-aided filtering framework for IoT sensing front-ends with 1-bit quantization. For fully known system models, we propose a Bussgang-aided Kalman Filter (BKF) that explicitly inc

Holographic Communication via Recordable and Reconfigurable Metasurface
arXiv:2506.19376v2 Announce Type: replace Abstract: Holographic surface based communication technologies are anticipated to play a significant role in the next generation of wireless networks. The existing reconfigurable holographic surface (RHS)-based scheme only utilizes the reconstruction process of the holographic principle for beamforming, where the channel sate information (CSI) is needed. However, channel estimation for CSI acquirement is a challenging task in metasurface based communications. In this study, inspired by both the recording and reconstruction processes of holography, we develop a novel holographic communication scheme by introducing recordable and reconfigurable metasurfaces (RRMs), where channel estimation is not needed thanks to the recording process. Then we analyz

Croissant Charts: Modulating the Performance of Normal Distribution Visualizations with Affordances
arXiv:2604.04432v1 Announce Type: new Abstract: Affordances, originating in psychology, describe how an object's design influences the physical and cognitive actions users may take. Past work applied affordance theory to visualization to explain how design decisions can impact the cognitive actions of visualization readers. In this work, we demonstrate that affordances can complement effectiveness rankings by further explaining the root causes behind visualizations' task performance. To do so, we conduct a case study on static normal probability density function plots, identifying their current affordances. Next, we identify the optimal affordances for a common probability-comparison task and develop a novel affordance-driven visualization, the Croissant Chart, to support them. We empirica

Teacher Professional Development on WhatsApp and LLMs: Early Lessons from Cameroon
arXiv:2604.04139v1 Announce Type: new Abstract: AI in education is commonly delivered through web-based systems such as online forms and institutional platforms. However, these approaches can exclude teachers in low-resource contexts, where everyday mobile platforms like WhatsApp serve as primary digital infrastructure. To address this gap, we present a field pilot in Cameroon that deploys a WhatsApp-based chatbot with LLM-supported content for teacher professional development (TPD), compared with an online form baseline. The system was evaluated through a mixed-methods study with 47 primary school teachers, integrating quantitative measures with qualitative insights from interviews and participant feedback. Results show that the chatbot was rated higher in perceived usability and overall

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!