From Guessing to Placeholding: A Cost-Theoretic Framework for Uncertainty-Aware Code Completion
arXiv:2604.01849v1 Announce Type: new Abstract: While Large Language Models (LLMs) have demonstrated exceptional proficiency in code completion, they typically adhere to a Hard Completion (HC) paradigm, compelling the generation of fully concrete code even amidst insufficient context. Our analysis of 3 million real-world interactions exposes the limitations of this strategy: 61% of the generated suggestions were either edited after acceptance or rejected despite exhibiting over 80% similarity to the user's subsequent code, suggesting that models frequently make erroneous predictions at specific token positions. Motivated by this observation, we propose Adaptive Placeholder Completion (APC), a collaborative framework that extends HC by strategically outputting explicit placeholders at high-
View PDF HTML (experimental)
Abstract:While Large Language Models (LLMs) have demonstrated exceptional proficiency in code completion, they typically adhere to a Hard Completion (HC) paradigm, compelling the generation of fully concrete code even amidst insufficient context. Our analysis of 3 million real-world interactions exposes the limitations of this strategy: 61% of the generated suggestions were either edited after acceptance or rejected despite exhibiting over 80% similarity to the user's subsequent code, suggesting that models frequently make erroneous predictions at specific token positions. Motivated by this observation, we propose Adaptive Placeholder Completion (APC), a collaborative framework that extends HC by strategically outputting explicit placeholders at high-entropy positions, allowing users to fill directly via IDE navigation. Theoretically, we formulate code completion as a cost-minimization problem under uncertainty. Premised on the observation that filling placeholders incurs lower cost than correcting errors, we prove the existence of a critical entropy threshold above which APC achieves strictly lower expected cost than HC. We instantiate this framework by constructing training data from filtered real-world edit logs and design a cost-based reward function for reinforcement learning. Extensive evaluations across 1.5B--14B parameter models demonstrate that APC reduces expected editing costs from 19% to 50% while preserving standard HC performance. Our work provides both a theoretical foundation and a practical training framework for uncertainty-aware code completion, demonstrating that adaptive abstention can be learned end-to-end without sacrificing conventional completion quality.
Subjects:
Computation and Language (cs.CL)
Cite as: arXiv:2604.01849 [cs.CL]
(or arXiv:2604.01849v1 [cs.CL] for this version)
https://doi.org/10.48550/arXiv.2604.01849
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Liang Zhu [view email] [v1] Thu, 2 Apr 2026 10:03:32 UTC (126 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modellanguage modeltraining
Voice AI Agents: Building Speech-to-Speech Apps with TypeScript
Voice AI Agents: Building Speech-to-Speech Apps with TypeScript Voice is the most natural interface for AI. In 2026, speech-to-speech applications are transforming customer service, virtual assistants, and real-time translation. But building voice AI pipelines traditionally requires stitching together multiple SDKs: one for Speech-to-Text (STT), another for LLM inference, and a third for Text-to-Speech (TTS). NeuroLink unifies this entire pipeline into a single TypeScript SDK. In this guide, you'll learn how to build real-time voice AI agents using NeuroLink's streaming architecture. We'll cover speech-to-text integration, streaming LLM responses, text-to-speech synthesis, and practical patterns for production voice applications. Why Voice AI Is Hard (And How NeuroLink Solves It) Building

Semantic Search with TypeScript: Using embed() and embedMany() for Vector Search
Semantic Search with TypeScript: Using embed() and embedMany() for Vector Search In the age of information overload, keyword-based search often falls short. Users aren't just looking for exact matches; they're looking for meaning . This is where semantic search shines, allowing systems to understand the intent behind a query and retrieve results that are conceptually similar, even if they don't contain the exact keywords. At the heart of semantic search lies the concept of embeddings – dense numerical representations of text that capture its meaning. NeuroLink, the universal AI SDK for TypeScript, simplifies the process of generating and utilizing these embeddings, making it straightforward to build powerful semantic search capabilities into your applications. This article will guide you t

I Built 3 APIs for Turkey’s Used-Car Market with Apify
Turkey’s used-car market is massive, fragmented, and surprisingly hard to work with if you want structured data. Listings live across marketplaces, dealer pages are inconsistent, pricing changes fast, and even simple questions like “What is this car worth?” or “Which dealers dominate Istanbul for this brand?” are harder than they should be. So I built three focused APIs on top of Apify to solve different layers of the problem: A listing extraction API for Arabam A valuation API for Arabam + Sahibinden A dealer intelligence API for Arabam + Sahibinden All three are built for developers, analysts, insurers, lenders, marketplaces, and automotive businesses that need clean Turkish vehicle data instead of brittle scraping scripts. 1. Arabam.com Vehicle Scraper API The first API is the raw data
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Google AI Edge Gallery
Google AI Edge Gallery Terrible name, really great app: this is Google's official app for running their Gemma 4 models (the E2B and E4B sizes, plus some members of the Gemma 3 family) directly on your iPhone. It works really well. The E2B model is a 2.54GB download and is both fast and genuinely useful. The app also provides "ask questions about images" and audio transcription (up to 30s) with the two small Gemma 4 models, and has an interesting "skills" demo which demonstrates tool calling against eight different interactive widgets, each implemented as an HTML page (though sadly the source code is not visible): interactive-map, kitchen-adventure, calculate-hash, text-spinner, mood-tracker, mnemonic-password, query-wikipedia, and qr-code. (That demo did freeze the app when I tried to add

Alibaba s Qwen team built HopChain to fix how AI vision models fall apart during multi-step reasoning
When AI models reason about images, small perceptual errors compound across multiple steps and produce wrong answers. Alibaba's HopChain framework tackles this by generating multi-stage image questions that break complex problems into linked individual steps, forcing models to verify each visual detail before drawing conclusions. The approach improves 20 out of 24 benchmarks. The article Alibaba s Qwen team built HopChain to fix how AI vision models fall apart during multi-step reasoning appeared first on The Decoder .

OpenAI reveals 600,000 weekly health queries from hospital deserts as seven in ten come after hours
ChatGPT gets millions of health queries per week in the US, especially from areas where doctors are hard to reach. The article OpenAI reveals 600,000 weekly health queries from hospital deserts as seven in ten come after hours appeared first on The Decoder .


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!