Models model language model training announce valuation million

From Guessing to Placeholding: A Cost-Theoretic Framework for Uncertainty-Aware Code Completion

arXiv cs.CLby Liang Zhu, Haolin Chen, Lidong Zhao, Xian WuApril 4, 20262 min read0 views

arXiv:2604.01849v1 Announce Type: new Abstract: While Large Language Models (LLMs) have demonstrated exceptional proficiency in code completion, they typically adhere to a Hard Completion (HC) paradigm, compelling the generation of fully concrete code even amidst insufficient context. Our analysis of 3 million real-world interactions exposes the limitations of this strategy: 61% of the generated suggestions were either edited after acceptance or rejected despite exhibiting over 80% similarity to the user's subsequent code, suggesting that models frequently make erroneous predictions at specific token positions. Motivated by this observation, we propose Adaptive Placeholder Completion (APC), a collaborative framework that extends HC by strategically outputting explicit placeholders at high-

View PDF HTML (experimental)

Abstract:While Large Language Models (LLMs) have demonstrated exceptional proficiency in code completion, they typically adhere to a Hard Completion (HC) paradigm, compelling the generation of fully concrete code even amidst insufficient context. Our analysis of 3 million real-world interactions exposes the limitations of this strategy: 61% of the generated suggestions were either edited after acceptance or rejected despite exhibiting over 80% similarity to the user's subsequent code, suggesting that models frequently make erroneous predictions at specific token positions. Motivated by this observation, we propose Adaptive Placeholder Completion (APC), a collaborative framework that extends HC by strategically outputting explicit placeholders at high-entropy positions, allowing users to fill directly via IDE navigation. Theoretically, we formulate code completion as a cost-minimization problem under uncertainty. Premised on the observation that filling placeholders incurs lower cost than correcting errors, we prove the existence of a critical entropy threshold above which APC achieves strictly lower expected cost than HC. We instantiate this framework by constructing training data from filtered real-world edit logs and design a cost-based reward function for reinforcement learning. Extensive evaluations across 1.5B--14B parameter models demonstrate that APC reduces expected editing costs from 19% to 50% while preserving standard HC performance. Our work provides both a theoretical foundation and a practical training framework for uncertainty-aware code completion, demonstrating that adaptive abstention can be learned end-to-end without sacrificing conventional completion quality.

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2604.01849 [cs.CL]

(or arXiv:2604.01849v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2604.01849

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Liang Zhu [view email] [v1] Thu, 2 Apr 2026 10:03:32 UTC (126 KB)

Original source

arXiv cs.CL

https://arxiv.org/abs/2604.01849

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modeltraining

ProductsLive

Voice AI Agents: Building Speech-to-Speech Apps with TypeScript

Voice AI Agents: Building Speech-to-Speech Apps with TypeScript Voice is the most natural interface for AI. In 2026, speech-to-speech applications are transforming customer service, virtual assistants, and real-time translation. But building voice AI pipelines traditionally requires stitching together multiple SDKs: one for Speech-to-Text (STT), another for LLM inference, and a third for Text-to-Speech (TTS). NeuroLink unifies this entire pipeline into a single TypeScript SDK. In this guide, you'll learn how to build real-time voice AI agents using NeuroLink's streaming architecture. We'll cover speech-to-text integration, streaming LLM responses, text-to-speech synthesis, and practical patterns for production voice applications. Why Voice AI Is Hard (And How NeuroLink Solves It) Building

DEV Community

12m30 minutes ago

ProductsLive

Semantic Search with TypeScript: Using embed() and embedMany() for Vector Search

Semantic Search with TypeScript: Using embed() and embedMany() for Vector Search In the age of information overload, keyword-based search often falls short. Users aren't just looking for exact matches; they're looking for meaning . This is where semantic search shines, allowing systems to understand the intent behind a query and retrieve results that are conceptually similar, even if they don't contain the exact keywords. At the heart of semantic search lies the concept of embeddings – dense numerical representations of text that capture its meaning. NeuroLink, the universal AI SDK for TypeScript, simplifies the process of generating and utilizing these embeddings, making it straightforward to build powerful semantic search capabilities into your applications. This article will guide you t

DEV Community

10m24 minutes ago

ProductsLive

I Built 3 APIs for Turkey’s Used-Car Market with Apify

Turkey’s used-car market is massive, fragmented, and surprisingly hard to work with if you want structured data. Listings live across marketplaces, dealer pages are inconsistent, pricing changes fast, and even simple questions like “What is this car worth?” or “Which dealers dominate Istanbul for this brand?” are harder than they should be. So I built three focused APIs on top of Apify to solve different layers of the problem: A listing extraction API for Arabam A valuation API for Arabam + Sahibinden A dealer intelligence API for Arabam + Sahibinden All three are built for developers, analysts, insurers, lenders, marketplaces, and automotive businesses that need clean Turkish vehicle data instead of brittle scraping scripts. 1. Arabam.com Vehicle Scraper API The first API is the raw data

DEV Community

5m22 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 299 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsFresh

Google AI Edge Gallery

Google AI Edge Gallery Terrible name, really great app: this is Google's official app for running their Gemma 4 models (the E2B and E4B sizes, plus some members of the Gemma 3 family) directly on your iPhone. It works really well. The E2B model is a 2.54GB download and is both fast and genuinely useful. The app also provides "ask questions about images" and audio transcription (up to 30s) with the two small Gemma 4 models, and has an interesting "skills" demo which demonstrates tool calling against eight different interactive widgets, each implemented as an HTML page (though sadly the source code is not visible): interactive-map, kitchen-adventure, calculate-hash, text-spinner, mood-tracker, mnemonic-password, query-wikipedia, and qr-code. (That demo did freeze the app when I tried to add

Simon Willison Blog

1mabout 4 hours ago

ModelsLive

Alibaba s Qwen team built HopChain to fix how AI vision models fall apart during multi-step reasoning

When AI models reason about images, small perceptual errors compound across multiple steps and produce wrong answers. Alibaba's HopChain framework tackles this by generating multi-stage image questions that break complex problems into linked individual steps, forcing models to verify each visual detail before drawing conclusions. The approach improves 20 out of 24 benchmarks. The article Alibaba s Qwen team built HopChain to fix how AI vision models fall apart during multi-step reasoning appeared first on The Decoder .

The Decoder

1mabout 1 hour ago

ModelsLive

OpenAI reveals 600,000 weekly health queries from hospital deserts as seven in ten come after hours

ChatGPT gets millions of health queries per week in the US, especially from areas where doctors are hard to reach. The article OpenAI reveals 600,000 weekly health queries from hospital deserts as seven in ten come after hours appeared first on The Decoder .

The Decoder

1mabout 1 hour ago

ModelsLive

A Tale of AI Betrayal and Snapshotting Victory

Article URL: https://makc.co/essays/gpt-clusterfuck/ Comments URL: https://news.ycombinator.com/item?id=47658138 Points: 1 # Comments: 0

Hacker News AI Top

1mabout 1 hour ago