Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessConnecting MCP servers to Amazon Bedrock AgentCore Gateway using Authorization Code flowAWS Machine Learning BlogParsing the AI and gaming future with Nvidia’s Jensen Huang | GTC Q&A - GamesBeatGNews AI NVIDIAStartup Battlefield 200 applications open: A chance for VC access, TechCrunch coverage, and $100KTechCrunch Venture🔥 Jeffallan/claude-skillsGitHub Trending🔥 teng-lin/notebooklm-pyGitHub Trending🔥 HKUDS/DeepTutorGitHub TrendingNebius Stock Rises on $12B Meta AI Deal & Nvidia Investment | 2026 - News and Statistics - indexbox.ioGNews AI NVIDIAHow to use the new ChatGPT app integrations, including DoorDash, Spotify, Uber, and othersTechCrunch AI98% of Firms Struggling to Manage Wireless as AI ExplodesDigit.fyiNvidia’s AI Boom Faces Taiwan Supply Risk - NVIDIA (NASDAQ:NVDA), Taiwan Semiconductor (NYSE:TSM) - BenzingaGNews AI NVIDIAFrom Prompt Engineering to Harness Engineering: The Next Evolution of LLM SystemsTowards AIAdvancing Responsible AI Adoption and Use in the Public Sector: Three Policy Priorities for State LegislationCenter for Democracy & TechnologyBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessConnecting MCP servers to Amazon Bedrock AgentCore Gateway using Authorization Code flowAWS Machine Learning BlogParsing the AI and gaming future with Nvidia’s Jensen Huang | GTC Q&A - GamesBeatGNews AI NVIDIAStartup Battlefield 200 applications open: A chance for VC access, TechCrunch coverage, and $100KTechCrunch Venture🔥 Jeffallan/claude-skillsGitHub Trending🔥 teng-lin/notebooklm-pyGitHub Trending🔥 HKUDS/DeepTutorGitHub TrendingNebius Stock Rises on $12B Meta AI Deal & Nvidia Investment | 2026 - News and Statistics - indexbox.ioGNews AI NVIDIAHow to use the new ChatGPT app integrations, including DoorDash, Spotify, Uber, and othersTechCrunch AI98% of Firms Struggling to Manage Wireless as AI ExplodesDigit.fyiNvidia’s AI Boom Faces Taiwan Supply Risk - NVIDIA (NASDAQ:NVDA), Taiwan Semiconductor (NYSE:TSM) - BenzingaGNews AI NVIDIAFrom Prompt Engineering to Harness Engineering: The Next Evolution of LLM SystemsTowards AIAdvancing Responsible AI Adoption and Use in the Public Sector: Three Policy Priorities for State LegislationCenter for Democracy & Technology
AI NEWS HUBbyEIGENVECTOREigenvector

PAFT: Preservation Aware Fine-Tuning for Minimal-Edit Program Repair

arXiv cs.SEby Boyang Yang, Zijian Cai, Shunfu Jin, Haoye TainApril 6, 20261 min read0 views
Source Quiz

arXiv:2604.03113v1 Announce Type: new Abstract: Large language models (LLMs) are effective for automated program repair, but plausible patches that pass the full test suite often rewrite more code than necessary, increasing review and maintenance costs. This over-editing is common because most bugs are localized, while standard supervised fine-tuning provides no explicit signal about which tokens should be preserved and which should be changed. We propose PAFT, a preservation-aware fine-tuning method for minimal-edit program repair. PAFT derives token-level preservation signals by aligning buggy and fixed code, combines them with full-sequence masking, and applies an edit-difficulty curriculum. Across Defects4J and HumanEval-Java, PAFT improves pass@1 by up to 65.6% over standard supervise

View PDF HTML (experimental)

Abstract:Large language models (LLMs) are effective for automated program repair, but plausible patches that pass the full test suite often rewrite more code than necessary, increasing review and maintenance costs. This over-editing is common because most bugs are localized, while standard supervised fine-tuning provides no explicit signal about which tokens should be preserved and which should be changed. We propose PAFT, a preservation-aware fine-tuning method for minimal-edit program repair. PAFT derives token-level preservation signals by aligning buggy and fixed code, combines them with full-sequence masking, and applies an edit-difficulty curriculum. Across Defects4J and HumanEval-Java, PAFT improves pass@1 by up to 65.6% over standard supervised fine-tuning (StdFT) while reducing average edit distance (AED) by up to 32.6%. On Defects4J with DeepSeek-Coder-6.7B, PAFT also outperforms AdaPatcher, a strong preference-based repair baseline, improving pass@1 from 5.9% to 10.1% while reducing median AED from 61.0 to 42.0. Overall, PAFT preserves stable context and concentrates edits on faulty regions, yielding smaller, more localized, plausible patches without inference-time search, reranking, or post-processing.

Subjects:

Software Engineering (cs.SE)

Cite as: arXiv:2604.03113 [cs.SE]

(or arXiv:2604.03113v1 [cs.SE] for this version)

https://doi.org/10.48550/arXiv.2604.03113

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Boyang Yang [view email] [v1] Fri, 3 Apr 2026 15:35:47 UTC (808 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelannounce

Knowledge Map

Knowledge Map
TopicsEntitiesSource
PAFT: Prese…modellanguage mo…announcereviewarxivfine-tuningarXiv cs.SE

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 264 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models