Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessOpenAI Raises $122B in Funding at $852B Valuation - Built InGoogle News: OpenAIWhat’s going on with Donut Lab?Engadget2026 World Cup predictions: AI picks full group standings and bracket - USA TodayGoogle News: GeminiThe agentic web meets the digital ad ecosystem - MarTechGoogle News: Machine LearningOracle Data Center Nears $16 Billion Financing After Twisty PathBloomberg TechnologyChinese Startup Debuts Super-Bendy Robotic Arm for Orbital RepairsGizmodoSupreme Court Justice Samuel Alito jokes about letting Claude AI decide a major caseBusiness Insider"Final Year Student? Here's Exactly What You Need to Get a Dev Job in 2026"DEV CommunitySuper Micro Co-Founder Pleads Not Guilty in China Smuggling CaseBloomberg TechnologyHow I Launched 14 SaaS Products in 6 Months as a Solo Founder Using LovableDEV CommunityFDB Just Launched the First MCP Server for Medication DecisionsDEV CommunityAnthropic accidentally leaked thousands of lines of code - Los Angeles TimesGoogle News: AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessOpenAI Raises $122B in Funding at $852B Valuation - Built InGoogle News: OpenAIWhat’s going on with Donut Lab?Engadget2026 World Cup predictions: AI picks full group standings and bracket - USA TodayGoogle News: GeminiThe agentic web meets the digital ad ecosystem - MarTechGoogle News: Machine LearningOracle Data Center Nears $16 Billion Financing After Twisty PathBloomberg TechnologyChinese Startup Debuts Super-Bendy Robotic Arm for Orbital RepairsGizmodoSupreme Court Justice Samuel Alito jokes about letting Claude AI decide a major caseBusiness Insider"Final Year Student? Here's Exactly What You Need to Get a Dev Job in 2026"DEV CommunitySuper Micro Co-Founder Pleads Not Guilty in China Smuggling CaseBloomberg TechnologyHow I Launched 14 SaaS Products in 6 Months as a Solo Founder Using LovableDEV CommunityFDB Just Launched the First MCP Server for Medication DecisionsDEV CommunityAnthropic accidentally leaked thousands of lines of code - Los Angeles TimesGoogle News: AI

ProofBridge: Auto-Formalization of Natural Language Proofs in Lean via Joint Embeddings

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2510.15681v3 Announce Type: replace-cross Abstract: Translating human-written mathematical theorems and proofs from natural language (NL) into formal languages (FLs) like Lean 4 has long been a significant challenge for AI. Most state-of-the-art methods either focus on theorem-only NL-to-FL auto-formalization or on FL proof synthesis from FL theorems. In practice, auto-formalization of both theorem and proof still requires human intervention, as seen in AlphaProof's silver-medal performance at the 2024 IMO, where problem statements were manually translated before automated proof synthesi — Prithwish Jana, Kaan Kale, Ahmet Ege Tanriverdi, Cruise Song, Sriram Vishwanath, Vijay Ganesh

View PDF HTML (experimental)

Abstract:Translating human-written mathematical theorems and proofs from natural language (NL) into formal languages (FLs) like Lean 4 has long been a significant challenge for AI. Most state-of-the-art methods either focus on theorem-only NL-to-FL auto-formalization or on FL proof synthesis from FL theorems. In practice, auto-formalization of both theorem and proof still requires human intervention, as seen in AlphaProof's silver-medal performance at the 2024 IMO, where problem statements were manually translated before automated proof synthesis. We present ProofBridge, a unified framework for automatically translating entire NL theorems and proofs into Lean 4. At its core is a joint embedding model that aligns NL and FL (NL-FL) theorem+proof pairs in a shared semantic space, enabling cross-modal retrieval of semantically relevant FL examples to guide translation. ProofBridge integrates retrieval-augmented fine-tuning with iterative proof repair, leveraging Lean's type checker and semantic equivalence feedback to ensure both syntactic correctness and semantic fidelity. Experiments show substantial improvements in proof auto-formalization over strong baselines (including GPT-5, Gemini-2.5, Kimina-Prover, DeepSeek-Prover), with our retrieval-augmented approach yielding significant gains in semantic correctness (SC, via proving bi-directional equivalence) and type correctness (TC, via type-checking theorem+proof) across pass@k metrics on miniF2F-Test-PF, a dataset we curated. In particular, ProofBridge improves cross-modal retrieval quality by up to 3.28x Recall@1 over all-MiniLM-L6-v2, and achieves +31.14% SC and +1.64% TC (pass@32) compared to the baseline Kimina-Prover-RL-1.7B.

Comments: Published as a conference paper at the 14th International Conference on Learning Representations (ICLR 2026), Rio de Janeiro, Brazil, April 23-27, 2026

Subjects:

Logic in Computer Science (cs.LO); Artificial Intelligence (cs.AI)

ACM classes: I.2.3; I.2.7; F.4; F.3.1; I.2.6

Cite as: arXiv:2510.15681 [cs.LO]

(or arXiv:2510.15681v3 [cs.LO] for this version)

https://doi.org/10.48550/arXiv.2510.15681

arXiv-issued DOI via DataCite

Submission history

From: Prithwish Jana [view email] [v1] Fri, 17 Oct 2025 14:20:50 UTC (1,993 KB) [v2] Sun, 7 Dec 2025 23:34:50 UTC (2,078 KB) [v3] Sun, 29 Mar 2026 12:53:42 UTC (2,112 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
ProofBridge…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 171 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers