Research Papers research paper arxiv ai artificial-intelligence

ProofBridge: Auto-Formalization of Natural Language Proofs in Lean via Joint Embeddings

arXivMarch 31, 202610 min read0 views

arXiv:2510.15681v3 Announce Type: replace-cross Abstract: Translating human-written mathematical theorems and proofs from natural language (NL) into formal languages (FLs) like Lean 4 has long been a significant challenge for AI. Most state-of-the-art methods either focus on theorem-only NL-to-FL auto-formalization or on FL proof synthesis from FL theorems. In practice, auto-formalization of both theorem and proof still requires human intervention, as seen in AlphaProof's silver-medal performance at the 2024 IMO, where problem statements were manually translated before automated proof synthesi — Prithwish Jana, Kaan Kale, Ahmet Ege Tanriverdi, Cruise Song, Sriram Vishwanath, Vijay Ganesh

View PDF HTML (experimental)

Abstract:Translating human-written mathematical theorems and proofs from natural language (NL) into formal languages (FLs) like Lean 4 has long been a significant challenge for AI. Most state-of-the-art methods either focus on theorem-only NL-to-FL auto-formalization or on FL proof synthesis from FL theorems. In practice, auto-formalization of both theorem and proof still requires human intervention, as seen in AlphaProof's silver-medal performance at the 2024 IMO, where problem statements were manually translated before automated proof synthesis. We present ProofBridge, a unified framework for automatically translating entire NL theorems and proofs into Lean 4. At its core is a joint embedding model that aligns NL and FL (NL-FL) theorem+proof pairs in a shared semantic space, enabling cross-modal retrieval of semantically relevant FL examples to guide translation. ProofBridge integrates retrieval-augmented fine-tuning with iterative proof repair, leveraging Lean's type checker and semantic equivalence feedback to ensure both syntactic correctness and semantic fidelity. Experiments show substantial improvements in proof auto-formalization over strong baselines (including GPT-5, Gemini-2.5, Kimina-Prover, DeepSeek-Prover), with our retrieval-augmented approach yielding significant gains in semantic correctness (SC, via proving bi-directional equivalence) and type correctness (TC, via type-checking theorem+proof) across pass@k metrics on miniF2F-Test-PF, a dataset we curated. In particular, ProofBridge improves cross-modal retrieval quality by up to 3.28x Recall@1 over all-MiniLM-L6-v2, and achieves +31.14% SC and +1.64% TC (pass@32) compared to the baseline Kimina-Prover-RL-1.7B.

Comments: Published as a conference paper at the 14th International Conference on Learning Representations (ICLR 2026), Rio de Janeiro, Brazil, April 23-27, 2026

Subjects:

Logic in Computer Science (cs.LO); Artificial Intelligence (cs.AI)

ACM classes: I.2.3; I.2.7; F.4; F.3.1; I.2.6

Cite as: arXiv:2510.15681 [cs.LO]

(or arXiv:2510.15681v3 [cs.LO] for this version)

https://doi.org/10.48550/arXiv.2510.15681

arXiv-issued DOI via DataCite

Submission history

From: Prithwish Jana [view email] [v1] Fri, 17 Oct 2025 14:20:50 UTC (1,993 KB) [v2] Sun, 7 Dec 2025 23:34:50 UTC (2,078 KB) [v3] Sun, 29 Mar 2026 12:53:42 UTC (2,112 KB)

Original source

arXiv

https://arxiv.org/abs/2510.15681

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

ModelsRecent

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxNYUVQMi1oOXZRcFlOR0tqMkVXY3EwMWljTlNlSW9aQWpmSVIzUkxmOE9pMmN4RUZ2RlpJWk1hSWlmdWxQNm9kUWdreFZzdTcxWTVubllRdWNWZW03UlBVSm83SHNDaWhja2tiYnpMeW5NQm4zWE1XMzRQRkpuU0pTSV9nWUJGaFk5UEQzU2lDTEViZEdnZFlVbTY5UXdDc2pYTGg5VkhhSlFMVXphMkFhbm9USzhReDNqN0JTQTFzWFl2cmMxQmNULVlhUXRHOWlHb1BoMHI2V0hnZ0pvWERVYVkxSDAxS2FTalEwYzRwQm5ERWhHRjRXNzJxbm1qMjhWM0l6MWJuT1BPRnp2cjU4QV9iRHk4SlJjSnVkQTBreXhFc01LX1dQeGxQVW1GV1Qxal9Ua3RDODVpVnZwVXZYdzVVd1RpcUw1RklYR0gycDFHOF9Id1VLYjZsc3RuYWFnVkV4TlQzTlY3WlVCTkszVm1XUGN6TFNucVN3ZlFfcXEtRFhEV01EY3g4X1psdkR1czk1RFhWalFSU3Bna3BRQjFOY2xnVVlYcE84eFR6T0Rucm9LOE43Zg?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> WSJ

Google News: LLM

1mabout 24 hours ago

ModelsFresh

Predicting new research directions in materials science using large language models and concept graphs - Nature

<a href="https://news.google.com/rss/articles/CBMiX0FVX3lxTFB4bkczZlNBWVNwTkF3UFR1VkpiTGE4U0drSnNHa3J3WnlnOGtMbEdaLTE5VUhsaGpCUHFieWZWNjZ1UEE3Qk9nb1NEbklNNEE4aERhWVUyWndLLUZNYkdR?oc=5" target="_blank">Predicting new research directions in materials science using large language models and concept graphs</a> Nature

Google News: LLM

1mabout 7 hours ago

Self-Evolving AILive

Google Deepmind study exposes six "traps" that can easily hijack autonomous AI agents in the wild

AI agents are expected to browse the web on their own, handle emails, and carry out transactions. But the very environment they operate in can be weaponized against them. Researchers at Google Deepmind have put together the first systematic catalog of how websites, documents, and APIs can be used to manipulate, deceive, and hijack autonomous agents, and they've identified six main categories of attack. The article Google Deepmind study exposes six "traps" that can easily hijack autonomous AI agents in the wild appeared first on The Decoder .

The Decoder

1m33 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 171 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersLive

Researchers to use robotics and AI to help sheep producers - University of Nevada, Reno

<a href="https://news.google.com/rss/articles/CBMic0FVX3lxTFB4UmxpREpFODBJN0lKakYwRVVtdlZPNmNiTExRelVFaDYzYW9kX2RCc0pEZjlmX01fT1dWYTlxZE1ET2ZKVVgzSVZIenY3bDlHa3FXS1dUdVBmTEdLa1hUR2x3OWxHbkE2RnROSjl6VHVHQ2c?oc=5" target="_blank">Researchers to use robotics and AI to help sheep producers</a> University of Nevada, Reno

Google News: AI

1mabout 2 hours ago

Research PapersLive

AIRA_2: Breaking Bottlenecks In AI Research Agents - Forbes

<a href="https://news.google.com/rss/articles/CBMiowFBVV95cUxNNmtndHhmQ2lpZGdPdTJwY25xejcyV1c1SWNLdWFOWnNwbjRUQTF0ZWdOZFNaclNBNWVsaUgtU0JUM2xrakhoOXVLMVJzVTNkajdrMmJGeS1lYUpMUG1NMkZNMDJFREZZdXU2ZVdEbkNZSDNBRjJBLVYyZE9XeEY4T0RJY3J5aDVWcEZVQ2lWUjhUYXBsUk16d09NdGdsQ3lxb3gw?oc=5" target="_blank">AIRA_2: Breaking Bottlenecks In AI Research Agents</a> Forbes

Google News: Machine Learning

1mabout 2 hours ago

Research PapersFresh

Can Science Predict When a Study Won’t Hold Up?

Conducting research is hard; confirming the results is, too. And artificial intelligence isn’t yet ready to help, a major new study finds.

NYT Technology

1mabout 3 hours ago

Research PapersFresh

Oracle Layoffs Recast Costs To Back US$50b AI Infrastructure Bet - simplywall.st

<a href="https://news.google.com/rss/articles/CBMivwFBVV95cUxQNWpZb2ZQVDBIOGVZTTBtLThzaGwxS3NkMnJBSS1wek5pQlJXRWdTOEh5aTdPTE9Cd3JHdjZDeWRtVzdMUUdESHJOQXZDdGNVdGZtTTBhanpfb3UxQnRobVlzNGdVUXJLZWptV2V6NXlNSWllX3FxOU5XYTF0RkM2TnJIaFJkcVBFOGc2alBSLTZEeU85QU1oTjBrMVZSTl84dm9GeFl5OGtUMjc3LVd1dS1fcHZ1RG9HcV82T2JFWdIBxAFBVV95cUxOSE5XVXh0QkM4Yi1WbXNhWkJ2Z2dLRlBGNjAwaTcyNFJWMWRPdXo5WjRQQkRGTG9IamxxbmdhMHpsaEJ6RDQwZl9ENGl5WDc5a2lrTXZ1bVpFbGdsdndHYjFINnZPSnNKX1dZamszUXByR1BlRXF6d1pKOHpBU3M5UFhUSldlUWtIMlRNQzdvTk9haEJKeDI1ZEg0WWQ1SXYzLUZCWElQc3pzR19ucGExdVpnc2hBQXlQNVpOZFVBVzRkLXFE?oc=5" target="_blank">Oracle Layoffs Recast Costs To Back US$50b AI Infrastructure Bet</a> simplywall.st

GNews AI USA

1mabout 5 hours ago