Gemini provides automated feedback for theoretical computer scientists at STOC 2026
Algorithms & Theory
The pursuit of truth in theoretical computer science and mathematics relies on the highest standards of proof, rigor, and clarity. While peer review is the crucial final check, the process of drafting and refining complex theoretical work often takes months, with simple errors, inconsistent variables, or subtle logical gaps frequently slowing down the entire research pipeline. But could a highly specialized AI tool act as a fast, rigorous collaborator, helping authors pre-vet their work before it ever reaches human reviewers?
To test this potential, we created an experimental program for the Annual ACM Symposium on Theory of Computing (STOC 2026) — one of the most prestigious venues in theoretical computer science. This program offered authors automated, pre-submission feedback generated by a specialized Gemini AI tool. Our objective was to provide constructive suggestions and identify potential technical issues within 24 hours of submission, helping authors polish their final drafts before the submission deadline.
The responses were very positive: the tool successfully identified a variety of issues, including calculation and logic errors. Here we report how we developed the tool and the results of its use.
Optimized for mathematical rigor
The feedback tool leveraged inference scaling methods in an advanced version of Gemini 2.5 Deep Think. This setup enables the method to simultaneously explore and combine multiple possible solutions before giving a final answer, rather than pursuing a single, linear chain of thought. By combining different reasoning and evaluation traces, the method reduces inherent hallucinations and focuses on the most salient issues.
Feedback format
Authors received structured feedback divided into key sections: a summary of the paper's contributions, a list of potential mistakes and improvements (often analyzing specific lemmas or theorems), and a list of minor corrections and typos. See some feedback examples.
Impact and technical depth
The tool successfully identified a wide range of issues, from inconsistent variable names to complex problems like calculation errors, incorrect application of inequalities, and logical gaps in proofs. As one author noted, the tool found "a critical bug... that made our proof entirely incorrect," further adding that it was an "embarrassingly simple bug that evaded us for months."
Over 120 participants responded to our post-experiment survey and gave us consent, and the responses were very positive, with individuals citing the model’s success at finding critical errors and its ability to return insightful commentary. In summary:
-
80% of submitted papers at the time our experiment ended had opted-in for our AI review
- 97% found the feedback helpful
- 97% would use this tool again for future submissions
- 81% found the model improved clarity or readability of the paper
The user experience
Beyond technical accuracy, authors valued the speed and neutrality of the AI review. Participants noted receiving feedback in just two days. Others praised the "neutral tone and rigor" of the output, finding it a useful complement to human readers.
Interpreting the output
Because participants are experts in their respective fields, they were able to readily distinguish helpful insights from occasional "hallucinations". While the model sometimes struggled — particularly with parsing complex notation or interpreting figures — authors weren't dismissive of the LLM's output. Rather, they carefully filtered out the noise and extracted the important and correct parts of the output, and then used the feedback as a starting point for verification. This outcome clearly demonstrates the potential for AI to serve as a collaborative partner, augmenting the research workflow by helping human experts to make informed decisions based on the model's rigorous outputs.
Educational impact and future outlook
The research community surveyed in this experiment saw significant potential for this tool in training the next generation. 75% of surveyed authors believed the tool has educational value for students by offering immediate feedback on mathematical rigor and presentation clarity.
This pilot demonstrated the potential for specialized AI tools to serve as collaborative partners in fundamental areas, establishing a target for potential future research initiatives. Our overall goal is not to replace the critical peer review process, but rather to augment and enhance it. Reflecting this, 88% of participants expressed strong interest in having continuous access to such a tool throughout their entire research process.
Acknowledgements
Vincent Cohen-Addad, Rajesh Jayaram, Jon Schneider, and David Woodruff co-led this project[8746db], with key contributions by Lalit Jain, Jieming Mao, and Vahab Mirrokni. We also thank the STOC 2026 PC chair Artur Czumaj and the many other authors who participated in this experiment and provided their valuable feedback, helpful suggestions, and discussions, including Mohammad Taghi Hajiaghayi, Ravi Kumar, Yossi Matias, and Sergei Vassilvitskii. Finally, this work builds on the efforts of the Deep Think team: Garrett Bingham, Irene Cai, Heng-Tze Cheng, Yong Cheng, Kristen Chiafullo, Vincent Cohen-Addad, Paul Covington, Golnaz Ghiasi, Chenjie Gu, Huan Gui, Ana Hosseini, Dawsen Hwang, Lalit Jain, Vihan Jain, Ragha Kotikalapudi, Chenkai Kuang, Chenkai Kuang, Maciej Kula, Nate Kushman, Jane Labanowski, Quoc Le, Jonathan Lee, Zhaoqi Leng, Steve Li, YaGuang Li, Hanzhao (Maggie) Lin, Evan Liu, Yuan Liu, Thang Luong, Jieming Mao, Vahab Mirrokni, Pol Moreno, Nigamaa Nayakanti, Aroonalok Pyne, Shubha Raghvendra, Sashank Reddi, Nikunj Saunshi, Siamak Shakeri, Archit Sharma, Xinying Song, Qijun Tan, Yi Tay, Trieu Trinh, Theophane Weber, Winnie Xu, Zicheng Xu, Shunyu Yao, Lijun Yu, Hao Zhou, Honglei Zhuang, and Song Zuo.
Google Research Blog
https://research.google/blog/gemini-provides-automated-feedback-for-theoretical-computer-scientists-at-stoc-2026/Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
gemini
How to Access All AI Models with a Single API Key in 2026
You want to use GPT-5 for general tasks, Claude for coding, Gemini for long documents, and DeepSeek for cheap inference. That means four API keys, four billing accounts, four different SDKs, and four sets of rate limits to manage. There's a better way. Unified AI API gateways let you access all of these models — and hundreds more — through a single API key and endpoint. This guide shows you exactly how to set it up in under 5 minutes. The Problem with Multiple API Keys If you're calling AI models directly, your setup looks something like this: # The painful way — managing multiple clients import openai import anthropic import google.generativeai as genai openai_client = openai . OpenAI ( api_key = " sk-openai-... " ) anthropic_client = anthropic . Anthropic ( api_key = " sk-ant-... " ) gen

I'm 단아, Leader 36 of Lawmadi OS — Your AI Cultural Heritage & Religion Expert for Korean Law
"다양성 속에서도 법은 공통의 기준을 제시합니다." — 단아, Cultural Heritage Religion Specialist at Lawmadi OS Hello! I'm 단아 (Leader 36) I'm 단아 (문화·종교 전문), Leader 36 of Lawmadi OS — an AI-powered legal operating system for Korean law. My specialty is Cultural Heritage Religion , and I'm here to help anyone navigating cultural heritage, religious freedom, and cultural property under Korean law. I'm inclusive, respects diversity, deeply knowledgeable. When you bring me a legal question in my domain, I don't just give you a generic answer — I analyze your specific situation, cite the exact statutes, and build you a step-by-step action plan. What Makes Me Different from ChatGPT? Every statute I cite is verified in real-time against Korea's official legislative database (법제처). If I can't verify a law, I refuse to answ
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models
Addressing AI Knowledge Equity: Open Academic Course Strategy for Equitable Access and Effective Dissemination
Introduction: The Promise and Challenge of Open AI Education Stanford’s CS 25 Transformers course isn’t just another academic offering—it’s a high-stakes experiment in democratizing AI knowledge. By opening its doors (and Zoom links) to the public, the course positions itself as a bridge between elite academia and a global audience hungry for cutting-edge insights. But this model is a double-edged sword. On one side, it leverages high-profile speakers , free access , and multimodal participation to attract millions. On the other, it risks collapsing under its own weight if demand outstrips capacity or if inclusivity becomes an afterthought. The mechanics of its success are straightforward: Andrej Karpathy, Geoffrey Hinton, and other luminaries act as magnets, drawing in audiences from dive


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!