Smuggled Intelligence
<table><tr><td><img alt="Chain of Thought" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/59/small_chain_of_thought_logo.png" /></td><td></td><td><table><tr><td>by <a href="https://every.to/@danshipper" itemprop="name">Dan Shipper</a></td></tr><tr><td>in <a href="https://every.to/chain-of-thought">Chain of Thought</a></td></tr></table></td></tr></table><figure><img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/3779/full_page_cover_Screenshot_2025-10-07_at_10.13.12_AM.png"><figcaption></figcaption></figure><p><em>Was this newsletter forwarded to you? <u><a href="https://every.to/account" rel="noopener noreferrer" target="_blank">Sign up</a></u> to get it in your inbox.</em></p><p></p><hr class="quill-line"><p></p><p>Here’s a question: Are we officially i
Was this newsletter forwarded to you? Sign up to get it in your inbox.
Here’s a question: Are we officially in the part of the movie where human experts lose their livelihoods and we realize we’ve been training our replacements the whole time?
I ask because the current rate of AI progress is both exciting and unsettling.
GPT-5 Pro has begun to cross boundaries that, until recently, felt securely human. This month, it solved Yu Tsumura’s 554th problem—a notoriously tricky exercise in abstract algebra that every major model before it had failed—producing a clean proof in 15 minutes. A week later, the noted quantum computing researcher Scott Aaronson credited GPT-5 with providing a key technical step in a proof he was working on.
OpenAI recently came out with a benchmark called GDPval, which evaluates how well AI performs real expert-level tasks drawn from 44 different occupations. For instance, one asks the model to play the role of a wholesale sales analyst: It needs to audit an Excel file of customer orders to find pricing mismatches and packaging errors, and summarize the findings and recommendations in a short report.
Overall, the research showed that GPT-5 was as good as or better than human professionals 40.6 percent of the time. Claude Opus 4.1, meanwhile, was better than human experts a whopping 49 percent of the time.
Cue a slew of headlines like, “OpenAI tool shows AI catching up to human work” from Axios, or, “AI models are already as good as experts at half of tasks, new OpenAI benchmark GDPval suggests” from Fortune.
Create a free account to continue reading
The Only SubscriptionYou Need to Stay at the Edge of AI
The essential toolkit for those shaping the future
"This might be the best value youcan get from an AI subscription."
- Jay S.
Every Content
AI&I Podcast
Monologue
Cora
Sparkle
Spiral
Join 100,000+ leaders, builders, and innovators
Email address
Already have an account? Sign in
What is included in a subscription?
Daily insights from AI pioneers + early access to powerful AI tools
Front-row access to the future of AI
Bundle of AI software
Thanks for rating this post—join the conversation by commenting below.
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
claudemodelbenchmarkAutonomous AI systems depend on data governance
Much of the current focus on AI safety has centred on models – how they are trained and monitored. But as systems become more autonomous, attention is changing toward the data those systems depend on. If the data feeding an AI system is fragmented, outdated, or lacks oversight, the system s behaviour can become more unpredictable. [ ] The post Autonomous AI systems depend on data governance appeared first on AI News .
Alibaba releases Qwen3.6-Plus, its third proprietary, closed-source AI model launched within a three-day period, saying it "drastically enhanced" agentic coding (Luz Ding/Bloomberg)
Luz Ding / Bloomberg : Alibaba releases Qwen3.6-Plus, its third proprietary, closed-source AI model launched within a three-day period, saying it drastically enhanced agentic coding Alibaba Group Holding Ltd. has released its third proprietary AI model in as many days, reinforcing the company's intent
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models
Autonomous AI systems depend on data governance
Much of the current focus on AI safety has centred on models – how they are trained and monitored. But as systems become more autonomous, attention is changing toward the data those systems depend on. If the data feeding an AI system is fragmented, outdated, or lacks oversight, the system s behaviour can become more unpredictable. [ ] The post Autonomous AI systems depend on data governance appeared first on AI News .

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!