Anthropic’s Claude Source Code Leak Hands Competitors a Blueprint It Spent Billions to Build - PYMNTS.com

More about

claudebillion

Market News

Microsoft Plans to Invest $5.5 Billion in Singapore by 2029 - WSJ

Microsoft Plans to Invest $5.5 Billion in Singapore by 2029 WSJ

GNews AI Singapore

1m3 days ago

ModelsLive

AI Safety at the Frontier: Paper Highlights of February & March 2026

tl;dr Paper of the month: A benchmark of 56 model organisms with hidden behaviors finds that auditing-tool rankings depend heavily on how the organism was trained — and the investigator agent, not the tools, is the bottleneck. Research highlights: Linear “emotion vectors” in Claude causally drive misalignment: “desperate” steering raises blackmail from 22% to 72%, “calm” drops it to 0%. Emergent misalignment is the optimizer’s preferred solution — more efficient and more stable than staying narrowly misaligned. Scheming propensity in realistic settings is near 0%, but can dramatically increase from one prompt snippet or tool change. AI self-monitors are up to 5× more likely to approve an action shown as their own prior turn — driven by implicit cues, not stated authorship. Reasoning models

lesswrong.com

18m16 minutes ago

Market News

Microsoft Plans to Invest $5.5 Billion in Singapore by 2029 - WSJ

Microsoft Plans to Invest $5.5 Billion in Singapore by 2029 WSJ

GNews AI Singapore

1m3 days ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 173 connections

Scroll to zoom · drag to pan · click to open

More in Models

ModelsLive

AI Safety at the Frontier: Paper Highlights of February & March 2026

tl;dr Paper of the month: A benchmark of 56 model organisms with hidden behaviors finds that auditing-tool rankings depend heavily on how the organism was trained — and the investigator agent, not the tools, is the bottleneck. Research highlights: Linear “emotion vectors” in Claude causally drive misalignment: “desperate” steering raises blackmail from 22% to 72%, “calm” drops it to 0%. Emergent misalignment is the optimizer’s preferred solution — more efficient and more stable than staying narrowly misaligned. Scheming propensity in realistic settings is near 0%, but can dramatically increase from one prompt snippet or tool change. AI self-monitors are up to 5× more likely to approve an action shown as their own prior turn — driven by implicit cues, not stated authorship. Reasoning models