Anthropic to sign deal with Australia on AI safety and economic data tracking - The Star
<a href="https://news.google.com/rss/articles/CBMiygFBVV95cUxQVkE3bjU2NnN0al9FQlhqRUZmZFhlYzZHTWJCU3lYX3dPLVQ5d2VnQ01TWWtzRVNHcEdvNEtpY2toRW0zdnNuaFlKM0Vtc0ZpcFFnUkJ0eEpfTktlM2tkWFhCRXZTcUhmUGdMZ2lSMmM3U3FIdEk5Yl9SUDFWX2RaZ3JjYmQ5MFl1bmZScUczbXNzcW1vWFlXNmVoVDJIYS1jTmowSHZCc3Y3dkNXbDVGdDk0SlNRVjAxSUt0ZXFhN0czUzJWaWtxOS1B?oc=5" target="_blank">Anthropic to sign deal with Australia on AI safety and economic data tracking</a> <font color="#6f6f6f">The Star</font>
Could not retrieve the full article text.
Read on Google News: AI Safety →Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
safetyAnthropic partners with Australia to advance AI safety and track economic impact - CXO Digitalpulse
<a href="https://news.google.com/rss/articles/CBMiswFBVV95cUxPOVQxZTJzSjlYN1VNT1ZkZjdrY3NlYTdnLU1MQkkyOWxhMm5VNDBTeUlXMzM2U0lkRkhRT05UQ09YQXduNWxDUm1VWE9OcnVLQnBIUzctZ1p5NXdDd0xIVHhLZWUwUG5wUkZGWUtyR2FVeGE0b1ZydlNCcm5IM2wyaEcybzZUU0J5WkVpd0o3RXh1VHdxaXA2RlBOR2VnVlEySzMwWjQxbF91Z3J2YUIyMGlCWQ?oc=5" target="_blank">Anthropic partners with Australia to advance AI safety and track economic impact</a> <font color="#6f6f6f">CXO Digitalpulse</font>
AI giant Anthropic signs safety pact with Australia - Inner East Review
<a href="https://news.google.com/rss/articles/CBMiowFBVV95cUxOd2ZVam0yQXZhYTFsWXdSSjRRaGFncUx1bnBDTFNMZU05VHhXT2JxRzJ0VDZHNEpfQzZoejR0TG90Y1pSaWljdmU4VWF1RUh6QXp1eDdZYmNrRXktTEVQMDVoQ1BlRG9UekxnMGJGdmhLT3Q0TlZpNTVWOUNsU2Y2X014ZmVNdGpYRTdTbFVINExWTkhuTkkxSy1KSWVDNkhoLVNB?oc=5" target="_blank">AI giant Anthropic signs safety pact with Australia</a> <font color="#6f6f6f">Inner East Review</font>
Trojan-Speak: Bypassing Constitutional Classifiers with No Jailbreak Tax via Adversarial Finetuning
arXiv:2603.29038v1 Announce Type: new Abstract: Fine-tuning APIs offered by major AI providers create new attack surfaces where adversaries can bypass safety measures through targeted fine-tuning. We introduce Trojan-Speak, an adversarial fine-tuning method that bypasses Anthropic's Constitutional Classifiers. Our approach uses curriculum learning combined with GRPO-based hybrid reinforcement learning to teach models a communication protocol that evades LLM-based content classification. Crucially, while prior adversarial fine-tuning approaches report more than 25% capability degradation on reasoning benchmarks, Trojan-Speak incurs less than 5% degradation while achieving 99+% classifier evasion for models with 14B+ parameters. We demonstrate that fine-tuned models can provide detailed resp
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Countries
The socio-technical gap: an AI framework for project resilience in UK construction - Frontiers
<a href="https://news.google.com/rss/articles/CBMimwFBVV95cUxQRXlnajROd0RJSnZSUzNGUkRpMTdQUEpraFN3UEpfZ3pnTkpwbVdlWWZmNVczMnNRbjRhbTN5LTRmeFNKOUljVjVCd2RIX0JTRF90eUs0Y2ZPaDJMLS1kRFVWLVBTSzM4MEtEUllUOXhyVm44S2RZR1IwZ285SDRyQXY0OHNXN0VWeEZvRVNwLVM1ZkZEWHlYeDhYTQ?oc=5" target="_blank">The socio-technical gap: an AI framework for project resilience in UK construction</a> <font color="#6f6f6f">Frontiers</font>
OpenAI and Microsoft back UK-led global push to make AI safer - Open Access Government
<a href="https://news.google.com/rss/articles/CBMiqgFBVV95cUxQbXFla1ZrbEhnNm1UVVgwaDdpbzVNVHFpclhIdWVlMkVRTkhlTnlZQmVXZjhFeGN4dEdHZ2RIamFVc25DOHVsSjRSU180U2ZnanJIdU5hVjBWbjd1bXRONjBrRUtLWmROQlN1WXhRZlBLMVJaVk0xTnJueGVGRWp5Z1djeEJsc0hBMDJLRF9CVEYzSVVlTVNpZlA3ZGFaMEpGOUwxTDB4OFVwUQ?oc=5" target="_blank">OpenAI and Microsoft back UK-led global push to make AI safer</a> <font color="#6f6f6f">Open Access Government</font>
Russia at the Forefront of Front-Line AI - russiapost.info
<a href="https://news.google.com/rss/articles/CBMiV0FVX3lxTE9CYjh0QVpKTW90SDdPZV82VTloYVNSLVV2VGEzMjV1RXBvSFFBc3NETnlwci16TWJUUkh5UU9QRkhqTm8yZGFQQkZBNHpaSXNXcFI4XzRjZw?oc=5" target="_blank">Russia at the Forefront of Front-Line AI</a> <font color="#6f6f6f">russiapost.info</font>
United States, China or Russia: Who writes the moral code for artificial intelligence? - Lowy Institute
<a href="https://news.google.com/rss/articles/CBMivgFBVV95cUxNTkVCS3VoYzN4SHZQVi1VaW5CQ0tqYkJYLWd6NzlvWVlqMjdjRFplQ25DS25LYTk1U1lFX2Y5SFBrSzNpWXJyU0JHOTJJWVEtcWNRRE1kYjBDSE5ReUk5RjNTMDVreUxQcXlndXBqcWl5aG1yVFltWHZQWmdydHZCVnZSc1FmaEZRZU1JNm9ac21zY2lkTmE5X3BTYWd6OFdlOGM5bXg5T090QXNVUE83YkNNRURoUE11MjV1Z3l3?oc=5" target="_blank">United States, China or Russia: Who writes the moral code for artificial intelligence?</a> <font color="#6f6f6f">Lowy Institute</font>
Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!