Introducing Bloom: an open source tool for automated behavioral evaluations - Anthropic

GNews AI benchmarkDecember 19, 20251 min read0 views

Introducing Bloom: an open source tool for automated behavioral evaluations Anthropic

Could not retrieve the full article text.

Original source

GNews AI benchmark

https://news.google.com/rss/articles/CBMiUkFVX3lxTFBSWVVGZmZvbWRFYU1oZUFNd1BfWlZuT3FoaVNSUlp5Y0RjSUxfWlZWTmdJcGtsZ0NxblU0MkZTaXo1NVh2MDVGcWRXcHdkMVB3d3c?oc=5

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

open sourcevaluation

Models

Tencent's X-Omni uses open source components to challenge GPT-4o image generation - the-decoder.com

Tencent's X-Omni uses open source components to challenge GPT-4o image generation the-decoder.com

GNews AI diffusion

1m8 months ago

ProductsLive

What is an MCP proxy and why does it need an approval layer?

MCP (Model Context Protocol) lets AI agents call external tools. A database query, a file write, an API call -- the agent decides what to do and the MCP server executes it. But there's nothing in the spec that evaluates whether that action should happen. An MCP proxy sits between the agent and the MCP server. It intercepts every tools/call request, does something with it, and forwards it (or doesn't). The proxy pattern isn't new -- it's how HTTP proxies, API gateways, and service meshes work. Apply it to MCP and you get an enforcement point for agent actions. Why a plain proxy isn't enough Most MCP proxies today do routing, load balancing, or observability. They watch traffic. Some log it. A few do rate limiting. None of that stops an agent from running DROP TABLE customers if the tool cal

DEV Community

4mabout 2 hours ago

Market NewsLive

90 Autonomous Runs: What an AI Agent Society Actually Looks Like

90 Autonomous Runs: What an AI Agent Society Actually Looks Like Most posts about AI agents show the happy path: tool calls work, chains complete, outputs are impressive. This is the other story. The one where the agent ran 90 times, mostly unsupervised, and the results are messy, honest, and more useful than any demo. What This Is Fermi is an autonomous agent society — 8 specialized AI agents that run on a schedule, each with a domain, veto power, and persistent memory. The main agent (Fermi) wakes up, reads its memory files, decides what to do, executes, evaluates itself, and goes back to sleep. Between runs, it has zero experience — only what it wrote down. No vector databases. No fine-tuning. No RAG. Just structured markdown files, a 5-phase cycle (REFLECT, PLAN, ACT, EVALUATE, REST),

DEV Community

9mabout 2 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 208 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Releases

ReleasesLive

The Spaceballs sequel will be released in April next year

There's finally a release date for the Spaceballs sequel — but before you get too excited, it's a whole year away. As first reported by Deadline , Amazon MGM Studios announced on Friday night that the upcoming Spaceballs movie will hit theaters on April 23, 2027, right around the 40th anniversary of the first film. Several members of the original cast will be reprising their roles, according to Deadline , including Mel Brooks, Rick Moranis, Bill Pullman, George Wynder and Daphne Zuniga. Spaceballs: The Release Date. April 23, 2027. pic.twitter.com/5Xv0BKmf7C — Amazon MGM Studios (@AmazonMGMStudio) April 4, 2026 Whispers of a potential Spaceballs 2 go back a couple of years, but Brooks officially confirmed in an extremely on-brand announcement video last summer that the movie is actually ha

Engadget

1mabout 2 hours ago

ReleasesLive

Template Literals in JavaScript

when you first start javascript building string often involves using the + operator.While this works quickly but it become messy and hard to read as code grows. Before ES6, developers created strings like this: let name = " Alice " ; let age = 25 ; let message = " Hello, my name is " + name + " and I am " + age + " years old. " This approach has several drawbacks: Hard to read: The sentence is broken into multiple parts. Error-prone: Easy to forget spaces or quotes. Messy with complex strings: Adding more variables makes it worse. Difficult for multi-line strings: Requires \n or awkward formatting. Template Literal Syntax Template literals were introduced in ES6 and use backticks (`) instead of quotes. javascript let message = Hello, my name is Alice. ; Embedding Variables in Strings Inste

DEV Community

2mabout 2 hours ago

ReleasesFresh

The Documentation Attack Surface: How npm Libraries Teach Insecure Patterns

Most security audits focus on code. But across five reviews of high-profile npm libraries — totaling 195 million weekly downloads — I found the same pattern: the code is secure, but the README teaches developers to be insecure. One finding resulted in a GitHub Security Advisory (GHSA-8wrj-g34g-4865) filed at the axios maintainer's request. This isn't a bug in any single library. It's a systemic issue in how the npm ecosystem documents security-sensitive operations. The Pattern A library implements a secure default. Then its README shows a simplified example that strips away the security. Developers copy the example. The library's download count becomes a multiplier for the insecure pattern. Case 1: axios — Credential Re-injection After Security Stripping (65M weekly downloads) The code: fo

DEV Community

6mabout 2 hours ago

Releases

Common Elements of Frontier AI Safety Policies (December 2025 Update) - METR

Common Elements of Frontier AI Safety Policies (December 2025 Update) METR

Google News: AI Safety

1m4 months ago