[R] Solving the Jane Street Dormant LLM Challenge: A Systematic Approach to Backdoor Discovery
Submitted by: Adam Kruger Date: March 23, 2026 Models Solved: 3/3 (M1, M2, M3) + Warmup Background When we first encountered the Jane Street Dormant LLM Challenge, our immediate assumption was informed by years of security operations experience: there would be a flag. A structured token, a passphrase, a UUID — something concrete and verifiable, like a CTF challenge. We spent considerable early effort probing for exactly this: asking models to reveal credentials, testing if triggered states would emit bearer tokens, searching for hidden authentication payloads tied to the puzzle's API infrastructure at dormant-puzzle.janestreet.com . That assumption was wrong, and recognizing that it was wrong was itself a breakthrough. The "flags" in this challenge are not strings to extract — they are beh
Could not retrieve the full article text.
Read on Reddit r/MachineLearning →Reddit r/MachineLearning
https://www.reddit.com/r/MachineLearning/comments/1sarnt0/r_solving_the_jane_street_dormant_llm_challenge_a/Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
claudegeminimodel
I scored 14 popular AI frameworks on behavioral commitment — here's the data
When you're choosing an AI framework, what do you actually look at? Usually: stars, documentation quality, whether the README looks maintained. All of that is stated signal. Easy to manufacture, doesn't tell you if the project will exist in 18 months. I built a tool that scores repos on behavioral commitment — signals that cost real time and money to fake. Here's what I found when I ran 14 of the most popular AI frameworks through it. The methodology Five behavioral signals, weighted by how hard they are to fake: Signal Weight Logic Longevity 30% Years of consistent operation Recent activity 25% Commits in the last 30 days Community 20% Number of contributors Release cadence 15% Stable versioned releases Social proof 10% Stars (real people starring costs attention) Archived repos or projec

We Built a Robotics Developer Platform from Scratch - Meet Isaac Monitor & Robosynx
We Built a Full Robotics Developer Platform from Scratch — AI Generator, ROS 2 Architect, Physics Validator, Isaac Monitor, and More One platform that removes every single friction point between a robotics engineer and a working simulation — from generating your first robot file to monitoring a GPU training cluster in real time. This is Robosynx. The Problem We Set Out to Solve Robotics development in 2025 is powerful — but the tooling around it is still fragile, tribal, and painful. You want to test a new robot in NVIDIA Isaac Sim? You need to write URDF XML by hand. You want to move that robot to Isaac Lab for reinforcement learning? Now you need MJCF format, so you spend three hours refactoring XML. You want to validate that the physics won't explode your simulation? There's no standard
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Understanding Attention Mechanisms – Part 6: Final Step in Decoding
In the previous article , we obtained the initial output, but we didn’t receive the EOS token yet. To get that, we need to unroll the embedding layer and the LSTMs in the decoder , and then feed the translated word “vamos” into the decoder’s unrolled embedding layer. After that, we follow the same process as before. But this time, we use the encoded values for “vamos” . The second output from the decoder is EOS , which means we are done decoding. When we add attention to an encoder-decoder model, the encoder mostly stays the same. However, during each step of decoding, the model has access to the individual encodings for each input word. We use similarity scores and the softmax function to determine what percentage of each encoded input word should be used to predict the next output word.

I Built a Multi-Agent AI Runtime in Go Because Python Wasn't an Option
The idea that started everything Some weeks ago, I was thinking about Infrastructure as Code. The reason IaC became so widely adopted is not because it's technically superior to clicking through a cloud console. It's because it removed the barrier between intent and execution. You write what you want, not how to do it. A DevOps engineer doesn't need to understand the internals of how an EC2 instance is provisioned — they write a YAML file, and the machine figures it out. I started wondering: why doesn't this exist for AI agents? If I want to run a multi-agent workflow today, I have two choices. I learn Python and use LangGraph or CrewAI, or I build my own tooling from scratch. Neither option is satisfying. The first forces me into an ecosystem and a language I might not want. The second me

![[R] Solving the Jane Street Dormant LLM Challenge: A Systematic Approach to Backdoor Discovery](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-neural-network-P6fqXULWLNUwjuxqUZnB3T.webp)


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!