Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessIs Scale AI Stock Public in 2026? Price, Symbol & Alternatives - Bullish BearsGoogle News - Scale AI dataHow to Choose Your MVP Tech StackDEV CommunityDocument Workflow Automation: An Architectural Guide to Building API-Driven Document PipelinesDEV CommunityHow to Roll Back a Failed Deployment in 30 SecondsDEV CommunityWho's hiring — April 2026DEV CommunityScraped 300 pages successfully. Site updated robots.txt at page 187 and blocked me.DEV CommunityI built an npm malware scanner in Rust because npm audit isn't enoughDEV CommunityMCP App CSP Explained: Why Your Widget Won't RenderDEV CommunityVS-wet dreigt ASML-export van immersiemachines naar China af te knijpenTweakers.netBuilt a script to categorize expenses automatically. Saved 3 hours/month.DEV CommunityFrom MLOps to LLMOps: A Practical AWS GenAI Operations GuideDEV CommunityCleaned 10k customer records. One emoji crashed my entire pipeline.DEV CommunityBlack Hat USADark ReadingBlack Hat AsiaAI BusinessIs Scale AI Stock Public in 2026? Price, Symbol & Alternatives - Bullish BearsGoogle News - Scale AI dataHow to Choose Your MVP Tech StackDEV CommunityDocument Workflow Automation: An Architectural Guide to Building API-Driven Document PipelinesDEV CommunityHow to Roll Back a Failed Deployment in 30 SecondsDEV CommunityWho's hiring — April 2026DEV CommunityScraped 300 pages successfully. Site updated robots.txt at page 187 and blocked me.DEV CommunityI built an npm malware scanner in Rust because npm audit isn't enoughDEV CommunityMCP App CSP Explained: Why Your Widget Won't RenderDEV CommunityVS-wet dreigt ASML-export van immersiemachines naar China af te knijpenTweakers.netBuilt a script to categorize expenses automatically. Saved 3 hours/month.DEV CommunityFrom MLOps to LLMOps: A Practical AWS GenAI Operations GuideDEV CommunityCleaned 10k customer records. One emoji crashed my entire pipeline.DEV Community
AI NEWS HUBbyEIGENVECTOREigenvector

Building a Circuit Breaker in Rust: From Zero to Production

DEV Communityby Dylan DumontMarch 31, 20267 min read1 views
Source Quiz

<blockquote> <p>Your service calls an external API. It goes down. Your threads pile up waiting for timeouts.<br> Your whole app dies. The Circuit Breaker pattern exists to prevent exactly this.</p> </blockquote> <h2> What We're Building </h2> <p>A production-grade Circuit Breaker with three states, configurable thresholds, and zero unsafe code.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code> ┌─────────────────────────────────┐ │ │ failures >= threshold call succeeds │ │ ┌───────────────▼──────────┐ ┌────────────┴─────────┐ │ │ │ │ │ CLOSED │ │ HALF-OPEN │ │ (requests pass through)│ │ (one probe call) │ │ │ │ │ └──────────────────────────┘ └───────────────────────┘ ▲ │ │ call fails│ │ ▼ │ ┌────────────────────────┐ │ │ │ │ timeout expires │ OPEN │

Your service calls an external API. It goes down. Your threads pile up waiting for timeouts. Your whole app dies. The Circuit Breaker pattern exists to prevent exactly this.

What We're Building

A production-grade Circuit Breaker with three states, configurable thresholds, and zero unsafe code.

┌─────────────────────────────────┐  │ │  failures >= threshold call succeeds  │ │  ┌───────────────▼──────────┐ ┌────────────┴─────────┐  │ │ │ │  │ CLOSED │ │ HALF-OPEN │  │ (requests pass through)│ │ (one probe call) │  │ │ │ │  └──────────────────────────┘ └───────────────────────┘  ▲ │  │ call fails│  │ ▼  │ ┌────────────────────────┐  │ │ │  │ timeout expires │ OPEN │  └──────────────────│ (requests rejected) │  │ │  └────────────────────────┘

Enter fullscreen mode

Exit fullscreen mode

Three states:

  • Closed — normal operation, requests flow through, failures are counted

  • Open — service is assumed down, requests are rejected immediately (fail fast)

  • Half-Open — after a timeout, one probe request goes through to check recovery

Step 1 — Model the State Machine

Start with the types. In Rust, states map naturally to an enum.

use std::time::{Duration, Instant};

#[derive(Debug, Clone, PartialEq)] pub enum CircuitState { Closed, Open { opened_at: Instant }, HalfOpen, }`

Enter fullscreen mode

Exit fullscreen mode

Open carries an Instant so we know when to transition to Half-Open. No booleans, no stringly-typed states.

Step 2 — Define the Configuration

Separate config from runtime state — single responsibility, easy to test.

#[derive(Debug)] pub struct CircuitBreakerConfig {  /// How many consecutive failures before opening the circuit  pub failure_threshold: u32,  /// How long to wait in Open before probing again  pub recovery_timeout: Duration,  /// How many consecutive successes in Half-Open to close again  pub success_threshold: u32, }

impl Default for CircuitBreakerConfig { fn default() -> Self { Self { failure_threshold: 5, recovery_timeout: Duration::from_secs(30), success_threshold: 2, } } }`

Enter fullscreen mode

Exit fullscreen mode

Step 3 — Build the Circuit Breaker

use std::sync::{Arc, Mutex};

#[derive(Debug)] pub struct CircuitBreaker { config: CircuitBreakerConfig, state: Mutex, failure_count: Mutex, success_count: Mutex, }

impl CircuitBreaker { pub fn new(config: CircuitBreakerConfig) -> Arc { Arc::new(Self { config, state: Mutex::new(CircuitState::Closed), failure_count: Mutex::new(0), success_count: Mutex::new(0), }) } }`

Enter fullscreen mode

Exit fullscreen mode

We wrap it in Arc immediately — a Circuit Breaker is always shared across threads.

Step 4 — The Core: call()

This is where the state machine lives.

#[derive(Debug, thiserror::Error)] pub enum CircuitError {  #[error("Circuit is open — service unavailable")]  Open,  #[error("Service call failed: {0}")]  Inner(E), }

impl CircuitBreaker { pub fn call(&self, f: F) -> Result> where F: FnOnce() -> Result, { // 1. Check if we should allow the request self.check_state()?;

// 2. Execute the call match f() { Ok(value) => { self.on_success(); Ok(value) } Err(e) => { self.on_failure(); Err(CircuitError::Inner(e)) } } }

fn check_state(&self) -> Result<(), CircuitError<()>> { let mut state = self.state.lock().unwrap();

match &state { CircuitState::Closed => Ok(()),

CircuitState::Open { opened_at } => { // Has the recovery timeout elapsed? if opened_at.elapsed() >= self.config.recovery_timeout { *state = CircuitState::HalfOpen; *self.success_count.lock().unwrap() = 0; Ok(()) // allow the probe request } else { Err(CircuitError::Open) } }

CircuitState::HalfOpen => Ok(()), // allow one probe } }

fn on_success(&self) { let state = self.state.lock().unwrap().clone();

match state { CircuitState::HalfOpen => { let mut successes = self.success_count.lock().unwrap(); successes += 1;

if *successes >= self.config.success_threshold { *self.state.lock().unwrap() = CircuitState::Closed; *self.failure_count.lock().unwrap() = 0; *successes = 0; tracing::info!("Circuit breaker closed — service recovered"); } } CircuitState::Closed => { // Reset failure count on success self.failure_count.lock().unwrap() = 0; } CircuitState::Open { .. } => {} // shouldn't happen } }

fn on_failure(&self) { let state = self.state.lock().unwrap().clone();

match state { CircuitState::Closed => { let mut failures = self.failure_count.lock().unwrap(); failures += 1;

if *failures >= self.config.failure_threshold { *self.state.lock().unwrap() = CircuitState::Open { opened_at: Instant::now(), }; tracing::warn!( failures = *failures, "Circuit breaker opened — too many failures" ); } } CircuitState::HalfOpen => { // Probe failed — back to Open *self.state.lock().unwrap() = CircuitState::Open { opened_at: Instant::now(), }; tracing::warn!("Circuit breaker re-opened — probe failed"); } CircuitState::Open { .. } => {} } } }`

Enter fullscreen mode

Exit fullscreen mode

Step 5 — Async Support

Real services use async. Wrap the sync version with an async variant:

impl CircuitBreaker {  pub async fn call_async(&self, f: F) -> Result>  where  F: FnOnce() -> Fut,  Fut: Future>,  {  self.check_state().map_err(|_| CircuitError::Open)?;_

match f().await { Ok(value) => { self.on_success(); Ok(value) } Err(e) => { self.on_failure(); Err(CircuitError::Inner(e)) } } } }`

Enter fullscreen mode

Exit fullscreen mode

Step 6 — Wire It Up

A real example — wrapping an HTTP client call:

use std::sync::Arc;

#[derive(Clone)] pub struct PaymentGatewayClient { http: reqwest::Client, breaker: Arc, base_url: String, }

impl PaymentGatewayClient { pub fn new(base_url: String) -> Self { let config = CircuitBreakerConfig { failure_threshold: 3, recovery_timeout: Duration::from_secs(60), success_threshold: 1, };

Self { http: reqwest::Client::new(), breaker: CircuitBreaker::new(config), base_url, } }

pub async fn charge(&self, amount: u64, token: &str) -> Result { self.breaker .call_async(|| async { self.http .post(format!("{}/charge", self.base_url)) .json(&serde_json::json!({ "amount": amount, "token": token })) .send() .await? .json::() .await }) .await .map_err(|e| match e { CircuitError::Open => AppError::ServiceUnavailable("payment gateway"), CircuitError::Inner(e) => AppError::HttpError(e), }) } }`

Enter fullscreen mode

Exit fullscreen mode

Callers never deal with circuit breaker logic — it's fully encapsulated. ✅

Step 7 — Observability

A Circuit Breaker is useless if you can't see it. Expose state via metrics:

impl CircuitBreaker {  pub fn state_label(&self) -> &'static str {  match *self.state.lock().unwrap() {  CircuitState::Closed => "closed",  CircuitState::Open { .. } => "open",  CircuitState::HalfOpen => "half_open",  }  } }*

// In your metrics handler (e.g. Prometheus via metrics crate) fn record_circuit_state(name: &str, breaker: &CircuitBreaker) { let state = match breaker.state_label() { "closed" => 0.0, "half_open" => 0.5, "open" => 1.0, _ => -1.0, };_

metrics::gauge!("circuit_breaker_state", state, "name" => name.to_string()); }`

Enter fullscreen mode

Exit fullscreen mode

Alert when circuit_breaker_state == 1.0 for more than 2 minutes. That's your on-call trigger.

Testing the State Transitions

#[cfg(test)] mod tests {  use super::*;*

fn failing_call() -> Result<(), &'static str> { Err("timeout") }

fn succeeding_call() -> Result<&'static str, ()> { Ok("ok") }

#[test] fn opens_after_threshold_failures() { let cb = CircuitBreaker::new(CircuitBreakerConfig { failure_threshold: 3, ..Default::default() });

for _ in 0..3 { let _ = cb.call(|| failing_call()); }

// Next call should be rejected immediately let result = cb.call(|| succeeding_call()); assert!(matches!(result, Err(CircuitError::Open))); }

#[test] fn recovers_after_timeout() { let cb = CircuitBreaker::new(CircuitBreakerConfig { failure_threshold: 1, recovery_timeout: Duration::from_millis(10), // short for tests success_threshold: 1, });

let _ = cb.call(|| failing_call()); assert_eq!(cb.state_label(), "open");_

std::thread::sleep(Duration::from_millis(20));

// Probe call succeeds → back to Closed let _ = cb.call(|| succeeding_call()); assert_eq!(cb.state_label(), "closed"); } }`_

Enter fullscreen mode

Exit fullscreen mode

The Full Picture

┌──────────────────────────────────────┐  │ Your Service │  └──────────────┬───────────────────────┘  │ call()  ┌──────────────▼───────────────────────┐  │ Circuit Breaker │  │ ┌─────────┐ ┌──────┐ ┌──────────┐ │  │ │ Closed │→│ Open │→│ Half-Open│ │  │ └─────────┘ └──────┘ └──────────┘ │  └──────────────┬───────────────────────┘  │ (if Closed or Half-Open)  ┌──────────────▼───────────────────────┐  │ External Service │  │ (Payment API, DB, etc.) │  └──────────────────────────────────────┘

Enter fullscreen mode

Exit fullscreen mode

Key Takeaways

  • Fail fast — don't waste threads waiting for a dead service

  • State is an enum — Rust's type system enforces valid transitions

  • Config is separate — easy to tune per-service without changing logic

  • Observability first — a silent Circuit Breaker is a dangerous one

  • Wrap at the boundary — one Circuit Breaker per external dependency, not per call site

What's Next?

  • Add retry with backoff inside the Closed state before counting as a failure

  • Combine with Bulkhead (separate thread pools per dependency)

  • Use tokio::sync::RwLock instead of Mutex for better async throughput

  • Persist state to Redis for multi-instance deployments

Part of the Architecture Patterns series.

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelavailableversion

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Building a …modelavailableversionproductserviceDEV Communi…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 145 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Releases