Releases claude model foundation model available version update

BuildWithAI: Architecting a Serverless DR Toolkit on AWS

DEV Communityby Romar CablaoApril 5, 20269 min read0 views

Overview I'd been getting more involved in disaster recovery planning lately and kept running into the same gap — a lot of teams on AWS have backups, but not a real Disaster Recovery (DR) plan. No documented runbooks, no tested failover procedures, no RTO/RPO targets tied to business impact. So that became the motivation for this side project: six AI-powered tools that automate the tedious parts of DR planning, built entirely on AWS. In part one of this three-part series, we will walk through the architecture — the serverless stack, the central model config, and the 5-layer cost guardrail system that keeps everything under $10/month (of course, you can set your own threshold; that's just what felt right for this side project). The next two parts will cover prompt engineering for each tool

Overview

I'd been getting more involved in disaster recovery planning lately and kept running into the same gap — a lot of teams on AWS have backups, but not a real Disaster Recovery (DR) plan. No documented runbooks, no tested failover procedures, no RTO/RPO targets tied to business impact. So that became the motivation for this side project: six AI-powered tools that automate the tedious parts of DR planning, built entirely on AWS.

In part one of this three-part series, we will walk through the architecture — the serverless stack, the central model config, and the 5-layer cost guardrail system that keeps everything under $10/month (of course, you can set your own threshold; that's just what felt right for this side project). The next two parts will cover prompt engineering for each tool and the lessons learned setting this side project.

Here is a look at what we're going to build. You can try out the live version at https://dr-toolkit.thecloudspark.com.

While this was implemented with the help of Kiro — AWS's spec-driven AI IDE — this series will focus on the DR toolkit, Amazon Bedrock, and the underlying AWS architecture, rather than Kiro itself.

What the toolkit does

Six tools, same workflow: provide input, Lambda calls Amazon Bedrock, get formatted output.

Tool Default Model What it does

1 Runbook Generator Nova Pro Paste IaC → get a full DR runbook

2 RTO/RPO Estimator Nova Lite Fill a form → get recovery targets and DR tier

3 DR Strategy Advisor Nova Lite Answer questions → get an AWS DR architecture pattern

4 Post-Mortem Writer Nova Lite Paste incident notes → get a structured post-mortem

5 DR Checklist Builder Nova Lite Pick your AWS services → get a tailored audit checklist

6 Template DR Reviewer Nova Pro Paste IaC → get a gap analysis with fix snippets

The live demo at DR Toolkit currently runs on Amazon Nova models. But these are just the defaults — the toolkit supports any model in the Bedrock Model Catalog. You can mix and match: Nova Lite for simple tools, Claude Sonnet for complex ones, or go all-in on a single provider. Just update models.config.json and redeploy.

Architecture

Here’s the big picture. I kept the architecture intentionally simple and straightforward AWS serverless setup. Few Lambda functions, one API Gateway, one DynamoDB table, one SNS topic, S3 + CloudFront for the frontend.

So when someone opens the toolkit, CloudFront serves the static frontend from a private S3 bucket. When they submit a tool form, the request goes through API Gateway to one of six tool Lambda functions. Each Lambda runs through the guardrail checks against DynamoDB before calling Amazon Bedrock's invoke_model. Separately, if the monthly AWS Budget hits $10, an SNS alert triggers the budget_shutoff Lambda, which flips tools_enabled=False in DynamoDB. Every tool checks that flag before doing anything else.

Browser  │  ├── GET ──▶ CloudFront (security headers + URL rewrite)  │ └──▶ S3 (private bucket, OAC only)  │  └── POST ──▶ API Gateway (HTTP API, 10 req/s, burst 25)  │  ▼  AWS Lambda (Python 3.14)  ├── guardrails.py ← 5-layer cost protection  ├── model_config.py ← reads models.config.json  ├── Amazon Bedrock (cross-region inference profiles)  └── DynamoDB (daily counters + IP rate limits + kill switch)

Browser  │  ├── GET ──▶ CloudFront (security headers + URL rewrite)  │ └──▶ S3 (private bucket, OAC only)  │  └── POST ──▶ API Gateway (HTTP API, 10 req/s, burst 25)  │  ▼  AWS Lambda (Python 3.14)  ├── guardrails.py ← 5-layer cost protection  ├── model_config.py ← reads models.config.json  ├── Amazon Bedrock (cross-region inference profiles)  └── DynamoDB (daily counters + IP rate limits + kill switch)

AWS Budget $10/mo ──▶ SNS ──▶ Lambda (flips kill switch)`

Enter fullscreen mode

Exit fullscreen mode

Layer What Why

Frontend Next.js 16 + Tailwind CSS v3 Static export, zero server cost

Frontend hosting S3 (private, OAC) + CloudFront Security headers, HTTPS, URL rewrite

API API Gateway HTTP API Built-in throttling, cheaper than REST API

Compute Lambda (Python 3.14) One function per tool + shared layer

AI Amazon Bedrock Cross-region inference profiles

Database DynamoDB (on-demand) Counters + feature flag + per-IP rate limits

Alerts SNS + AWS Budgets Auto-shutoff at $10/month

IaC Serverless Framework Single serverless.yml

Central config: models.config.json

Every tool's model, token limit, daily cap, and word count is controlled by one JSON file at the repo's root directory:

{  "region": "ap-southeast-1",  "tools": {  "runbook-generator": {  "modelId": "apac.amazon.nova-pro-v1:0",  "displayLabel": "Nova Pro",  "badgeColor": "blue",  "toolLimit": 50,  "maxTokens": 800,  "maxWords": 600  },  "rto-estimator": {  "modelId": "apac.amazon.nova-lite-v1:0",  "displayLabel": "Nova Lite",  "badgeColor": "green",  "toolLimit": 50,  "maxTokens": 400,  "maxWords": 300  }  } }

{  "region": "ap-southeast-1",  "tools": {  "runbook-generator": {  "modelId": "apac.amazon.nova-pro-v1:0",  "displayLabel": "Nova Pro",  "badgeColor": "blue",  "toolLimit": 50,  "maxTokens": 800,  "maxWords": 600  },  "rto-estimator": {  "modelId": "apac.amazon.nova-lite-v1:0",  "displayLabel": "Nova Lite",  "badgeColor": "green",  "toolLimit": 50,  "maxTokens": 400,  "maxWords": 300  }  } }

Enter fullscreen mode

Exit fullscreen mode

This config is consumed at deploy time by three things:

Lambda handlers — via a shared model_config.py module
Frontend — a slim copy with just displayLabel + badgeColor for the UI badges
serverless-models.js — auto-generates IAM resource ARNs so Bedrock permissions stay scoped to exactly the models in use

The handlers auto-detect the model provider from the modelId and use the correct Bedrock request format — Anthropic's anthropic_version + system string format for Claude, or Amazon's schemaVersion: messages-v1 + system array format for Nova. You can mix providers freely within the same deployment. IAM permissions update automatically on deploy — no manual policy edits needed.

Want to switch from Nova to Claude? Swap the modelId:

"runbook-generator": {  "modelId": "global.anthropic.claude-sonnet-4-6",  "displayLabel": "Sonnet 4.6",  ... }

"runbook-generator": {  "modelId": "global.anthropic.claude-sonnet-4-6",  "displayLabel": "Sonnet 4.6",  ... }

Enter fullscreen mode

Exit fullscreen mode

Redeploy and that's it 🚀. The Model Selection Guide in the repo has copy-paste-ready model IDs for every supported option.

The 5-layer cost guardrail system

Running a free public tool on Bedrock with no authentication means you need cost protection in layers. Five guardrail layers is probably overkill for most projects. But for a free public demo where anyone can hit the endpoint, I'd rather over-protect than wake up to a surprise bill. All five checks run before Bedrock ever gets called.

Layer 1 — API Gateway throttling

Configured in serverless.yml:

HttpApiStage:  Properties:  DefaultRouteSettings:  ThrottlingRateLimit: 10  ThrottlingBurstLimit: 25

HttpApiStage:  Properties:  DefaultRouteSettings:  ThrottlingRateLimit: 10  ThrottlingBurstLimit: 25

Enter fullscreen mode

Exit fullscreen mode

This is the first line of defense. Abuse gets 429s from API Gateway before Lambda even runs. Zero Bedrock cost.

Layer 2 — Daily usage counters

DynamoDB atomic conditional increments, both global (200/day) and per-tool (50/day for most tools, 30 for DR Reviewer since Nova Pro costs more per call):

table.update_item(  Key={"pk": f"usage#{today}", "sk": sk},  UpdateExpression="ADD run_count :inc SET #d = :date",  ConditionExpression="attribute_not_exists(run_count) OR run_count < :limit",  ExpressionAttributeValues={":inc": 1, ":limit": limit, ":date": today}, )

table.update_item(  Key={"pk": f"usage#{today}", "sk": sk},  UpdateExpression="ADD run_count :inc SET #d = :date",  ConditionExpression="attribute_not_exists(run_count) OR run_count < :limit",  ExpressionAttributeValues={":inc": 1, ":limit": limit, ":date": today}, )

Enter fullscreen mode

Exit fullscreen mode

Layer 3 — Per-IP rate limiting

3 requests per minute per IP, using DynamoDB TTL'd counters:

minute_bucket = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M") pk = f"ratelimit#{source_ip}#{minute_bucket}"

minute_bucket = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M") pk = f"ratelimit#{source_ip}#{minute_bucket}"

table.update_item( Key={"pk": pk, "sk": "ALL"}, UpdateExpression="ADD run_count :inc SET expires_at = :exp", ConditionExpression="attribute_not_exists(run_count) OR run_count < :limit", ExpressionAttributeValues={ ":inc": 1, ":limit": IP_RATE_LIMIT, ":exp": int(time.time()) + 120, }, )`

Enter fullscreen mode

Exit fullscreen mode

Layer 4 — Bedrock token caps

Hard max_tokens per tool (400–800 depending on the tool). Input is also truncated to 8,000 characters before it reaches Bedrock. Most templates I tested were well under 3,000 characters, so the cap rarely triggers, but it bounds the worst case.

Layer 5 — Budget auto-shutoff

AWS Budget at $10/month → SNS → Lambda sets tools_enabled = false in DynamoDB:

def handler(event, context):  table.put_item(Item={  "pk": "config", "sk": "global",  "tools_enabled": False,  "disabled_reason": "Monthly budget threshold reached.",  })

def handler(event, context):  table.put_item(Item={  "pk": "config", "sk": "global",  "tools_enabled": False,  "disabled_reason": "Monthly budget threshold reached.",  })

Enter fullscreen mode

Exit fullscreen mode

Every handler checks this flag first. Worst case: tools temporarily unavailable. But never a surprise bill. (There's up to a ~5 minute lag between the budget alert and shutoff, so in-flight requests at alarm time aren't blocked. But at these volumes, the overshoot is negligible.)

Security hardening

A few key controls worth highlighting:

IAM least privilege. bedrock:InvokeModel is scoped to specific inference profile and foundation model ARNs, auto-generated from models.config.json by serverless-models.js. No wildcards on any IAM policy.

S3 private + OAC. No public access. Only CloudFront can read from the bucket.

CORS. API Gateway allowedOrigins is restricted to the CloudFront domain. The Lambda response headers themselves use Access-Control-Allow-Origin: * because the response helper doesn't know the domain and the API relies on rate limiting and daily caps (not auth tokens) for protection. The gateway-level restriction is the meaningful one.*

Prompt injection defense. All handlers use Bedrock's system parameter to separate instructions from user input. More on this in Part 2.

Full details in the Security Assessment doc in the repo.

What's next

That covers the architecture: the serverless stack, the central config, the 5-layer cost guardrails, and the security controls.

In the next part, we'll look at the tools themselves: the prompts behind each one, how to choose the right model per tool, the system prompt pattern for prompt injection defense, and the patterns that are reusable in any Bedrock project.

Try it / Fork it:

Live Demo: https://dr-toolkit.thecloudspark.com

DR Toolkit

AI-powered disaster recovery planning tool for AWS builders. Plan, document, and audit your DR posture with Amazon Bedrock. Resilience planning, accelerated by generative AI.

dr-toolkit.thecloudspark.com

DR Toolkit on AWS

AI-powered disaster recovery planning tool for AWS builders. Plan, document, and audit your DR posture with Amazon Bedrock. Resilience planning, accelerated by generative AI.

Tools

Tool Endpoint Model Daily Limit

1 Runbook Generator POST /runbook Nova Pro 50/day

2 RTO/RPO Estimator POST /rto-estimator Nova Lite 50/day

3 DR Strategy Advisor POST /dr-advisor Nova Lite 50/day

4 Post-Mortem Writer POST /postmortem Nova Lite 50/day

5 DR Checklist Builder POST /checklist Nova Lite 50/day

6 Template DR Reviewer POST /dr-reviewer Nova Pro 30/day

Architecture

Frontend: Next.js 16 (static export) + Tailwind CSS → S3 + CloudFront
Backend: AWS Lambda (Python 3.14) → API Gateway HTTP API
AI: Amazon Bedrock — Nova Lite (Tools 2–5), Nova Pro (Tools 1, 6)
Database: DynamoDB single table dr-toolkit-usage (usage counters + feature flag)
IaC: Serverless Framework v3 (serverless.yml)
Region: ap-southeast-1 (Singapore)

Project Structure

dr-toolkit/ ├── serverless.yml # Serverless Framework

dr-toolkit/ ├── serverless.yml # Serverless Framework

…

References:

Disaster Recovery of Workloads on AWS — AWS Whitepaper
Amazon Bedrock Developer Guide
Amazon Bedrock Model Catalog
Amazon Bedrock Cross-Region Inference
Amazon Bedrock — Anthropic Claude Parameters
CloudFront Origin Access Control

Original source

DEV Community

https://dev.to/aws-builders/buildwithai-architecting-a-serverless-dr-toolkit-on-aws-123d

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

claudemodelfoundation model

Models

Diffusion Model Achieves Creative Image Generation Via Low-Probability Regions - Quantum Zeitgeist

Diffusion Model Achieves Creative Image Generation Via Low-Probability Regions Quantum Zeitgeist

GNews AI diffusion

1m2 months ago

Models

Best Open Source Text-to-Speech Models in 2025: A Developer and Enterprise Guide - Resemble AI

Best Open Source Text-to-Speech Models in 2025: A Developer and Enterprise Guide Resemble AI

GNews AI voice

1m4 months ago

Models

EU AI Act Brief – Pt. 5, General-Purpose AI Models - - Center for Democracy and Technology

EU AI Act Brief – Pt. 5, General-Purpose AI Models - Center for Democracy and Technology

GNews AI EU

1m25 days ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 137 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Releases

Releases

Elon Musk Says Tesla Terafab Project For AI Chips To Launch In A Week - investors.com

Elon Musk Says Tesla Terafab Project For AI Chips To Launch In A Week investors.com

GNews AI chips

1m21 days ago

ReleasesLive

Dark Dish Lab: A Cursed Recipe Generator

What I Built Dark Dish Lab is a tiny, delightfully useless web app that generates cursed food or drink recipes. You pick: Hated ingredients Flavor chaos (salty / sweet / spicy / sour) Then it generates a short “recipe” with a horror score, a few steps, and a warning. It solves no real-world problem. It only creates regret. Demo YouTube demo Code GitHub repo How I Built It Frontend: React (Vite) Ingredient + flavor selection UI Calls backend API and renders the generated result Backend: Spring Boot (Java 17) POST /api/generate endpoint Generates a short recipe text and returns JSON Optional AI: Google Gemini API If AI is enabled and a key is provided, it asks Gemini for a very short recipe format If AI is disabled or fails, it falls back to a non-AI generator Notes Only Unicode emojis are u

DEV Community

2mabout 1 hour ago

ReleasesLive

KVerify: A Two-Year Journey to Get Validation Right

KVerify: A Two-Year Journey to Get Validation Right In December 2023, I wrote a small article about a utility I called ValidationBuilder . The idea was simple: a DSL where you'd call validation rules as extension functions on property references, collect all violations in one pass, and get a result back. Ktor-specific, but the concept was portable. I published it and moved on. Except I didn't. The Problem I came to Kotlin without a Java background. Spring was my first serious framework, and I didn't really understand it. My issue was specific: I was reaching for Kotlin object declarations where Spring wanted managed beans. I was hardcoding configuration values that should have been injected, not because I didn't know configuration existed, but because I couldn't figure out how to bridge Sp

DEV Community

16mabout 1 hour ago

ReleasesLive

CodeClone b4: from CLI tool to a real review surface for VS Code, Claude Desktop, and Codex

I already wrote about why I built CodeClone and why I cared about baseline-aware code health . Then I wrote about turning it into a read-only, budget-aware MCP server for AI agents . This post is about what changed in 2.0.0b4 . The short version: if b3 made CodeClone usable through MCP, b4 made it feel like a product. Not because I added more analysis magic or built a separate "AI mode." But because I pushed the same structural truth into the places where people and agents actually work — VS Code, Claude Desktop, Codex — and tightened the contract between all of them. A lot of developer tools are strong on analysis and weak on workflow. A lot of AI-facing tools shine in a demo and fall apart in daily use. For b4 , I wanted a tighter shape: the CLI, HTML report, MCP, and IDE clients should

DEV Community

7mabout 1 hour ago