Products available version update product application service

BDD Test Cases from User Stories: 5 Steps and 12 Scenarios

DEV Communityby KRINOApril 2, 202612 min read1 views

User stories set the destination. Test cases map every path, including the wrong ones. In this post we walk through a 5-step BDD decomposition framework for turning a single user story into concrete Given/When/Then scenarios, with a complete worked coupon example. BDD (Behavior-Driven Development) structures tests as <a href="https://cucumber.io/docs/gherkin/reference/" rel="noopener noreferrer">Given/When/Then scenarios</a> that both stakeholders and tools like Cucumber or SpecFlow can understand. Most teams write user stories in a format like: <blockquote> As a checkout user, I want to apply a discount coupon, So that I pay less for my order. </blockquote> Then they try to extract test cases directly from that sentence. The result i

User stories set the destination. Test cases map every path, including the wrong ones.

In this post we walk through a 5-step BDD decomposition framework for turning a single user story into concrete Given/When/Then scenarios, with a complete worked coupon example. BDD (Behavior-Driven Development) structures tests as Given/When/Then scenarios that both stakeholders and tools like Cucumber or SpecFlow can understand.

Most teams write user stories in a format like:

As a checkout user, I want to apply a discount coupon, So that I pay less for my order.

Then they try to extract test cases directly from that sentence. The result is a handful of happy path scenarios and a nagging sense that something important was missed.

That instinct is right. This post explains why, and what to do about it.

Why User Stories Fail as Test Case Sources

User stories describe intent, not behavior. They answer why a feature exists, not what the system should do in every situation.

The coupon story above doesn't say:

What happens if the coupon is expired
What if the cart is below the minimum order value
Whether a coupon can be applied more than once
What if the coupon code is valid but belongs to a different user
What if the user removes an item after applying the coupon
What happens if the coupon service is temporarily unavailable

None of these are edge cases you invent. They're real scenarios that happen in production. They're just not in the user story because user stories aren't meant to capture them.

The gap between "the story" and "what needs testing" is where bugs live.

The 5-Step BDD Decomposition Framework

Don't try to extract test cases from the user story text directly. Use the story as a starting point for structured decomposition.

The framework in summary:

Identify the main scenario (the happy path)
Map every input and its valid boundaries
Add negative scenarios for each invalid input
Add boundary-condition scenarios at and around each threshold
Add error-handling and state-transition scenarios

Each step below.

Step 1: Identify the Main Scenario

What does the system do in the ideal case? This is your happy path, the scenario where everything works as intended.

Scenario: Valid coupon applies correct discount  Given I have a cart with items totaling $50  And I have a valid coupon "SAVE20" for 20% off  When I apply the coupon  Then the cart total should be $40  And I should see "20% off applied"

Scenario: Valid coupon applies correct discount  Given I have a cart with items totaling $50  And I have a valid coupon "SAVE20" for 20% off  When I apply the coupon  Then the cart total should be $40  And I should see "20% off applied"

Enter fullscreen mode

Exit fullscreen mode

This is what your product owner has in mind when they write the story. Necessary, but not close to enough.

Step 2: Map Every Input and Its Boundaries

Every input the user provides or the system receives has a valid range. List them explicitly.

Input Valid Range Boundary Cases

Coupon code Alphanumeric, 4-20 chars Empty, too short, too long, special chars

Expiry date Today or future Yesterday, today, tomorrow

Minimum cart value Above minimum threshold (e.g., $30) $29.99, $30.00, $30.01

Usage limit 0 - N uses per user 0, 1, max, max+1

User ownership Owned by this user or global Own coupon, other user's coupon, global

Each boundary is a potential test case. The edge of the valid range is where errors happen.

Step 3: Add Negative Scenarios

For each input, ask: what should the system do when this input is invalid, expired, or out of range? Each answer is a test case.

Scenario: Expired coupon is rejected with clear message  Given I have a cart with items totaling $50  And I have an expired coupon "OLD20"  When I apply the coupon  Then I should see the error "This coupon has expired"  And the cart total should remain $50  And no discount should be applied

Scenario: Expired coupon is rejected with clear message  Given I have a cart with items totaling $50  And I have an expired coupon "OLD20"  When I apply the coupon  Then I should see the error "This coupon has expired"  And the cart total should remain $50  And no discount should be applied

Enter fullscreen mode

Exit fullscreen mode

Step 4: Add Boundary Conditions

Boundary conditions are the exact limits of valid input: the values right at, just below, and just above each threshold.

A coupon with a $30 minimum order value needs three scenarios:

Cart at $29.99 → coupon rejected
Cart at $30.00 → coupon accepted
Cart at $30.01 → coupon accepted

Scenario: Coupon rejected when cart is one cent below minimum  Given I have a cart with items totaling $29.99  And I have a valid coupon "SAVE20" with a $30 minimum  When I apply the coupon  Then I should see "Minimum order of $30 required"  And the cart total should remain $29.99

Scenario: Coupon rejected when cart is one cent below minimum  Given I have a cart with items totaling $29.99  And I have a valid coupon "SAVE20" with a $30 minimum  When I apply the coupon  Then I should see "Minimum order of $30 required"  And the cart total should remain $29.99

Enter fullscreen mode

Exit fullscreen mode

Boundary conditions expose off-by-one errors, rounding bugs, and comparison operator mistakes (< vs <=). I've seen teams spend weeks debugging production issues that a single boundary test would have caught. The return on time invested is unusually high.

Step 5: Add Error Handling and State Scenarios

What happens if the coupon service is down? What if the user applies a coupon, then removes an item that drops the cart below the minimum?

State transitions (when valid inputs become invalid mid-session) are consistently under-tested. In most codebases I've worked on, they're not tested at all.

Scenario: Coupon auto-removed when cart drops below minimum  Given I have a cart with items totaling $50  And I have successfully applied coupon "SAVE20" with a $30 minimum  When I remove items bringing the cart total to $25  Then the coupon should be automatically removed  And I should see "Coupon removed: cart below $30 minimum"  And the cart total should be $25 with no discount

Scenario: Coupon auto-removed when cart drops below minimum  Given I have a cart with items totaling $50  And I have successfully applied coupon "SAVE20" with a $30 minimum  When I remove items bringing the cart total to $25  Then the coupon should be automatically removed  And I should see "Coupon removed: cart below $30 minimum"  And the cart total should be $25 with no discount

Enter fullscreen mode

Exit fullscreen mode

The Three Coverage Levels

Not every story needs exhaustive coverage immediately. A pragmatic approach uses three levels, chosen based on the risk profile of the feature.

Coverage Level Scenarios Focus Use When

Quick 3 total Happy path + obvious failures Low-risk features, rapid prototyping, legacy system initial coverage

Standard 8 total Adds boundaries & realistic edge cases Most production features

Exhaustive 12 total Adds security, concurrency, state transitions Payment flows, authentication, data integrity, anything hard to roll back

Choosing the right level is a judgment call about risk, not a technical decision. Features that touch money, health data, or user permissions need Exhaustive. A blog post draft autosave feature probably needs Quick.

Worked Example: Coupon Application

User story: As a checkout user, I want to apply a discount coupon, so that I pay less for my order.

Quick Coverage (3 scenarios)

Scenario Tests

1 Valid coupon applied Correct discount shown, cart total updated

2 Invalid coupon code Clear error message, cart unchanged

3 Expired coupon Expiry-specific error message

Scenario: Valid coupon applies correct discount  Given I have a cart with items totaling $50  And I have a valid coupon "SAVE20" for 20% off  When I apply the coupon  Then the cart total should be $40  And I should see "20% off applied"

Scenario: Valid coupon applies correct discount  Given I have a cart with items totaling $50  And I have a valid coupon "SAVE20" for 20% off  When I apply the coupon  Then the cart total should be $40  And I should see "20% off applied"

Scenario: Invalid coupon code is rejected Given I have a cart with items totaling $50 When I apply coupon "NOTREAL" Then I should see "Coupon not found" And the cart total should remain $50

Scenario: Expired coupon is rejected Given I have a cart with items totaling $50 And I have an expired coupon "OLD20" When I apply the coupon Then I should see "This coupon has expired" And the cart total should remain $50`

Enter fullscreen mode

Exit fullscreen mode

Standard Coverage Adds (5 more scenarios, 8 total)

Scenario Tests

4 Cart below minimum order value Minimum order error with threshold displayed

5 Single-use coupon applied twice "Coupon already used" error

6 Coupon applied, then item removed below minimum Coupon auto-removed, user notified

7 Cart at exactly minimum value Coupon applies successfully

8 Cart $0.01 below minimum Coupon not applicable

Scenario: Coupon rejected when cart is one cent below minimum  Given I have a cart with items totaling $29.99  And I have a valid coupon "SAVE20" with a $30 minimum order  When I apply the coupon  Then I should see "Minimum order of $30 required"  And the cart total should remain $29.99

Scenario: Coupon rejected when cart is one cent below minimum  Given I have a cart with items totaling $29.99  And I have a valid coupon "SAVE20" with a $30 minimum order  When I apply the coupon  Then I should see "Minimum order of $30 required"  And the cart total should remain $29.99

Enter fullscreen mode

Exit fullscreen mode

Scenarios 7-8 are the boundary pair for the minimum threshold. Testing both sides of the boundary catches < vs <= bugs that unit tests often miss.

Exhaustive Coverage Adds (4 more scenarios, 12 total)

Scenario Tests

9 Coupon belonging to another user "Coupon not found" (not "belongs to someone else")

10 Two simultaneous coupon applications Only one applied, no double discount

11 Coupon code with SQL injection attempt Sanitized and treated as invalid

12 Coupon applied, page refreshed State persisted correctly

These deserve additional commentary.

Scenario 9: User Enumeration Prevention

At first glance, you might write the expected behavior as: "When I apply a coupon that belongs to another user, I see 'This coupon belongs to a different account.'"

That sounds helpful, but it leaks information. It confirms the coupon exists and is assigned to someone. An attacker can use this to enumerate valid coupon codes or confirm which users have coupons.

The correct behavior returns a generic "Coupon not found" message, identical to what you'd see for a coupon that doesn't exist at all.

Scenario: Coupon belonging to another user returns generic not-found  Given I have a cart with items totaling $50  And a coupon "PRIVATE10" exists but belongs to another user  When I apply coupon "PRIVATE10"  Then I should see "Coupon not found"  And the cart total should remain $50

Scenario: Coupon belonging to another user returns generic not-found  Given I have a cart with items totaling $50  And a coupon "PRIVATE10" exists but belongs to another user  When I apply coupon "PRIVATE10"  Then I should see "Coupon not found"  And the cart total should remain $50

Enter fullscreen mode

Exit fullscreen mode

Scenario 10: Race Condition / Double Discount

If a user opens two tabs and clicks "Apply Coupon" simultaneously, what happens? Without explicit concurrency handling, you can end up with double discounts or corrupted cart state. I've seen promotions lose thousands of dollars to this exact bug during flash sales.

Scenario 11: Input Sanitization

The coupon code field accepts user input. That input will hit your database. Malicious input should be sanitized and rejected as "invalid code," not cause errors or, worse, execute.

Scenario 12: State Persistence

If a user applies a coupon then refreshes the page, is the coupon still applied? This tests session or database persistence. It sounds obvious, but I've seen carts silently drop discounts on refresh because the coupon was stored client-side only.

Common Mistakes

Four mistakes I see repeatedly in BDD scenarios.

Mistake 1: Testing the Implementation, Not the Behavior

BDD scenarios should describe what the system does from a user's perspective, not how it does it internally.

Bad:

Scenario: Click apply button sends API request  Given I am on the cart page  When I click the "Apply Coupon" button  Then a POST request is sent to /api/v1/coupons/apply

Scenario: Click apply button sends API request  Given I am on the cart page  When I click the "Apply Coupon" button  Then a POST request is sent to /api/v1/coupons/apply

Enter fullscreen mode

Exit fullscreen mode

Good:

Scenario: Valid coupon applies correct discount  Given I have a cart with items totaling $50  When I apply coupon "SAVE20"  Then my cart total should be $40

Scenario: Valid coupon applies correct discount  Given I have a cart with items totaling $50  When I apply coupon "SAVE20"  Then my cart total should be $40

Enter fullscreen mode

Exit fullscreen mode

The bad version tests implementation details. The good version tests behavior. Only the second survives a refactor.

Mistake 2: Skipping Boundary Conditions

Teams often test "valid" and "invalid" but miss the exact threshold. If the minimum order is $30, test $29.99, $30.00, and $30.01 — not just "$20" and "$50."

Mistake 3: Coupling to UI Elements

Scenarios that say "When the user clicks the Apply Coupon button" couple your test cases to your UI. Describe actions at the domain level: "When the user applies coupon SAVE20."

Mistake 4: One Huge Scenario Instead of Multiple Focused Ones

A scenario with twelve steps and three outcomes is testing three things at once. When it fails, you don't know which of the three things broke. One behavior per scenario.

Where AI Fits: Honestly

AI test case generation is useful but uneven. Worth being specific about where it helps and where it doesn't.

AI handles well:

Volume. Given a well-structured user story, AI can generate a complete scenario set in seconds rather than the hour it might take manually.
Consistency. AI doesn't forget null inputs or boundary values — the inputs you'd normally catch on the thirtieth scenario of the day when you're no longer paying attention.
Format compliance. Gherkin syntax is strict, and AI follows it reliably.
Coverage breadth. AI applies the decomposition systematically across every input. It won't skip Step 4 because it's Friday afternoon.

AI handles poorly:

Domain-specific rules. Your business has constraints that aren't in the user story: user enumeration risks, regional compliance requirements, product-specific logic. AI doesn't know these unless you tell it explicitly.
Organizational context. Which test cases already exist? Which are covered by integration tests at a lower level? AI doesn't have this information and will generate duplication.
Coverage level judgment. Deciding whether a feature needs Quick, Standard, or Exhaustive coverage is a risk assessment about your system, your users, and your rollback capability. That's a human call.

What actually works: let AI handle Steps 2-5 at the coverage level you choose, then review and add the domain constraints it can't know. You get AI's consistency with your context.

Summary

The gap between user stories and test cases is where bugs live. A structured decomposition approach closes that gap:

Identify the main scenario — the happy path
Map every input and its boundaries — the raw material for edge cases
Add negative scenarios — what happens when inputs are invalid
Add boundary conditions — the exact thresholds where errors hide
Add error handling and state transitions — the scenarios that survive production

Choose your coverage level based on risk: Quick for low-stakes features, Standard for most production work, Exhaustive for anything involving money, security, or data integrity.

This is part of the Practical Test Design series. I'm building KRINO Dokim, a tool that applies this 5-step framework automatically — you choose the coverage level, it generates the scenarios, and you review with full visibility into why each scenario was created. If you work with user stories and BDD, I'd genuinely appreciate your feedback on the approach.

What coverage level does your team default to? Have you run into the user enumeration trap (Scenario 9)? I'd love to hear in the comments.

Original source

DEV Community

https://dev.to/krinosystems/bdd-test-cases-from-user-stories-5-steps-and-12-scenarios-g76

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

availableversionupdate

Releases

Why the world s militaries are scrambling to create their own Starlink

The reliable internet connections provided by Starlink offer a huge advantage on the battlefield. But as access is dependent on the whims of controversial billionaire Elon Musk, militaries are looking to build their own version

New Scientist Tech

1m22 days ago

ProductsLive

Reviewing the evidence on psychological manipulation by Bots and AI

TL;DR: In terms of the potential risks and harms that can come from powerful AI models, hyper-persuasion of individuals is unlikely to be a serious threat at this point in time. I wouldn’t consider this threat path to be very easy for a misaligned AI or maliciously wielded AI to navigate reliably. I would expect that, for people hoping to reduce risks associated with AI models, there are other more impactful and tractable defenses they could work on. I would advocate for more substantive research into the effects of long-term influence from AI companions and dependency, as well as more research into what interventions may work in both one-off and chronic contexts. ----- In this post we’ll explore how bots can actually influence human psychology and decision-making, and what might be done t

LessWrong AI

50m11 minutes ago

ProductsLive

b8634

chat : add Granite 4.0 chat template with correct tool_call role mapping ( #20804 ) chat : add Granite 4.0 chat template with correct tool_call role mapping Introduce LLM_CHAT_TEMPLATE_GRANITE_4_0 alongside the existing Granite 3.x template (renamed LLM_CHAT_TEMPLATE_GRANITE_3_X ). The Granite 4.0 Jinja template uses XML tags and maps the assistant_tool_call role to assistant . Without a matching C++ handler, the fallback path emits the literal role assistant_tool_call which the model does not recognize, breaking tool calling when --jinja is not used. Changes: Rename LLM_CHAT_TEMPLATE_GRANITE to LLM_CHAT_TEMPLATE_GRANITE_3_X (preserves existing 3.x behavior unchanged) Add LLM_CHAT_TEMPLATE_GRANITE_4_0 enum, map entry, and handler Detection: + ( or ) → 4.0, otherwise → 3.x Add production Gr

llama.cpp Releases

2mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 195 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Products

ProductsLive

Reviewing the evidence on psychological manipulation by Bots and AI

LessWrong AI

50m11 minutes ago

ProductsFresh

b8631

sync : ggml macOS/iOS: macOS Apple Silicon (arm64) macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan) Ubuntu x64 (ROCm 7.2) Ubuntu x64 (OpenVINO) Windows: Windows x64 (CPU) Windows arm64 (CPU) Windows x64 (CUDA 12) - CUDA 12.4 DLLs Windows x64 (CUDA 13) - CUDA 13.1 DLLs Windows x64 (Vulkan) Windows x64 (SYCL) Windows x64 (HIP) openEuler: openEuler x86 (310p) openEuler x86 (910b, ACL Graph) openEuler aarch64 (310p) openEuler aarch64 (910b, ACL Graph)

llama.cpp Releases

1mabout 2 hours ago

ProductsLive

b8634

llama.cpp Releases

2mabout 1 hour ago

ProductsLive

b8635

Relax prefill parser to allow space. ( #21240 ) Relax prefill parser to allow space. Move changes from prefix() to parser generation Only allow spaces if we're not having a pure content parser next macOS/iOS: macOS Apple Silicon (arm64) macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan) Ubuntu x64 (ROCm 7.2) Ubuntu x64 (OpenVINO) Windows: Windows x64 (CPU) Windows arm64 (CPU) Windows x64 (CUDA 12) - CUDA 12.4 DLLs Windows x64 (CUDA 13) - CUDA 13.1 DLLs Windows x64 (Vulkan) Windows x64 (SYCL) Windows x64 (HIP) openEuler: openEuler x86 (310p) openEuler x86 (910b, ACL Graph) openEuler aarch64 (310p) openEuler aarch64 (910b, ACL Graph)

llama.cpp Releases

1m37 minutes ago