Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessBuy Rating on MNTN: Scalable Self-Serve CTV Platform, Generative AI Innovation, and Underpenetrated SMB Opportunity Drive Multi-Year Growth Potential - TipRanksGoogle News: Generative AIUX Roundup: OpenAI Usability | Integrated Software | Seedance 2 vs. Kling 3 | Grok Imagine | Increasing AI Use - Jakob Nielsen on UXGoogle News: OpenAILetters: As a former English teacher, I know why using ChatGPT on college applications is wrong - sfchronicle.comGoogle News: ChatGPTOpenAI calls for robot taxes, a public wealth fund, and a 4-day workweek to tackle AI disruptionBusiness InsiderOpenAI calls for robot taxes, a public wealth fund, and a 4-day workweek to tackle AI disruption - Business InsiderGoogle News: OpenAI📈 Data to start your week: The AI squeezeExponential ViewOpenAI suggests electric grid, public wealth fund for AI era - Seeking AlphaGoogle News: OpenAIHow I use Claude for strategy, Gemini for research and ChatGPT for 'the grind' - Tom's GuideGoogle News: ChatGPTHow to Reap Compound Benefits From Generative AI - MIT Sloan Management ReviewGoogle News: Generative AIThe AI agent buffet is closed - AxiosGoogle News: ClaudeLocal data science company awarded Missile Defense Agency contract - Rome SentinelGoogle News: Machine LearningOpenAI's Fidji Simo takes medical leave, company announces leadership changes - CNBCGoogle News: OpenAIBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessBuy Rating on MNTN: Scalable Self-Serve CTV Platform, Generative AI Innovation, and Underpenetrated SMB Opportunity Drive Multi-Year Growth Potential - TipRanksGoogle News: Generative AIUX Roundup: OpenAI Usability | Integrated Software | Seedance 2 vs. Kling 3 | Grok Imagine | Increasing AI Use - Jakob Nielsen on UXGoogle News: OpenAILetters: As a former English teacher, I know why using ChatGPT on college applications is wrong - sfchronicle.comGoogle News: ChatGPTOpenAI calls for robot taxes, a public wealth fund, and a 4-day workweek to tackle AI disruptionBusiness InsiderOpenAI calls for robot taxes, a public wealth fund, and a 4-day workweek to tackle AI disruption - Business InsiderGoogle News: OpenAI📈 Data to start your week: The AI squeezeExponential ViewOpenAI suggests electric grid, public wealth fund for AI era - Seeking AlphaGoogle News: OpenAIHow I use Claude for strategy, Gemini for research and ChatGPT for 'the grind' - Tom's GuideGoogle News: ChatGPTHow to Reap Compound Benefits From Generative AI - MIT Sloan Management ReviewGoogle News: Generative AIThe AI agent buffet is closed - AxiosGoogle News: ClaudeLocal data science company awarded Missile Defense Agency contract - Rome SentinelGoogle News: Machine LearningOpenAI's Fidji Simo takes medical leave, company announces leadership changes - CNBCGoogle News: OpenAI
AI NEWS HUBbyEIGENVECTOREigenvector

I Asked AI to Do Agile Sprint Planning (GitHub Copilot Test)

DEV Communityby Incomplete DeveloperApril 1, 20265 min read4 views
Source Quiz
🧒Explain Like I'm 5Simple language

Hey there, little explorer! Imagine you want to build a super cool LEGO castle, right? 🏰

Usually, grown-ups plan how to build it, like "First, let's find all the red bricks!" or "Next, we'll make the tall tower!" This planning is called "sprint planning" for computer programs.

So, some grown-ups asked a smart computer friend, like a robot helper called "Copilot," to plan how to build a computer program. They wanted to see if the robot could plan as well as a person.

Guess what? The robot tried its best! But sometimes, its plans were a little silly. Like, it wanted to find all the red bricks first, then all the blue bricks, instead of building a little bit of the castle each day so you could play with it sooner!

So, the smart robot is super good at helping write the code (like putting LEGOs together), but planning the whole big castle? It's still learning that part! It's like it needs more practice to be a super-duper castle planner. 😊

<p>AI tools are getting very good at writing code.</p> <p>GitHub Copilot can generate entire functions, review pull requests, and even help refactor legacy codebases. But software development isn’t just about writing code.</p> <p>A big part of the process is <strong>planning the work</strong>.</p> <p>So I decided to run a small experiment:</p> <p><strong>Can AI actually perform Agile sprint planning?</strong></p> <p>Using <strong>GitHub Copilot inside Visual Studio 2026</strong>, I asked AI to review a legacy codebase and generate a <strong>Scrum sprint plan for rewriting the application</strong>.</p> <p>The results were… interesting.</p> <h1> Watch Video </h1> <h2> <iframe src="https://www.youtube.com/embed/ErwuATHHXw4"> </iframe> </h2> <h1> The Setup </h1> <p>The experiment was intention

AI tools are getting very good at writing code.

GitHub Copilot can generate entire functions, review pull requests, and even help refactor legacy codebases. But software development isn’t just about writing code.

A big part of the process is planning the work.

So I decided to run a small experiment:

Can AI actually perform Agile sprint planning?

Using GitHub Copilot inside Visual Studio 2026, I asked AI to review a legacy codebase and generate a Scrum sprint plan for rewriting the application.

The results were… interesting.

Watch Video

The Setup

The experiment was intentionally simple.

I gave Copilot an existing codebase and asked it to:

  • Review the code

  • Analyze the architecture

  • Generate a Scrum sprint plan for rewriting the project

I also added some realistic constraints:

  • Only one developer is working on the rewrite

  • The developer works 5 hours per day

  • Sprints are 2 weeks long

  • Only 7 days per sprint are development days

One important limitation:

The AI was not given any historical sprint velocity or team metrics.

That matters a lot, because in real Agile teams, effort estimates rely heavily on historical data.

But even for humans, sprint estimation is notoriously difficult, so this seemed like a good test.

Test 1 — ChatGPT 5.1 Codex Mini

The first model I tested was ChatGPT 5.1 Codex Mini.

It produced what it described as a detailed sprint plan, but the result was very high level and vague.

Looking at the structure of the plan, something immediately felt wrong.

The sprint plan looked more like Waterfall development than Agile.

Examples:

  • Sprint 1 focused only on low-level domain entities

  • There was nothing usable produced

  • Tests were scheduled in Sprint 3

  • Documentation and final sign-off appeared in the last sprint

This is basically the opposite of what Scrum tries to achieve.

Agile delivery is about incrementally delivering working software.

Instead, the plan delayed meaningful output until much later.

So for this task:

Codex Mini failed.

It didn’t appear to understand the practical workflow of Agile development.

Test 2 — ChatGPT 5.1 Codex

Next I tested the full ChatGPT 5.1 Codex model.

This time I changed the workflow slightly.

First, I asked Copilot to perform a code review.

The review itself required around three premium requests, but the output was reasonable.

After that, I asked the model to produce a sprint plan for the rewrite.

At first glance, the result looked much better.

The AI used the correct Scrum terminology:

  • Definition of Ready

  • Definition of Done

  • Sprint goals

  • Backlog features

But once again, reading the details told a different story.

When AI Sounds Smart (But Isn't)

The output looked convincing.

But many parts were too vague to be useful.

For example, the Definition of Done included generic statements but no measurable criteria.

The sprint plans also contained unrealistic assumptions.

Example: Sprint 1

Sprint 1 was titled:

Foundation & Structure – Establish .NET 10 Clean Architecture

The tasks themselves were mostly reasonable.

In fact, about 80% of them made sense technically.

But the time estimates were way off.

Given the constraints I provided, the work listed in Sprint 1 would realistically take about 10 hours.

That means it could likely be finished by day three of the sprint.

The same issue continued in Sprint 2.

Most of the work involved mechanical migration tasks, like converting entities into a newer format.

But there was still no real domain logic being implemented.

Another Agile Anti-Pattern

Looking at the overall milestone plan revealed another issue.

The AI scheduled Service Layer work and Test Coverage near the end of Sprint 3.

Again, this starts to look more like Waterfall than Agile.

Testing and functional behavior should evolve throughout the development process, not suddenly appear late in the project.

What This Experiment Reveals

Both models produced a lot of output that looked intelligent.

There were structured plans, Agile terminology, and detailed explanations.

But much of it was surface-level reasoning.

The AI struggled with the deeper realities of software development:

  • Understanding the actual complexity of the codebase

  • Identifying where business logic must be redesigned

  • Estimating effort realistically

Rewriting a system usually isn’t just about migrating code.

The core domain logic often needs to be rethought completely.

Even experienced developers usually need several days exploring a codebase before they can estimate the real work.

Was This Experiment Fair?

Not entirely.

Real sprint planning relies on information that the AI did not have access to:

  • Historical sprint velocity

  • Team estimation practices

  • Knowledge of the codebase

  • Developer discussions and consensus

Many teams use planning poker, where multiple developers estimate effort and converge on a shared estimate.

That process relies heavily on human experience with the system.

AI simply doesn’t have that context.

Final Verdict

So can AI perform realistic Agile sprint planning?

Not really.

AI can definitely help with:

  • Code reviews

  • Architecture analysis

  • Backlog documentation

  • Technical recommendations

But sprint planning still requires human judgment.

Especially when dealing with legacy systems and complex business logic.

AI can assist developers.

But deciding what can realistically fit into a sprint is still something teams need to do themselves.

Watch Spec-Driven Dev Playlist on YouTube

Spec Driven Development - YouTube

Spec-Driven Development is an emerging approach to AI-assisted software development, where AI agents generate code based on structured specifications instead...

youtube.com

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
I Asked AI …modelapplicationservicefeaturecopilotanalysisDEV Communi…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 198 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!