Engineering & delivery · The planning shift

Your sprint isn’t the bottleneck anymore.

AI agents have compressed the slow part of software — writing the code — from weeks to hours. That’s the good news. The catch: the queue didn’t vanish, it moved — to deciding what to build, writing it down clearly enough to hand off, and reviewing what comes back. If your delivery numbers start behaving strangely this year, this is why, and here’s what to do about it.

Sources · Deloitte State of AI 2026, ThoughtWorks Radar Vol. 33 6-min read For decision-makers

01 · What actually changed

Building got cheap. Deciding, specifying and reviewing didn’t.

For thirty years, writing the code was the expensive, slow step, and everything in software management was built to schedule around it. Agents removed that constraint — and revealed the three steps that were always underneath it.

The old bottleneck

Writing the code

Build time dominated. Plans, estimates and team size all revolved around how long it took people to type the software into existence.

Fix the typing speed, fix the delivery.

The new bottleneck

Deciding · specifying · reviewing

When code is fast, the queue moves upstream and downstream: choosing what’s worth building, describing it clearly enough for an agent to execute, and checking what it produced.

The work that’s left is the human judgement.

Why it matters to you: if your team is still organised entirely around the old bottleneck — measuring output by how much code got written — you’re optimising the step that’s no longer the constraint. The leverage now is in how clearly work is defined and how fast good work gets reviewed and shipped.

02 · The strange numbers

Your velocity will look weird. That’s a recalibration, not a crisis.

“Velocity” is the number most engineering teams use to estimate how much they can deliver in a cycle. It was calibrated on how fast people write code — so when agents write most of it, the number moves sharply, and it stops meaning what it used to.

The trap to avoid

A bigger velocity number is not a better result if the work is piling up unreviewed. Code that’s been drafted but is stuck waiting for a human to check it hasn’t been delivered — and counting it as progress just hides the real queue. When someone shows you a chart with a thrilling new number on it, the question to ask is: how much of that has actually shipped and been verified?

Reset the expectation before the number does it for you. Tell your board and your team, plainly, that the old measure is being recalibrated — that what you’ll now watch is finished, reviewed, shipped work and how quickly it moves, not how much got typed. That’s a performance-measurement problem to fix, not a performance problem to panic about.

03 · The conversation to have

Three questions to ask your engineering lead.

Are our work items written so an agent could execute them?

Does each task say what “done” looks like, what’s in and out of scope, and how it’ll be checked? That clear brief is now the thing that determines quality — more than the model.

What’s our review capacity?

If agents can produce ten changes a day and we can only carefully review three, the other seven are the bottleneck. Can we review faster without lowering the bar?

Who owns priority?

Deciding what matters most is a business judgement, not something to hand the AI. A person must own the ranking — the agent can prepare the list, but it can’t choose what’s worth doing.

Who’s accountable for what ships?

“The model wrote it” is not an answer an auditor accepts. A named human still approves each change — that’s not friction, it’s accountability.

04 · The governance line

The fix is boring on purpose — and that’s the good news.

Most companies are running ahead of their guardrails: Deloitte’s State of AI 2026 finds only about one in five has a mature way of governing autonomous AI agents. The instinct is to convene a committee. The better answer is more mundane.

What good governance actually looks like

It isn’t an AI ethics board. It’s three boring engineering habits your team may already have: a clear written brief for each piece of work (the spec), every change reviewed and approved before it goes live (the pull request), and automated checks that run on every change (the test pipeline). Those three give you a complete record of who changed what, why, and who signed off — which is exactly what a POPIA or FSCA review asks for, produced as a by-product of working this way rather than assembled in a panic.

05 · When it isn’t worth it

The honest limits.

This discipline is not free, and it doesn’t pay off everywhere. Two cases where the machinery costs more than it returns:

Where to hold off

Small teams and low volume. If a couple of people ship a handful of changes a week, writing formal briefs for each one is overhead you’ll feel and won’t recoup. Keep it lightweight.

Genuinely exploratory work. When you’re still figuring out what to build — early product, pre-market-fit — a firm written spec is just a confident guess you’ll rewrite next week. Specify once you know; explore before you do.

The rule of thumb: the more your work is repeatable and accountable, the more this pays; the more it’s small or exploratory, the lighter you keep it.

Keep reading

Technical version

Planning with agents

The deep version for your engineering lead: spec-driven development, the backlog as agent-readable data, cadence below the sprint, and what to measure when velocity breaks.

Open the leaf →

Briefing

When Is an Agent the Right Tool?

Before you specify the work, decide whether it needs an agent at all. Most things sold as agents should be simpler, cheaper and more predictable.

Read briefing →

The take

Agents didn’t remove the work — they moved it. From typing code to deciding what’s worth building, writing it down clearly, and reviewing what comes back. Manage those three, and the speed takes care of itself.