2026

2026년 6월 18일

Ticket Smells: How to Spot a Bad Slice Before You Pull It

A bad ticket rarely announces itself. It looks fine in refinement, earns a confident estimate, and then detonates at the end of the sprint: twice the work, a hidden dependency, or a description only its author can decode. The fix is not heroics. It is learning to smell the rot early, the way you already smell a code smell. Here are the four ticket smells that cost teams the most, the research on why they hurt, and a checklist your team can run before anything reaches the board.

S

Sascha Becker

Author

약 19분

Ticket Smells: How to Spot a Bad Slice Before You Pull It

Every team has the ticket. It read like a two pointer. "Add a field to the export." Someone pulled it on Monday morning, confident. By Thursday it had eaten a database migration, a translation pass, and an argument about backfilling old records. The sprint review was awkward.

That ticket did not get unlucky. It was rotten when it entered the board, and the rot was visible to anyone who knew what to sniff for.

You already do this with code. A function with five boolean parameters, a utils.ts that is now four thousand lines, a comment that says "do not touch": you feel the wrongness before you can prove it. That instinct has a name. Kent Beck coined "code smell" and Martin Fowler made it famous: a surface signal of a deeper problem. Tickets have smells too. The difference is that nobody taught you to notice them, so they keep detonating at the end of the sprint instead of getting caught at the start.

This is a field guide to the four ticket smells that cost teams the most: why each one hurts (the research is clearer than you would expect), the tell that gives it away on the board, and the cheapest fix. The goal is not a perfect ticket. It is the instinct, so a bad slice makes you wrinkle your nose before you ever drag it into "In Progress."

Smell one: the Iceberg

It shows you the tip. The mass is underwater.

The Iceberg looks small and is not. "Just add a field." "Simply migrate the report." The surface is one input box or one mechanical move. Underneath sit a migration, an API contract change, an i18n pass, and a backfill nobody mentioned.

A small yellow sticky note floats at the waterline like the tip of an iceberg, while a huge tangled mass of cables, gears, and database drives hangs hidden beneath the surface. — The ticket shows you the sticky note. The migration, the contract change, and the backfill are all underwater.

This is not carelessness. It is wiring. The planning fallacy is the documented human tendency to underestimate our own tasks even when we remember the last one running long. In one study, students estimating their own thesis predicted an average of 33.9 days; the actual average was 55.5 days, worse than their own worst case guess.¹ And estimates are widest exactly when you know least. Steve McConnell's Cone of Uncertainty puts the earliest estimate up to four times too high or too low, a sixteenfold spread, and it narrows only as you resolve unknowns, not as time passes.²

The person who wrote the ticket is the worst placed to see the iceberg, because of the curse of knowledge: once you know a system, the hard parts compile into reflex and stop looking like work. (More on that one later. It is its own beast.)

The tell

Minimizing verbs: "just," "simply," "only," "quickly." The smaller the word, the bigger the berg.
Vague action verbs with no object: "update," "handle," "support," "integrate," with nothing saying what done looks like.
Thin or missing acceptance criteria. Nothing about edge cases, migration, or existing data.
A wide spread in estimation. When one person says 3 and another says 13, that gap is not noise to average away.

The fix

When a story is too big to estimate confidently, that is not a reason to guess harder. Bill Wake, who coined the INVEST checklist for good stories, notes that "it would take me more than a month" almost always carries a hidden second half: "because I do not understand what it would entail."⁴ Confident size and low understanding travel together.

Two moves. Split it before you commit. Or, if the uncertainty is real, run a spike: a timeboxed investigation whose only job is to buy enough knowledge to estimate or split. Keep spikes rare; Mike Cohn warns that overusing them just extends your time to value.⁵

The wide spread is a gift

In planning poker, the disagreement is the data. Do not split the difference and move on. Ask the 3 and the 13 to explain themselves. One of them is seeing an iceberg the other isn't, and that conversation is the entire point of estimating together.

³

Smell two: the Siamese Twins

Two tickets on the board, one piece of work in reality.

Someone split the feature, so the board shows two cards: "Build the login API" and "Build the login screen." It feels like progress. It is an illusion. The screen has nothing to call. The API has nothing to show. Neither can be demoed alone, and the integration, where the bugs actually live, hides in a third step nobody ticketed.

This is horizontal slicing: cutting work by technical layer (backend, frontend, database) instead of by value. Humanizing Work is blunt about it. Splitting by architectural layer "may satisfy small, but it fails at independent and valuable."⁶ A layer is not a slice. A database table with no screen delivers nothing a stakeholder can see, which is why Wake put Independent first in INVEST and called overlap "the most painful form of dependency."⁴

The cost is flow. Coupled tickets cannot move on their own. They wait on each other, they force handoffs, and they push integration risk to the end of the sprint where there is no time left to absorb it.⁷

The tell

The two cards share a branch or a single pull request. Two IDs, one merge.
Neither can be demoed on its own.
A standing "blocked by" link between them, especially a chain.
They are always estimated together and always assigned to the same person in the same sprint.
One card is named after a layer, not an outcome: "Backend for X," "API for X," "UI for X."

The fix

Slice vertically. Replace "build the API" plus "build the screen" with one thin story that goes top to bottom: "A user can log in with a valid email and password," UI, service, and storage in one demoable increment. If that is too big, keep slicing vertically, not horizontally: happy path first, then wrong password handling, then lockout. Each slice is full stack and shippable.

When two teams genuinely must work in parallel, decouple with a contract first: agree the API shape up front so the frontend builds against a stub while the backend builds the real thing, with no hard "blocked by" between them. Or build a walking skeleton, a tiny end to end path that links the real components, then thicken each slice.⁸

Smell three: the Tapper

Written by someone who can hear the whole song. Read by someone who only gets the taps.

In 1990 a Stanford researcher named Elizabeth Newton ran an experiment. One group tapped out the rhythm of a well known song on a table. The other group tried to name it. Tappers predicted listeners would get it about half the time. Listeners identified 3 of 120 songs, two and a half percent. The tappers heard the full melody in their heads. The listeners heard disconnected knocks.⁹

That is the curse of knowledge, a bias named in a 1989 economics paper and made famous by Chip and Dan Heath: once you know something, you cannot reconstruct what not knowing felt like.¹⁰ The cursed ticket is pure tapping. It names a solution ("add a Redis TTL to the profile resolver, bust on PROFILE_UPDATED") and deletes the problem. The person who picks it up implements it literally, ships it, and learns in review that the real issue was stale avatars after upload, which a fixed TTL makes worse.

The "so that" is where a ticket carries its reason, and it is the first thing to go missing. Mike Cohn calls the so-that clause often the most important part of a story, because knowing why someone wants something frequently reveals a better way to give it to them.¹¹ A ticket that says what to build but never why hands you a solution with the problem torn off.

Here is the same ticket cursed and then cured.

text
TITLE: Add Redis TTL to user-profile resolver
Set a 300s TTL on the profile resolver cache key. Use the existing
cacheWrap HOF. Bust on PROFILE_UPDATED.

The picker-upper does not know why (performance? cost? a staleness bug?), what "correct" looks like, or whether 300s is a requirement or a guess. Now the cured version:

text
TITLE: Profile pages load slowly under load; cache the profile read
As a signed-in user, I want my profile to load quickly, so that I am not
staring at a spinner when traffic is high.

Context: profile reads hit the DB on every request and are our slowest p95
endpoint at peak. Data changes rarely, except right after the user edits it.

Acceptance criteria:
- Given a profile was read recently, when it is requested again, then it is
  served from cache and p95 drops below 150 ms under the peak load test.
- Given a user updates their profile, when the next read occurs, then the
  response reflects the update immediately (no stale data).
- Given a cache miss or outage, when a profile is requested, then it still
  loads from the DB (no error).

Note: Redis/TTL is one possible approach. Choose what meets the criteria.

The tell

No "so that," no stated value. The card names a build, never a reason.
Solution dictated, problem absent: "add Redis TTL," "memoize the selector," with no user facing symptom.
No acceptance criteria, or vague ones: "fast," "clean," "user friendly," nothing a tester could check.
Dense in-group jargon: internal codenames, event names, file references, with no expansion.
One author, zero discussion. No questions on the thread, no testing perspective. The mental model never left one head.

The fix

Make the implicit explicit before the ticket is ready, and do it as a conversation, not a documentation chore. Two cheap rituals carry most of the weight. Three Amigos: a product person, a developer, and a tester look at the story together, one asking what problem, one asking how to build, one asking what could break.¹² Example Mapping, Matt Wynne's 25 minute technique: four colors of card, the story, its rules, concrete examples, and the open questions nobody can answer yet.¹³ The card pattern is its own smell test. A pile of red question cards means too much is still in someone's head, and if you cannot map the story in about 25 minutes, it is too big or too vague to start.

Smell four: the Boulder

Too big to move. It sits on the board for weeks, and the board shows nothing, because there is nothing smaller than the whole thing to finish.

"Build user account management" enters the sprint. Three weeks later it is still "In Progress." Not because no one is working, but because the ticket is one indivisible lump, so it produces no completion, no throughput, no visible movement until it is entirely done. Meanwhile its age climbs quietly past everything the team usually ships.

Big batches are slow batches. Smaller items finish faster, give feedback sooner, and carry less risk; this is the most studied lever in Donald Reinertsen's work on product flow. Little's Law makes the trap exact: average cycle time equals work in progress divided by throughput, so the more you keep open at once, the longer each thing takes.¹⁴

The metric that catches a Boulder early is Work Item Age: how long an unfinished item has been in progress. Cycle time and throughput are lagging; they only describe finished work. Age is leading; it warns you while the item is still stuck.¹⁵ Plot it against your team's history and a Boulder crosses the 85th percentile line long before it is formally late.¹⁶

The tell

No status change in several days. The simplest stall signal there is.
The story has rolled over two or more sprints.
An estimate larger than half the sprint.
Many subtasks, all still open. Work is fragmented but nothing finishes.
A permanent "In Progress." Age counts idle and blocked time too, so a ticket nobody is touching is still aging.

The fix

Right-size before you commit. A common rule of thumb: if a story will take more than a few days, split it. Sliced into "view profile," "edit email," "change password," and "delete account," the same work reaches Done every day or two. The board moves, throughput is real, and if one slice blocks (say "delete account" waits on legal), only that small piece ages while the rest ships.

The four smells, one root cause

Step back and the smells rhyme. The Iceberg is a slice that hid its size. The Twins are a slice taken along the wrong axis. The Tapper is a slice missing its reason. The Boulder is a slice that was never cut. Every one is a slicing failure, and slicing well is a learnable skill, not a talent.

Start with INVEST (Bill Wake, 2003): a good story is Independent, Negotiable, Valuable, Estimable, Small, and Testable.⁴ It is the checklist behind every tell above. Independent kills the Twins. Estimable and Small kill the Iceberg and the Boulder. Valuable and Testable kill the Tapper.

The single most important habit is to slice vertically, not horizontally. A story should be a thin slice through the whole cake (UI, logic, data) that delivers something demonstrable, not one layer of it.¹⁷ Wake's original 2003 image still says it best: a story is a cake, and the customer cares about a slice, not the database layer on its own.

A layer cake shown two ways: a thin vertical wedge cut cleanly through every layer with a small flag on top, beside a single horizontal layer pulled out on its own and crumbling apart. — A vertical slice goes through every layer and stands on its own. A single horizontal layer just crumbles.

When a story resists splitting, two named toolkits help. Mike Cohn's SPIDR gives five axes to cut along: Spike, Paths, Interfaces, Data, Rules.¹⁸ Richard Lawrence's Humanizing Work guide adds a flow and nine patterns, with two tie-breakers worth memorizing: prefer the split that lets you drop or defer the low-value part, and prefer splits that come out roughly equal in size.⁶

A worked example. The before is horizontal and huge:

text
- Build the payments database schema
- Build the payment service
- Build the checkout UI

Nothing ships until all three land, and the first two are pure plumbing with no value on their own. The after is vertical and thin, using SPIDR Paths and Rules:

text
- A user can pay for one item by credit card, happy path, end to end
- Add PayPal as a second payment path
- Decline payment when the billing country is unsupported
- Validate a discount code at checkout

Each line is shippable, demoable, and independently valuable. If the sprint runs short, you drop the discount code and still have a working checkout. That is what good slicing buys you.

Want to build the muscle as a team? Run an Elephant Carpaccio session: take one feature and force it into the thinnest vertical slices you can, each one demonstrable. Alistair Cockburn, who invented it, points out the name is backwards. You are not carving slices off a finished elephant. You are building the elephant one slice at a time.¹⁹

Catching them before the board

Smelling a bad ticket while you read it is good. Catching it before anyone reads it is better. That is what refinement is for, and it does not need to be heavy.

A Definition of Ready is a short shared agreement that a ticket is understood enough to start: it has stated value, has testable acceptance criteria, is small enough, and its dependencies are known.²⁰ Treat it as a guideline, though, not a gate. A rigid Definition of Ready quietly turns into a mini waterfall, a stage gate where "refinement" hands off to "development," which is exactly the large-batch handoff lean flow warns against.²¹ The bar is "we know enough to start and will adapt," not "every question answered in advance."

Make refinement a habit, not an event. Mike Cohn's rule of thumb is to spend about five to ten percent of each sprint on it, aiming only for "sufficiently understood," not perfectly specified.²² In that time, run a Three Amigos pass on anything fuzzy, Example Map anything you cannot explain in a sentence, and split anything bigger than a few days on the spot.

The smell test

Before a ticket reaches the board, run it past these. A "no" is not a veto. It is a prompt to talk, split, or send it back.

Value is explicit. You can say who benefits and why, the "so that."
Acceptance criteria are written and testable. A tester could turn them into checks today.
No open question is blocking a start.
It is small enough to finish comfortably in a few days.
It is independent, or every dependency is named and tracked.
The team agrees on the size, with no wide spread left unexplained.
Three lenses have seen it: product, development, and testing.
You could map it in about 25 minutes. If not, it is too big or too vague.

Start next sprint

You do not need a process overhaul. Pick a few of these and run them.

Put the smell test on the wall and read tickets against it in refinement.
Make "send it back" a normal, blameless move. A missing "so that" or no testable criterion goes back to the author, not into the sprint.
When estimates spread wide, stop and talk before you average. The gap is the most valuable signal in the room.
Split anything longer than a few days, vertically, on the spot. Keep SPIDR and the Humanizing Work patterns nearby as a cheat sheet.
Watch Work Item Age, not just the burndown. When a ticket ages past your usual, intervene before it is late.

The payoff is not tidier tickets for their own sake. It is a sprint that ends the way it looked like it would on Monday. The scares stop hiding in the bushes once the whole team can smell them coming.

S

글쓴이

Sascha Becker

다른 글