T-shirt estimation is the idea, that estimation of user stories or tasks should be done at a rather rough level (t-shirt sizes) rather than precise hour or even storypoints. The idea is to save time, frustration and avoiding a false sense of knowledge by not trying to be more precise than actually usefull. It often turns out to be rather more precise than expected and with a good knowledge-pr-time-spend ratio.

This post is a practical guide of how to do t-shirt estimation. It is meant to be a practical how-to of one of my favorite simplications of one of the most time-wasting activities in larger organisations.

First a few things to notice about estimates.

  • You will never get estimates exactly right on the individual user story level. Estimates are a distribution of likelihood of hitting your average cost – NOT single numbers. So expect individual estimates to be a degree of “wrong”.
  • Variability pooling means that even if individual estimates are uncertain you can be predictable overall if you find yourself in a stable environment. This is also known as the law of big numbers.
  • What makes you predictable (or unpredictable) has more to do with your environment than estimation/sizing, but you do need to be disciplined about data collection to stand a chance at predictability. A new team, using a new process on a new product with unknown technology will be highly unpredictable no matter the estimation technique, triangulation exercises or the amount of detail provided.
  • Data beats detail every time – if you are not using yesterday’s weather for forecasting then you are missing out on the easiest way of increasing predictability

Looking at data we can see that most estimate distributions look something like this:

Estimate distribution

And it is not only in theory. Here is a picture how a month worth of done features distribute in terms of cycle time. Even from a rather small sample it is clear that the tasks are distributed in three distinkt groups.

Estimate distribution example

As you can see there is a tail on the distribution of each T-shirt category which basically means that some Small will take longer than the average Medium and so on. While it is possible to adjust the distribution the tail will remain since it is “what we don’t know we don’t know” that causes it and that simply cannot be uncovered by upfront analysis.

Why estimate at all?

From a lean perspective estimation is a wasteful activity – no doubt about that. So what are the arguments for investing time and money in an activity that does not provide any direct value to the solution?

  • Better cost based prioritization. Quite often items in a backlog are moved either drastically up or down (deleted) based on an estimate. The actual cost might not be predictable (will be covered in detail later) but having an indication provides a good basis for prioritizing what is next (and yes I know there are multiple other factors to consider as well)
  • Predictability. Having a rough clue about the size of our pipeline can tell us the likelihood of meeting deadlines, budgets and business goals. Budgets. To get a budget for a new product or improving an existing one sponsors or customers want a target number and they want to know at least the rough price tag.
  • Risk. If we are triangulating our estimates (using different estimation methods to assess the cost) and results are pointing in the same direction then risk is probably much lower than had they been pointing in wildly different directions.
  • High-level alignment. If one person considers it a Small and another Large they are probably not thinking of the same. A quick planning poker session will surface those items where the team is not aligned at all and where two minutes spent estimating could surface a point early that would otherwise result in frustration later on.

Using story point based T-shirt estimates

Using story point based T-shirt bucket estimates we optimize for the goal of predictability and alignment without the unfounded promise of individual accuracy. Basically we only allow estimates to be placed in one of the following buckets:

T shirt sizes

T-shirt sizes are used because they make it quick and easy for the team to understand the buckets they are allowed to use (there is no such thing as a S/M bucket). A small number of buckets communicate that uncertainty is expected (it is there anyway) and dramatically speeds up estimation sessions. Points are added to the T-shirt buckets because they exactly indicate the ratio between them and are used for price per point calculations and velocity:

  • 2-3 “S” should fit in a “M
  • 2-3 “M” should fit in a “L
  • 2 “L” should fit in a “XL
  • XXL is really too big to estimate but think of something at least twice the size of XL

This way we keep all estimates comparative. If you are estimating a new project you start by finding a couple of representatives for T-shirt buckets S, M and L and then line the others up accordingly – it really does not take much time and the suggested point ratio between the T-shirt sizes is typically a pretty good fit.

But what if you really need the price?

To turn T-shirt estimates into a budget and time forecast we use the concept of a price per point. This is done simply by looking at historical data and dividing money spent by the number of points finished. How to build a price per point assumption for new projects is covered in the next section.

Working with a price per point as the central figure provides numerous benefits:

  • It is simple and should take very little time
  • You can adjust your price per point assumption without having to revisit individual estimates
  • Price per point can easily be translated into both deadline (given a stable capacity) and a budget.

Do however remember that adding more people does increase spending but will also make your price per point rise – at least for the short term.

Greenfield projects, new technology, new teams and new domains

When forecasting the price per point for new teams, new technology or new products, you can use triangulation of estimates. I nomally use a spreadsheet with some build in formulaes help do it fast an effectively but it would need instructions and explanation to use it meaningfully. The idea would be to do:

  • Best case, worst case, realistic case of storypoints-to-hours ratio
  • historical data (from similar teams, similar projects)
  • gut feeling

and use these on the t-shirt estimated user stories with a blend of how much you want to weight the estimation types. The point of this would be to not trust only one source for an reasonable ratio, but to also distribute that estimate reliablity.

However, no matter what you do in this situation, your estimates and forecasts will be more uncertain. Not because you are doing something wrong, but because the environment around you is unpredictable.

But then why do it at all, if the environmetnt is so uncertain? Three main reasons:

  • Somebody will demand the number from you anyway – so just make it as quick and easy to come up with something reasonable as possible.
  • When you have a base target to re-plan from, the discussions can be based on data instead of feelings, blame games etc.
  • You will teach the mindset of handling uncertainty instead of eliminating it. The team will have practiced how to do an effective estimation process when the cone of uncertainty gets narrow enough to become usable.

Further simplication of the T-shirt sizes

In most practical situations I actually prefer only to use

  • S” = 3
  • M” = 8
  • L” = 20

It turns out that it often plenty of precision and the estimation really fast. There is also room for extending with the “XS” and “XL” as needed. However, do not let the ease of estimation divert you away from actually discussing the complexity of a task - there can be great value in discussing the task in order for the team to actually understand the purpose and implications.