Main Article: Should We Estimate in Story Points?
I'm a consultant, so you know my answer is, it depends. But in this case, it does. It comes down to the underlying delivery system and how much of that system is being forecast. Let's break down the factors that make answering this question contextual. I know this is a controversial question, so be kind to each other when discussing it. It isn't a question that an ever have a single answer even inside one company.
I first want to make sure you realize my context. I'm interested in forecasting and planning. I'm interested in what historical distribution of data yields the best predictive results for plans I'm going to make in the future. The discussions teams' have when creating estimates in points are often useful to the development team itself. No matter how I answer the predictive question, I'm staying out of your teams decide whether to estimate or not - that is YOUR problem.
Better Predictor: Velocity or Throughput?
One way of thinking about which measure is a better predictor of flow through a system is to think of city traffic. During peak times of the day, the roads are so congested that you often spend time waiting in traffic. Being stuck in traffic doesn't specifically target your car (no matter how you feel), it impacts all cars on that same route.
Statisticians call this an independent factor because the delay isn't specific (dependent) on a single-vehicle. This matters when we forecast and will decide which predictor is better. If delays in a system are "dependent" (related to an individual item), then Velocity will outperform Throughput. If delays in the system are "independent" (impacting all or some things at random), then Throughput will beat Velocity. And in software delivery systems, it alters whether which predictor is better: Throughput or velocity.
The bicycle has few delays, and distance equals time (use velocity). Cars and motorcycles have many independent delays and distance does NOT equal time (use Throughput)
To choose the better predictor, we need to look at the amount of time work spends delayed or sitting idle and categorize these delays as dependent or independent.
Dependent delay examples:
- Something taking longer because it's hard
- Something waiting for more information
Independent delay examples:
- Waiting in backlogs (ours or other teams) to start
- Work sits idle waiting for delivery cadences
Tallying up the delays in your process helps make the decision:
- If our delivery system has a majority of dependent delays, then use Velocity.
- If our delivery system has a majority of independent delays, then use Throughput.
When there are independent delays, like traffic signals - a 1 point story can take the same time as a 13 point story. Using Velocity would give the wrong answer; use Throughput instead.
The narrower the system being forecast, the more likely dependent delays dominate. Teams level forecasting fits this category, making Velocity a better predictor at the local level. The more the system value chain being forecast (two or more multiple teams), the more independent delays compound, making Throughput the better choice.
Scaled software development has a LOT of independent delays (dependencies, approvals, release scheduled cadences). The moment you are planning beyond a single team, I've always found Throughput to be the better predictor. Just saying, I fully expect this to be your result too.
The quick experiment you should run is to forecast using both methods. One will better predict an actual outcome. You can also run this experiment on data from the past. Go back a month or sprint, use the velocity and throughput data leading up to that point, and try and predict what you know occurred. If one method shows a better result, use that one! The results I see are teams without external dependencies will score better on Velocity. Teams with external dependencies or multiple teams will score better using Throughput. When I do this in real life, Throughput is most often better. See, I told you it depends!
The Science and Online Calculator
The science of which predictor is better depends on a system measure called process efficiency. You can read and interactively learn about the impact of process efficiency using the online calculator here -
MYTH: Items Must Be Similar Sizes to Forecast Using Throughput
It's a common myth that to forecast with Throughput that all items must be the same size. They don't. It would be like saying all cars and trucks stuck in traffic need to be the same horsepower to arrive at the same time. A Maserati and a VW Beetle next to one another in the same traffic jam (stop and go traffic) arrive at the same time. In low flow systems, with independent delays - size (or horsepower) doesn't matter much. There is often a lot of explaining I have to do to make people believe this fact. It isn't the size but the distribution of sizes being stable that matters when forecasting. For example, a similar ratio over time of the number of small, medium, and large items. It all of a sudden, you started doing all huge things the throughput forecast will be wrong. You have to think! But think the ongoing distribution of values changing and NOT item by item.