The reason is actually quite simple: when things go well, they can only go so well, but when they go bad, they can go really bad.
Or, in a more quantitative way: while each step of the project will be equally likely to take longer or shorter than median, the steps that take longer can take much longer, while the steps that go faster only have a limited potential for balancing out the delays.
I don't think this is the correct explanation. I think it's far more likely that (1) people don't know the true distribution of task times, so estimates are just crap guesses based on hubris, what managers want to hear, etc., and (2) scope creep.
That is, if you are doing the project, your company's initial estimate was the one that had the highest chance of being too low.
And that's all companies. You can't really be honest as all your competitors are similarly bullshitting. ( I used to be upset about that thinking I was always working for the black sheep, then over the years, you always end up working with your competitor one way or another and you find out it is the same everywhere )
There is also the deadline game. The client will push for earlier and earlier release date. The provider will accept because the provider knows that the client will not be able to test the product. I used to be upset to deliver code that would not even compile. Then over the years, we have had client not ready to test for several years. An extreme case, is a client that took a package I developed 5 years after delivering the working version of it.
That's the biggest problem I have had with Agile. Very often companies are not ready to support the lack of bullshit even internally - no more schrodinger status, no creative budget allocation - developer appear to be slower and cost more.
You completely lost me there, what happened exactly?
Seems like they were not in any hurry after all. Some 5 years later, they contacted my company again asking how to install it in their test environment. They were apparently not happy that after 5 years, the solution was still a bodge solution using interix rather than a proper port in windows. I don't know what happened from there, I had moved on to another project right after that delivery, could not remember anything and to be honest only painful memories could come back from that shitty codebase. ( the company was making something like 5K a year gross from that application )
We delivered a project a few weeks ago and heard nothing; I heard only yesterday that the manager went on holiday and will be back 3rd week of january. And
when he comes back his inbox will be full so I do not expect any testing till the 2nd week of feb...
Instead we have to understand the distribution, and the mean as well, and add up how long it takes on average, rather than just in the most numerous case.
Further, and perhaps more importantly, by understanding each distribution, you have far greater knowledge about what the expected time will be, and where the delays are most probable; and you can begin to start analyzing where those delays might be coming from in the interactions of all the parts of the system. This is straight-up quality talk right out of W. Edwards Deming, and it's how you start to improve how you produce in general. Good stuff.
Maybe optimistic estimates are a motivating factor for teams. The desire to finish faster, to be more efficient than you were before, a commitment with a challenge.
In my experience, it takes a strong product owner and a mature development team to meet deadlines. Shipping on-time is always a game of tradeoffs. Accuracy in estimates is usually the result of doing things in a known, measurable way. And by keeping the stakeholders close, you can make critical decisions together to keep a project on track.
You could feasibly be on a project that could last until the heat death of the universe.
There are 2 other important issues. First, you need to update your schedule based on the amount of time tasks are actually taking. If you only have 10 tasks (each taking a month), the error bars can be quite significant. But if you have 200 tasks (each taking a day), the mean completion time will have quite small error bars. So if you keep a rolling average (say over the last 30 tasks), you can have a fairly good estimate for completion time (agile developers will notice that this is "velocity").
The second very important issue is to be completely anal about your definition of "done" and about making sure that the completed tasks actually meet that definition. Your mean completion time will only be useful if you are measuring the time to completion accurately (obviously).
This effect is so powerful that I recommend "same sizing" tasks and planning everything to have a completion time of somewhere between 1 and 2 days.
There is actually one last thing you need to do. Requirements discovery never happens completely before you start development. As you write the code, you discover new things to do. It is obviously important to modify your plan to accommodate that new information. If you don't, you will end up building something that nobody needs. However, the amount of new work seems to be predictable. I made graphs of new work added to projects over a couple of years and it appears that the growth of new requirements is very similar to some of the defect discovery models (for example Littlewood). Just making a graph of new stories/tasks added over time will give you a decent idea, but I have found that a rule of thumb of adding 30% or so (over the whole project) for new requirements seems to work well.
I think scheduling these issues would benefit more from splitting up risk and estimated duration and estimating both separately. Perhaps by drawing a mini probability distribution for each task.
Breaking down the task only helps if, in some way, analysing the constituent pieces actually helps you better estimate the risk or brings unconsidered requirements to light. Sometimes it doesn't. If you do too much of it it can also end up being a contributing factor to the delays (analysing every last detail).
When you have a series of dependencies (A -> B -> C) stacked against each other, once A is late, B and C are almost guaranteed to be late. If B is late too, C suffers even more and has little chance of being on time.
If you can lay out the tasks so that dependencies have slack between them so that lateness can be absorbed without shifting later tasks, you are more likely chance to hit deadlines and potentially complete the project on time. Or in more formal terms, track the Critical Path.
In practical terms, I would see this all the time commuting home in DC. I walked to the train to the bus to home. If I timed it perfectly, my commute was ~25 minutes. But if I couldn't cross the street in time, I'd catch a later train which made for a later bus. Or a slightly later train would be a much later bus. When those variances stacked up, it could take an hour.
This is partly why in software it's so critical to loosely couple software. If you're working on a big old ball of mud there are so many dependencies that even the smallest task ends up taking forever.
Fortunately SCRUM provides you with a block of time which you can use to decouple software independently of working on features or bugs, so this necessary work always gets done. Ha.
As an example:
Arriving to the airport one hour in advance doesn't allow you to actually board a flight an hour earlier.
But get to the airport one hour late — and you're flying tomorrow (or whenever the next flight is scheduled).
Interesting observation, I'd never thought of project planning that way.
So maybe include some slack.
Still, maybe you want to know how often will your schedule run long. So do some measurements of historical data to find out how your task estimates work out in practice.
Plug those in and do a few simulations, and viola, you can get a precise number of how you're likely to do. Plus handy tool to evaluate schedule quality.
Well, they can, but usually the acceptable explanation for a project going really well is it actually was less work than estimated and the consequence is fewer resources next time. So, there is virtually no motivation left to drive a project into that area.
From the article: "But while there’s a lower bound to how “under” the median a step can be – a step can’t take negative time – there’s virtually no upper limit to how much over the median time a project can take."
So when one step takes a little longer than expected most people fall into the 150 mph trap and get frustrated when they don't meet the false expectation.
In the post industrial world time isn't the problem but rather project definition and scoping.
In the industrial world the problem was already solved (machine was built, market often established and output depended on a few factors that could be adjusted. Need more output add more of X)
In the post industrial world every project is about problem solving and scoping.
To put it into comparison.
If we apply post-industrial reality to an industrial world. It means that each time a product needed to be done, if not the factory, then the machines would have to be developed.
It will take many many years before time estimation will die, but it will happen.
First, the industrial world also had significant issues with project definition and understanding. The problems were not already solved, and the process, inputs, outputs and the whole system was constantly changing. Every project was already about problem solving and scoping. So there's a bit of rose-colored glasses toward the past here.
Second, the idea that time estimation is no longer applicable is probably off. There are two separate problems: problem definition, and problem solving, and estimation is extremely useful in the latter. Problem definition is a different problem that still needs much focus, but it doesn't preclude the need for better understanding of time to coordinate other processes and dependencies. You might be saying that those dependencies aren't as important as we think, and I tend to agree, but that's a different argument.
Generally speaking, the idea that knowledge and skills from the industrial era are no longer applicable is untrue. There is a huge body of knowledge about how products are made and built that have 99% applicability to software and technology in the post-industrial world. This is because the problems are the same: management of people, leadership, understanding interactions within complex systems, understanding statistics (the importance of which this article proves profoundly), and improving the spread of knowledge. This is the way that Toyota began operating in the post-WWII era, the way W. Edwards Deming modeled companies, and the way that the current Lean movement guides you to improve almost any business. It's highly relevant.
The main point we should take away, is that time and estimates are not constraints on a system; but rather outputs that are predictable and follow statistical patterns. We can use those outputs to make better decisions, especially if we understand the whole process of production in a systemic way.
Estimation need not die. It's a tool for good in the hands of a systems thinker.
In reality, most of the factors that influence the outcomes are exactly the same. Same team, same knowledge, same approaches, same psychological biases, same methods, same politics, and so much more.
Those are the things that influence timelines most; not the project itself.
Oh, yes it can. It's called your parent company going out of business.
Realistic proposals fail. An under-resourced under-priced and short-scheduled proposal is functionally approved by the customer-user, then signed-off as the cheapest/shortest option by the customer-payer.
The real negotiation is in the T&Cs for change management, scope creep, responsibilities for delays and the structure of payment according to milestones.
m = 'most likely' time
o = 'optimistic' time
p = 'pessimistic' time
e = ∑(oi + 4mi + pi)/6
The program (or project) evaluation and review technique, commonly abbreviated PERT, is a statistical tool, used in project management, which was designed to analyze and represent the tasks involved in completing a given project. First developed by the United States Navy in the 1950s, it is commonly used in conjunction with the critical path method (CPM).
If anyone is interested in trying out the beta version, drop me an email at andre /at/ thebroadbaycompany.com
The other quirk is that it breaks down when you map from effort or work to duration, the actual time something takes when you factor in availability, interruptions, communication costs, etc.
One doesn't have to go very far and look for psychological issues when a major source of indeterminism comes from directly the imperfect tooling itself we use.
Sometimes software is compared to construction, except you don't get to start with a perfectly detailed environment to build your software on. Until we have this major source of indeterminism in the tools themselves, pretending to fix issues on the estimation process alone is a fool errand.
I believe a as-400 project can be estimated fairly well now based on previous experiences. But how would you go estimating a project relying on, say, local browser storage? The only way is to go and build it and identify all the pitfalls yourself, and then the browser landscape changes and you get set back all over again.
Just following Safari rules on IOS to obtain a full screen mode sets us back a week on almost every releases. We started before they started messing with stuff, now we include a week of fixes in the schedule for every major IOS update, but we had no chance to predict this when the project started.
For stuff that's been built a thousand times, you can state with a fair amount of confidence how long a project should take and how much it should cost. You can obtain financing and insurance because the dataset is large enough.
For stuff that's unusual or bespoke -- the kind of thing that will appear in an architecture magazine or in a newspaper investigatory report -- then estimates are very likely to be wildly optimistic.
So it's the same, insofar as the further you stray into research, the less certainty there is. The bigger the bet, the fewer such things have been built, the bigger the risk will be that things go awry.
I have a book in my collection -- Industrial Megaprojects -- which makes fascinating reading for enumerating all the ways that chemical process plants, giant mines, gas pipelines, gigantic factories etc can blast through the budgets and schedules.
My personal favourite: a chemical plant built relying on an adjacent river for cooling. To save costs, only one water temperature sample was taken during planning ... in winter. Three billion dollars later, the owners found that the plant was inoperable for about half the year because the river water was too warm.
For every step in a project, there’s about a 50% chance of completion under or on the median step completion time. And there’s about a 50% chance of not. If a project is composed of 2 steps, the probability that both steps are at or under their median times is 50% * 50%, or 25%. For a 3 step project, it’s 50% * 50% * 50%, or 12.5% and so on. If a project has 6 steps, the chances of some of those steps going over its median is greater than 98%.
Instead I offered broad time ranges that would narrow down to more accurate ones as work progressed. The managers who had to report those upwards didn't always like that first but they grew to appreciate how the dynamic worked across the project timeline. In the beginning, they didn't know how long because nobody knew how long. Well, maybe definitely more than a month and definitely no longer than a year, and "it all depends". But every week we knew more and they knew more, and the time margins could be shrunk incrementally as soon as difficult tasks turned out to be not so difficult. So the more progress the better everyone knew how we were doing.
That is one gratifying slide to completion but I do admit it probably doesn't work for every team or company.
Well, that is an estimate according to managers who actually understand what they're doing ;-)
Unfortunately some project managers / product GMs, and sometimes worse, technical salespeople, fail to understand what all is involved in building the software, web functionality they've come to expect to "just work".
That being said, I've been surprised at how understanding people can be when you explain to them in thought out manner.
Broad time ranges can also be risky, because people will hear what they want to hear, which is usually the lower end of the range.
Which is why I only discuss effort (in hours) rather than timeframes. All of our projects are fixed bid, so the conversation usually goes, "this project will take 150 hours of effort. We don't know how long those 150 hours will take to expend." And in rare cases where we absolutely have to mention a timeframe, we make sure it is not in writing anywhere.
Or it's delayed because of more and more edge cases being found and someone not wanting to be consistent about them. If the first, third, ninth, sixty second and hundredth items in a loop have to all act differently for seemingly random reasons, that adds a lot of time to the development.
It unfortunately doesn't matter how good your estimate is if the company/client tosses out the orignal spec at the first possible opportunity.
But for well managed projects, yeah, it's a good writeup.
<meta name="twitter:description" content="Possibly because your boss is an idiot.">
First, a dev estimate is given of six months.
Plan is made to ship in six months.
Then, stakeholders bicker for four months on whether, how, and when to actually do it.
Then, dev gets started, being told that they already spent four months, so they should be done in two, according to the "initial estimate."
Was it "debugging the development process" that called this out? Or "code complete" perhaps?
Pivotal Tracker then looks at story delivery over the past 3 weeks and gives a simple average: velocity. You can then look forward to see approximately when future stories will be completed. You can also see a volatility measurement, which characterises how much velocity is fluctuating.
Our horizon is deliberately short, because we move very quickly.
Nevertheless, it has a simple advantage: it is based on the true and most recent data of the exact project you are estimating.
Other estimation techniques are useful in other situations, but simply being able to say "those are the actual numbers for this project this month" is enormously powerful. There's no fudging. The numbers are right there in black and white.
It usually takes a little while for people new to this approach to accept that velocity is not a target; it's a measurement only. It's a unitless measurement that is only meaningful within a single project, operating at a floating exchange rate with calendar days.
One last thing that helps, as others have pointed out. Break down your estimation tasks into smaller units. Never accept the small headline that hides a big feature. Continuously look for seams to break big stories into small stories. When the pointing begins, summarise aloud with fellow engineers a rough idea of what will need to be done.
Psychologists call this the "unpacking effect", and I suspect that it's responsible for most of the estimation-increasing power discovered in more fully-dressed estimation techniques like PERT, parametric estimation tools or even good old fashioned checklists.
(I was working on an estimation tool for a while, so this subject is dear to my heart).
Another issue: It is really common for people to "take some well deserved time off!" when they finish some piece of the project earlier than anticipated, thus flushing away time that could have helped out on parts that will take longer than expected.
So only the cockeyed optimists reproduced. That's why we can't estimate schedules today.
"It always takes longer than you expect, even when you take into account Hofstadter's Law."
— Douglas Hofstadter, Gödel, Escher, Bach: An Eternal Golden Braid
Cost estimation is an NP problem. Perhaps it will become practical when we can use a quantum supercomputer to estimate the time needed to build a single page web app.
Expected time = (best + 4 * likely + worst) / 6
I went hunting for the origins of that formula a few years ago. I couldn't find the original source, and certainly none of the sources I found had a justification for it.
Someone who works at RAND could probably pull the original internal work and tell us. But I suspect it was chosen for ease of calculation.
While it's been criticised for being normal-esque instead of pessimistically skewed, it still outperforms single-point estimations. I think it's because of the "unpacking effect" that a full PERT estimate causes you to undergo.
There's a literature of people tinkering with the formula, but I don't think it's anywhere near as important as the unpacking effect is.
PERT was invented by the US Navy but these days the DoD recommends not using it but just use normal Critical Path Analysis instead where you just estimate a single time - hard enough on it's on without having to do it three times.
But as I said elsewhere in this thread, my suspicion is that the unpacking effect dominates the "improvement" that's observed and that the particular formula is largely secondary.
PERT is taught as a time based planning tool but it also has a earned value element. The tools for that have moved on too.
I suppose the unpacking effect does make more sense though, over CPA.
"How long will it take?"
So what do you put in CPA - 6, 7, 8 ? PERT says 7.
If a project is a month past its deadline, that means it should have begun a month sooner (in order to meet the deadline).
It's surprising how little attention is paid to this fact in the professional world. So many teams and managers focus on the end date, and ignore the start date. In truth, the start and end dates are precisely equal in importance.
Full disclaimer, I wrote a large chunk of the scheduling engine, so if you're curious about it, just shout.
Heads up though, my son was just born, so it might take me a while to respond ;)
When you accept this fact, sometimes projects get done on schedule
Improve the system, improve the world [cheap, fast, good] lives in. Still have to balance, but you're now balancing cheaper, faster, and gooder.
All this talk of medians and means assumes that you've done the same thing many times before (otherwise you wouldn't have those statistics). In software, if you've done it before, then it's already done so the estimate is zero.
Projects always involve doing new things, new requirements, using new technology, techniques, or tools; targeting a new system, and/or using new processes and people. New requirements, constraints, market situations, priorities, and discoveries rise up during the project. Sometimes all of the above. Past performance is not a predictor of future results.
It's more realistic to flip it around. Prioritize the most-important requirements and set an initial deadline. Then you'll get something (the most important stuff, or at least some progress towards it) by the deadline and can decide whether to add the less important stuff afterward.
When doing that, quality and other intangibles need no longer be overlooked but become part of the prioritization. Do you want high quality, or more stuff by the deadline? Do you want the team functioning well for the long haul or is it worth burning them out to rush this, then having downtime afterward?
To get more control, shorten the deadlines to the minimal point where you're still getting deployable chunks of acceptable quality. That acknowledges the inherent uncertainty and gives better control and visibility than just making up a random number (or 3).
Are you saying businesses shouldn't have processes?