Hacker News new | past | comments | ask | show | jobs | submit login
The longer it has taken, the longer it will take (2015) (johndcook.com)
159 points by haltingproblem 25 days ago | hide | past | web | favorite | 46 comments

I'm seeing this happen yet again on a project at work which has been super badly managed, it has been going for ~ 1 year (which is a pretty long time for what it actually is).

The deadline for project completion is 1 October. The due date is approaching quickly and 4 weeks ago, the other teams were telling me their stuff was mostly done and now it's just time for tidy up and bug fixes (which sounded dubious); However, now they're all working extremely long days and weekends just to complete.

So we went from almost being done 4 weeks ago with the finish line just days away, to clearly being under quite a lot of pressure to deliver and obviously still a lot to do.

The other thing is, people seem to be spending time on things that aren't really important and not tackling the actual work.

Maybe a lot of this comes down to stress and fatigue from their expectations not being met which ends up affecting performance.

Some usual questions that come to mind...

Are the individual metrics/incentives aligned with progress on the project? Is anyone panicking? Is anyone trying to appease someone else who's panicking, by giving an inaccurate appearance of progress/work? Is anyone stalling for time, while they try to move elsewhere?

Is there a project plan with work breakdown (and interdependencies, and preferably resource allocations) that shows what you've done, and what you still have to do? Are the incomplete tasks broken down to the resolution of 1-2 days, or to hours? Is the plan complete, or is there substantial work to do that doesn't appear in the plan? Is the completion on the plan accurate? How frequently is slippage being checked, and how does that happen? Does anyone have incentive for the plan be inaccurate at this point?

Are people working on unimportant things because they're blocked on important things by dependencies on other people, but don't want to say it?

Is everyone still putting in full effort, and committed to the project success? Or have some given up on the project, and are focused on shielding their careers?

(Even if the project effectively isn't working from a plan, has bad morale and panicking, there are conflicts of interest, etc... it might not be too late for a good manager to rally everyone around an achievable new plan, possibly including revisiting the requirements. Given that the project is in trouble, I suspect it would need believable buy-in from upper management and the "customer" for this project, or people will still feel doomed, rather than focused on making it work.)

In Project Management class I learned about Parkinson's law which states "work expands so as to fill the time available for its completion". That about sums up my life.

It's fascinating to watch first hand, management's inability to properly estimate project timelines. I think it's from a lack of really digging in and solving the thousands of tiny problems that eventually add up to solving whatever larger problem it is the general organization was trying to solve originally.

Instead, I've seen mostly attacking a problem head on which involves a really inordinate about of planning, more planning, some execution, and then more planning. Planning so much, that timelines must be altered to make room for more planning.

That's been my experience though.

I believe the desire to blow through timelines planning instead of executing, is born out of fear and what I call the sensation of movement. Fear because management doesn't want to mess up and miss something important. The sensation of movement causes management to incorrectly perceive work is being done.

At the end of all of this, nothing is done because no one has done any of the work to accomplish it, they've only been planning.

I think a non-trivial part of why deadlines are always wrong is something akin to the Coastline Paradox [1], where the closer you are to something, the more details there are.

At some point you have to just say, "Fuck it. It's good enough" and leave it as the terrible flawed pile of crap it looks like at the detail level you're at.

Often once you step back and look at it from a customer perspective, you had just gone in too deep and they didn't need half of what you were preparing for anyway.

[1] https://en.wikipedia.org/wiki/Coastline_paradox

I’m beginning to doubt the reality of anything ever being done in the first place. Done software is dead software that nobody uses anymore.

It seems more fruitful to talk about gradients and equilibriums of desired changes.

I want to agree. But I use TeX. And really, it does as advertised. And set a line for where it stops.

Note that one property described is not true of all distributions. I've used the exponential distribution to predict completion times before, and the exponential distribution has the simple property that on average completion is always the mean completion time at zero away. I have no idea if this is accurate or not, just that I've found the simplicity convenient. In contrast, for the power law distribution used in the blog post the mean completion time from now increases as time passes. (In both cases, average time from start to finish increases on average as time passes.)

I chose the exponential distribution because it's the maximum entropy distribution for a positive number with known mean: https://en.wikipedia.org/wiki/Maximum_entropy_probability_di...

So in my limited understanding, the exponential distribution popular because it's very easy to work with -- you can actually analytically solve some queueing problems, for example, that wouldn't be possible with other distributions.

But power-law distributions show up over and over in things that people here care about: file size, network traffic, process lifetimes, etc etc. In these cases the exponential will drastically underestimate the fat tail.

A power law distribution is roughly as easy to work with here as exponential. The blog post contains the power law results for this case, which are fairly easily obtained through a conditional average (conditioning on t > t_0).

The important question you bring up is which is more accurate, which I don't have an answer for. But perhaps a reader has data on this. I will note that I compared exponential against some FOIA request processing data while back and thought it was okay, though I don't remember anything quantitative; see here: https://news.ycombinator.com/item?id=21032750

I think it's likely that something with more parameters like a log-normal distribution would be better than either, but intuitively I doubt you'd be able to get simple equations for the mean remaining time out of that.

One problem with the power law model is that the expected duration at t = 0 is 0. The exponential model does not have that problem. You could fix this for a power law by not having power law behavior for short times.

I learned about the prevalence of Pareto distributions in computer systems from Harchol-Balter's Performance Modeling and Design of Computer Systems[0] (which, I will admit, I properly understood perhaps 25% of).

She reported the Pareto distribution of process times from first data collected in 1997[1].

[0] https://www.cs.cmu.edu/~harchol/PerformanceModeling/book.htm...

[1] https://www.cs.cmu.edu/~harchol/Papers/TOCS.pdf

A reference to Jaynes. :) Have an upvote!

The thing that makes this non obvious is not just that things take longer than you think, but rather as time goes on you have to adjust your expectation of completion to be longer.

For example, if there is a project that is meant to complete in a week. It has now been two weeks, so one week over budget. Most people would think that finishing the project is right around the corner, but rather the expectation should be that it will take another week or two weeks. If you get to the end of a month, same applies - the expectation should be that it will be another month, not that it is right around the corner.

Worse, is when folks take a late project and add people to it. They don't realize this is just adding to the work, so will almost certainly just increase the time.

You may get lucky and one of the new additions throws out the current plan. Cuts from the sunk costs. Probably, though, that will be its own trap. You'll think you understood your success for the next time.

That’s Brook’s law: “Adding manpower to a late software project makes it later,”

That’s Hofstadter’s law: “It always takes longer than you expect, even when you take into account Hofstadter's Law.”

Has anyone checked the underlying assumption (that the there’s a power law involved)? My belief was that most long tailed distributions don’t actually follow a power law.

Probably a year or more ago I compared some data on FOIA request processing times reported by US government agencies against an exponential distribution (not power law). I used the given mean and compared the predicted median against the actual median. Was acceptable as far as I was concerned, though I'd have to redo the analysis to actually get quantitative.

With a power law there's no mean or median starting at t = 0, so this sort of comparison isn't possible given the FOIA request data available. I'd be interested in seeing the data on the tails to see if those are power law distributed.

Edit: I'll do a quick quantitative check. Using data from the CIA's 2018 FOIA report (p. 18): https://www.cia.gov/library/readingroom/foia-annual-report

Simple track:

Mean: 32.21 days

Median: 12 days (exponential prediction: 22.33 days, 86% error)

Complex track:

Mean: 368.49 days

Median: 306 days (exponential prediction: 255.42 days, 17% error)

Not the best fit but could be worse.

I do this to myself too but I've gotten better at it. The longer I take to do something, the more I feel it has to have to show as being worthwhile, and that feeds back poorly.

I'm pretty sure orgs also have this problem.

My favorite related (not a power law distribution) example of this: life expectancy grows with age.

For some people that's really intuitive and others can use it as a more practical stepping stone on the way to understanding what happens in projects.

The life example is weaker than the project example.

In the project example, not only does the total project time increase with spent time, but the time left increases with spent time.

That is true for any distribution. The mean (expectation) of the part after X+epsilon has to be larger than after X. There is no possible age distribution where your claim isn't true.

OP here. I recall a similar passage in Thinking Fast and Slow by Daniel Kahneman. He was part of a committee to rewrite textbooks and everyone estimated it to be part of a normal distribution instead of a power law. He lamented how even trained psychologists were bad at estimating completion times in spite of being aware of the data.

I think this holds for most human endeavors that have intellectual property as the end result - software projects, books, doctoral thesis, etc.

Two questions - 1- I would really like to understand why? 2- I have always thought lean is the answer to the above in a startup context but really curious of hear of others.

Not just intellectual endeavours, the planning fallacy shows up everywhere.

I have a book called Industrial Megaprojects by Murrow[0] which gives a lot of examples of financially disastrous multi-billion dollar projects. His conclusion? It's not really the doing of the project that was wrong, it was that the projects shouldn't have been done in the first place. The estimates and preliminary investigations were underdeveloped and this typically leads to overoptimism.

Bent Flyvbjerg has also done a lot of work on megaprojects[1] and his basic conclusion is that well-estimated projects don't get built, because almost no megaproject is ever viable or cost-effective in itself. So there's a survivor bias towards "bad" projects, ones that are poorly planned in the first instance, magnifying the effect.

In terms of software, agile/lean has the advantage of limiting commitment. It's easier to terminate something that's cost very little and not gotten far than to terminate something that's several years and millions of dollars into going nowhere. The former can be dismaying and annoying. But by the time serious time and money have been spent, there's a sunk-cost fallacy and often, personal pride or status of powerful individuals is involved.

[0] https://onlinelibrary.wiley.com/doi/book/10.1002/97811192010...

[1] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2424835

My guess is it is a statistical artefact. If you have a project and you pick a random time in that project then you expect to pick the midpoint.

So, if the only information we have is that the project has gone on for 2 years we expect that to be the midpoint and that the project will continue for 2 years.

In particular if the project manager has lost control of the project (eg, not scheduled in a contingency, missed requirements, etc) then there is no reason to believe that anyone knows what % of the project is done. So assume 50% because that is the Most Likely Estimate. And be surprised at how often that is the right guess in my cynical and not inconsequential experience :p.

If the project manager is in control (usually evidenced by people opining that the project will finish early) then expect the project to finish exactly on time when something unexpected goes wrong.

I wrapped a probability distribution around this very phenomenon a few years back:


The expected duration converges to exactly double your wait time so far. I've found the "unreliable friend" distribution to be very useful in my modeling!

Your "exponential timer" page 404s, unfortunately.

Operating room variation: "Everything takes longer than it takes."

Hofstatder's law: It always takes longer than you think, even after accounting for Hofstatder's law.

It always takes 10% more than the developer's estimate, even if he added 10%.

(I say 10%, but my estimates are usually 100% under, even if I added 100%)

It’s fun, but it’s not true. Some projects finish in time when planned with a large margin of safety.

In defense of Hofstadter's Law: it's about perception more than actual outcomes. And it doesn't predict infinite time. It can be converging to a limit, just that the limit is higher than expected.

The nice part is the recursion in the way this law is formulated.

Taleb might say this is a sibling to the Lindy Effect


Hey mentions the Lindy effect. I don't think I learned anything new here, except there's a scientific name for something pretty intuitive.

I've heard this paraphrased another way as the time it takes a company to reach its peak is its half life. The faster a company grows, the faster it will also decline.

See also: getting kubernetes into production anywhere.

Often there are powerful incentive effects at work that can explain the "lateness". The party making the estimate may be rewarded for making optimistic estimates (winning a bid, winning votes, staying in power). And that reward may be greater than the penalty for being "late", which can often be blamed on factors that appear unpredictable.

So the challenge for organizations who want predictability is to set up their system to balance the rewards and penalties to get the desired result.

I think that's definitely a factor. I came into a project once where that had happened. The initial estimate was a mistake and the contractor decided to up the estimate incrementally whenever they reviewed status. The client became annoyed because "every time they estimate completion, it is taking longer." (I had worked for both and came in as a third party and due to that, was able to get things back on track.) The initial situation brings me to the other major factor I see in these situations. Anything that was overlooked or comes up unexpectedly results in more work. The situations where unexpected situations result in less work are as rare as hen's teeth. This results in underestimating more often than not.

Sell a lot low, deliver less late

The root cause is usually poor resource estimates (ie. not spending enough time estimating effort before committing to timelines).

Some estimating techniques: https://www.simplilearn.com/project-estimation-techniques-ar...

For a team’s ongoing everyday tasks, I’ve used parametric estimation with some success by defining team weekly capacity for productive work and then summing up the parametric estimates for all backlog tasks. You can then get a rough idea for the number of weeks to complete a bigger, more complex task.

You can even build a model to automate this estimation.

I recall a study that IBM did on S/W defects decades ago. My recollection is that they concluded that the more defects had been found, the more still remained. That sounds like a corollary to "the longer it has taken, the longer it will take."

Isn't this obvious?

This is why features get cut and crunch is a thing. There are only so many ways to prevent the finish line from falling off the horizon.

I think what this points out, which may be counter-intuitive, is that if there is sufficient complexity in the project that has driven it to be late, there is likely hidden complexity that will push it even further.

Complexity grows exponentially (a graph of inter-relationships) and so even cutting a feature isn't enough to curb the ballooning complexity of an over-scoped or poorly estimated project.

I could be projecting from past experience ;)

Definitely applies to doctoral thesis projects, especially the writing up! :D

Are there exceptions? If so, what is special about them?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact