
Why software projects take longer than you think – a statistical model - mzl
https://erikbern.com/2019/04/15/why-software-projects-take-longer-than-you-think-a-statistical-model.html
======
afarrell
An important aspect of being a professional software engineer is having the
backbone to sometimes say things like:

\- “I don’t know yet enough about the problem to give you even a rough
estimate. If you’d like, I can take a day to dig into it and then report
back.”

\- “This first part should take 2-3 days. 5 on the outside. But the second
part relies heavily on an API whose documentation and error messages are in
Chinese and Google Translate isn’t good enough. I’d need to insist on
professional translation in order to even estimate the second part.”

\- “The problem is tracking down a bug rather than building something, so I
don’t have a good way of estimating this. However, I can timebox my
investigation and if I’ve not found the cause at the end of the timebox, can
work on a plan to work around the bug.”

You need to be willing to endure the discomfort of looking someone in the
face, saying “I don’t know”, and then standing your ground when they pruessure
you to lie to them. They probably don’t want you to lie, but there is a small
chance that they pruessure you to. If you don’t resist this pruessure, you can
end up continually giving estimates that are 10x off-target, blowing past them
as you lose credibility, and your running your brain ragged with sleep-
deprivation against a problem you haven’t given it the time to break down and
understand.

But when you advocate clearly for your needs as a professional, people are
generally reasonable.

~~~
Aeolun
> But when you advocate clearly for your needs as a professional, people are
> generally reasonable.

This has not been my experience. People want ‘estimates’ at all costs, tell
you to not worry about any accuracy, and then a week later tell your manager
you committed to x date.

~~~
nahname
As long as software is a cost center to your company, it will be treated this
way. Trying getting a job at a company where software is a core concern.
Ideally with a CEO that is not from a marketing/business background.

~~~
afarrell
For more on this sort if thing, Patrick McKenzie’s writing is good:

\- [https://www.kalzumeus.com/2011/10/28/dont-call-yourself-a-
pr...](https://www.kalzumeus.com/2011/10/28/dont-call-yourself-a-programmer/)

\- [https://www.kalzumeus.com/2014/04/09/what-heartbleed-can-
tea...](https://www.kalzumeus.com/2014/04/09/what-heartbleed-can-teach-the-
oss-community-about-marketing/)

~~~
eikenberry
> In the real world, picking up a new language takes a few weeks of effort and
> after 6 to 12 months nobody will ever notice you haven’t been doing that one
> for your entire career.

Spoken as someone who has never taken the time to fully master a programming
language and, from the sound of it, has never worked with someone who has
either. The difference between someone who has spent 6-12 months with a
language compared someone who has spent 6-12 years is night and day. From the
general tone of the article, he obviously focuses more on the business value
than on the technical side and that is a pretty good approach for making
money. But I'll take Peter Norvig's advice
([http://norvig.com/21-days.html](http://norvig.com/21-days.html)) over this
guys when it comes to mastering a language.

To be fair most of the content of those articles is pretty decent though and
it is just a pet peeve of mine when people claim that mastery of your medium
doesn't matter.

~~~
Aeolun
In my experience, having mastery over one language translates more or less
directly into being at least journeyman in all others.

If you are working with an apprentice, whatever you do, it will seem like you
are really experienced.

------
bunderbunder
I've been one place that I thought was _really_ good at software estimation.
Their system was:

Everything gets a T-shirt size. Roughly, "small" is no more than a couple
person-days, "medium" is no more than a couple person-weeks, "large" is no
more than a couple person-months.

Anything beyond that, assume the schedule could be unbounded. Figure out how
to carve those into a series of no-larger-than-large projects that have
independent value. If they form a series of iterations, don't make any
assumptions about whether you'll ever even get around to anything but the
first one or two. That just compromises your ability to treat them as
independent projects, and _that_ creates risk that you find yourself having to
worry about sunk costs and writing down effort already expended when it
eventually (and inevitably) turns out that you need to be shifting your
attention in order to address some unforeseen business development.

At the start of every quarter, the team would commit to what it would get done
during that quarter. There were some guidelines on how many small, medium or
large projects they can take on, but the overriding principle was that you
should under-promise and over-deliver. _Lots_ of slack (1/3 - 1/2) was left in
everyone's schedule, in order to ensure ample time for all the small urgent
things that inevitably pop up.

There was also a log of technical debt items. If the team finished all their
commitments before the end of the quarter, their reward was time to knock
things off that list. Best reward ever, IMO.

~~~
SketchySeaBeast
> Everything gets a T-shirt size. Roughly, "small" is no more than a couple
> person-days, "medium" is no more than a couple person-weeks, "large" is no
> more than a couple person-months.

That's pretty much exactly what I've ended up using on past projects - a
little more fine grained (start at a half day, went up to months), but that
was my approach as well, and if I didn't know everything it went up a size.

~~~
karthikb
Did you find much differentiation between half day and couple of days?
Especially because some things that might take a couple days end up taking 30
mins (some efficient package already exists), and some half day things end up
taking a couple days, so it comes out in a wash?

~~~
bunderbunder
The answer to that question depends heavily on the duration and nature of your
planning iterations.

If you're doing quarterly planning, the difference between half a day and a
couple a days is meaningless, and there's not really any point in
distinguishing among them.

If you're doing 1-week sprints, the difference between half a day and a couple
days is enormous, and the product planner might get some value out of
distinguishing among them.

If you're following a more kanban-y approach, the difference is perhaps
meaningful, but not particularly actionable, so I think I (personally) still
wouldn't bother to capture the distinction for planning purposes.

------
teddyh
According to Joel Spolsky¹, programmers are generally bad at estimating, but
they are _consistently_ bad, with the exact factor depending on the
individual. So by measuring each person’s estimate and comparing it to the
actual time takes after the fact, you can determine each person’s estimation
factor, and then when they estimate again, you can get a pretty reliable
figure.

1\. [https://www.joelonsoftware.com/2007/10/26/evidence-based-
sch...](https://www.joelonsoftware.com/2007/10/26/evidence-based-scheduling/)

~~~
tonyedgecombe
You know that article was written to sell a feature in their bug tracker. I
like Joel's writing but I'd take that piece with a pinch of salt.

~~~
teddyh
If we try to be a bit charitable, we could assume that they implemented the
feature in the bug tracker _because_ of this observed property of estimates.

------
acd
US navy has developed something similar with beta statistical distribution.
You estimate "Optimistic", "Most likely" and "Pessimistic" time estimates for
each task in the project and then use beta distribution on it. Some tasks take
way longer than estimated.

Here is the link to the time estimation described above with Beta
distribution. [https://www.isixsigma.com/methodology/project-
management/bet...](https://www.isixsigma.com/methodology/project-
management/better-project-management-through-beta-distribution/)

~~~
maltalex
I find this approach very interesting, but it hinges on the assumption that
project completion times follow a beta distribution. What's the basis for
that?

~~~
kqr
It may reflect the observations in OP kind of well -- with one difference: it
assumes we're good at estimating the mode, not the median. But other than
that, within the range we're talking about (1 < alpha < beta) it has somewhat
similar shape to the lognormal distribution.

The three things that still bother me about that idea are:

1\. I haven't tried fitting it to the dataset in OP;

2\. It's bounded to the right, which seems unrealistic;

3\. I haven't come up with intuitive interpretations for the alpha and beta
parameters in this context. If the beta distribution means something, then its
parameters must have natural interpretations as well.

------
mbesto
As always, my favorite article on this subject:
[https://www.lesswrong.com/posts/CPm5LTwHrvBJCa9h5/planning-f...](https://www.lesswrong.com/posts/CPm5LTwHrvBJCa9h5/planning-
fallacy)

> _A clue to the underlying problem with the planning algorithm was uncovered
> by Newby-Clark et al., who found that

Asking subjects for their predictions based on realistic “best guess”
scenarios; and

Asking subjects for their hoped-for “best case” scenarios . . . . . . produced
indistinguishable results._

> _So there is a fairly reliable way to fix the planning fallacy, if you’re
> doing something broadly similar to a reference class of previous projects.
> Just ask how long similar projects have taken in the past, without
> considering any of the special properties of this project. Better yet, ask
> an experienced outsider how long similar projects have taken._

~~~
jfehr
Daniel Kahnemann calls this the "inside view" and "outside view", from his
book Thinking Fast and Slow.

The relevant excerpt (mostly an anecdote that serves as an introduction to a
whole _chapter_ about it) can be found here:
[https://www.mckinsey.com/business-functions/strategy-and-
cor...](https://www.mckinsey.com/business-functions/strategy-and-corporate-
finance/our-insights/daniel-kahneman-beware-the-inside-view)

------
basetop
The mythical man month ( required reading for most CS programs ) goes into a
historical and production aspect of why software projects take longer than
what you think and what you expected.

Also, there is a law named after author called the Brooks's law : "adding
human resources to a late software project makes it later"

[https://en.wikipedia.org/wiki/Brooks's_law](https://en.wikipedia.org/wiki/Brooks's_law)

In most industries, if you are running behind schedule, you throw more workers
at the problem to catch up. For example, laying railroad tracks, digging
ditches, deliverying packages, harvesting crops, etc. By adding more workers,
you shorten the time it takes to complete the project. But which software
engineering, the reverse tends to happen. If you are falling behind, just
throwing more developers at the problem worsens the problem. Most likely
because you need the new developers to get "caught up" with the
code/project/tools but if you rush that process, then they won't have a full
understanding of the project/code/tools and introduce bugs/problems themselves
which exacerbates the problem.

It's a fun read if you have the time.

~~~
rainhacker
Given most software project estimations are off, wonder if a corollary of
Brook's law can be - don't add resources in later stages of 'any' software
project.

~~~
bostonvaulter2
Ah, but how do you know what stage of the software project you're in? Are you
3 months into a 4 month project or are you 3 months into a 5 year project?

~~~
rainhacker
Maybe, instead of asking - if I'm x months into a y months project (assuming y
months is the initial estimate) - ask if the project is x% feature complete.
Based on the remaining features identify how far the project has progressed.
Though, this approach has the problem of scope creep. As the requirements are
fluid, especially in a long-running project.

------
hnzix
My rule of thumb: take your estimate, double it, then add 20%. I'm not joking.

~~~
thatoneuser
Eh. I've been successful adding 20% in. I feel like if you have to double
first (meaning your end result is 220% of what you originally estimated) then
you aren't learning from previous mistakes. Maybe 220 is appropriate for the
first time you do work or work with a certain team tho.

~~~
crazygringo
> _then you aren 't learning from previous mistakes_

???

The doubling-it is for when you build the whole component on top of a library,
then discover that the library has a fatal bug you can't work around, and you
have to rebuild the whole component on top of a different library, then
discover that other library has _another_ fatal bug, so now you need to
include _both_ libraries with logic around when to use which one, and then
that seems to work but you add plenty of tests and documentation to make sure
it works that way in prod too and not just on your dev machine.

I don't see what that has to do with learning from previous mistakes. Pretty
much all but the most basic programming turns out to be like that -- dealing
with unforeseeable and undocumented problems.

(The 20% is because you weren't planning for sick days, an unforeseen
emergency bugfix on another project, nobody remembered about the afternoon
retreat next week, etc.)

~~~
lostctown
Going through this now and boy is this true. Nothing like rewriting a 1k+ lib
the day before a feature is supposed to be ready and then praying you didn't
blow up some other lesser documented part of the code.

------
onion2k
This is why I like 3 point estimation[1] - if you have optimistic, expected
and pessimistic estimates for each task you can pull out which points are high
risk. Using a single estimate can't give you that insight.

[1] [https://en.wikipedia.org/wiki/Three-
point_estimation](https://en.wikipedia.org/wiki/Three-point_estimation)

~~~
vbuwivbiu
manager: "thanks for the optimistic estimate!"

~~~
onion2k
Sure, in the same way a bad manager will say "No, that's too high, I'm going
to reduce your estimate" if you use a single number. Bad managers are a thing.
Make sure you get a good one.

~~~
nicoburns
Yep. I still remember my first manager, who doubled every estimate I gave him.
He was great.

~~~
onion2k
Definitely a good manager, but Hofstadter's Law says that's still too low. :)

[https://en.wikipedia.org/wiki/Hofstadter%27s_law](https://en.wikipedia.org/wiki/Hofstadter%27s_law)

------
adrianmonk
One strategy for dealing with risk is to order your tasks so the highly
variable ones come first. That way, as you progress through the project, you
eliminate a lot of unpredictability, and it's smoother sailing toward the end.

However, as the tables in this article show so clearly, one task can dominate
others. This serves as a great illustration of a perception problem the above
strategy can create. Outside observers watch your progress and will probably
evaluate it in terms of number of tasks completed.

You have (say) 10 tasks to complete, and from their point of view, all they
know is a really long time has passed and you haven't even completed 1 of
them! Since they are further removed from the situation than you are, they are
almost guaranteed to not appreciate why the variable task is that way. They're
likely to just think your team is performing badly.

So, maybe it's better to order tasks so that risky stuff is spread more evenly
across the timeline of the project. It could create less confusion. Putting
risk at the beginning is a strategy that requires a whole lot of trust and
buy-in.

------
mikekchar
The interesting thing is that by the central limit theorem, the mean of a mean
is normally distributed. This is extremely helpful. Here's what I suggest you
do:

Same size your stories to small values. Do 30 stories in a sprint and take the
mean. Do 30 sprints and take the mean of the sprint. What you get is the mean
amount of time to do a sprint of 30 stories. What's amazing is that this
estimate will be normally distributed. You can measure the variance to get
error bars.

Of course 900 stories to get good estimates ;-) However, imagine that your
stories averaged 2 days each. Imagine as well that you have a team of 10
people. That means that you will finish a "sprint" of 30 stories in 6 days (on
average). 30 sprints is 180 days -- the better part of a year, but you
probably don't need a 95% confidence interval.

You will find that after a few sprints, you'll be able to predict the sprint
length pretty well (or if you set your sprints to be a certain size, then you
will predict the number of stories that will fit in it, with error bars).

The other cool thing is that by doing this, you will be able to see when
stories are outliers. This is a highly undervalued ability IMHO. Once a story
passes the mean plus the variance, you know you've got a problem. Probably
time to replan. If you have a group of stories that are exceeding that time,
then you may have a systemic estimation problem (often occurs when personnel
change or some kind of pressure is being applied to the team). This kind of
early warning system allows you to start trying to find potential problems.

This is really the secret behind "velocity" or "load factor" in XP. Now, does
it work on a normal team? In my experience, it doesn't because groups of
people are crap at calmly using statistics to help them. I've had teams where
they were awesome at doing it, but that was the minority, unfortunately.

~~~
piccolbo
The central limit theorem is in the limit for the number of variables in the
sum approaching infinity. In the finite world, the article explains how it's
done. The article is saying, the sum of lognormals is not normal. You are
saying: take enough of them and it is normal. The article is still more
accurate than your reasoning for 30 stories. From the wikipedia entry for
Central limit theorem " As an approximation for a finite number of
observations, it provides a reasonable approximation only when close to the
peak of the normal distribution; it requires a very large number of
observations to stretch into the tails". To prduce a 95% confidence intervals,
you have to upper-bound the tails. All methodologies that are based on sum of
subtasks estimates are not evidence based. But we knew already sw
methodologies are not evidence-based, did we?

~~~
ska

        You are saying: take enough of them and it is normal.
    

This doesn't completely undermine your point, but that isn't what they are
saying, I think. I read it as saying by CLT that the estimates of the mean of
those distributions is normal and centered on [the mean you are actually
interested in]. Tails are perhaps somewhat a red herring here, because you
don't really care about them unless you are specifically trying to evaluate
worst-case-but-really-unlikely.

~~~
mikekchar
Yes, that is correct. It's been a very long time since I studied statistics,
so I'm not sure if the variance of a mean has the same confidence interval as
the mean. I suspect not. So you would indeed need to have a very large number
of samples to get good error bars. It's a good point which I hadn't really
considered. However it will never really get that far anyway because hopefully
you'll intervene before the long tail hits you.

I think those really long tails are more of a problem when you are working
with "features" that are much longer. If you have 1 day stories and you've
been working on the story for a whole week, you know you have a massive
problem. It's time to back up and see if there is a way to break it up, or to
do it differently.

If you have a feature that is a month, by the time you get to 5 months, you
have so much capital invested in the original plan that it's very hard
(politically) to say, "Nope... this isn't working out. Let's try something
else". Of course, it is very hard to get your organisation to plan to a 1 day
level of granularity.

------
yawz
Even the language is not correct: We all call it "estimate", but stakeholders
behave like it's a "commitment". Passage from uncertainty to certainty happens
in the language, and all the responsibility is on the engineering team's
shoulders.

~~~
adrianmonk
Even if you choose the right words, people don't necessarily pay close
attention. And if they want a commitment, they may assume that the numbers you
give them are a commitment regardless of how you phrase it.

But even if they did listen closely to what you said, "estimate" is not even a
great word. If your car is in a wreck, a body shop gives you an estimate to
fix it, and they may treat that estimate as a commitment. It's pretty common
practice that people are held to an estimate, or to not going over by a
certain small percentage. Maybe we need a word like "forecast" or
"prediction".

~~~
Bjartr
> Maybe we need a word like "forecast" or "prediction".

Scrum (The org behind that flavor of agile) actually changed the language used
in their guide back in 2011 from developers making a "spring commitment" to
making a "spring forecast" for exactly these reasons.

[https://www.scrum.org/resources/commitment-vs-
forecast](https://www.scrum.org/resources/commitment-vs-forecast)

------
wellpast
> Instead, figure out which tasks have the highest uncertainty – those tasks
> are basically going to dominate the mean time to completion.

From the _technical_ side of things, uncertainty can mean a few things here:

(A) I've never done this kind of task (or I don't remember or didn't write
down how long this task took in the past)

(B) I don't know how to leverage my historic experience (e.g., implementing an
XYZWidget in React and implementing the same widget in Vue or Elm for some
reason take different amounts of time)

Considering (A)... _Rarely_ does a seasoned developer in the typical business
situation encounter technical tasks that are fundamentally different than what
has been encountered before. Even your bleeding-edge business idea using
modern JS' \+ GraphQL' is still going to be built from the same _fundamental_
pieces as your 1999 CRUD app using SOAP and the estimates are going to be the
same.

If you disagree with this you are in the (B) camp or you haven't done the work
to track your estimates over time and see how ridiculously accurate estimates
can be for an experienced practitioner. Even "soft tasks" like "design the
widget" are estimable/repeatable.

This whole you-can't-estimate-software accuracy position is entirely a
position of inexperience. And of course all bets are off there. You are
talking about estimating _learning_ in this case, not _doing_. And the bets
are especially off if you aren't modeling that these are two different
activities: learning and doing.

~~~
afarrell
> is entirely a position of inexperience

There are a lot of inexperienced software engineers and very little good
guidance written for them. What is a new CS grad to do when asked for an
estimate? How can a new grad learn to produce accurate estimates within 3
months?

~~~
wellpast
A problem with our industry in this regard is that we don't understand the
difference between _learning_ and _doing_.

A new grad entering industry is going to be doing _a lot_ more learning than
doing -- or rather learning _while_ doing.

I know from experience that explicit _doing_ is highly predictable/estimate-
able for the experienced software practitioner.

I have a suspicion that the _learning_ side would be predictable, too, if our
industry could do a better job of articulating what the _practitioner_ skill
set actually is -> then a pedagogy could develop and it could be said that
learning X-Y-Z takes on average this long for a set of students. Etc.

But we as an industry do not seem to be near this level of clarity -- in large
part because we don't even have the vocabulary to frame it this way... in
terms of _learning_ vs _doing_...

Now what this means for the new CS grad is not the best story. You'll rather
have to play the chaotic game a little bit, which includes a mess of things
like doubling estimates and working weekends or what-have-you depending on the
work culture you find yourself within.

That ^^ in the short term.

In the long term what you should do is practice on your own:

1) ALWAYS privately plan and estimate your tasks to the best of your ability--
on your own, you may not get benefit by exposing your "practice" to the
higher-ups

1a) hint: your tasks should be as scoped/small as you can make them and they
will look pretty simple, like this: "design widget X", "design API interface",
"implement API interface", "test API interface", "implement widget Y", "learn
how framework X lifecycle works" (yes! even learning tasks can be estimated!),
and so on. The key is that you try to keep them under a day and ideally even
smaller on the order of hours.

2) RECORD the time it takes and compare to your estimates. --> LEARN from
this.

3) REPEAT

If you do this conscientiously you will find that your estimates improve
dramatically over time and you'll be able to out-estimate your peers to their
own wild confusion.

This skill set will pay off in your ability to deliver AND have weekends and
hours with your family. Because you will be able to "see time" and protect
yourself. You'll have protection and ammo even in the worst pressure-cooker
environments because you will be able to say "No" with confidence. Or rather
you will learn how to say "Yes" and what to say "Yes" to. (Ie. you will get
really good at priority negotiation etc.) And you'll cultivate delivery trust
with superiors and everyone will be happy.

The main reason you get these claims that "business just doesn't understand
software" and "they put too much pressure on us" is because "business" side
doesn't trust us. Once you get good at estimates and _delivering_ on them,
that trust is restored and everyone can be made happy.

BUT -- and here's the rub -- it takes _time_ and conscientious _effort_ and
variety of experience to reach this level of confidence. My advice: Ignore all
the philistines who say it can't be done....because they'll just try to talk
you out of the effort that they weren't or aren't willing to do themself.

------
oli5679
In the UK, bookmakers offer 'accumulator' bets, where a punter can select many
outcomes, getting a big prize if 100% correct.

This takes advantage punters' failure to accurately multiply probability - 10
events with 80% probability have joint probability <11%.

Something similar happens with planning, where people fail to compound many
possible delay sources accurately.

Dani Khaneman covered this in Thinking Fast and Slow, also showing that people
overestimate their ability, thinking they will outperform a reference class of
similar projects because they are more competent.

[https://en.m.wikipedia.org/wiki/Planning_fallacy](https://en.m.wikipedia.org/wiki/Planning_fallacy)

~~~
Sahhaese
It doesn't take advantage in the way you say because you get paid along the
same lines. If you bet an accumulator with 2 selections at evens, you get paid
at 3/1\. (4.0 decimal odds) so that is "fair".

It's profitable for bookies because they have a house edge, and that edge is
increased the more subsequent bets you make. The house has more edge with an
accumulator than a single bet.

People like to do accumulators because it's more meaningful to win large
amounts occassionally than more regularly win less meaningful amounts.

So it's a "trick" to simply increase gambling.

If you had to pick out 8 evens shots in sequence and intended to roll over
your bet each time it would have the same effect / outcome, but starting with
a pound by the last bet you're effectively placing a 128 quid bet on an evens
money.

It's not that the player thinks that they have a better chance of winning than
1/256 it's that it effectively forces them to gamble a lot larger amount in
the situations where only 7 out of 8 of their picks come in.

And that's before considering the edge. If we consider that these are probably
events that happen more like only 45% of the time (at best) then instead of a
255/1 shot we're looking at 600/1 shot.

------
bloorp
One big factor that I don't THINK the article touches on is how we are using
'space' metaphors (speed and distance) for our work that's very 'time' based
(working productivity and duration). And I think when we estimate, we try to
estimate distance/duration but we forget that we're really trying to estimate
our speed/productivity.

Where that gets painful is, say you estimate you can get something done in two
days. In reality, you're twice as fast (there's that speed metaphor), so you
actually get it done in one day. Yay, you saved a day! Now assume you're twice
as slow as your two day estimate. Boo, you spent TWO days longer. So in terms
of duration, getting it incorrect in the painful direction seems like a bigger
mistake.

I don't think this is the same phenomenon as the author's mean vs. median
dilemma. I'll bet both the mean vs. median and the productivity vs. duration
dilemmas are real factors though.

------
SilasX
Pet theory: this is entirely explained by unknown systems not behaving as
expected. As developers, and unlike e.g. carpenters, we are constantly using
new tools with effects we haven't yet experienced. Then we have to yak-shave
to get around their heretofore unknown kinks. Then the time blows up.

If and when you're using known features of a known framework, and that's _all_
you're doing, the estimates are accurate and the work is completed quickly.

~~~
maltalex
I disagree. Estimations tend to be just as wrong even when the tools are well
known.

There's always that one edge case you haven't considered, that one algorithm
that doesn't work as well as you expected, that small change to the
requirements the requires a completely different approach.

~~~
SilasX
"Just as" wrong? I don't know what to point to to resolve disagreement here
since it's just anecdotal, but if you're just using the same feature you've
used hundreds of times before, there is nowhere near the potential for yak-
shaving snags.

~~~
maltalex
Okay, sure. Not "just as" wrong since a certain class of pitfalls are
eliminated by deep knowledge of your tools.

But I'd argue that that's not where the bulk of the uncertainty comes from. I
think that in the software industry as a whole, most of the problems are
solved by people with at least a decent understanding of their tools, and
estimates still suck.

The problems is that the tasks we deal with are always new, unsolved problems.
That's the nature of software. Unsolved problems come with uncertainty. That's
the nature of unsolved problems.

Carpenters on the other hand deal mostly with solved problems. They just need
to execute.

------
pjungwir
Here is a story about the woes of interpreting statistical distributions:

I have two habits when I estimate: First, I like to give a range or even
better a triple estimate of "optimistic", "expected", "worst case". (And btw,
you should expect things to weight toward the pessimistic side, because you
always find new requirements/problems, but you rarely discover a supposed
requirement is not really one.)

Second: I like to break down a project into tasks of just a few hours and then
add everything up. I usually don't share that spreadsheet with the customer,
but it helps me a lot. Pretty much always it gives me a number higher than I'd
like, but which is almost always very accurate. A couple times I've ignored
the result and given a lower estimate because I really wanted some project,
and it has always turned out the spreadsheet was right.

Well, one time I combined these two approaches, so that my very finely-chopped
estimates all had best/expected/worst values, _and_ I shared that with the
customer. Of course they took one look at the worst-case total and said, "How
can a little thing like this possibly take 2 years??" I didn't get the work.
:-)

EDIT: Btw it feels like there is a "coastline paradox" here, where the more
finely you estimate, the higher the max possible, so that you can make your
estimate grow without bound as long as you keep splitting items into greater
detail. It'd be interesting to see the math for that.

~~~
pjungwir
EDIT2: In spite of my personal experience I do think the author makes a strong
case for this: "Adding up task estimates is a really misleading picture of how
long something will take." Perhaps I've had better results because I try to
give myself a little padding in every task, just considering that everything
requires not just typing code but getting on some calls, having back-and-forth
in emails and the project management tool, testing (automated or manual),
fixing a thing or two, etc. So my individual estimates are probably a bit
higher than median. When I work with other developers they consistently
estimate lower than me. But my numbers are deliberately not "best case",
because then you _know_ you'll go over on the total.

------
chiefalchemist
The why is typically:

1) The person(s) doing the estimate aren't qualified to do so.

2) There is a disconnect between wishful thinking and reality.

3) There is some arbitrary future milestone (i.e., "We need to ship by ____
because ____ is happening the following week.") that is independent of
software development.

4) Most importantly, when the deadline is missed the estimate / estimators are
not questioned, the software team is.

I've been at this a long time - too long? - and the narrative that IT is
__always__ at fault is a myth that needs to be buried.

------
Chris2048
I'm sorry to contribute to the dogpile effect (long thread probably says same
thing I'm about to say, but I didn't see it..), _but_..

devs estimate known risks. Ideal path + predictable delays. The further
reaches of the long tail are the unknown risks.

known risks are estimated based on knowledge (hence, a question for a dev),
unknown risks are just an adjustment parameter on top of that estimate,
possibly based on historical evidence (there is no reason a dev could estimate
any better).

It should be managements job to adjust a dev estimate. Let's be real here -
I've never heard a real life example of management using stats for this kind
of thing, or being enthusiastic about devs doing the same.

Perhaps if management is taken seriously as a science, things will change, but
I doubt it.

<strong_opinion type="enterprise_software_methodology_cynicism">

Bizness is all about koolaid-methodology-guru management right now, very much
the bad old ways - a cutting example of workable analytical management would
be needed for things to change, but this is unlikely as all the stats people
are getting high pay cool ML jobs, and aren't likely to want to rub shoulders
with kool-aider middle managers for middle-management pay..

</strong_opinion>

------
jermaustin1
In my experience software takes longer to build than original estimates
because no one will get out of the way of the development team and let them
work.

This is an extreme example, but one I now live in daily.

My current full-time-ish gig is working on a pretty enterprisy system for law
enforcement. To this date there hasn't been a single feature request, or bug
fix that took more than 16 hours of development time. And so I know that I can
typically finish something within a few hours to a day of receiving the task.
UNLESS my manager wants to discuss ad nauseam what he means when he says
"intersect an array". Or get stuck in 2 day long code reviews where my manager
makes me sit behind him while he goes over every single line of code that
changed, then gets side tracked and starts checking emails, chat messages,
text messages, calling other developers in to check on their statuses, and
even watching youtube... while I'm stuck in his office waiting on my code
review to be done so I can go back to my 5th day of trying to complete a task
that would have taken only a couple of uninterrupted hours. /rant

And this is why I pay $120 a week for therapy.

~~~
pysxul
Sorry but why would a manager even do a code review?

~~~
jermaustin1
He was originally hired as the sole developer 10 years ago, but the project
grew too big, and instead of hiring a manager to oversee the project and hire
more devs, they moved him into a management position, and put him in charge of
hiring new developers.

------
scandox
In the world of small medium projects often the major issue is that software
engineers give estimates for writing the software but customers take that to
mean time to actual delivery in production and a lot of the time have no idea
how big a task deployment and integration are...or don't even have a plan for
that.

~~~
maxxxxx
in medical devices it usually takes five times as long to really finish the
project vs. finishing development.

------
revskill
To me, software projects take longer than i think because customers don't know
what they actually need until there's a runnable version of what they want.

------
ACow_Adonis
Honestly, after doing this whole data science thing for a while now, I'm going
to be blunt: I can estimate quite a lot of tasks with quite a lot of accuracy.
including software and IT tasks.

What I can't do is make bad management hear what they don't want to hear. Nor
can I stop people from accepting an estimate because its closer to what they
want to be true, because they've already made a promise that conflicts with
the actual estimate, or because a certain process requires or necessitates
inaccurate estimates.

I think the whole "software is hard to estimate" myth stems from 2 fundamental
causes:

\- not controlling for human biases or referencing actual real world data \-
processes that don't punish/ reward people who provide inaccurate/accurate
estimates respectively.

------
andybak
I once had a really convoluted metaphor for estimation which involved opening
boxes that sometimes contained other boxes which sometimes contained other
boxes... I wonder how that models mathematically.

~~~
andy_ppp
The problem with this analogy is that the boxes do not obey the laws of
physics and fit inside eachother...

~~~
andybak
Did I mention that they were magic boxes?

------
nocturnial
Wouldn't it be more sensible to give a range instead of a fixed date? I know
it's not going to happen, but I think it would be more informative and honest.

That way you could communicate better of how certain you are. There's a
difference between telling something will be completed in, for example, 6
month +- 2 weeks. Or something will be finished in 6 months +- 1.5 months. The
estimated time is the same but communicates the level of certainty more
clearly.

Or just use between, for example, 4-6 months if you don't want to use +-
notation.

~~~
DougWebb
In my experience, developers understand and appreciate range-based estimates.
But when those numbers start moving up the communication chain, some non-
developer is going to either not like or not understand the point of the
range, and will convert it to a single number: either the first one, the last
one, or the average. They might even be honest about it, thinking "I need to
know the earliest possible date, so I'll use the low end". But then the next
person, who doesn't see the range, thinks that low end is THE committed date
and will plan accordingly. Now your deadline is 99% likely to be missed.

------
mekane8
I used to work for a software company that did projects for clients and
charged by the hour (a consultancy) and we frequently had to estimate projects
with incomplete information in order to event get the contract in the first
place, so they often turned into the actual final budget. Since I was the head
of engineering I ended up doing a lot of sales and estimation for new
projects. Besides just doing it a lot and gaining experience from many
projects, there were a few other things that really helped me:

1) Software Estimation: Demystifying the Black Art by Steve McConnell (already
mentioned by others).

2) His short video on "Targets vs. Estimates" was super helpful - we watched
it with engineering, sales, and project management all together and had a good
discussion afterwards.
[https://www.youtube.com/watch?v=FY9X21HA02w](https://www.youtube.com/watch?v=FY9X21HA02w)

3) Keeping a running list of "Things to Remember". Every time a project went
astray or we encountered something during a project that I had failed to
estimate (but potentially could have) it went on the list. That was useful to
share with other too, when they did estimates.

I really like the discussion of standing up to those who ask for estimates. A
clear understanding of what an estimate is and what it's for is important, as
is a strong sense of professionalism. I would recommend "The Clean Coder" by
Robert Martin for this. It's more about professional behavior than software
practices. Especially his chapters on "Saying No" and "Saying Yes". I read it
and discussed it with my team often. It helped us realize when we had to
refuse to give an estimate because we lacked the necessary information to do
so, rather than just guess or make something up.

------
settsu
I’ve been coding professionally for 20 years and I’m largely no better at
estimating completion than day 1, maybe worse since I’m more likely to feign
confidence in a figure I’ve essentially pulled out of my ass.

At some point, I came to the realization that my fundamentally poor concept of
time was always going to be an insurmountable obstacle to my career
advancement.

------
JabavuAdams
My empirically-confirmed heuristic is that the time to deliver a feature set
that someone would actually want to use is 2.5x-3x of the time I think of when
asked for an off-the-cuff estimate.

Basically, multiply initial estimate by a number between e and pi -- no joke!
It's a bit of a problem given that PM's think they're being generous with a
20% pad.

~~~
rainhacker
I can relate to your heuristic. The current project I've been working on is
close to completion and somewhere between 2.5-3x of its initial estimate.

------
maltalex
My theory on why software estimates suck is tied to Rosenberg’s Law:

> Software is easy to make, except when you want it to do something new. The
> corollary is, The only software that’s worth making is software that does
> something new.

It's hard to estimate the difficulty of something you've never done. And the
nature of software development is to always do new things.

------
iraldir
I believe this is the reason why scrum uses story points instead of time
estimate. By putting uncertainty on the same level as effort, you give it more
weight. And using a fibonacci sequence rather than a continuous amount with
the rule you should round up if unsure tend to correct those defects.

~~~
jfoutz
there is an uncharitable response to this thread, which is marked dead. I do
think there is a kernel of truth in that response, which is nothing more than
mocking.

why fibonacci? I think it's reasonable to say time includes an exponential
error. an estimated 1 hour task is very different than an estimated 1 week
task. i see that 1 hour, 1 day and 1 week estimates are progressively, and
likely exponentially, worse.

Is it just the ease of doing the math? (totally reasonable answer, in my
humble opinion). or is there something specific about fibonacci that's
actually relevant? I think it's the former not the latter. but if you have any
evidence to the contrary, i'd love to hear it.

~~~
js8
The kernel of truth of that response is that you won't get more certainty
about an estimate by adding more uncertainty, no matter in how much pseudo-
scientific cargo-culting mumbo-jumbo you cloak it. (Unfortunately, HN readers
don't really like sarcastic commentary which requires reader to think it
through. Time is precious, and we just want to be coddled, give me your
opinion ELI8 straight, because there is no time to think!)

~~~
jfoutz
I think you're half right. adding an error band to a far out estimate won't
give you a better estimate, but it might help tell you how much you _don't
know_ about what you're facing.

Cumulative error is _hard_. see weather prediction.

------
robbick
One take away is for sprint planning - if there is a reasonable amount of
uncertainty on a task, take it out, either break it down or do some
investigation, and bring it back next time. You don't want a σ=2 task messing
up your sprint!

------
achenatx
The assumption in the article is the amount of work is a constant. Instead
what happens is that a manager has decided how much time it should take
(always too little). If you accept that time, then you will go over.

If you push for more time and get it, then the manager will eventually add
more scope and you will go over.

The vast majority of the time we see projects go over deadlines because scope
was added under the guise of clarifying scope.

Starting with a business value actually works pretty well to enable people to
control scope.

We also use actual velocity on tasks to re-forecast daily. That has been the
most effective way to get a good date. Over a large number of tasks this works
very well.

------
zcanann
I always think of project tasks as flow charts, where every item either takes
1 day, or 1 week. There's no way of really knowing in advance. Complications
happen.

It makes it really hard to calculate the "expected value" of 5-10 tasks.

~~~
dTal
The more tasks you have to do, the _more_ certain about duration you should be
- some of the uncertainty will cancel out and you will get a gaussian
distribution. For your example, I expect 10 tasks of between 1 day and 1 week
each (with flat probability) to take about 6 weeks in total, with a 95% chance
of completion within 7 weeks.

------
arendtio
Well, I think this is a complex topic. Not because of the math, but because
the key to an accurate estimate is to understand who has done the estimate and
on what basis.

As stated in the summary, the core driver for inaccurate estimates is
_uncertainty_ :

> Tasks with the most uncertainty (rather the biggest size) can often dominate
> the mean time it takes to complete all tasks.

There are different sources of certainty:

\- Experience: If someone has done a task 20 times he probably knows how much
time he will require. Someone who hasn't done the task yet, probably
underestimate the time he requires (e.g. because of the median vs. mean
conflict). But don't be fooled just because you have 20 years of work
experience in the field but never actually done a specific task doesn't mean
you can estimate it better than someone who just started the job but had done
that specific task 10 times. However, most of the time, projects are doing
something new. So you have to find out which tasks have been done before by a
project member and which are completely new ground. If something is completely
new, remember to plan time for getting familiar with a problem space plus a
handful of complications (together this will be more than the actual task
would take someone who is trained for that specific task).

\- Detail: The smaller the tasks the larger the overall estimate... or so.
Planning on a top-level is rarely going to be accurate. We do it a lot because
it doesn't take much time. But _if_ you want an accurate estimate, you have to
plan on small, specific tasks.

\- Risk management: Every project has risks. Some don't really have an impact
and others blow up the whole project. Know your risks and what you are going
to do if something should go in the wrong direction. It is not like you
wouldn't have time to figure out what to do when the problem occurs but to
understand how it would impact your timing and to take preventive actions
(e.g. include stakeholders).

If you have people who have done the exact same task a few times, made a
detailed plan of every step and know how to handle the most likely or
impactful risks you are in a good position to deliver on time. Most of the
time you won't have that luxury and have to compensate the resulting
uncertainty with a prolonged time to project completion, but that should be
just fine when being communicated in the beginning.

With all that said, remember, that some projects don't require an accurate
estimate. _Sometimes_ it is enough to deliver just as soon as possible.

~~~
helloindia
On the experience part, this happened to me few times, when the project
manager asked me for an estimation, i give it from my perspective and
experience, and then he gives the work to someone else with no experience, and
obviously takes more than estimated.

Now, i always give two estimates: An estimation if the work is done by me, and
an estimation if the work is done by someone else.

------
mlthoughts2018
I’d also add that _the reason why the mean will be larger than the median_ is
usually hard to discern for any specific situation.

This foments bikeshedding debates that are a mix of business pressure to favor
uninformed estimates and disagreements about which possible sources of
surprises or hitting a wall are most likely.

As soon as you mix this with a formal estimation system like Agile (yes, even
the platonic ideal ‘agile’ too), it creates a snowball effect of time wasting
because overhead is so frequently required to resolve debates and placate
business pressures for estimates.

------
danpalmer
Something I’ve been trying is estimating for the time in which I’m 80% sure I
can finish something. I end up racing ahead of estimates most of the time
because they are too high, but then some times I’ll find something that is a
bit more complicated than expected and it takes a little longer. Overall this
seems to balance out, but also have a lot more predictability, it’s easier to
predict at any given time what I might be working on. This has been pretty
important on my team where I’ve been doing API work for some iOS developers to
use.

------
edejong
Yes, this is a thesis I've brought up quite some times in relation to sprint
planning. The odd misjudgment having a strong influence on the total estimate
biases upwards, not downwards.

------
dre85
I'm blown away on a regular basis by how long it takes to write software that
in my opinion is super easy.

Usually it just comes down to fuzzy/missing/changing requirements. A lot of
times people know that they want something, but they don't know exactly what.
Or they're absolutely sure they need feature x, but then later they realize
they don't, but they missed out on developing other more fundamental features.

~~~
adrianmonk
I have developed a belief about this: people don't know what they want until
you show them what they said they want. Then it's immediately obvious to them
what they wanted instead.

This suggests that demos and mock-ups might be a valuable tool. The sooner you
can get someone to try something, the sooner they can tell you what direction
they really wanted you to go in instead, and the less time you waste.

~~~
scruple
They want everything but they can't even define a starting point. It's
insanity. I've been going back and forth with a customer about this (through
my product and sales team) since August of last year. At first the priority
was high and it seemed like the scope was reasonable. But no one was making
moves or delivering on my requests / asks / concerns. I suggested a phased
approach, so that we could _be_ agile and get _something_ in front of the
customer for feedback. But before I even got there the requirements changed
again and again and again and again and again...

Thankfully all of this specific work is easy to segregate away from the rest
of my team. It's proven to be very toxic and morale draining. If my entire
team was involved, I've no doubt that we'd have lost people by now.

And the real kicker: It turns out that right now we can't even deliver on the
first pass that I had originally suggested, despite me being basically done
with my teams piece, because some other team in the company was loose with
their language and convinced a bunch of product and sales folks that, yes, of
course they had what the customer was asking for. They didn't. They still
don't. I'm convinced today that they never will. Bunch of fucking sycophants.

I have no idea how these things happen but recently I've become convinced that
this sort of confusion-on-all-fronts is just par for the course in this
industry today. The only work that I've been able to do in the past couple of
years that was well understood and easy to articulate, and as a result capable
of being completed mostly on time and within scope, was born out of a select
few individuals being able to identify a real problem and a real solution and
grinding away at it in a controlled fashion. But that seems to be rare and not
at all "how things are done."

------
bigred100
What I want to know is why the software developer is expected to give
estimates. Estimating how long common tasks take seems like something I would
expect a software manager to be able to do effectively and care about. If
you’re just a monkey who is asking everyone to do the planning and estimation
job for you, I’m frankly unsure why the company allows you to collect a
paycheck.

------
boffinism
I've always thought that typically overoptimistic estimates tend to be more
based on the mode. I.e. 'this is the sort of task that normally takes 1 day,
so I'll estimate 1 day'. The high point on the probability curve is the most
noticeable, but it's also way further to the left than either the mean or the
median.

I have no data to back that up though.

------
Michielvv
I think the most important issue why projects take longer is not because the
time it takes to complete a task is uncertain, but because at the start it
will always be unknown which tasks will prove critical. The further you get,
the more tasks will reveal itself that were not part of the original scope,
but critical nonetheless.

------
twothamendment
They don't take longer than I think. The problem is nobody listens! True story
- They handed me an RFP, asked for an estimate. I gave it and they cut it in
half and got the job. I didn't take any pleasure in being right, but I was
right. It was an expensive mistake.

------
diiq
Vistimo.com, the tool I built and use when I run estimates for my clients,
uses log-normals and monte-carlo simulation to conquer this exact problem.

Really exciting to see other people beginning to recognize and use the same
statistical tools -- they've served me really well.

------
jrochkind1
1\. Do the tasks with, as far as you can tell, the most uncertainty as soon as
possible in the process.

2\. This is why "agile" tries to avoid estimating more than a few weeks. The
more tasks you have in there, the more you estimate is likely to be really
off.

------
temp269601
How about making the manager estimate the project, that way if the deadline is
not met, the manager receives the blame? It's the manager's job to manage
resources, and if the deadline is not hit, then they can hire/bring on more
resources. If an engineer works as hard as they can for 40 hours a week, why
is it the engineer's fault if the arbitrary deadline is not met? If the
engineer estimate's time for a project, the engineer will always have to work
more than 40 hours a week because some estimates will be too optimistic.

~~~
ben509
The manager does receive the blame, and then stuff rolls downhill.

------
SiempreViernes
_Fits symmetric function to clearly asymmetric distribution_

Author: Decent fit, in my opinion!

This bad fit makes me genuinely sad (；∩；)

~~~
ben509
He could probably tweak a skew normal distribution to make it fit nicely, but
it's _pretty_ close to normal.

------
StreamBright
There would be great to have predictions that use ML to estimate how long
something will take.

~~~
ska
If you are thinking this as opposed to statistical modelling, what is the
benefit you imagine?

------
gilbetron
It takes longer than we think because writing software is solving a math
problem, and you can't know how long solving a math problem will take until
after you solve it.
[http://www.warhound.org/kcsest.pdf](http://www.warhound.org/kcsest.pdf)

------
unityByFreedom
Eh don't worry, self driving cars are coming this year, Elon said so.

------
pikzel
I've been in a company where all estimates were multipied by pi.

------
juskrey
Because they can't take negative time.

------
usgroup
TLDR anyone?

~~~
markwkw
Thesis: Developers are good at estimating median time to finish tasks. But the
tasks that take longer, in fact take much, much longer than estimated.

E.g. Dev estimates that time to complete each of A, B, C tasks will be 2 days.
In reality, A will take 1 day, B will take 2 days, but C will take 8 days.

Dev was right about the median time to complete each task (2 days) but average
was much higher. Article goes into how to statistically model the distribution
of actual time to complete tasks.

~~~
afarrell
That was a really concise but faithful summary. Welcome to HN.

