
What We Do and Don't Know about Software Development Effort Estimation - ExpiredLink
http://www.infoq.com/articles/software-development-effort-estimation
======
x0x0
Yet another place where intuitions derived from the normal distribution about
the behavior of distributions screws people.

An explanation I like, from michaelochurch:

    
    
          Let's say that you have 20 tasks. Each involves rolling a 10-sided die. 
       If it's a 1 through 8, wait that number of minutes. If it's a 9, wait 15 
       minutes. If it's a 10, wait an hour.
          How long is this string of tasks going to take? Summing the median time 
       expectancy, we get a sum 110 minutes, because the median time for a task is 
       5.5 minutes. The actual expected time to completion is 222 minutes, with 5+ 
       hours not being unreasonable if one rolls a lot of 9's and 10's.
          This is an obvious example where summing the median expected time for the 
       tasks is ridiculous, but it's exactly what people do when they compute time 
       estimates, even though the reality on the field is that the time-cost 
       distribution has a lot more weight on the right. (That is, it's more common 
       for a "6-month" project to take 8 months than 4. In statistics-wonk terms, 
       the distribution is "log-normal".)

~~~
Cacti
Exponential feedback is a bitch!

When I put together plans and estimates, I always take a lot of care to
separate out those things which are linear and those things which
exponentially impact other things within the schedule, along with the sort of
inflection points. I may not know where I'm going to roll a 9 or 10, as they
may crop up anywhere, but there are certainly areas where they are more
possible and less possible.

In a sane world, at least. Can't do much with a black swan.

~~~
hcarvalhoalves
In a startup I've worked on, we did estimates based on hours, but also had the
"no idea" tasks. We would then make sure to alternate sprints to work on well
estimated tasks (as you say, "linear" tasks) and others to work on the hard
ones, without a set deadline; or split the team effort in that manner.

I think that's a better strategy than making wild guesses and ultimately
falling behind schedule, but at the same time maintaining cadence, which buys
you power to sometimes say "hard things are hard, I don't know when it'll be
ready to ship".

------
lethain
I think many and perhaps most poor estimates are caused by initial estimates
being viewed as too high for the project, and instead of deciding the project
isn't worth doing at its estimated cost, instead deciding the estimates must
be wrong in order to align expected project value with expected project cost.

Perhaps in a twisted future where we estimate project cost before deciding
which projects to take on, we might discover our estimates are much better.

A related pathology is trading technical debt for speed, every time, on every
project. The debt will be paid.

~~~
markcmyers
Well, exactly. If I understand the paper, the only condition you need in order
to arrive at accurate estimates is the absence of pressure to underestimate.

------
emjimenez
Software effort estimation methods fail because they ignore the margin of
error. In Mathematics, engineering and statistics a result does not mean
anything if it does not include the margin of error. One month may be one
month if the margin of error is one day, and one month may be one year if the
margin of error is one year. Classic estimation techniques like Cocomo or
Albrecht Function Points ignore this fact. They have no mathematical rigor. If
presented with mathematical rigor they would be absurd, because their margin
of error is between 100% and 600%. Classic software effort estimation
techniques are harmfull and dangerous, because ignoring the margin of error
they invite to make decisions that ignore existing risks. No automatic method
can replace human experience and wisdom. They have not bound margin of error
too, but at least they do not pretend to hide existing risks.

------
tunesmith
Many customers/clients don't even _want_ accurate estimates. Given the choice
between an accurate estimate of $x, and a competing estimate of $0.75x with
later surprises and deadline stress and renegotiations to pay another $0.35x
for "phase 2" which gets the product up to what they originally wanted,
especially when the business relationship has "bonded" in a way where it's all
rah-rah, go-team, we're-in-this-together... clients will go for the latter
path way more often than they should.

Part of the reason estimates are inaccurate is because there's that business
disincentive to be accurate.

------
crasshopper

       Hofstadter's Law: It always takes longer than you expect,
       even when you take into account Hofstadter's Law.
    

The best advice I ever got on project-time estimation (from a biology postdoc)
was: make your best, most honest best effort, and then double it.

When I make projections with a spreadsheet, I have a cell that copies my grand
total of all costs and call that copy "unforeseen costs". I always hate
bidding that high at the start, but the estimate ends up being close to right
surprisingly often.

This article says 30% overruns are common, which is within my former boss'
+100% bounds.

The other nice thing about doubling your cost estimate is it prevents you from
catching the winner's curse and landing an overly-stingy client. Plus if you
really _can_ keep costs within your spec for the project, then you win extra
profits. You'll never win that "game" if you don't leave room for error.

~~~
ams6110
The advice I got was double the number, and increase the time unit. E.g. you
think the project will take 2 weeks, estimate 4 months.

~~~
crasshopper
I've heard that as well. I dunno, doubling seems to have worked so far...

------
Joeri
I've been giving estimates for a decade, and still feel like I'm winging it
every time. Planning poker definitely works, provided you understand the
requirements, and by 'understand' i don't mean you have read a spec document,
but that you understand the customer's business problem and have figured out
how the proposed solution aims to solve it. Sadly most large projects don't
have the time in their pre-sales estimation phase for the team that produces
the estimate to build an understanding of the whole problem domain.
Paradoxically this low confidence in the estimate will tempt the sales team to
cut it even further, since they interpret uncertainty as a liberty to seek the
low bound (or even lower).

As others have pointed out in those large projects it's better not to make up-
front estimates and just build as much value as possible for a fixed cost,
using agile principles. However, that's typically not how large software
projects are sold (or bought). Fixed price almost always means fixed scope.
I'd like to know of any large software project sold to a customer in truly
agile fashion (no fixed scope determined in advance). To me it sounds like a
software development unicorn: you hear about it, but you're never the one
building it.

------
ww520
Often a technical realistic schedule is derived and presented, but the
business side deems the project cost is too high and asks the schedule to be
"optimized." The optimistic scenarios of the schedule is adapted and revised.
Of course reality sets in when the project goes forward and it ends up taking
as much time as predicted.

~~~
andruby
That's often what I see. An estimate roadmap is presented, management
expresses that it wants it sooner, the roadmap is "shuffled" and "optimized",
it is approved, yet reality still sets in during development :-)

------
amenod
IMHO it is quite easy to make a reliable estimation of well-planned project.
However it is _extremely_ difficult to plan the project more than one step
ahead of what is already done... This is why agile development is so popular.

In general when under-estimating the project you can make it:

1) on time and within planned resources,

2) with all the planned functionalities and

3) without sacrificing quality.

Pick any two.

~~~
jayvanguard
> IMHO it is quite easy to make a reliable estimation of well-planned project.

Evidence suggests otherwise. Sure you can estimate +/\- 100-200% early on but
that isn't what anyone is aiming for in a software project. Even detailed
plans of repeatable (non-trivial) software projects do not result error bars
that anyone really desires.

~~~
HeyLaughingBoy
I don't know that I'd say it's easy, but it is certainly possible. The big
takeaway from the article that I agree with is that historical data
significantly improves estimates. If you know that e.g., the last 5 projects
took an average of x weeks on the authentication layer, then it's likely that
your project will take somewhere around the same time.

The problem is that most companies don't record this data. Start today!

------
chvid
The article says:

"A tendency toward underestimation of effort is particularly present in price-
competitive situations, such as bidding rounds. In less price-competitive
contexts, such as inhouse software development, there are no such tendencies -
in fact, you might even see the opposite. This suggests that a main reason for
effort overruns is that clients tend to focus on low price when selecting
software providers - that is, the project proposals that underestimate effort
are more likely to be started. "

Is that really correct? Are there studies that shows that inhouse projects (or
not fixed-price projects) do not underestimate systematically as opposed to
fixed-price client projects?

~~~
narag
There are very different inhouse environments. Some are obsessed with metrics
and "improving" results.

------
kickingvegas
An older engineer once told me a valuable tl;dr approach to engineering
estimates: whatever number you arrive at, multiply it by two.

~~~
zwischenzug
"Six to eight weeks" was the default estimate my project managers gave for
anything above trivial. Long enough to make the task seem difficult, not too
long to scare off the client.

~~~
twic
Do you work at Stack Exchange?

[http://meta.stackexchange.com/questions/19478/the-many-
memes...](http://meta.stackexchange.com/questions/19478/the-many-memes-of-
meta#19514)

~~~
zwischenzug
Ha! This was the norm at my corp over 10 years ago. Heh.

------
snarfy
We have so much technical debt I can no longer estimate with any accuracy.
Something that should take half a day takes a week. Most of the week is
cleaning up all of the crap in the code, and then spend a couple hours writing
the few lines of code that solve the business requirement.

~~~
collyw
Sound exactly how my management want things to be in a couple of years

------
Flenser
Does anyone know where can I read the referenced studies? I'd be interested if
they looked at Evidence Based Scheduling / Monte Carlo simulation:
[http://www.joelonsoftware.com/items/2007/10/26.html](http://www.joelonsoftware.com/items/2007/10/26.html)

Edit:

One of the authors of the references has several articles available here:

[https://www.simula.no/people/magnej/bibliography](https://www.simula.no/people/magnej/bibliography)

but not the one referenced, although there are several newer articles.

~~~
Flenser
Found a PDF for this one thanks to Google Scholar

4\. T. Menzies and M. Shepperd, “Special Issue on Repeatable Results in
Software Engineering Prediction,” Empirical Software Eng., vol. 17, no. 1,
2012, pp. 1–17

[http://menzies.us/pdf/12stability.pdf](http://menzies.us/pdf/12stability.pdf)

EDIT:

Several of the author's articles are available here (click the PDF links):

[https://www.simula.no/people/magnej/bibliography?b_size:int=...](https://www.simula.no/people/magnej/bibliography?b_size:int=9999999&b_start:int=0)

From looking at Google Scholar it looks like there are many newer articles on
Software Estimation that the OP does not reference so may not have read.

Edit 2:

This paper talks about Monte Carlo Simulation:

[https://www.simula.no/research/se/publications/Jorgensen.200...](https://www.simula.no/research/se/publications/Jorgensen.2005.5/simula_pdf_file)

------
mlinksva
That we don't know whether software development is subject to economy or
diseconomy of scale stuck out to me.

Estimation cost (of doing the estimation, not consequences of estimation) not
mentioned. Is estimation itself significantly costly relative to subject of
estimation?

To the extent open source works relatively well as a development practice, how
much of a role does suppression of estimation play (assuming there is
suppression; harder to even pretend to hold anyone to an estimate without a
contract, so why bother)?

------
cashoil
Problem with estimates is that once there is an estimate the team can really
stick to this estimate. Regardless of quality.

It is feasible for the team to claim that it met the estimate, and it is
feasible to have all indicators green on the day the deadline is met. Simply
do less design, less refactoring, less thinking, less tests, less
collarborative work, less engineering...

------
blutoot
Can Machine Learning (or NLP) ever help in estimating effort based on expected
lines of code where the model would be trained upon _similar_
applications/files that already exist? If so, is anyone researching this in
academia or in any research lab?

------
ericHosick
There is mention of tools like group estimates which improve the estimation
effort (according to the article).

There isn't much mention of the estimates you can go for:

1) Accurate but not reliable

2) Reliable but not accurate

------
kaonashi
> An implication of these observations is that clients can avoid effort
> overrun by being less price - and more competence - focused when selecting
> providers.

I absolutely loved this line.

------
alien3d
Hard to estimate.client think it was easy .most think us like tv show,

------
dror
I never understood why software estimates are so bad when other fields do so
well.

When a contractor gives you an estimate of how long and how much it's going to
take to do a remodel, he's invariably on time and on schedule, right?

And when Boeing spends billions of dollars on a new plane, they have on ready
and on budget, right?

So why can't software people do the same?

Oh wait, complex, badly defined projects tend to run late and over budget.
It's not that complicated. Spend 3-6 months defining all the details of your
new web application, promise not to change anything on the fly, don't ask us
to make it work on IE 7, and by the time we do 3-4 of these, we'll be able to
give you a good estimate.

~~~
zxcvbnmkj
> And when Boeing spends billions of dollars on a new plane, they have on
> ready and on budget, right?

Well, no, they don't.

[http://edition.cnn.com/2011/TRAVEL/08/07/boeing.dreamliner/](http://edition.cnn.com/2011/TRAVEL/08/07/boeing.dreamliner/)

