Why is estimating so hard?

mwilliamson · on April 23, 2012

I remember reading a comment on another story which I thought was a wonderful explanation of why estimates tend to be too short:

http://news.ycombinator.com/item?id=3522910

To quote one part:

Let's say that you have 20 tasks. Each involves rolling a 10-sided die. If it's a 1 through 8, wait that number of minutes. If it's a 9, wait 15 minutes. If it's a 10, wait an hour.

How long is this string of tasks going to take? Summing the median time expectancy, we get a sum 110 minutes, because the median time for a task is 5.5 minutes. The actual expected time to completion is 222 minutes, with 5+ hours not being unreasonable if one rolls a lot of 9's and 10's.

tibbon · on April 23, 2012

There are two issues at hand:

1) People are pretty terrible at understanding things exponentially (or overall, just non-linearly). Bumps in the road for programming I've seen can set a time goal back by 10x. Thought it would take a week? Suddenly its more than two months.

2) As highlighted here, we're often bad at translating all of our assumptions and intuitiveness into code. Its just a website with a login, with a, etc.... Yet, there's a lot of details there that you're forgetting about.

Experience is the best thing here you can hope for, but even that can fail. Unless you're doing exactly what you've done before, you're going to hit bumps.

Also, no client wants to hear an overly long and safe estimate. You might even think it will take 6 months, but the client wants to hear 2. Everyone else is telling them 2. You look incompetent if you say 6, so you say 2... maybe 3.

paulsutter · on April 23, 2012

Evolution.

When we lived in caves, there were men who were perfect estimators and there were men who were hopeless optimists.

When the optimists said "I'm going to get a deer!", the perfect estimators would explain with great precision and accuracy how difficult it was. The optimists ignored them, and the perfect estimators just sat around the cave. The optimists did finally catch the deer, even though it was probably more work than it was worth.

That's why most of our ancestors were hopeless optimists. The perfect estimators just didn't reproduce. So how can we help but be hopeless optimists ourselves when it comes to generating schedules?

The solution: multiply your schedule estimate by pi to get a round number.

Wilduck · on April 23, 2012

No implementations yet? This took me about 12 minutes and 10 lines of python code:

  def break_13(text):
       words = text.split()
       lines = [[]]
       for word in words:
           last_line = sum(len(word) for word in lines[-1])
           if (last_line + len(word) + len(lines[-1])) > 13:
               lines.append([word])
           else:
               lines[-1].append(word)
       return '\n'.join(' '.join(line) for line in lines)

I like the larger point of this article, which basically states that the way humans do tasks is different than how we should instruct computers to do tasks. However, to me this point is undermined by the description of writing a procedure near the end:

> If you are good at abstracting, you’ll likely come up with three different scenarios for breaking a line. 1. you break it at the 10th character of a word if that word is longer than 10 characters. 2. You break it at the 11th character if that character is a space. 3. You look backwards from the 10th character looking for a space and if there is one, you break it there.

I think it is less valuable to break down your procedure into smaller pieces before given an estimate, and more valuable to have a good knowledge of what sorts of problems are hard and why they're hard. That way, you don't have to resort to glib phrases like "Write down a procedure to tie your shoes".

thangalin · on April 24, 2012

This solution does not meet the maximum 8" length requirement. The code produced the following PDF:

http://www.mediafire.com/view/?47t49w2h4pxa6d1

Even if you shrank the font, decreased the margins, and removed the 1" marker, the text would still exceed the maximum length of 8 inches.

Web page for printing: http://pastebin.com/DLR5cGss

gvb · on April 23, 2012

I did it in seconds, but I cheated.

fmt --width=13 <gettysburg.txt

thangalin · on April 23, 2012

Your solution is quick, but it will not work.

lynx --dump http://morphadorner.northwestern.edu/morphadorner/techtalk/s... | fmt --width=13

This fails due to sub-optimal formatting:

    But, in
    a larger
    sense,

Remember the physical constraint of 1.5" x 8". The fmt output exceeds the bookmark's physical length. Those three lines can be written as two:

    But, in a
    larger sense,

fmt produces a much lengthier script than would fit on the physical bookmark.

gvb · on April 23, 2012

You're right, I had a bug in my code.

   fmt --width=14 <gettysburg.txt

The copy of the Gettysburg address I grabbed off Wikipedia has "mdashes" in it (unicode characters). That messes up my wrapping too because fmt counts them as two bytes even though they are one character.

   diff y z
   101,104c101,103
   < us—that
   < from these
   < honored dead
   < we take
   ---
   > us-that from
   > these honored
   > dead we take

(the others were not material).

thangalin · on April 24, 2012

Ensuring quality of data counts as time required to complete the project. Also, fmt does not produce an optimally short solution for an 8" long bookmark. For example, using your fmt statement on http://pastebin.com/RxWd11bU produces:

    on a great
    battle-field
    of that 
    war. We
    have come to
    dedicate a
    portion of

Versus hand-written:

    on a great
    battle-field
    of that war. 
    We have come 
    to dedicate a
    portion of

You'll have to do better than 128 lines, I think, to fit the length requirement. And this was really the author's point. Had you estimated a solution that'd take a few seconds to code using fmt, your estimate would be blown away by reality. Not only would have gone down a rabbit hole (which actually happens quite a lot in software development), but you'd still not have a working solution.

redthrowaway · on April 23, 2012

You cheated, yes, but yours is also the only sane solution, and the only one you would consider using in the real world where time spent developing has an opportunity cost.

thangalin · on April 24, 2012

It is not a viable solution; it will not print within 8" because it produces a result that is too long.

ericb · on April 23, 2012

How software estimating feels:

A sealed envelope containing a destination is placed in your hand. Without opening it, estimate how long you will need to drive there. Estimate high and you will be seen as slow or lazy, estimate low and you seem incompetent when arrive late. Others plan around your estimate--so you must commit to it. Fun game, that.

gvb · on April 23, 2012

More like a postcard of a beautiful paradise island is placed in your hands. Without turning the card over to see where the island is located, estimate how long you will need to sail there.

[edit] And when you turn it over, it turns out it was really Gary, Indiana. http://maps.google.com/maps?q=gary+indiana&hl=en&ll=...

erikb · on April 24, 2012

I think the question is the same as someone asking "Why is riding a bike so hard?" and the answer is simply "Because you didn't learn to do it well enough, yet." FOr someone doing the same guessing for 20 years it is not hard at all. In fact he probably doesn't even have to think about it. The perfect result will just pop up in his head (or a result so close that the error margin doesn't matter for that exercise).

Is that thought so unusual? I don't really see, why it needs to be discussed, what makes a beginner guess wrong. The solution will be always the same: exercise or let someone do the task, who has the experience.

Could someone explain the point of this discussion to me?

K2h · on April 23, 2012

I completely agree how hard it is to 'teach' a computer how to do a task.

I have shifted my focus from writing procedures for humans to writing programs for computers. when I'm done, I have a program the computer can run, and serves as a procedure that explains to humans just how hard the problem is- and exactly what compromises were made in the implementation.

the resulting program is much more thorough and thought out than the procedure I would have written.

irrationalfab · on April 23, 2012

Estimation is difficult because we compare apple with oranges. One thing is to perform a task another one is to freeze the intelligence that allows us to perform the task. It involves foreseeing edge cases which might not be present in a particular application of the logic.

On the other hand once its coded, it's way faster (because the logic was abstracted) and it is flawless (it always yields the same result).

tshaddox · on April 23, 2012

On a slightly related note, that naive greedy word wrap algorithm is generally considered aesthetically bad. A better idea is to give spaces at the end of lines a superlinear penalty and then minimize the total penalty. See http://en.wikipedia.org/wiki/Word_wrap#Algorithm.

IanMechura · on April 23, 2012

Estimation seams hard because of two reasons.

1) Very few individual contributors responsible for providing estimates ever invest the time necessary to learn how to perform estimation.

2) Many stakeholders (managers, PM, etc.) do not know the definition of the word estimation.

tonyarkles · on April 23, 2012

One of the key things with this is differentiating between the word "estimation" and the word "committing". I'm not sure if it was Uncle Bob's book (The Clean Coder), or another one, but those two words are frequently misunderstood by everyone involved.

An engineer's "estimate" is a project manager's "commitment", unless there's a serious discussion about what the probability of completion there is associated with the estimate. (The default assumption will be 100%)

seele · on April 23, 2012

Few years ago I've posted some thoughts about it: http://risklog.blogspot.com/2005/07/software-development-is-...

andrewcooke · on April 23, 2012

has anyone actually tried the task? it's trivial. i can understand how people get estimates wrong for complex problems, but did this guy + kent beck, pair programming, really need more than 30 minutes? do average software engineers need 30-45 minutes?

the message is fine - i don't have a problem with that at all - but the numbers / facts / anecdotes seem way off base to me.

scpike · on April 23, 2012

My guess is that by "the appropriate break point for a line" the author means the break point that makes the result look the best, not just chopping at 13 characters. In order to match what a human would be able to do easily, you'd need to implement something like what tex does (http://en.wikipedia.org/wiki/Word_wrap#Minimum_raggedness).

aidenn0 · on April 24, 2012

Not true since they give an algorithm towards the end of the article, and i'ts just the greedy one.

zaptheimpaler · on April 24, 2012

my implementation:

  i = 0
  for c in string.split():
	i = i + len(c) + 1
	if i <= 13:
		sys.stdout.write(c)
		sys.stdout.write(" ")
	else:
		sys.stdout.write('\n')
		sys.stdout.write(c)
		sys.stdout.write(" ")
		i = len(c)

took me 10 minutes.

thangalin · on April 24, 2012

This does not run in Python (NameError: name 'string' is not defined) -- the import statements are missing. And even when it does run, the output exceeds the maximum 8" bookmark length requirement. See my other reply.

http://news.ycombinator.com/item?id=3882085

trustfundbaby · on April 24, 2012

going off of the post, I find that getting estimates wrong is also because

a. not enough time is given to come up with good estimates

b. Programmers usually don't enjoy the process of estimating and so don't do it properly

c. Previous estimates are usually not shared amongst programmers

To elaborate.

a. There is usually pressure from managers or clients to come up with quick 'ballpark' estimates but only treat them as such when they don't fit their plans. So say they want something done in under 6 days. If you estimate a day ... there might be no questions asked, until you start blowing deadlines. But if you estimate 10 days, then a lot of people will quibble with you, trying to pressure you to give rationale or otherwise reduce the estimate.

As the poster shows, properly estimating something is actually a pretty involved process that means constructing the thing you're trying to build on paper (or in your head). Clients/bosses don't want to pay for that time, so devs are implicitly pressured to underestimate things, to avoid long discussions about unfavorable estimates.

b. Teeing off of a. clearly building the damned thing is way more fun than constructing it in your brain, estimating it, then going back and forth with clients on each line item as described above. There isn't a particular process, and sometimes you might actually spend almost half of the time of the estimate actually figuring out how to do it. Devs simply don't like estimating things, so they make (bad) guesses that seem safe but eventually turn out way wrong.

On a project I had once, I hated the work so much that I simply took a wild guess at how long it would take to do one particularly hairy bit and doubled it. Seemed totally safe at the time ... figured I'd be under by a lot. Turned out being even double that estimate ... the culprit ... Internet explorer.

c. If you go to a mechanic shop to get an estimate for work on your car, you might find that the mechanics go into a computer, plug in some details about your car and the problem ... and the computer spits out a number. This is because lots of things with cars are pretty routine ... Software isn't like that ... stuff that's routine is usually pulled out into plugins and frameworks that just work ... so we're constantly tackling problems that seem new, throw in the capabilities and weaknesses of particular frameworks, plugins or programming languages and you can see how these sorts of things can get away from you very quickly.

Programmers don't share their estimates with each other (hint hint), so the only really reliable source of good estimates tends to be personal experience, which is another reason why you should code ... a lot.

The only really good process I've found is to completely wireframe what needs to be done, (either that or build a clickable prototype that works exactly the way they want it to work). Then provide an estimate for that. But that doesn't always work for smaller things like bug fixes or smaller features.