Well, maybe. Probably you can retrace your steps and figure out where you set them down while you were on the phone, or maybe they fell into the couch. There is also a possibility that it is going to take you a long time to find those keys.
The software estimate often depends on environments being available, other people finishing the piece you interact with on time and working the way that you expected, and you may have tricky defects that you really can't estimate how long it will take to fix them or how many there will be.
A clear specification is only one piece of the puzzle. To switch metaphors, pretend it is 1812 and try to give an estimate of how long it will take to get from Chicago to San Francisco. Yes, you have a map with routes, but there is also weather, broken wagon wheels, dysentery, bandits, you name it. (just ask the Donner party). Let's just estimate it at 90 days and probably we'll lose some people along the way and pick up a few more.
At best I try to give the suitcase estimate: I can estimate that you will need one suitcase for a weekend trip, you most likely will not need two.
If you give a software developer a very clear specifications of what they need to do, and don't change those specifications, it's very likely that they'll give an accurate estimate and get it done in time. Probably even faster, because most software developers pad their estimates somewhat.
Also, it is possible to "invent new things" that have very clear specifications that don't change much. It might not be very common, but it does happen. Especially for relatively simple things.
Two things actually cause inaccurate estimates:
1) a lack of clear specifications of what the project is going to be
2) a change in the specifications
In many cases, the expense of providing clear specifications is not justified. This is normally the case when companies develop their own products, where they would implement them regardless of how long it would take and bear the expenses themselves.
When software development services are provided for other parties, there is normally a "requirements gathering" phase where the developer tries to get a very comprehensive specification for the project. Normally, this specification and its estimates will be very accurate. However, after realizing their mistaken assumptions in the requirements gathering phase, the client tends to want something different from what they wanted before - it is these changes of requirements that cause initial project estimates to be off.
In the end, no estimate has to be off if we provide clear specifications - we just have to accept that requirements/specifications are very likely to change during the development of any product.
As developers we translate between what the customer appears to want and what the compiler requires. If the customer had specified what they want with perfect clarity we would be able to automatically generate code from those requirements (or even compile them directly). But because they haven't been completely clear we have to use judgement and inference.
I want a program that adds three numbers together and appends the string "asdf1234" to the result.
1) I can provide a very clear specification for this.
2) I invented something new.
3) I can provide an accurate estimate for how long it will take to develop.
It's not very useful. But there are many "mini-projects" in organizations that are of small enough scale that you can provide accurate estimates and specifications for them. A more complicated, but "useful" example, would be to implement a logging server according to a REST API. The specifications are clear, the project is simple, and the estimate will be very accurate unless the software developer is not very experienced.
print(($ARGV + $ARGV + $ARGV) . "asdf1234");
> A more complicated, but "useful" example, would be to implement a logging server according to a REST API.
If the REST API in question is well-defined, it's because there's already an implementation of it. Just install that implementation and use it, and you'll be done in half an hour. Are you not sure if you can use the existing implementation? Well, then your project might take 45 minutes, or it might take two weeks. Suddenly you have two orders of magnitude of uncertainty in your estimate. Maybe you know you can't just use the existing implementation; how much of it can you reuse? Is there another piece of software out there that implements the same API? You may be able to get it to work for you in an hour and a half. Or you may be better off writing stuff from scratch.
Then, either with the off-the-shelf software or the software you wrote from scratch, you may encounter a killer performance bug — which could take you two weeks to resolve. Or you may not.
Maybe you think only the not-very-experienced software developer would consider using off-the-shelf software, or encounter a critical performance bug that could take weeks of work to resolve. If that's what you think, I suspect you're the one who's not very experienced!
This. One of the biggest sources of uncertainty is not knowing how much of the existing code will work well enough for you.
This isn't necessarily true. Two examples:
for x, y in sorted(lst), assert(x < y => index(x, sorted(lst)) < index(y, sorted(lst)) )
assert( sqrt(x) * sqrt(x) == x )
1. It's true that those specifications are clear (although you might also want to specify that multiset(sorted(lst)) == multiset(lst)), and sqrt should be allowed an error of half an ulp and perhaps required to be nonnegative and restricted in its domain to nonnegative numbers.) But they are not necessarily specifications that are easy to estimate, either. (I should have said "your very clear, easy-to-estimate specification", since writing specifications like the ones you have above is clearly not doing the development work.)
2. It is at least theoretically possible to automate the finding of programs that satisfy specifications like your two examples above. Given the knowledge that λx.x*x is strictly monotonic on nonnegative numbers, for example, you can apply a general binary chop routine to compute sqrt(x) in O(log numberofbits) time.
3. For your realistic example, it seems very likely that you could write software to build the scrapers rather than hiring a person. http://lists.canonical.org/pipermail/kragen-hacks/2004-Janua... represents a prototype solution to this problem from 2004, but without the knowledge of AI algorithms I have now, and of course with much less computational power than is available today.
You may say I'm being pedantic but these are exactly the type of issues which cause inaccurate estimates.
Yeah, I'm not a big believer in accurate time estimates either. Routinely accurate time estimates imply that you're doing something wrong. If your work is routine enough that you can routinely accurately estimate it, you're missing an automation opportunity somewhere. Probably a big one.
Remember Parkinson's Law "Work expands so as to fill the time available for its completion." 
Now consider student's syndrome "Student syndrome refers to the phenomenon that many people will start to fully apply themselves to a task just at the last possible moment before a deadline." 
This two phenomena makes sure that in a project with multiple tasks time saved in a few of them won't be used in the tasks that take more than expected.
Based on this Ely Goldratt developed the Critical Chain Project Management methodology 
And when these things are predictable, it's because they aren't novel. And since software is (ideally) non-repetitive, value fundamentally comes from novelty.
In other words, the better you code, the less predictable you will be.
I believe it is both in fact probably leaning a bit more on the question side of the equation.
I heartily recommend reading the studies associated with the Wikipedia page, as they demonstrate that this applies to more fields than just software. I will agree that the data does appear to support the hypothesis that the less repetitive the task, the more likely it is to fall subject to the planning fallacy.
Now, we're talking about a dedicated group of people who aren't going to do the work themselves and who do nothing but estimation, but still....
The fundamental problem is that until you're actually well into the project, you can only see down a few levels; but the amount of effort that is going to be required is a function of the length of the fringe of the tree, which you can't see, at least not very clearly, until you get there.
"Hey inventor, I need a drone that will pick up mice (but not other animals), locate my ex-girlfriend and drop them on her head. Give me a budget and a time estimate."
For a good amount of software (excluding boilerplate, plumbing e-commerce-type software) - this is what it's all about. I had a long argument with an uncle of mine who's getting into the field of software project management. I get the sense that among the project management types, there's a sense that software can be constructed by building gantt charts, costing spreadsheets, and "staff months". They claim computer science degrees "don't teach the practical stuff" and it's as if they are completely unaware that there lurk hard problems.
Oh yeah, managers also believe that changing the counter top from marble to granite in the middle of the project should be free, because, hey, it's software.
From Frequently Forgotten Fundamental Facts about Software Engineering:
RD1. One of the two most common causes of runaway projects is unstable requirements. (For the other, see ES1.)
ES1. One of the two most common causes of runaway projects is optimistic estimation. (For the other, see RD1.)
ES2. Most software estimates are performed at the beginning of the life cycle. This makes sense until we realize that this occurs before the requirements phase and thus before the problem is understood. Estimation therefore usually occurs at the wrong time.
ES3. Most software estimates are made, according to several researchers, by either upper management or marketing, not by the people who will build the software or by their managers. Therefore, the wrong people are doing estimation.
Glass has a book that goes into more depth on these points and has numerous citations (Facts and Fallacies of Software Engineering, Robert L Glass) and covers a wonderful variety of other topics.
Some companies certainly are behaving better software wise. I recommend his book, he goes into a longer discussion of the details, has citations, and includes counter points. It's also possible that this has shifted since the book was written (2002 I believe), but I suspect it's more that you've been at companies who managed to get this part right.
Every project I've ever worked on in the AAA games industry was deliberately underestimated so the publisher is happy with budget going in, with the added benefit that you can put pressure on the programmers because they are behind schedule from day one.
Are there still people who believe this works? I have been on teams who sometimes go into the super-effective mode where the team seems to work as one unstoppable goal-reaching mind. Yup, but this never happens due to external pressure, especially one perceived as bogus from the start.
"So you say we need to work "smarter not harder" because we are on "aggressive" schedule? Yeah, right. Can you please also hang some motivational posters around so we can watch them while we utilize the synergy to maximize shareholder value?"
But I'm not covering how the world should work. I'm covering how it does work. The reason this article resonates is precisely because we all see the same mistakes being made again and again.
I also agree that many software projects are more analogous to Columbus's trip to the new world...a tough trip even if you knew for sure there was a new world and where it was, almost impossible if you don't.
But realistically most people are working on web sites, enterprise apps, mobile apps, where there is enough prior experience that we should be able to make reasonable estimates. We aren't curing cancer here.
Yes the same mistakes get made again and again...
But the simple version of the problem, in my experience, is related to the 80/20 rule. No matter how many times a developer goes through the process of estimation -> development -> slippage, whether on a big scope or small, we will inevitably estimate the very next project as if only the clean 80% exists and the 20% will magically disappear this time.
Let's back up. In my experience the 80/20 rule (or Pareto Principle) applied to software estimation means that you will spend 80% of your time writing 20% of your features. Usually this has to do with various technical blocks, edge cases, framework incompatibilities -- the list goes on and varies wildly from one feature or application to the next.
You will spend 20% of your time working on 80% of the feature(s). This is the stuff you have done before: basic CRUD, ORM-fu, REST actions -- these are the clean, easily understood software tasks that you see right in front of you as the happy yellow brick road to feature completion.
And no matter how many times we wash-rinse-and-repeat this cycle, the next time a PM or colleague asks us "how long this will take," we will stare at that beautiful, pristine part of the path that we understand and know and have done before, and conveniently elide the fact that our development process has never been that clean and there is a nasty 20% (what? my code won't connect to the staging server because of some weird certificate problem? what do you mean my gems are incompatible? I can't get 'make' to build this gcc lib!! IE is doing what?? Why do our servers treat HTTP basic auth in such a non-standard way?) just waiting to jump out, muss your hair, laugh in your face and dance like a hyperactive monkey all over your keyboard.
Both of my parents are engineers in several different areas (road, civil, and HVAC engineering). Through talking to them I realized that their projects and estimates are always off and by similar margins as the softare projects I've been involved in.
I'm not sure why that is, but people just seem to generally suck at estimating engineering activities. With software I think the problem is more pronounced because the engineering phase so dominates the cost of construction. If designing a bridge takes longer and costs more than what you expected it really isn't such a big deal because actually building it is much more costly and delay-prone. The people financing the project won't have a hard time absorbing the cost associated with incorrect engineering estimates. Quite clearly the situation when it comes to software is different.
If the problem domain is defined by physics or physical function, I think most time estimates are going to be a lot closer. You might still get outliers, but most experienced people in the domain will probably give you decent estimates.
When you get human's preferences, design, business processes, and regulations into the mix, estimates can be wildly off. I once thought a defined business process would yield a decent software specification, but that never seemed to work out since the "flexibility" argument seems to creep in. Never mind multi-national companies that have a defined business process that is actually executed totally differently at every location.
1. The languages and tools change - but are often immature for the task, which means a lot of reinvention and re-workarounds
2. Users ask for nice-to-haves but are often difficult to implement as it is a once-off operation - federated sign on in frameworks that do not support it natively
3. The specification problem - the more people you have involved, more people are required to understand the problem domain in order to translate into code. The problem is particularly bad when rolling out new payroll systems in Government/heavily unionized industries, as there are just a lot of rules which can be very difficult to implement.
4. It is still ridiculously difficult to test enterprise software across the full stack.
The software tasks that are easy to estimate are the same ones that you already should have automated, maybe with a DSL. However, automating them is hard to estimate.
You can think of this by imagining that you're asked to solve a Rubik's cube. You can look at 5 of the sides, but not touch it. Tell them how many moves it will take to solve. The theoretical maximum is 20, I believe. In this case, and many others in programming, you can't know how long it will take to get something done. The fastest way to find out how long it will take to finish a system, is to do the work and finish it.
I'm in part playing devil's advocate because I don't agree with software patents. But the problem with us (technical people) not communicating consistently is that if we can't get our story straight how do we expect others to understand what we say?
In the software patent arguments we stay firmly away from the word invention, wary of how it will be used against us. Yet here, for the sake of arguing about estimatation, we seek to embrace the word invention?
I'm not sure we can have it both ways. We've got to collectively get our story straight.
It's the whole "known unknowns" vs. "unknown unknowns" thing, and I think it's useful to distinguish between the two.
Then I double the result.
It's always closer to the amount of time I really need in the end. For some reason I'm always off by half. And I have heard of others with the same problem using this approach.
However, if you are working for an engineering company, often you have projects that were similar which can provide a guideline for the scope of the software or software modification that you are undertaking.
I would posit that for the majority of software projects, the invention hypothesis cannot be true, because you are not really inventing anything new.
You are more likely replacing something that is done manually or done via a hideous and error prone excel spreadsheet machinations.
So my complimentary hypothesis is that estimating the budget for a software project is a social issue.
My manager and I come up with some really cool feature that will let us market our product to new people. My manager asks me for a time estimate. He says:
>How long do you think this project would take for you to do?
Basically, it is a simple question that is also completely overloaded with little details, philosophical questions, and other minutia.
I can give an estimate of how long it will take for me to get this project done. But how long a project will take me is a derivative measurement. It is units of software engineering multiplied by my units of software completion rate. What we really want is units of software engineering work. This does not exist. You can't invoice for 10,000 ml of software engineering.
The punchline for this is that my rate of completion will be different from Steve or Charlie's and we don't have the rate or the units, just the aggregate for any given project. And it seems to be that the tendency is to go to your best programmer to get an hours estimate, rather than your mediocre programmer, regardless of who will be working on the particular project (you probably don't know who will be working on it when you are figuring out financials for a project that is 6-10 months off).
There is no standardized software engineering test that gives you a baseline for probable programmer efficiency/ability so that you can adjust the budget of the project accordingly.
There are also questions about 'what is done', interruptions from other projects that you have going on simultaneously, interpretation of various details in your spec.
And there is other bureaucratic stuff. I've had it where I was budgeted for a few months and had a delivery date representing that, and I hadn't received the relevant
input documents that I needed to complete the project 2 months into the timetable.
Or the other version, you budget for a few months and some know-nothing pushes back and tells you do not need that much time (or really that his financials will look better if you finish it this quarter rather than next).
There are certain unsolved problems in computer science that you may encounter. When asked to solve them as part of my day job, I prefer to gracefully decline and offer a solution that I know has a chance of working and being implemented in a reasonable amount of time (by doing some research beforehand and figuring out what is reasonable within current technology/knowledge and what is pipe-dream/'interesting research' material).
There may be things that you do not yet fully understand when working on a project. But it is possible to estimate how long it will take you to learn those things. It is very difficult to estimate the complications of many of the social factors. If stupid steve is working on the project, instead of me, it might never get completed. If a jackass up the chain of command cuts my budget halfway through, i can't predict that. If I get pulled off onto an emergency project, who knows what will happen. I think this is the real reason why startups and small workgroups do so much better at software. By reducing the number of people involved, you reduce the chaos factor and the amount of damage control you have to do due to someone monkeying with your project when they really shouldn't be.
I'd like to hear some thoughts on how to improve those estimations rather that explaining why it is hard as Brooks did on MMM.
Not Invented Here would be rewriting bridge4j because you feel like it. A good software engineer does research, and invents what needs to be invented because it doesn't exist. If it exists and it's freely available, you use it and write glue code.
The reason web startups can pop up so quickly is because we all do this. IndexTank was an example, the list of open-source technologies we used is very long. Big companies on the other hand are prone to unnecessarily rewriting stuff, sometimes just to keep developers entertained in between meaningful projects.
Let's say that you have 20 tasks. Each involves rolling a 10-sided die. If it's a 1 through 8, wait that number of minutes. If it's a 9, wait 15 minutes. If it's a 10, wait an hour.
How long is this string of tasks going to take? Summing the median time expectancy, we get a sum 110 minutes, because the median time for a task is 5.5 minutes. The actual expected time to completion is 222 minutes, with 5+ hours not being unreasonable if one rolls a lot of 9's and 10's.
This is an obvious example where summing the median expected time for the tasks is ridiculous, but it's exactly what people do when they compute time estimates, even though the reality on the field is that the time-cost distribution has a lot more weight on the right. (That is, it's more common for a "6-month" project to take 8 months than 4. In statistics-wonk terms, the distribution is "log-normal".)
Software estimates are generally computed (implicitly) by summing the good-case (25th to 50th percentile) times-to-completion, assuming perfect parallelism with no communication overhead, and with a tendency for unexpected tasks, undocumented responsibilities, and bugs to be overlooked outright.
That is a beautifully accurate description of modern software development. I hope you get quoted.
Then, we ran the numbers. This was our graph for median time for a swag:
1 2 3 4 5
We went to fib, because it felt more natural with the effort. The end result is if a story is estimated on the end, everyone knows there isn't enough information, it's a black hole or it should be broken down.
Often the primary issue is fitting a goal that is in flux into time period that is not in flux.
At my last company, they needed an e-commerce site done. I had the coding finished in roughly 3 months and we were just waiting on the design. We went through 3 artists and a design by committee (the boss rounded up everyone in the company once a week to give their feedback). 2 entire finished designs were also scrapped. In addition to all of this, the boss would change his opinion of it on a daily basis (I think it depended on his mood).
A year into the project, they questioned me as to why the project wasn't finished. This was after I had been telling them for months why we couldn't run a project this way. A year after this, the project was finished.
What infuriates me is that a company like this is still making money. Every aspect of the company was run like the above scenario. Over time, every good person they ever had left in frustration (including me).