
Fact and folklore in software engineering - Morendil
http://morendil.github.com/folklore.html
======
PaulHoule
I think the "10x" myth persists because we all imagine that we're the
programmers who are "10x" better than average.

It's nominally true because many programmers have zero or lower productivity,
sometimes due to no fault of their own. If you work on a project for three
years, and then it gets canceled, you could say your work had no economic
value because it never got in front of customers. If some PHP hack manages to
pound something out in a month that brings in $100,000 of business, he looks
like a hero, no matter how bad the code is.

In my view, superproductivity is about alignment with your environment. If you
work for 12 months a year, but spend 3 of them on side projects that go
nowhere, spend 3 months being a sysadmin, and then waste another 5 months on
reworking a GUI, there goes about 90% of your productivity.

The "Mythical Man Month" says that about 20% of the time on a software project
is spend coding, the rest is requirements work, testing and that kind of
stuff. A superprogrammer with supertools might be able to do that in time that
approaches zero asymptotically, but unless you can get rid of the other 80%,
project cost and time don't get much worse, even in comparison with a coder
who is half as productive as average.

The moral? If you want to look like a genius, find some place where (i)
requirement work takes little time (and wastes little time downstream with
changes) and (ii) testing, deployment and all that is minimal.

~~~
lispm
I have seen programmers that are more than ten times faster than some slow
programmers. There are debugging tasks where some people struggle for days,
while others would find it in one hour.

~~~
ewjordan
Take any non-trivial debugging task (i.e. something that slipped through the
net for several releases in a live system, not a simple bug in fresh code) and
set me on it 100 times, and you'll almost definitely see at least 10x
variation in how long it takes me, and that's just because of the nature of
debugging, sometimes you get lucky and sometimes you don't.

Of course, there's some real variation in the averages that people will settle
down to over the course of hundreds of debugging tasks, but an individual's
productivity on debugging tasks varies so much that I'd be hard pressed to say
much at all about the individuals until I'd seen them do at least a few
trials.

It _may_ be the case that the differences in the average times people take do
end up being on the order of 10x or more, but it would take a lot of
observation to say that for sure (exactly how much depends on what sort of
distributions we see when we measure this stuff).

~~~
Morendil
Kudos for pointing out that it's the shape of the distribution that matters,
not how far apart its endpoints.

Interestingly there are articles out there which suggest that one distribution
(of competence, rather than productivity, but intuitively you'd expect the two
to be related) is actually bimodal rather than normal, "the camel has two
humps":

[http://www.codinghorror.com/blog/2006/07/separating-
programm...](http://www.codinghorror.com/blog/2006/07/separating-programming-
sheep-from-non-programming-goats.html)

------
j_baker
It's important to note that (as far as I can tell) the author doesn't actually
debunk the claim that good programmers are 10x more productive, only the claim
that studies show that programmers are 10x as productive.

Personally, I think the author is unintentionally also pointing out the
problem of relying on studies of programmer productivity: it's next to
impossible to measure. I seriously doubt that there is no valid research on
this because no one has gotten around to it. Too many peoples' businesses rely
on understanding programmer productivity inside and out for that to be the
case.

Personally, I agree with whoever it was that argued that we need to view this
as a soft science like psychology or sociology. We need to focus on working in
spite of our imperfect means of measuring productivity rather than dismissing
all of our research based on it. Such is the nature of studying the human
mind.

~~~
beoba
Developer productivity is near impossible to _define_ , much less measure.

~~~
mkramlich
Developer productivity is a lot like obscenity. And jazz. It may be hard to
define. But it doesn't matter. Because I know it when I see it. YMMV. But
that's not my problem. I don't have to define precisely what it means to be
hit in the face by a rotting fish. I'm pretty sure I'll know it when it
happens.

Now back to Terminal and vi so I can be productive again...

~~~
j_baker
This is true. But when you're doing an objective study, you need something
more than "I know it when I see it".

~~~
barry-cotter
Well you could use peer ranking, and see how well the rankings correlate with
one another. That's the method they use in expertise studies for subjective
fields.

------
wazoox
Heck, even one given programmer can see his own productivity varying wildly.
There are times I could hack away thousands of lines of working code in a
single week, and some other times I'd tinker for months getting no actual job
done. Overall I managed to ship from time to time, though I blankly admit I
failed the most ambitious projects because I was probably not up to the task.

Even programming at my very basic ability level (actually I never considered
myself to be a professional programmer * ) I've seen a number of people who
simply never get anything done, and are 10x less productive than I am. The
sort of people who are the heroes of thedailywtf.

They never do any better than tinkering around; with some luck they may
actually implement some simple code if given very precise requirements because
they miss the basic impetus, or the necessary comprehension to start by
themselves. Maybe in some huge java drones team, they may add some value, I
don't know for sure, I've never worked in a team of more than five programmers
anyway.

OTOH I've worked with some guys, thumping the keyboard 16 hours a day for
months straight, getting working production code out of the frigging door
every single day. So I dare saying I've seen it all : the 10x programmer, the
1x programmer (myself and most others), and the 0.1x programmer (I don't know
how many).

All of this sure isn't scientific fact, but simply what I witnessed myself in
the past 25 years. It's enough information to guide me through :)

* I'm a professional dilettante :)

------
equark
I don't see how it's really debatable that some programmers can be 10x more
productive in certain domains. This is especially true in technical domains
where it can take years to gain the knowledge required to solve the problem at
all. I can think of real-world problems that would require the average
programmer several years to solve, yet I know people that could solve them in
a day. In fact, almost any specialized task works like that.

~~~
mattmanser
It would take others years to solve what someone else can solve in a day?

I would love to hear an example of that as that seems a pretty extreme claim.

~~~
btilly
I would suggest <http://norvig.com/spell-correct.html> as an example of a very
short program that took an expert under an hour to produce which many
competent programmers could have spent weeks spinning their wheels on. (With a
worse result.)

You will note that the expert is not quicker by virtue of pulling in more
sophisticated tools or large libraries. Rather the expert is faster due to
knowing how to solve the actual problem.

~~~
staunch
I love that example.

Another one: <http://www.paulgraham.com/spam.html>

For many years hundreds of programmers worked on anti-SPAM systems that
weren't as effective as the very simple method PG proposed.

If anyone else can think of more I'd love to know them.

~~~
mattmanser
But you've both just disproved your own examples!

The very fact there are articles explaining previous solutions to these
problem domains is why I just don't believe anyone would ever take a year to
get up to speed with something an expert could do in one day.

I can make a great spell checker or a great anti-spam program that takes an
expert a day to program in less than a year by just googling it.

There are so few programming domains that aren't fairly transparent where a
domain expert has such a massive advantage that a domain novice couldn't
gather sufficient knowledge in a year to rival the other programmer's one day
effort, given the same level of intelligence, etc.

~~~
btilly
When it comes to spelling correction, there is a lot of literature on how to
do it. So that example I grant you. But could you, before Paul Graham's
article, have written an anti-spam program that is that effective? Seriously?

------
pg
When I found myself reading sentences like this

    
    
        Citation itself often presents as a modality, 
        with an associated degree of confidence.
    

I started to wonder if this article was a practical joke. Why was someone
writing about programming using the pompous language of literary theory? Was
someone trying to pull a Sokal? Then I saw the endnote.

(I realize this is only DH2, but it's such an extreme case that I can't help
thinking how lucky we are not to have programming ordinarily discussed in this
way.)

~~~
Morendil
Which part of the above did you have trouble with?

"Citation" is the topic being discussed. It's a technical term that refers to
how scientific papers refer to other scientific papers.

"Modality" is the concept I've been introducing for three paragraphs. You're
supposed to know what it means by then, or I've not been clear enough.

"Present as" is an intransitive verb phrase. I could say the same thing,
perhaps more simply, by saying "you can think of a citation as a modality".
Perhaps it's this phrase that has thrown you?

[EDIT: I've changed the article to try and make the sentence clearer. Thanks
for your feedback.]

It seems to me that "degree of confidence" is clear enough, and that it's
clear enough that a modality can have a degree of confidence associated with
it.

So the article introduces two technical terms you're perhaps not familiar
with, but I'd suppose someone who can read a CS paper or write a program can
cope with that much.

~~~
JesseAldridge
It's the word "modality" that throws me. The word by itself is pretty vague --
"somehow related to a mode". The linked Wikipedia article is dense and
confusing. You explain it as a way of modifying a statement. Then you say a
publication is an "extended modality". That requires another mental stretch.
How does the abstract idea of "a publication" connect to modifying a
statement? One expects you to explain that in the following sentences, but you
don't really. Instead you follow with the seemingly unrelated statement, _"A
researcher may wish to state a conclusion: “water boils at 100 degrees
celsius”. Convention dictates that he should initially add various hedges to
his conclusion: ..."_ So when writing a publication a researcher will
customarily add hedges. What does that have to do with a publication being an
"extended modality"? Is it really necessary to use such an obscure term? It
seems like it shouldn't be this hard to understand what you're saying.

~~~
meestaplu
I definitely agree about the linked Wikipedia article on modality. Literary
theory is probably not a good way to talk about modality with programmers, but
the idea of modality isn't just about linguistics and literature, and isn't as
obscure as you think.

A better starting point for programmers and computer scientists would have
been modal logic, which uses the modal operators of necessity and possibility.

For example, classical logic uses propositions. I can say "P" in classical
logic. In modal logic, I can say "P", "Necessarily P", and "Possibly P", where
logical necessity and possibility are modalities. See
<http://en.wikipedia.org/wiki/Modal_logic> \-- it's a pretty good overview and
it links the logical and epistemological senses of modality.

------
huherto
My personal opinion is that the big gains in programming productivity in large
projects have have to do with how code is organized and structured. The right
abstractions are also key to keep the inherent complexity under control. It is
not about implementing x feature in t time. It is more about implementing the
feature in a coherent way, and the gain is long term. To measure and compare
that, you would need to track 2 projects implementing the same features over a
period of time. That experiment is hard to do.

~~~
yxhuvud
Don't forget how the _people_ is organized. Both programmers and the people
around them.

------
sdepablos
Anyone who says that "net negative productivity programmers" don't exist is a
person who has been really lucky in his professional career.

------
prewett
I suspect the 10X observation is due in large part to the fact that some
programmers are more experienced. I'm much more productive than I used to be,
because I've been doing it for 10 years. I've already had to think about a lot
of problems, which means I already know solutions for a lot of things. It
doesn't mean I'm inherently better than I was, just more experienced.

I expect that this 10X figure is also applicable to things besides software
engineering. I'd guess that auto mechanics with 30 yrs of experience running
their own auto shop are 10X more productive at solving problems than new guy
at the dealership. (Note: solving problems, not just replacing the part.
Knowing what part to replace) The Ph.D. who's been researching for 30 years
probably is a lot faster at finding solutions than the guy who just passed his
qualifiers.

------
abecedarius
One set of studies: <http://norvig.com/java-lisp.html>

The focus was more on the language factor, and some programmers were students
while others were pros, and in some they were self-selected; but still, >10x
variation in development time among the Java programmers. 30x if you allow all
the above differences. The 10x thing gets quoted way more confidently than the
amount of study justifies, but it's not just made up.

~~~
ewjordan
I've already made this comment elsewhere on this thread, but just for
completeness sake...

The problem with reading off a 30x variation (or even >10x) from Norvig's
results is that it discounts the possibility that the same person might take
different amounts of time for tasks (relative to the others in the group).

For instance, you might have people in the group that have already worked some
variation of that problem. Or you might have some that had colds, were
distracted, or just do badly on that sort of enumeration problem, but are
killer at other problems. Every one of these possible discrepancies cuts away
at the expected magnitude of the actual overall productivity difference
between programmers, and should be accounted for.

Of course, since that difference is not what was being studied, they didn't
set up the experiments to gather the data we'd need to best estimate the real
productivity differences, so it's hard to say what we should conclude...I'm
sure there _are_ people that are 1/30th as productive as the best programmers,
but what we'd really want to know is how typical they are, and what the
overall distribution looks like. That's far less clear.

------
Chris_Newton
The 10x productivity claim is one of the unfortunate ones, because anyone with
a few years of programming experience can relate to the idea, but it's very
difficult to define an objective hypothesis that can be tested quantitatively
in any meaningful way.

First you have to define what things like "productivity" are for programmers,
and to my knowledge no-one has really done a robust job of even that yet.

Then you have to establish who the "best" and "worst" cases are. After all, I
suspect many of us would agree that some programmers make a net negative
contribution to their projects, taking more time away from more efficient
programmers to fix the resulting problems than it would have taken the
stronger programmers to do the work themselves in the first place. This may be
inevitable, given the learning curve involved in a field as complicated and
diverse as programming.

That all said, I will just mention that Glass also covers this subject, with
various other citations, in "Facts and Fallacies of Software Engineering".
Given that his final fact is "Many researchers advocate rather than
investigate", a criticism very much along the lines of this article, it would
be interesting to know whether his sources stack up to more robust criticism
if anyone has access to them.

------
dimitar
Is there "software engineering" engineering at all?

It is more like "social engineering" or "financial engineering". More to do
with ingenuity, than engineering as a practice.

Engineering is an application of scientific - mathematical and physical
knowledge to create something. If you don't do the Math, you probably are not
an engineer. There is little room for subjectivity.

So, I guess most programmers (if not all) are software technicians. They have
practical knowledge, but not the proven theory behind it.

So you will have "Fact and folklore in software engineering", until you can
synthesise "something-driven development" using knowledge mathematics and
science.

Currently you have some ideologies and studies on them.

In control engineering for contrast to solve a problem, for example a
controller regulating the flow of fuel in an engine you have the following
procedure:

1\. You make a model of the open and closed loop system. You know how do the
processes look like in derivatives and you can make transfer functions and
state-space representations.

2\. You correct the system according to specification, adding to the models.

3\. Knowing the physical properties of the different parts of the system you
are left with some choices of implementation, but they have different trade-
off (physical and economic). You talk with the client and make the appropriate
choice. :-)

So, you have the heuristics to solve the problems all build on both practice
and science and you don't have the engineers split in schools or ideologies
:-)

I guess it should be the same with programmers.

~~~
elai
Software engineer is a term that can help with visa applications. Programming
feels a lot like engineering, architecture and shop work combined. A guy with
a machine shop and an idea can create a machine that becomes a very useful
mass market product even if he didn't do any mathematics other than
measurement and arithmetic. What an engineer does with lots of math can create
a similarly identical product. Some problems you need an engineer's
sophistication with mathematics and physics to create (like a bridge), and
similarly in programming, some problems in graphics and what not require
sophistication in mathematics and physics to even create it or to advance the
field.

------
JimboOmega
As the article points out (and I have before), even if it wasn't a myth,
there's no interviewing technique to determine who is at what level.

Also, it should be kept in mind that there is a chance that the original study
was contaminated by bad programmers. The truly bad - those who don't really
know how to write code at all - are going to vastly underperform those of
average competence.

If you had a test that screens out basic competency (an ability to handle CS
101 tasks like assignment, for instance) - how wide would the range be?

Certainly based on 1 study from the 60's... we can't make a good judgment.
It's not even a coding study! Even if it was a coding study, it doesn't
measure how well they work in teams, or how well they adapt to new languages,
or the myriad other things that matter in a programmer's career. . Here's a
question that deserves study - what interview techniques can make productive
teams? What productivity difference is there between those good at brain
teasers (programming ones or otherwise), and those that are bad at them? What
about those given syntax questions? Those who feel like a good cultural fit?
General IQ?

------
pschlump
I have looked back in the notes from my last startup. I was CTO and I had to
spend time doing fund raising, making sales calls, editing documentation and
designs, going to customer meetings. I was never full time in development. I
wrote 69 times as much code as the least productive developer and 12 times as
much as our "Lead Programmer". Based on my experience 10x is a low estimate. I
would say that some people are 100x as productive as others in developing
software.

------
kgo
Article talks a lot about science, and then provides no actual studies that
refute the original claim. That's not how science works. Devise a hypothesis.
Develop a test to confirm or deny. Run test.

And then he tears apart the study with strange assumptions. "Well they were
debugging, not programming." Because no programmers in the real world spend a
significant amount of their time debugging. I'd consider that to be a
significant amount of time for a programmers day. Wish I had code complete in
front of me, because I know it did do a break-down of developers time. The one
that said that most developers spend more time walking than reading software
books.

If you're going to refute McConnell or even Peopleware, you've got to do a
study. You can't just say "well cubes seem more productive than an office."

~~~
salvadors
He's not refuting "X", he's refuting the claim that "studies have shown X". To
do that you don't need to run your own studies and show different results from
X — in fact "X" might even be true. But if the existing studies don't exist,
or can be shown to be flawed, then it's perfectly valid to simply point out
the holes.

Your debugging criticism is somewhat valid, but the context here in that the
original Sackman paper was mostly testing the difference between "online" and
"offline" programming (i.e. programming whilst sitting at a computer or not —
nothing to do with the internet!): see, for example,
<http://dustyvolumes.com/archives/497> and
<http://dustyvolumes.com/archives/500>

The research was on whether debugging whilst sitting at a computer would make
you more lazy, which is an interesting experiment for its time, but isn't
really that connected to the modern concepts of debugging vs programming.

~~~
kgo
But to me that's like saying no study has proven global warming is real.
Because study X only measured ice cores in Antarctica. And study Y only
measured tree rings in the amazon.

A far better refutation would be be: Well I ran study Z, and it shows...

(Not that I think the article's claims are as outlandish ad Global Warming
Deniers, or that developer productivity has been studied nearly as much as
global warming.)

------
joblessjunkie
The example chosen for the first half of this blog post was rather annoying:
water boils at 100C _by definition_ , and thus there has never been any
scientific debate about this.

And yet, his point is made. :-)

------
gacba
I would love to know how a study could be conducted in such a way to prove or
disprove this notion. Any ideas? Any existing data sets that could reliably
mined for info?

~~~
Morendil
I wouldn't start with a data set. I would start by stating clearly enough the
question we are looking for an answer to.

Doing that might involve getting clear on the terms. "Productivity" is too
abstract and ambiguous to be used as-is, so we first need to say "here is what
we are going to measure".

~~~
narag
I believe that what people try to determine is not a very precise measure of
productivity, but the fact that _productivity measurement doesn't matter so
much_.

The central message that I remember from _Peopleware_ is that software teams
are more about people than about numbers.

Performance is going to vary, often wildly (often for the same people at
different times). I loved the anecdote of the woman that wasn't very
productive herself, but was a "catalyst", maybe simply a fancy way to say she
was fun and nice to work with.

Unless you are Google and have the right process to detect the best and
resources to attract them, your best bet is to treat people well and apply
common sense... as opposed to apply hard metrics and put pressure,
demoralizing your people.

------
mkramlich
This is one of those things that's yet another example of something being
treated academicaly which does not need to be treated that way. And of making
something much more complex than needed. In real life, in actual practice, any
working software engineer or heck even open source hobbyist programmer can
tell you that talent and ability and productivity vary significantly between
individuals. And this isn't just true among programmers, it's true in other
areas/fields as well. This is not controversial, or at least it shouldn't be.
And yes, some individuals are clearly 10x or 20x or whatever better than
others -- for whatever reason, and the reason(s) doesn't matter too much
honestly. It just is.

