
100x better approach to software? - ColinWright
http://www.johndcook.com/blog/2012/04/23/100x-better-approach-to-software/
======
mistermann
In my relatively short time (~20 years) in the software industry (in the
"enterprise software" corporate development world on the MS platform), it is
my _personal perception_ that it now takes nearing 10 times the LOC &/or
classes &/or complexity in general to implement the same functionality now,
using modern "improved" software architecture and design patterns, than it
used to writing code the "old" way we did things when I started out.

I'm deliberately very conscious of keeping an open mind and trying to discover
just what it is that I don't "get" about the way things are done these days. I
look at some of the incredibly complex software implementations of really very
simple functionality these days, and for the life of me I just can't
understand why people write things to be so complex.

Now and then I just can't restrain myself and I'll ask "why didn't you just do
it this way" (often with a proof of concept implementation) and the normal
response is blank stares, and a changing of the subject shortly thereafter.
This is not just with individuals, this happens in groups, and no one (so it
seems) feels anything is out of order. Not surprisingly, implementing new
features takes _far_ longer than things used to, at least according to my
perception.

So, unless I am out of my mind, a 10x improvement could _easily_ be realized
(at least in the world of "enterprise" software), just by unadopting some of
the practices adopted in the last decade or so.

~~~
pavel_lishin
Do the added LOC/complexity/classes make it easier to change the software
later? As I understand it, all of those things aren't there to make it easier
to start; they're there to make software easier to maintain.

~~~
loup-vaillant
I'm pretty certain that more code is more difficult to maintain, period.

Before you modify anything, you have to find the relevant piece of code,
understand it, and understand all the dependencies' interfaces (or even part
of their implementation, if the abstractions leak). I can't recall specific
experimental result, but it seems, well, obvious, that all the above will be
more difficult if there is more code to examine.

Now you could argue that you could shrink the size of the code that one have
to examine before making a modification, by _increasing_ total code size. I
don't believe it. Increased modularity means more code re-use in the first
place, and therefore a smaller program.

~~~
johndcook
Agreed. I imagine most unnecessary code is justified by saying that it will
make the code easier to maintain, even though it usually has opposite effect.
It pays to be very skeptical of your ability to predict the future.

------
DennisP
Haven't watch Kay's lecture yet, but I'll mention that he and some colleagues
are about five years into a project to write an entire computer system,
including OS, gui, networking, and programming tools, in 20K lines of code,
using some very innovative techniques. Compare to the millions of LOC in other
systems and it's easily a two orders of magnitude improvement.

One example: a TCP stack is about 20K LOC in C. In their system it's 160 LOC:
[http://www.itsoftwarecommunity.com/author.asp?section_id=162...](http://www.itsoftwarecommunity.com/author.asp?section_id=1624&doc_id=240682)

Their language allows easy syntax modification within scopes, so they defined
a syntax that matches the IETF ascii diagrams, and simply pasted in the
diagrams from the standards.

You can read their papers here: <http://www.vpri.org/html/writings.php>

~~~
tgflynn
But how many lines of code does it take to implement the language that allows
them to implement TCP in 160 LOC ?

~~~
mbrubeck
That language is OMeta. There are several implementations; the latest
JavaScript compiler is just a few hundred lines.
<http://www.tinlizzie.org/ometa/>

The compiler is self-hosting; much of that JS code is actually compiled from
OMeta source. For another example, the Common Lisp version includes fewer than
200 lines of OMeta source and fewer than 300 lines of Lisp source.

OMeta was a very eye-opening language for me because of the great
expressiveness it gets from its simple but novel approach. (I've written code
in Haskell, Prolog, Scheme, OCaml, and plenty of other languages, but OMeta is
not like any of them.) If you are interesting in programming languages and/or
parsing, Warth's thesis is a great read.

~~~
CodeMage
Thanks for this! I'm still reading the dissertation and haven't gotten to the
end yet, but so far it seems to me like what it would be like to "program in
ANTLR", so to speak; except that OMeta looks easier to understand and use.

------
tgflynn
It seems to me that a huge amount of developer effort tends to be expended in
mapping data between different data structures as required by different
algorithms or API's.

If a way could be found to automatically discover a set of feasible and
approximately optimal data transforms to meet the requirements of the
operations that need to be performed it seems like very large productivity
gains might be obtained.

~~~
gruseom
_a huge amount of developer effort tends to be expended in mapping data
between different data structures as required by different algorithms or
API's_

This is indeed one of the biggest sources of complexity. A programmer I used
to work with called it "meat grinding".

~~~
mtraven
I call that "plumbing", since it's basically just getting different sizes and
shapes of pipes to connect. But "meat grinding" is good too.

~~~
gruseom
When I hear people say "plumbing" it seems to mean not data conversion but the
internal workings of a system, especially anything infrastructurey. So "meat
grinding" seems more specific. But I like it because it's appropriately
disdainful. Who wants to grind meat all day?

Dostoevsky said that if you want to drive a man insane all you need to do is
sentence him to hard labor moving a pile of dirt from one side of a prison
yard to another – and then back again.

~~~
tptacek
Fitting data in whatever form it has taken at point X in a program into the
form that API Y expects is often more like meat ungrinding.

------
hxa7241
Kay's assertion seems incredible -- he really has to prove it.

It does not square with at least some research and what Brooks most
perspicaciously noted in 'No Silver Bullet' -- that there is essential
complexity in software, and it is the dominant factor. That really seems very
opposed to anything anywhere near 100-fold general improvements.

Looking at the VPRI Steps report of 2011-04, prima facie, overall, they seem
to be producing a 'second-draft' of current practice, not a revolutionary
general technique.

People regularly say they translated their system from language A to language
B and got 10-100 times code-size reductions. This seems essentially what STEPS
is doing: when you look at something again, you can see how to improve it and
do a much better job. But the improvement comes not from the general aspect --
the language B -- but from understanding better the particular project because
you already had a first-draft.

This can still be valuable work though, since it is about general things we
all use. Second-drafts are valuable, and ones of important things even more
so.

But we should probably not infer that they are coming up with a revolutionary
general way to write new software.

Substantially because large scale software cannot be _designed_ : it must be
_evolved_. We only know what to focus a second-draft around because we muddled
together the first draft and it evolved and survived. You cannot write a
second-draft first.

Having said that, if they find _particular_ new abstractions, they could be
very valuable. If, for example, VPRI came up with a neat way to express
implicit concurrency, maybe, that might possibly be comparable to
inventing/developing gargage-collection in common programming languages --
certainly useful and valuable.

~~~
gruseom
_Kay's assertion seems incredible -- he really has to prove it._

What assertion, exactly? That the personal computing stack can be implemented
in 20K LOC? They've already achieved the bulk of it, so that claim no longer
qualifies as incredible.

Pooh-poohing this staggering achievement because they didn't reinvent personal
computing while they were at it really takes the cake for moving the bar. If
someone had gotten all the world's recorded music on a thumb drive in 1970,
would you have complained that they didn't write the music themselves?

~~~
randallsquared
_What assertion, exactly? That the personal computing stack can be implemented
in 20K LOC?_

No, the assertion from the article that "99% or even 99.9% of the effort that
goes into creating a large software system is not productive."

While it's very interesting and frankly astonishing that they've been able to
reimplement things that are fairly well understood in 100x less code, this is
not at all the same as proving the assertion that these things could have been
originally done in 100x less code. I hope that it's _true_ that they could
have been -- it's exciting to think that it might be true -- but it doesn't
seem to be a claim that STEPS proves, or tries to, from the explicit goal of
reusing concepts and designs from other OSes.

~~~
gruseom
First, Kay didn't write the article.

Second, sorry, but I'm not buying this response at all. These guys have done
something that all conventional thinking about software complexity would say
was impossible. They've refuted something fundamental about universal
practice. Seems to me the only open-minded response is to be flabbergasted and
ask what else we need to question.

To say (I'm exaggerating how it sounds) "Oh, well, sure, they did _that_
impossible, but this _other_ impossible over here, that's the one that really
matters" just seems like rationalization to me.

Oh and also, it's at least three orders of magnitude, not two.

~~~
randallsquared
The supposition was attributed to Kay:

"Alan Kay speculates in this talk that 99% or even 99.9% of the effort that
goes into creating a large software system is not productive."

This is the claim.

Without in any way diminishing the astonishing accomplishments of the STEPS
team, let's remember that they've spent (are spending) five years to write 20K
lines of code. This doesn't sound like the _level of effort_ is 1000 (or 100,
or possibly even 10) times less for what they've accomplished than it would
have been had they implemented exactly the same functionality in, say, Java.
Would the Java code have been 1000 times bigger? It seems quite likely that it
would have! But the bulkiness of the end product's source code doesn't
determine the amount of effort that went into building it.

Could it be that they spent four and a half years figuring this new system
out, and a few weeks writing the 20K lines of source? It could; I haven't read
most of their papers. If that was the case, then that's what needs to be
pointed out to refute the assumption that they're spending as much or almost
as much effort on the end product (but with astonishingly little source code
required to specify it).

~~~
gruseom
You do have a good point that one should consider not just the code size but
the effort to get there (though let's not forget the effort to maintain and
extend it over time). But are you sure you're framing it correctly? It seems
unfair to compare the effort it takes to pioneer a fundamentally new approach
to the effort it takes to implement a common one. That's like comparing the
guys who have to cut a tunnel through mountain rock to the ones who get to
drive through it later. We have no idea how easy it might get to build new
systems their way once the relevant techniques were absorbed by industry. The
founding project is no basis for predicting that, other than that historically
these things get far easier as the consequences are worked out over time.

What we do know from general research (at least, I've read this many times -
anyone know the source?) is that programmer effort tends to be about the same
per line of code. That's one reason everyone always cites for higher-level
languages being better. Maybe that result doesn't hold up here, but it seems
like a reasonable baseline.

------
Aqua_Geek
> If the same team of developers could rewrite the system from scratch using
> what they learned from the original system?

We're doing this right now at work - completely rewriting one of our apps from
the ground up. It's been an awesome experience to look at what we did wrong in
the old code base and discuss how to fix that with the new one.

I definitely don't advocate doing this on every project, but unless there's
been due diligence (in refactoring, consolidating duplicated code, etc) I
think oftentimes we reach a point where this becomes somewhat inevitable for
progress to be made.

A forest fire now and again is a good thing to clear underbrush and replenish
nutrients and whatnot.

------
mickeyp
The issue, as always, is that you trade your "verbose" (by comparison)
imperative programming language for something that let's you specify or model
the behaviour of a system in very few lines of code. Impressive, indeed, until
you realise that knowing _how_ to specify what it is you want is as much a
black art as it is a science. Tools like that are hard to make understandable
enough to a large enough subset of developers that the investment in learning
it fully will pay off.

Back in University we had to so specify a minicomputer (PDP-11) in a
metaprogramming language called "Maude" using its complex, Prolog-like
rewriting functionality. Needless to say, the entire computer (CPU, ram bank,
store) only took up maybe a hundred or so lines but boy oh boy did it take me
ages of fiddling to get it just right.

Languages like that are just too difficult to work with for a lot of things,
and that's setting aside the inductive and provable nature of something as
simple as building a minicomputer out of logic.

------
InclinedPlane
Most coding tends to add a significant amount of accidental complexity. Take
that into account across every layer of development from the requirements and
design phase onward and you get an exponential explosion of complexity, and
thus an explosion of code size, of bugs, etc. More so when you consider that
the typical method of dealing with poor or leaky abstractions is to add
another layer of abstraction.

------
ced
From Alan Kay's interview:

 _The problem with the Cs, as you probably know if you’ve fooled around in
detail with them, is that they’re not quite kosher as far as their arithmetic
is concerned. They are supposed to be, but they’re not quite up to the IEEE
standards._

Does anyone know what he's referring to?

~~~
wmf
I wouldn't be surprised if he's talking about integer overflow, which has
bitten people in surprising ways for decades.

------
AdrianRossouw
adding more programmers doesnt make it faster to develop things. it just means
more time in meetings.

~~~
me_again
Some software systems are too large for one person, however talented, to
write. There's a reason not every line in the Linux kernel is written by Mr
Torvalds.

~~~
johnpmayer
But that's maybe not the best example; wasn't it originally just him? I think
GNU+Linux might better demonstrate what you're trying to say.

~~~
me_again
It was originally, but I don't think it's controversial to say it wouldn't be
in its current state as a 1-man project.

~~~
aaronblohowiak
I believe Linus has said that lately, he doesn't write code so much as review
and discuss patches.

------
alexchamberlain
Philosophical question: Is time in meetings wasted?

~~~
scott_w
Short answer: no Long answer: it depends

Some meetings are a complete waste of time, but they're one way of bringing
people's thinking closer together.

Meetings probably don't get everyone on the same page, but if they can get
people reading from the same book, it was probably worthwhile.

As an example, in our organisation, we use meetings as a way of getting
information from development to our sales/support teams and vice-versa (what
features are selling, where are the pain-points).

Whoever is chairing the meeting will make efforts to remove people who don't
need to be there e.g. if a meeting moves into technical discussions, we may
ask the sales representatives if they want to go to avoid being bored.

------
its_so_on
Actually there used to be a 100x better approach to software than this blog
post. Yes, that meant 10000x better software.

All you needed to do is break down what you were doing into steps, and pipe
one step to the other with this character: |

Unfortunately, one thing led to another and, well, we are 10,000 times slower
now than then.

whats your email address | confirm your email | get a link | pay with my
payment processor | get my service

Hahaha, setting this up today takes 8 hours. Approximately 10,000 times slower
than it should be. Actually, someone who can set up a complete billing
solution in a day is considered a hero.

Amazon is the only company that even comes close to doing for the web what we
had twenty years ago for local sys admin tasks. Only one lets you manage loads
of complicated files (kind of cool) and the other lets you provide a service
to hundreds or thousands of people (kind of _awesome_ ).

where did we go wrong.

~~~
its_so_on
I wish someone would reply instead of just downvoting, so I can at least see
if I've made myself clear.

It used to be that for most tasks a superuser, a real ace of an admin, used to
do, they would NOT write new programs: instead they would stitch together old
ones.

I wish I could find it but am having trouble just now, but there was an
experiment done among various programming groups completing the same task. You
had people doing it in various scripting languages, C++, Java, whatever.

The way I recall (again, having trouble finding this), the team or
person/approach that won handily was the one using Unix from the standard
commandline interface (e.g. bash) - no scripting or programming at all! Where
instead of writing a program to do it all, the person or group simply used
standard Unix tools, piping them together etc, until the problem was solved.
This approach was by far the fastest.

I'm saying, these days on the web we don't really have the same thing when you
develop a new application. We don't have a "Unix of the web" - though, again,
AWS and Amazon's ambitions on e.g. payment, database, etc, seem to be vaguely
in that direction - which is far more productive than writing a piece of
software.

No matter how productive -- HOW PRODUCTIVE! - you are at writing a script to
bill a user ten dollars, you can never -- NEVER! -- be as fast as typing "|
bill 10.00" where "bill" takes an email address on standard input. That Unix
program does not exist.

The way the web is developing, it does not look like it will exist. This was
my point. I guarantee you that typing "| bill 10.00" is nearly ten thousand
times as fast as writing any program in any programming language that does
that.

Unix works because someone took the time to write programs that can be
stitched together at the command prompt (or from a script). The Internet just
doesn't work that way.

The blog post I'm referring to ends with: "Is that just the way things must
be, that geniuses are in short supply and ordinary intelligence doesn’t scale
efficiently? How much better could we do with available talent if took a
better approach to developing software?"

I say, the problem is that the geniuses are no longer creating the "Unix
programs" of the web. They are writing software, i.e. lines of code, they are
not writing web "utilities".

If all the geniuses got together and gave me the top thousand things you need
an API for, and instead worked on making them a set of small, modular Unix
commands, then I would be literally ten thousand times more productive than
now.

I could literally do in 20 seconds of typing what I can do in two days.

~~~
jbooth
You know, I like grep and sed and all that too, but what the hell are you
talking about here?

The bill command just takes an email on stdin? How's it know which account
that email belongs to? From a database? With what credentials? Does it bill to
paypal or visa? To which merchant account number?

The thing with "unix style programming" fetishism is that, yeah, pipes are
great, but now you're writing incredibly complicated options parsers to
configure all your little standalone programs. Isn't there a point at which a
simple method call is easier? We've had method calls for a long time.

The reason you were downvoted (not by me) is probably that people thought this
was obvious and you were being obtuse and ideological.

------
PaulHoule
100x better is a bit much. 10x better, maybe.

~~~
skarayan
In the large projects that I have seen, when I think about how much of the
work being done is the core product vs frameworky stuff and/or integration, I
think 100-1000x is more accurate.

The pure business logic constitutes a very small part in comparison, but the
type of environment/company also matters. Startups tend to have less fluff
than large enterprises.

