Hacker News new | past | comments | ask | show | jobs | submit login
We Who Value Simplicity Have Built Incomprehensible Machines (dadgum.com)
220 points by prajjwal on May 19, 2012 | hide | past | web | favorite | 87 comments

I have spent the last two years creating a desktop GUI application more-or-less from the ground up. I used a bit of existing code but most of the choices were mine.

It was only "evolutionary" in the sense that I don't necessarily know everything I was thinking last month. But is easily enough to give the system many of the irrational qualities that I once cursed other people's systems for.

I also never had any deadline pressure. The whole system needs to get done but there's no other time line pressure. I can and do stop to rewrite subsystems that seem bad. I can and do write all the tests I would like to.

It's amazing to me how little any of this has helped with the fundamental problems. The code is better, the parts of the system that need to be optimized are optimized but the screaming at the code is still there. The "how did things reach this state is still there".

So what seems like the point of the article, that massive complexity happens through many small, unintended consequences, really rings true.

> massive complexity happens through many small, unintended consequences

Accidental complexity.

But all the little bits of complexity ... they slowly accreted until it all got out of control, and we got comfortable with systems that were impossible to understand.

I'm convinced that all systems of any significant complexity can only be created in evolutionary steps, and that process always leaves vestigial traces behind. The most complex computing environments always have an almost organic quality.

I strongly recommend Errol Morris' film "Fast, Cheap & Out of Control" for insight into the paradox that order seems to somehow magically emerge from chaos.

Thanks for the film recommendation. In a similar vein, I'd recommend Kevin Kelly's book, Out of Control: The New Biology of Machines, Social Systems, & the Economic World, which deals a lot with complexity of large systems and their parallels to biology and evolution.

I do think it's possible to create a system of "significant complexity" in a clean and not so evolutionary way, but only after creating a version or two that were developed incrementally, which necessarily end up with a lot of cruft.

Related, but not completely on topic:

"All watched over by machines of loving grace" by Adam Curtis.

"Bumblebee Economics" by Bernd Heinrich

There is a great presentation [1] by Rich Hickey about simplicity. He makes a point in keeping the words "simple" and "easy" semantically separate. This post is a wonderful example why that makes lots of sense. None of these examples really show anything that's decidedly simple. They're all about ease of use and backward compatibility (which is a particular case of ease of use).

It seems like "easy" can be further subdivided into several useful and objective categories. Clearly, libpng was only concerned with some aspects of "easy" (portability) and not others.

Also, I absolutely hate the kind of fatalism you often see in SE articles. "Oh, gosh, nothing is really simple, nothing really is bug-free, nothing is really good, so you shouldn't even try."

[1] http://www.infoq.com/presentations/Simple-Made-Easy

I don't think the average geek values simplicity all that much. We value power, flexibility, versatility. We value openness, interoperability, upgradeability. Simplicty? It's a distant 7th or 8th on the scale. That's why computers are the way they are, and we needed a company that really did value simplicity to create a new paradigm where users don't have to deal with libpng, winsock, antiviruses and other artifacts of the complexity that we, geeks, have wrought upon the world.

Edit: I'd go as far as to suggest that this desire for simplicity is a relatively new phenomenon that is closely linked to the rise of the New Apple... I personally didn't give a damn about simplicity (or didn't know that I did) before I got myself a Mac in 2007 or so. Since then, I've progressively raised simplicity as a top priority. Some friends of mine who are hardcore geeks too and never moved to Mac still don't care about simplicity. They also rarely use the slew of web 2.0 "simple" SaaS apps that have evolved over the last few years... They're happy with complexity, it seems (as I used to be).

I think you're confusing two kind of 'simplicity'. 'Apple' simplicity is in my eyes about cutting away what is not needed and unifying everything else. What author is talking about is simplicity that lies in roots UNIX-philosophy. It's about having many simple tools that can be composed into something more complex.

Perhaps my view on the world is twisted by the kind of programmers I know (mostly CS majors), but I almost everyone working on something bigger than 100 lines of code has simplicity high on their list of priorities. It makes everything easier to maintain, read and write.

Another problem is simplicity is inherently subjective. Almost no one thinks their code is overly complicated, that is an attribute to be applied to other people's code. People play lip service to the idea, and will sometimes claim their code is overly complex, in an almost self-deprecating way, but not much serious effort is put forth on fixing/avoiding that. Lots of comments disparaging other's failure to "get" their design or suggestions to RTFM, or worse, in open-source, RTFC, abound however.

This is because that what happens in engineering. Even simple things like shovels have evolved. The blade has been shaped to into the dirt. It's got a squiggly bend so it's balanced holding dirt. Stepping up a bit further we have things like clocks. Sundials don't work in the dark so we made mechanical clocks. Those of course have bits of complexity to make them easier to read and more accurate. Now consider computers, they're blindingly fast. We really can't begin to comprehend how much the state of a CPU changes in the blink of an eye. Same goes for memory. If we had to track that as many items as bits in real life, we fail hopelessly. To compensate for this power, we created management systems. Of course the management systems themselves fast and large so we have management systems for those too. The very power of computers makes them complex. All we've been doing is desperately trying to simplify them. Complaining that those simplifications are complicated, ignores just how complicate they would be if we didn't.

i think the point is that when designing clocks we didn't just attach some gears and a timing mechanism and some hands to a sundial, we threw the sundial out. computers are complex because we hardly ever remove anything, and we still bare their cost aeons after they're relevant.

Yeah but clocks generally don't interface with other technology. Trailer hitches and screw drivers do. There is a whole slew of different hitch adapters. And, most of us have or at least have seen screw drivers with a set of attachable bits.

Gall's Law is relevant here. http://en.wikipedia.org/wiki/Galls_law

  A complex system that works is invariably found to have
  evolved from a simple system that worked. The inverse
  proposition also appears to be true: A complex system 
  designed from scratch never works and cannot be made to 
  work. You have to start over, beginning with a working 
  simple system.

The book that comes from (originally Systemantics, now The Systems Bible), is well worth reading, a blend of wise and hilarious. It's one of those books you keep lending to people and then wonder where the hell your copy went.

What the author is describing is the "idealist" camp's perspective, as Prophet Joel wrote thus eons ago:


x86 has always been a demonstration of how pragmatism can beat idealism. If they'd been sidetracked by idealistic "make it pretty" thinking long ago, maybe someone else would have won the PC processor war.

There is a place for both types of thinking. Rather than fighting blindly for one camp, we should know when both are appropriate. Sometimes you're changing major versions and can afford to break compatibility. Sometimes you need to swallow it and leave the ugliness there.

Unfortunately, worse is better. Sigh. Here's to VPRI/STEPS changing things sometime in the next 30 years, as much of a longshot that is.

Well, that's the problem, isn't it? It's up to Alan Kay, who initially brought us the insights to help us out of this mess. The brightest minds are working at Color and other ways to make cat photo sharing easier. And that's what's getting funded. It's embarrassing. And we have only ourselves and our greed to blame.

I like this essay.

In the move from '286 to '386 through '486 there were many messy kludges (addressable memory; 16 bit to 32 bit instruction set; etc etc). I felt that 'we' should just stop the relentless advance of this weird architecture and start with something more sane.

But refinement through iteration is incredibly powerful. (See Kaizen for business processes and quality control) Sometimes you get left with weird bodges; the digital equivalent of the human appendix dangling with no purpose but only occasionally poisoning the host. But mostly the benefits of forward motion outweigh the disadvantages.

There are projects where people stop and think about "how to do this right" - almost always they are over taken by people who build fast and break things. (see, I guess, various micro-kernels.)

Let's not forget that Intel did attempt to pull an Apple with the Itanium as the 64 bit successor to x86. However, AMD stepped in with a backwards-compatible 64 bit architecture, and the rest is history. I suppose this would have happened one way or another; the beast that is the installed base does rather like to stay alive.

Linux kernel devs openly spoke of their dislike for Itanium. There are many recorded instances of this.

How is breaking backward compatibility simpler? If you don't support legacy commands and options you are forcing countless programs to be rewritten. Many aren't even actively maintained any more and businesses will avoid upgrading to your new 'simple' platform due to the huge cost, complexity, and risk of the rewrite project not succeeding, like many software projects. That's why OpenGL has so many old methods and commands, they are used by old, expensive CAD programs.

Speaking of OpenGL, more efficient memory routines he claims are worthless have probably saved me months of optimization programming real time games. If you don't have sufficient frame rate there for a good experience you end up doing enormous optimization work.

The Hitchhiker's Guide to the Galaxy, in a moment of reasoned lucidity which is almost unique among its current tally of five million, nine hundred and seventy-three thousand, five hundred and nine pages, says of the Sirius Cybernetics Corporation products that "it is very easy to be blinded to the essential uselessness of them by the sense of achievement you get from getting them to work at all.

"In other words --- and this is the rock-solid principle on which the whole of the Corporation's Galaxywide success is founded --- their fundamental design flaws are completely hidden by their superficial design flaws."

(Douglas Adams, So Long, and Thanks for All the Fish)

What insight does the author have to offer with this essay? It's easy to complain about complexity, harder to offer a simpler but equally capable alternative.

The author makes it sound like simple is easy. As if it's just a matter of saying no to complexity, like saying no to memcpy() whenever we have a memmove() that's good enough.

This is not the case. Simple is not easy. On the contrary, simple is hard. So you think "ls" has too many command-line flags; how are you going to cut the number in half? Which ones do we keep and which ones get axed? How are you going to provide for all the same use cases that those 35 flags were added for?

I'm not saying it's impossible and in fact I think it's a very worthwhile and noble cause. But it's long, tedious, hard work, and I say this as someone who has a nearly obsessive devotion to writing small and simple software.

Let me tell you what it's like to design small and simple software that is practical enough to actually use. The most important part by far is to leave out anything you possibly can. If there is a feature or abstraction that could be implemented on top of your software with equal efficiency as what you could do by implementing it internally, LEAVE IT OUT. The feature that seems so cool to you now is going to get in someone's way later on. Your attempts to be "helpful" are going to be the next generation's bloat.

This is where I think the famous Worse Is Better essay (http://www.jwz.org/doc/worse-is-better.html) goes wrong. To the MIT guy the "right thing" is for the system to implement complex recovery logic to hide from the user that a system routine was interrupted. And why? So the user doesn't have to check and retry themselves? The code to do this is trivial and can trivially be implemented in a library that everyone can then share.

The key to long-term simplicity is LAYERING. The lowest layers should implement a very general and reusable abstraction, and be as absolutely non-opinionated as possible. Layers on top can tend more towards policy. If a higher layer is too opinionated, a user can always drop down to a lower layer.

I haven't actually used libpng, but from a glance at the sample code that the author is complaining about I'm inclined to say that the author's complaint is misdirected. It looks like png is a well-designed, though very low-level API. Using low-level APIs often requires a lot of steps; this is because the overall task is broken down into its most primitive steps. Unless some of these steps are redundant this does not mean that the software is too complex. Rather it means that the author would prefer a high-level API instead of the low-level one. But don't demand that the lower-level API be hidden in the name of "simplicity" -- that will make someone else's life harder when their use case is not what the designers of the high-level API had in mind.

I don't disagree, but these points needs highlighting:

> It's easy to complain about complexity, harder to offer a simpler but equally capable alternative.

Want real stuff? Check out VPRI's work: http://vpri.org/html/work/ifnct.htm Right now, they're working on a 20KLOC OS (including desktop publishing, messaging, and the whole compilation chain). That's about 4 orders of magnitude smaller than current systems. Here is their last progress report on the STEPS project: http://www.vpri.org/pdf/tr2011004_steps11.pdf

> The most important part by far is to leave out anything you possibly can.

This is also the most overlooked part by far. I deal with something similar at my day job: I noticed that I am relatively poor at dealing with masses of complexity. My co-workers fare better. On the other hand, they don't mind small unjustified complexities. The latest bit I saw was this (in C++):

  class Foo {
    out_t1 bar(in_t1 in);  
    out_t2 baz(in_t2 in);  

    static mem data;

  // example
  Foo foo;
  out = foo.bar(in);
As it turned out, this "class" didn't have any state whatsoever. There was `data` of course, but it was never modified after its first initialisation at program launch. I devised a simpler interface (which happens to be backward compatible):

  class Foo {
    static out_t1 bar(in_t1 in);  
    static out_t2 baz(in_t2 in);  

  // example:
  out = Foo::bar(in);
It's not much, so they say it's no big deal, and act as if it does not count. But it adds up quickly, often to the point where I can reduce the line count of a procedure by half, without even understanding the code! (I know how to do correctness preserving modifications.)

I've seen these VPRI links before. I've read through that PDF and various web pages, but I still have a poor understanding of what exactly they are doing.

Is it possible to summarize in a paragraph how they are able to achieve this LOC reduction? Is it simply that systems like Linux et. al. have been cobbled together by many hands over many years while VPRI has a single vision? Are there coding techniques I can use today in my own work? The only example I've found was a mention of some networking stack that was able to make use of a parsing engine, rather than implementing a custom networking-specific engine. This sounds like nothing more (not to trivialize this task) than choosing the correct parts to turn into reusable libraries and then reusing them.

> they don't mind small unjustified complexities

This is something I've noticed as well. Complex code of my own creation grates on my nerves until I am able to erode it and smooth it out. I get the impression that not everyone feels this way.

The other comments here already explain how they do it on a high level, here I'll try to explain it on a slightly more technical level.

They use DSLs very heavily. The heart of their system is OMeta, a high level pattern matching and transformation language. It works both on flat data (in which case it acts as a parser: transform flat text into structured data) and on structured data, transforming it to different structured data. The compilers for all their DSLs, including the OMeta compiler itself, are written in OMeta.

All these compilers are very short (very roughly 200 lines each?). Because they can define new languages with so little code, they pretty much have a different language for each problem they want to solve. For example for the graphics rendering they define a concise data parallel language, they have a low level intermediate language, and a couple of others (e.g. Maru (Lisp like) and Ants (for implementing WYSIWYG editing) and amusingly the TCP RFC).

Of course DSLs are not enough. If you wanted to make an exact copy of Ubuntu or Windows then no matter how good your DSLs, you're not going to make it in 20k lines. So they simplify the personal computing stack a lot. For example they don't have separate programs for document editing (MS Word), presentations (PowerPoint), internet browsing (NOT the web -- their own web format instead of HTML -- you're never going to be able implement the traditional web stack in 20k lines), email, spreadsheets, etc. Instead they unify all of this in their universal documents. So they don't try to copy, but they try to get something that's functionally equivalent. The result is pretty powerful and potentially more useful: you can put a spreadsheet table in a document, in an email, on an internet page, etc. When they extend the universal document with a new feature (for example graphs, or math typesetting) then all these individual uses benefit, instead of doing each feature N times as we do now.

Links to source code repos can be found at http://www.vpri.org/vp_wiki/index.php/Main_Page

I count 3 "miracles". In ascending order of awesomeness, these would be:

(1) No feature creep. They provide essential functionality, little more. I say that's a good thing, because if you miss functionality, the system lets you build it relatively easily. My guess is, it explains about 1 order of magnitude (it divides code volume by 10).

(2) Factor everything. Again not very impressive, but it goes a long way. Just good engineering principles applied systematically across the system. For instance, they have one graphic stack, which draws everything in every program, including the window manager. They also have one document type. A side effect of this approach is a greatly increased integration. My guess is, it explains about 1 order of magnitude.

(Warning: the divide between (1) and (2) is somewhat arbitrary. In some ways, the STEPS system provide more functionality than current systems. I think the trick is to stop thinking in terms of features, but in terms of capabilities. For instance, UNIX don't have a feature to sort Apache log files by IP address. But it has `sort` and `awk`, and therefore it has the capability.)

(3) Build the right tools. The tools being mostly domain specific programming languages. And that is the impressive part. Their language are so expressive and simple that they can reduce code volume by two orders of magnitude without breaking a sweat. I don't know exactly how they came up with such languages, yet they did. The tree core languages that may interest you here are Ometa http://www.tinlizzie.org/ometa/, Maru http://piumarta.com/software/maru/, and Nile. Ometa is like Lex+Yacc, only much simpler and more general (it lets you transform trees, and flatten them, which makes it suitable for all compilation stages). Maru is a not-so-inefficient self-implementing language, based on 2 cores (lambdas and objects) which can implement each other.

So, 1+1+2=4. Or 1010100=10,000. I'm oversimplifying of course, but I think this gives an idea.

> Is it possible to summarize in a paragraph how they are able to achieve this LOC reduction?

I think it hangs on two principles:

1) more expressive tools at the price of performance.

2) use existing data as much as possible, but only complying with standards where really necessary.

1) is using e.e. Ometa & friends, which make it possible to (e.g.) implement a reasonable JS engine on top of any reasonable dynamic runtime with 100-200 lines or so.

2) is using e.g. the RFCs the define IP, TCP and UDP as input to a processor -- thereby, having a guaranteed-to-match-specification implementation, without having to repeat any of the struct/constants/details. The RFCs do not count among their 20K lines -- they are probably 3K lines themselves. The parser that parses RFC into data structures and some executable code IS counted in those 20K - but, last I looked, was less than 100 lines.

1) But as it turned out, the cost wasn't so high after all. (They didn't expect their graphic stack to run fast enough in a laptop, yet it does.)

2) What, they do not count the RFC they parse? I'd count that as cheating: if the RFC is part of the meaning of the system, it should be taken into account.

> What, they do not count the RFC they parse?

No, they don't.

> I'd count that as cheating: if the RFC is part of the meaning of the system, it should be taken into account.

I don't think it is cheating. The RFC text is essentially arbitrary, and "god-given" as far they are concerned -- and must only be adhered to for compliance with other systems. (If they designed their own network protocols, I would have agreed with you). By parsing it and generating code, they guarantee that the code conforms with the spec (which is more than you can say for any other network stack implementation!).

And it's all down to what they are trying to achieve: They are NOT trying to minimize LOC -- if they wanted to do that, they'd build everything in obfuscated APL or J.

They want to get to a state in which a person can understand / audit the entire software stack of a system they are running. 20k was selected as what a person can completely follow & understand in a reasonable amount of time (weeks, not years or centuries)

You cannot (average human "you") understand/audit a TCP/IP stack without reading the RFCs; therefore, they don't think they need to double-count it, and neither do I. Furthermore, if tomorrow the world switches to curveCP, they wouldn't include the related RFCs either -- only the code they had to write to make it work.

Aside from the things described by the other commenters, a big thing they're doing is accepting major performance hits that are no longer relevant. xterm on this netbook renders text onto the screen by copying prerendered glyph rectangles loaded by the X server from PCF files into the framebuffer, because that's how you had to do it in 1984, because you were on a barely-1-MIPS workstation, and if you tried to do something fancier at the time (including masked copies, so you don't need to have a character in your font for á, but can overstrike a ́ and an a) xterm would have displayed text noticeably slowly. But this level of optimization is kind of a waste on this 1000-MIPS netbook, let alone a 30,000-MIPS desktop machine. And it costs a lot of code complexity.

By contrast, STEPS treats each letter as a polygon or group of polygons, and rasterizes your polygons when it's time to draw them. So, among other things, you can rotate, scale, and color your text just like any other object, and the amount of code devoted to text rendering is tiny. The performance cost may be one to three orders of magnitude in that case, but that's acceptable for an interactive windowing system.

> Want real stuff? Check out VPRI's work

I didn't mean to make it sound like simplicity is not possible. Quite the contrary, I believe it is possible and extremely worthwhile, and it's something I have dedicated an enormous amount of time to in my own work.

All I'm saying is that the essay was not worthwhile to me, because it offered no insight that helps us get there.

> It's not much, so they say it's no big deal, and act as if it does not count.

I agree that your version is far superior, and that it is a big deal. If I saw code that constructed a class but the class didn't have any state, it would interrupt my flow while I stopped to figure out wtf is going on.

I didn't mean to imply that you meant what you didn't mean… That was poor phrasing on my part. Actually I agree with you. Saying that something sucks is worthwhile only when few know it. When everyone does, we ought to take the next step, and propose solutions (or at least analyse the problem more deeply than it has been). I was just saying "there! There! A solution!!".

The failure isn't in that we're adding too much complexity (we have to be able to create complex structures) or that we're not layering it enough (leaky abstractions build up). It's a general, and much more far-reaching, problem: our abstraction mechanisms are failing us. We need to stop complaining and put more effort into research. There are four or five projects working on this and, literally, hundreds of photo-sharing startups.

"our abstraction mechanisms are failing us... There are four or five projects working on this and, literally, hundreds of photo-sharing startups."

But it's the same thing! It's the same problems

And IMHO in the technical places it's getting worse every day.

The reason some photo sharing apps are bought by a billion dollar is the same as why lots os people use PHP even though "it sucks". User experience and ease of use.

You don't need to limit yourself to a high level abstraction (like gnome) or provide the whole "under the hood" controls like you're flying an aircraft.

Complexity has to be managed and split into stages. C'mon, even VIM has a smooth learning curve (and not a learning abyss, or being shut out)

And stop the very prevalent mentality in technical circles of "if you're technical you have to read the manual". User experience has a place even when editing .conf files or using a command line tool, but some people make it out of the way harder just to make it accessible only "to the true believers".

Yes, everything is mixture of complexity. When an artist paints, she has to decide how simple or complex to make the work. But what we're talking about here is that once you pick up a PHP brush, you're screwed. That's why it sucks. You can't change brushes; you have a big fat crayon. And while that works for filling in the sky, you can't get any detail on the faces, for example. (Choose a fine brush and you have a different problem of spending all your time on the sky.)

What I'm talking about when I say "our abstraction mechanisms are failing us" is the need to use any size brush at any time.

And, as an aside, photo sharing apps are bought for billions for the same reason that tulips used to be worth a fortune. Uploading pictures of your cat is about as valuable as a 'Carnaval de Nice'.

Any links to such projects?

VPRI's STEPS project is one such project: http://www.vpri.org/pdf/tr2011004_steps11.pdf

Btw Instagram is an example of opinionated simple interface. You might be underestimating the complexity of creating such interface.

In my reading I certainly saw insight, I think you're not pulling your 'world view' back enough when reading it and looking too closely at the examples.

I think his point is not that "ls" needs less flags, it's that it needs no flags.

Or that libpng is low-level versus high-level, it's more that there shouldn't even be a concept of low-level.

And the other thing he seems to be saying is that backward compatibility should be abandoned to allow libraries to actually mature instead of just bloat.

It was certainly food for thought to me. And you even seem to agree in the paragraph of Let me tell you what it's like to design small and simple software...

"I think his point is not that "ls" needs less flags, it's that it needs no flags."

Then don't use the flags. The default bare 'ls <wildcard>' is sane 99% of the time for me, and I consider myself a junior power user.

same here. in fact i'd call it a power law--90% `ls`, 9% `ls -l`, .9% `ls -ltr`, .09% `ls -lhSr`, .009% `ls -d` .0009% `ls -i`, etc.

which suggests the default behavior, and arrangement of the flags, is very well designed.

(if i could change one thing, it'd be to make all the sorts ascending by default....)

To add to your point: physics shows that it is possible to describe complex phenomena using simple models. But discovering such models is hard.

I entirely agree with you - but I think the hardest thing is SPLITTING THINGS UP into coherent units that compose the lower level API you write about. The problem is that we can divide things in many possible ways - what makes one division better then the other? This is hard, layering is the easier part (but still far from obvious: http://perlalchemy.blogspot.com/2012/04/breaking-problems-do...).

Dude, goddamnit. This is a startup forum. People complain about shit being hard, and it's our job to make it simpler. Even if it's the temple itself.

I believe it's possible, and worth doing. Moreover, I have spent several years of my life trying to do just that in the space of network parsers and serializers!

All I was saying is that it's not easy or straightforward to achieve, and that the author's essay doesn't offer anything that helps us get there.

Yeah sure you can keep layering till you get 47 different layers (some in parallel).... aka

"X Windows"

Layering does come at a cost. Speed and various hacks that have to reach below the layer. Partly because of X Windows Linux still has one of the slower and fragmented graphics shell.

>The author makes it sound like simple is easy. As if it's just a matter of saying no to complexity, like saying no to memcpy() whenever we have a memmove() that's good enough. This is not the case. Simple is not easy. On the contrary, simple is hard.

Simple by design might be hard, but "simplify in retrospect" is easy. It just needs you to be willing to sacrifice meaningless backwards compatibility and inflict some temporary pain (rewriting stuff dependent on it) for the benefit of tomorrow.

>So you think "ls" has too many command-line flags; how are you going to cut the number in half? Which ones do we keep and which ones get axed? How are you going to provide for all the same use cases that those 35 flags were added for?

You don't. Fuck ALL those use cases. Keep only those that are used a large percentage of the time by the large majority. For those that need the extra juice, let them use the old, convoluted version. You don't have to have "ls" be a be-all-end-all. Have both ls AND a power-version if anyone cares to maintain the second.

And sure as hell, don't keep competing or duplicated flags just for backwards compatibility.

> You don't. Fuck ALL those use cases. Keep only those that are used a large percentage of the time by the large majority.

Disagree completely. Because if you do that, the moment someone comes along and needs something that you're unwilling to provide and can't be layered on top of your work, they're going to make their own "ls" that overlaps a lot but satisfies their need. If we have 15 different "ls" implementations, we're back to more overall complexity.

Basically if you want to write code that lives at the bottom of the stack, you have to think of the most crazy hard-core shit that anyone might want to do with your code and how you can accommodate that.

Two fantastic examples of software that achieve this in spades: Lua and zlib. Both let you plug in your own memory allocator instead of malloc() if you really need to. Both let you perform your own I/O if you really need to.

Heartily agree on Lua. Embedding lua was one of the best library experiences I've ever had. The implementation and language is small and simple, and I think many people undervalue that. Compiling it was as simple as dragging the source files into Xcode (not an ideal long term solution, but quick and easy to compile to any target, and at 10k lines of code, reasonable compile times). As a comparison, I spent 3 days trying to compile Spidermonkey for ARM (with eventual success attributed mostly to finding a useful patch on github).

"crazy hard-core shit"

This is the problem. When basic understanding of a system (e.g. I/O) is viewed as "crazy hard-core shit" we are in trouble.

You do not even need ls in today's UNIX.

Globbing, printf, echo and stat can do it all.

And if necessary it's easy for the user to write her own custom utilities for displaying file or directory information from the stat struct as long as she is provided with good documentation for the standard C functions.

If you don't know how to do these things then you need to learn. And I say that as a dumb end user who can do things in UNIX that apparently most programmers of today cannot or will not because it's "too hard".

To me, learning all these high level scripting languages is crazy hard core shit. It would take me forever. They are large and complex.

Meanwhile I can do things the short, simple way. And learn about "crazy hard-core shit" like basic I/O at the same time.

Users need to be able to get the functions they need. But the unix-way provides a mechanism for getting ls plus the functionality you want: have a shell macro run the minimalist ls and pipe it through a formatter.

Unfortunately, although unix scripting was good by the standard of its era, it's still not particularly good. Unix as we know it hints at great ideals that are achievable but lives up to them poorly. As a result, early packagers shoveled hacks into existing layers and we live with the result.

Evidence: arguments in commands like cat and ls; the crudeness of ps to represent the process table; inability for scripts to access unix systems calls; kitchen sink shell interpreters.

Why do we look to shell interpreters for command-line history? That should be done in a separate layer closer to the terminal. Had it been, we'd have two small unbloaty layers. Instead we have tcsh, bash and zsh.

You could have your ls formatting macros live in your personal terminal emulator and you'd never need to copy .bashrc hacks to each system. The formatter code might not even live on the remote system, it might be part of the terminal.

I suspect you could lay [racket scheme] + [a transform that converts python style whitespace code into parens code] over a unix that has /proc, and use that to create the layers you needed. With that in place, you could then start taking noise out of the unix tools and work towards a smaller distribution.

Plan9 is talked of as "unix done right" but I'm not convinced. Some things it does well, but it has lots of baggage. For example, you have to buy into their desktop to access the OS. How is forcing me to use a mouse and a particular tiling window manager a necessary part of a better unix experience?

I'd like to play with a plan9 fork where the only user interface assumption was a terminal, and where you could get a secure connection into that terminal. (Too inept to do it myself so far, but keep playing with idea of running it on linux, ssh'ing to linux as the jump box, and then telnetting into the plan9 instance. Still - what a hassle)

People might use only 10% of features that your program provides but the problem is that different people use different 10% (it or similar was said about MS Word bloat).

Yes, by Joel Spolsky. Only it's not 100% accurate.

Some people DO use a different 10%, but the grand majority uses the same 10% (and occasionally has a need for something outside of that).

Don Norman wrote a similar essay titled "The Trouble with UNIX" in 1981 [1], where he praised UNIX as an "elegant system", but criticised its commands for their "incoherent, inconsiderate design".

  The greatest surprise actually is that Unix still lives on, 
  warts and all. The frustrations I experienced in the paper
  were real and still exist today [1987].
In a similar vein, but also quite entertaining, is the UNIX-HATERS Handbook [2].

[1] http://books.google.co.nz/books?id=UqhjMVwYKJsC&lpg=PA25... [2] http://www.simson.net/ref/ugh.pdf

That's why we need to use a clean slate occasionally.

Every feature creates a better picture on what to the next time you throw it all away.

C, UNIX and libpng have taught us a lot, but it's probably time to take them in the back yard and shoot them rather than whinge about complexity.

I had a similar reaction to libpng and attempted to write the simplest orthogonal decoder: https://github.com/martine/sfpng

In a similar tone, Sean Barrett stb_image.c is a good example - it's a reader for jpeg, psd, png, tag, etc. - http://nothings.org/stb_image.c

also a small writer - http://nothings.org/stb/stb_image_write.h - png, bmp, tag

on his page he has also small ttf reader

Nothing wrong with that. You can easily drive a car with no idea how to put the alternator in place. Not everybody that uses Google understands algorithms. I like all the wierd switches. Correct, you have to look at the man pages but you know one thing for sure: if you need it, somebody already needed it and it's THERE. And show some love for the complex but powerful. Finally, computers are much more transparent than most of the machines we use daily (cars, microwaves, etc).

For a person who knows how to use a man page, ls is a simple and highly effective tool for a simple purpose. Part of becoming a unix guru is learning to deal with the idiosyncrasies of the system. I see it as much like learning a natural language; they all have strange rules that are unintuitive to us at first, but we overcome these with practice and patience.

That's not to say however that redesign in certain areas wouldn't be a great thing.

IMHO Things are already getting simpler (or I should say external API complexity is slowing down).

There are couple of reasons why I think APIs are getting simpler:

* Services, Services, Services (RESTful services)

* Greater Automated Test (Unit Testing)

* Better dependency management

* Continous Integration / Deployment

* There is a new Apple-esque simple is elegance mentality now.

In short developers are not as afraid as they used to be to make drastic API changes.

It's time to consider a new iteration, to create what comes after unix...


plan9 is more unix than unix. The Go language has channels which can be thought of as type-safe generalized Unix pipes. How cool is that?

The problem is that people think "simplicity" is stopping and designing things from a clean slate. That hardly works. Real simplicity is a balance between all the forces, and that's only achieved with enough iterations to allow that balance to happen. If you turn around to nature you'll see this everywhere. Often the best solution is simply the only solution left.

What you design should be very very simple to someone when she starts to play with it.But as she plays more and more she should discover gradually(exponentially) the great underlying complexity in order to achieve the simplicity on the surface.

All the great codebases,machines,mathematical equations,laws of nature,even a bacterial cell seem to have this property.

Simplicity is such an overloaded concept. It can mean: reducing the complexity of interface, reducing the complexity of operations, reducing the understanding required for use, reducing work to do the unanticipated, keeping common metaphors (as opposed to introducing new ones), and on and on. Ugh.

For example, I deal with a lot of network code. I regularly have to make various softwares work with different protocols, sometimes in async systems, sometime in threaded systems, sometimes in simple blocking systems. Further, networking is an inherently layered space - (eg. the JSON app specific protocol sits on http sits on tcp, which sits on ip, which sits on ethernet, id like to bea able to say is encapsulated in rather than sits on, but that breaks at various layers :/). So anyway I get the order to implement support for FooProto in a few systems. Great, I've heard about FooProto and wanted to play a bit.

A bit of research turns up that there are a few major and complete implementations. One is the reference implementation and it is of course big, bloated and slow. It does everything but there is no cohesion to the API or whatever. That one is probably out.

So the others have mixed reviews but a few are very popular and "simple".

Option 1: exposes 4 methods, provides my code with a well designed data structure, and looks fantastic. Problem is, it does blocking I/O and isn't thread-safe. It will work for some situations, but not all the ones I have. Worse, the transport is pretty heavily tied to the socket code and various system calls. But it is very simple.

Option 2: optimized for non-blocking. Has lots of callback hooks. Still tied in with the socket/syscall layers, but not in a way that makes extraction impossible. But it would take a few hundred lines of wrapper to make it look synchronous. And forget thread safety.

Option 3: Really thread safe, decent api, layers split out, but in a weird metaphor that requires some odd wrapper code would really require some rework to translate, or would require reworking of the hosting code to deal with the metaphor.

So which of these is simpler? Arguably each of them. One solution of course would be to use each as needed. Then of course I have to track multiple APIs, there are integration and bug compatability concerns. Not really simple.

Odds are I'll end up choosing the reference implementation or make my own, just because in the small it is more complex, but in the large, it makes my life simpler - one code base to handle multiple situations. Also a simpler choice, albeit on a different access.

So my point is, simplicity is not just hard, but sometime incompatible with itself. Simple on one axis may complicate things on another, and in irreconcilable ways.

Aside: To anyone implementing protocols -- _Please_ keep your parsers, state machines and transports as loosely coupled as possible. I, and many people I know, will love you for it. Transports change, encapsulation is common, and networks seem to be mostly edge-case. Feel free to add a layer tying them all together in a nice common-use api, this is a good thing too, but keep in mind there are always other use cases. Keep in mind that you are always just feeding your parser a stream of bytes, and feeding your state machine a series of tokens. Some protocols have a few layers of this built in. Due to the nature of the beast tho, it always can be modeled that way.

> Aside: To anyone implementing protocols -- _Please_ keep your parsers, state machines and transports as loosely coupled as possible. I, and many people I know, will love you for it.

Ok, I have to curb my excitement here a bit, because what I'm about to tell you is that I'm solving exactly your problem with an obsessive focus on achieving exactly what you desire.

Though in saying so I'll have to disagree with you slightly when you say:

> Simple on one axis may complicate things on another, and in irreconcilable ways.

Basically I'm striving for (and believe I am on the verge of achieving) a design that achieves the best possible simplicity for network parsers and serializers in all dimensions. If you're wondering what the catch is, the catch is that it's taken me 2.5 years and counting to refine the design to the point where it can achieve this.

My library performs no I/O (it's buffers in, buffers out), is completely resumable (so you can suspend the state machine at any point and resume it later), is completely composable (so you can create pipelines that feed the output of one parser to the input of another), loosely coupled, small (<5k loc in the core library), easy to bind to high-level languages, minimal enough that it can run inside the Linux kernel, and FAST (can parse protocol buffers at almost 1.5GB/second).

And it's open-source. Don't be misled by the fact that it's advertised as a Protocol Buffer library; that's how it started, but I'm working on generalizing it to be capable of arbitrary parsing and serializing. Though Protocol Buffers schemas are used to specify the structure of any structured data that a parser wants to produce or a serializer wants to consume.


I did that when doing writing a DNS resolution library (https://github.com/spc476/SPCDNS). The encoding and decoding is one library, leaving the network layer primarily up to you (I provide a very simple network interface that's probably okay for very simple uses). It's an approach I now use for any protocols I work on.

Hey, you that are defending libpng in this thread, go write a C example to load a PNG file with varying formats and color depth using this lib.

This lib API is completely broken. I keep cut&pasting the code I wrote one day reading the doc with much pain...

It's our nature to accumulate junk. I mean, we are at least better than blind evolution, 98% of DNA is noncoding junk.

is BCD really a bad idea? it's 2012 and generally we still have to live with binary-only floats and without a decently fast decimal arithmetic.

BCD is not a bad idea... It's not great, you can pack arbitrary precision values more tightly, and you rarely need to display them (even when you do, the screen is slower than the maths), but it's useful for a bunch of things and it's nice to think about.

But does it really need an instruction on every modern computer's processor to make it that tiny bit faster?

When was the last time you used a program that used that instruction? I've never run in to it (and I spend a decent amount of time looking at x86 assembly). Nonetheless, it shows up at the top of every instruction reference - you need to implement it if you want to simulate x86 correctly.

But at least I can run some old 8086 code on my computer... Right?

It was a bad idea in the 6502. It was a "mode" which meant that you had to clear the decimal flag to make sure it was turned off.


and this is why new software paradigms come out every 10 years - to clean up the shit from the last one

Reminds me of my usual answer when asked by people - "Why does 'x' computing device not work properly?"

Because a hell of a lot of the code running on it was written a month or so late, in a panic, at 4am, by someone who was hungover and wired on coffee.

The thing is that simplicity could be right under a programmer's nose, already done, and they will pass right over it. They will start building something new when a simple solution already exists for what they are trying to do.

Some people just want to write code. No matter if it needs to be written or not. It's like nervous energy.

The important question that we fail to address in arguments about simplicity is "What is it you are trying to do?"

Most times, it's a simple thing. Or it's a number of different things, each of them simple. But the programmer cannot see it that way. He envisions something else. He envisions a large project. He fires up his "IDE". We're doomed, heading for complexity from the start.

What if the simple programs are boring? What if they are short? What then? The programer is itching to code something, anything. He's going to do it. He will produce something complex and unleash it upon the world.

He will have to use "features" to sell his work. (Chances are it will be very similar to other existing programs.) As such, his program can never just do one thing. It must do many things. It must have features. Because that is the only way he can convince people to try it.

Some users will bite. Success! Or is it? This how we end up with an overly complex monstrosity.

What if programming is really a boring task of writing simple, efficient, reliable programs? Then what?

We'd be nowhere without UNIX. And it's a simple boring (well, not to me) system. There is no GUI. But it works. Take some time to process that.

You do not need to create simplicity. (All due respect to STEPS.) You simply need to wake up and discover it. It's running your system. It's right underneath your nose.

Discover, not create.

> Some people just want to write code. No matter if it needs to be written or not. It's like nervous energy.

Maybe it's an attempt to play catch-up with the ``My Github is my resume!'' folk.

"Github as resume" is about transparency. If everything you've written is useless, showing it off is not going to help you.

> If everything you've written is useless, showing it off is not going to help you.

I agree, but how else does one apply to a job that __requires__ a Github-like hyperlink?

It's because some people really do like complexity.

They like writing 200 pages of documentation. And some like reading it. They want complexity. Keep adding stuff.

I remember reading one of the Windows programmers' hero's writing about some massive document of several hundred pages he wrote while on the Excel team at Microsoft and being overjoyed when he learned Bill Gates had actually looked at it.

I remember reading a post by some programmer on Sitepoint boasting about his program that was so complex it would make your eyes burn, or something like that. He was bragging about this.

I recall all the people in mailing lists and forums who get annoyed when anyone talks about conserving memory or disk space. The reason? Because these resources are so plentiful we can afford to waste them. That is a truly great reason. Brilliant.

I once had a colleague who said software is like a gas. It expands to fill space.

There are a great majority of prorammers who are not only OK with this state of affairs but they seek to preserve it. They get defensive when confronted with spending effort to simplify something.

There's a lot of discussion of simplicity that is just lip service. The truth is, simplicity is not easy. It is not a matter of adding more stuff. And we know programmers are lazy. Simplicity, real simplicity - removing stuff, would make many programmers uncomfortable. It would remove them from their comfort zone.

Simplicity is not burying things under five layers of abstraction so that it fits someone's preferred way to model a solution.

Simplicity is taking away things that are not essential until nothing further can be taken out.

It is cutting things down to size.

Achieveing simplicity means reducing someone's 200 pages documentation or their dizzyingly complex program whose source "will make your eyes burn". It may include using compact symbols instead of someridiculouslylongfunctionname. It means some stuff that someone spent time producing must get cut out.

The creators of complexity are not going to be happy with this. Because they like complexity. They like verbosity. It's comforting. They detest what appears to be cryptic.

That is the price of simplicity. To achieve it in today's programming environment involves offending some people. As such, we avoid it. We discuss it, but we really just dance around it much of the time. Let the complexity continue. We can pretend more abstraction equals more simplicity. Be happy, be productive and have fun.

Meanwhile the connoiseurs of simplicity are marginalised, often leaving them to write occasional blog posts like this one and to work on their simple projects in relative isolation. Embedded programmers know the feeling. Those who appreciate simplicity, and really do trim things down to size, are not the majority.

Keep posting those articles on Forth, hoping someday people might catch on.

What you are arguing for is efficiency, not simplicity. Cryptic and highly compact routines, symbols instead of telling names and memory / disk space conservation beyond the point of diminishing returns, all increase complexity, rather than reduce it.

> I remember reading one of the Windows programmers' hero's writing about some massive document of several hundred pages he wrote while on the Excel team at Microsoft and being overjoyed when he learned Bill Gates had actually looked at it.

When you have a spreadsheet programme you start with something relatively simple. You then add functionality. You add pivot tables and charting and macros and etc. At what point do you stop and say "We need to re-write the entire code from scratch to make sure this stuff is all tightly integrated and bug-free?" or "We need to split some of this functionality off into separate by integratable softs to protect the core product and provide split pricing for power users"?

Never re-write code from scratch:- (http://www.joelonsoftware.com/articles/fog0000000069.html)

It's ironic that you're linking to one of Joel's articles in your response, given that he's actually the "Windows programmers' hero" that the comment was referring to: http://www.joelonsoftware.com/items/2006/06/16.html.

Why is each sentence on its own line?

For simplicity, of course. ;)

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact