This is always going to be a problem with dynamic linking. It happens with dynamic linked libraries. In order to use some function, you have to bundle and depend on a whole library. That library may have functionality you don't need, which may make it reliant on other libraries, and so on. And if two of the libraries you depend on, in turn depend on different versions of some other library, you get dependency hell.
But libraries are just one level of fractal self-similarity. Packages bundle libraries and other supporting tools, so one package depends on others, containing tools and functionality you never use. And packages suffer the same dependency hell.
Then we invent even higher abstractions to make the dependency hell go away. Maybe Docker containers. But then what happens when you need assemblies of those? Pods of Containers?
This is a problem with the dynamic linkage model of code reuse. Not that static linkage solves the problem totally, but it doesn't have the same dependency bloat, and dependency failures occur in the development phase, rather than during deployment (or worse, many months later).
So yeah, Cathedrals are more elegant, and better code.
But, so what? Cathedrals might be nice to go sightseeing at in a new city. But they took years to complete, were funded by government and rich benefactors, and are mostly empty. I'm sure God appreciates them; but they are largely irrelevant to most mortals.
I've been doing this for more than 20 years, and my old self cringes when I download a 4 megabyte package for anything, given what we used to achieve with 4Kb. But I can't deny that I can get a product out much quicker now than I used to. It might not be as elegant, or perhaps as maintainable, but the market doesn't care: I can worry about that when I get revenue. If I spend time building a cathedral, a bunch of others are going to come build a bazaar outside the door and clean up, well before I've put down the foundations.
I agree with everything he writes, but I don't see any practical alternative.
(The site appears to be down, so apologies if I've got the wrong idea here.)
But I can't deny that I can get a product out much quicker now than I used to.
Ironically, I'm not sure even that is true for some of the projects I work on. The amount of time we spend dealing with dependency, build and deployment issues caused by some variation of package and repository management is crazy.
It's often true that you can get a quick and dirty prototype out quicker by throwing together the right combination of libraries, I agree. And it's certainly true that a lot of places are selling products and services where the code is of that quality. It's also true that the people using those products and services are all too often getting what they're paying for as a result, both in how well the software works and the kind of customer service they get when it breaks.
I continue to hope that at some point the pendulum will swing back the other way, probably when we reach a point where significant markets are so fed up with short-lived, poorly supported, half-broken junkware that they understand the benefits of paying for something better. For now, sadly, the "everything should be free, or at least no more than $2 on my platform's app store" mentality is so prevalent that outside of serious professional industries it's hard to make money selling software that is actually good any more.
Back on a more technical subject, dynamic libraries do have one major advantage that is sometimes overlooked in the static vs. dynamic debate: you can access code in them from different programming languages at run time. In contrast, tools for compiling and statically linking code from different languages are rare, assuming you're even talking about languages that compile and link statically in the first place. Often the best you've got is hoping that each language can generate object files using the same probably-C-based ABI, making it possible to statically link them assuming that the various run-time systems are then set up correctly by the code itself.
I must admit to being a bit puzzled as to what his point is.
Yeah, there are a lot of dependencies if you want to compile an entire linux distro from source. But it's just exposed because everything is right there. If you actually tried to figure out how to compile all the software on your Windows machine, would it be any better?
And to say that libtool, bloated and complex as it is, came about because someone was incompetent seems quite a bit insulting. It seems that confronted with a compatibility problem between different Unix-like systems, one has two choices:
(1) Coordinate changes to all Unix-like systems so the compatibility problem is removed, or
(2) Work around the problem with some sort of wrapper.
Now, in a perfect world, (1) is obviously the better alternative (but even that wouldn't help existing installations.) But the world is not perfect, and my chances of accomplishing (1) are practically zero, even if it's "just a single flag to the ld(1) command". Hence, anyone who actually wants to get something working on all those systems would have to do whatever it takes.
His point is someone has to stand up and say "This is how it will be done." Without that you just get tons of wasted effort.
Why do so many developers use OS X instead of Linux? One benefit of OS X is that there is one entity responsible for it, not 250. There is a person (or a few) making the decisions, proving stability and removing things that don't work. It's not perfect (recent transitions in OS X as leadership shuffled) but the different parts of OS X usually feel like part of the same whole.
As much as people 'like' the bazaar I see the same things in open source. Current init systems give you tons of choice and people keep screaming because it's being removed. But a lot of people seem to like having it organized under one umbrella so that it can all coordinate properly. Doesn't matter if you don't like the choices being made (many people don't like Apple's choices) but there is someone in charge.
There are other examples. Ubuntu seemed to get popular saying "This is what it will work like" as opposed to what used to be common of "Here are 6 GUIs, choose one!". Ruby (with Matz), Python (with GvR), Clojure (Rich Hickey), the Linux kernel all seem to be doing pretty well with someone nominally in charge to make hard decisions.
I think his example of build systems was perfect. No one would set out to design a system to generate a file to probe a system for 26 programs that you don't even need to choose build options for a program that only runs on a handful of nearly identical systems.
Many would argue that Haskell ends up more consistent and elegant than those, despite having been designed by committee.
> the Linux kernel
Ends up with messier code and design than that of FreeBSD; perhaps an inevitable side effect of popularity, but perhaps exacerbated by the person-oriented command hierarchy.
> No one would set out to design a system to generate a file to probe a system for 26 programs that you don't even need to choose build options for a program that only runs on a handful of nearly identical systems.
Actually almost all of libtool's code was written by one person, Gary Vaughan.
Bazaars can produce elegant code, and cathedrals can produce monstrosities; a handful of anecdotal examples prove very little.
Along with other outliers like Smalltalk and Scheme, neatly grouped together under "experimental languages". Cf. SQL, a language that from the beginning had its industry in mind.
Common Lisp would probably be the best counter-example to "language design by committee doesn't work." Despite being developed by a committee over an almost 15 year period and the standard document being over 1000 pages long and full of functions that are there just for backwards compatibility, the language manages to be both more consistent and feature rich than Ruby or Python.
The language most people know as C has been developed by a committee since 1983 (try reading pre-ANSI C). C++ has been developed by an ISO committee since 1993(?) and has expanded hugely in scope (standard library, namespaces, RTTI).
So successful programming language design by committee seems to be the rule and not the exception.
Common Lisp, C and C++ did not contribute to the field in the enormous way that Scheme, Haskell and Smalltalk did. They're sheeple languages (made with the worker in mind) to control employees, and not languages to empower and liberate the programmer (like Forth, another great example of an experimental language by Chuck Moore).
If you want to refute you will have to do better than just show up; you could start by explaining why do we have so many shops using Java and .Net and now Rails (which makes it possible for Ruby programmers to be fungible for managers) and so few using Scheme and Smalltalk etc.
Ask your manager why you can't switch to a better language. The answer is it's harder to find talent for it. Bingo.
> Ask your manager why you can't switch to a better language. The answer is it's harder to find talent for it.
That's a really good point.
IDK if I would say that about Scheme though. There are a lot more people around that know Scheme than know Common Lisp. I learned to program Lisp starting with Scheme; going back to Scheme from Common Lisp feels really disempowering.
"Ruby (with Matz), Python (with GvR), Clojure (Rich Hickey), the Linux kernel all seem to be doing pretty well with someone nominally in charge to make hard decisions."
So many programming languages. So much confusing choice and wasted effort.
What we need is for someone to be put in charge and consolidate all those programming languages in to one language.
I think you are getting down voted because you voiced a very controversial opinion without much explanation to to back it. Still, I think there is some merit to it (although I wouldn't go as far as saying we're ready for the "one language to rule them all" yet).
The paradox of choice and duplication of effort are very real problems. Let's take web app development for example. We have PHP, Python, Ruby, Java, C#, etc. Each language has their own set of libraries/frameworks and it takes a lot of time and effort for a programmer to learn all of them. I've been developing web apps for 10 years and yet, I couldn't contribute to a web app written in Java. What if instead of having X libraries in Y languages, we had X*Y libraries for a single language or X libraries of Y higher quality for a single language?
Standardisation has lots of benefits.
The web standardised on HTTP/HTML/CSS/Javascript and it enabled the creation of a massive ecosystem of tools, libraries and pool of developers. Front-end developers can easily switch jobs and contribute to other projects. I personally dislike languages that compile to HTML/CSS/Javascript because I think standardisation trumps their benefits. C is a de facto standard in the embedded world.
In my opinion, sometimes the hammers are so similar (e.g. Python/Ruby) that we should just collectively agree to only use one.
I know a very good developer who does both Unix and Windows development. He knows both very well.
He points out that with Linux you officially have the ability to customize everything perfectly, but spend a lot of effort just deciding what your stack will be. And if you integrate 4 or 5 things together, you get a lot of redundant crap because they did things differently.
In the C# world, by contrast, Microsoft declares technology winners and everyone adopts them. Those winners may be in principle not ideal, but the fact that everyone actually uses them saves a lot of wasted duplication of effort.
That's a very idealised view of Microsoft's platform. In particular, they have a habit of periodically declaring new technology winners that aren't compatible with the old ones, which then become legacy. Remember Windows Forms and Windows Presentation Foundation, both of which were Microsoft's officially-sanctioned ways of developing GUI apps with C# and both of which are now legacy APIs in maintenance-only mode?
However the cost is no worse than the cost in the open source world of jumping on technology decisions that then wind up fading away and dying. And the fragmentation in the open source world makes that more likely.
Yes, in principle you can support open source software yourself. But you'd be amazed at the long term costs of doing so when you are locked in to a platform with diminishing market share.
Perhaps a better example of the Microsoft strategy would be Windows itself. For many years and several major versions, they went to extraordinary lengths to ensure backward compatibility and stability, so software that ran on Windows from the mid-1990s would still have a reasonable chance of running acceptably on Windows in the late-2000s.
We had almost reached the point where, notwithstanding any commercial licensing barriers, you could either still run application software natively, or it was so old that you could host a virtual machine containing the old OS within the new one. That's a great situation to be in if you rely on the software, and it's also a very nice situation if you write software, because it basically means that once you've got software that works it works forever. A firm foundation keeps everyone safe.
It's regrettable that under Ballmer and now Nadella they seem to have been pushing hard in IMHO the wrong direction with increasingly rapid updates and more reliance on inherently unstable on-line services, instead of playing to those traditional strengths. This makes Windows just as unlikely a prospect for long term stability as anything from Apple or the Linux world these days.
Personally, all I want is a system that I can buy today to do useful work and reasonably expect to still be doing useful work more than a year or two later. It's disturbingly difficult to find software that meets that apparently simple requirement any more.
If you want backwards compatibility, you really should look at IBM.
Their mainframe family still runs most serious financial platforms around the world, and it is routine to run code written decades ago on the latest generation of hardware.
No, it is not cool. It is not sexy. It is not cheap.
There is a difference between being able to choose between 15 tools and being able to choose between 15 hammers. The point of the article is that we just don't need 15 hammers -- especially as 10 of those are obviously inferior -- and the existence of 15 hammers is incredibly unhelpful.
But the point of libtool or autoconf is not to make it easier to build various programs on Ubuntu, it is to make it easier to build the same program not just on Ubuntu and RedHat, but also on FreeBSD, OpenBSD, Darwin, Solaris, SunOS, Tru64, HP-UX, IRIX, AIX, and god knows what other systems. Even if Linux turned into a perfectly polished and neatly organized cathedral, you'd still need to get it to work on all those other systems. And most of those systems predate "the cathedral and the bazaar" anyway, don't they?
How much easier does autoconf make things, really?
I think the key phenomenon here is that software is non-portable by default, and portable through effort. If your problem is that the Linux code should use epoll() and the BSD code should use kqueue(), but OS X has quirks in its kqueue() implementation, and you need to fall back to poll() otherwise, autoconf will help... slightly. Then you need Sys V init scripts, or rc scripts, or launchd scripts, or... who even knows, unless you actually convince someone to port your project to that system.
So, you already have people testing and fixing bugs on other platforms, but now you tell them that they need to also learn M4 so they can write an Autoconf macro to test for some... thing... that you need which doesn't have a standard test. I'm not convinced that this makes my life any better.
Whether or not you like Autoconf, however, you have to agree that libtool has serious problems.
"Build systems", like the GNU one (autoconf/automake/libtool/etc), cmake, or others, theoretically "solve" the need of your project build in a newer/different platforms. However, in practice, in a newer OS "version" or newer platform, you have to "fix" things, again.
In my opinion, when possible, it is better to stuck to standards, and use just a Makefile for POSIX targets (it allows context detection) and a custom scripts or Makefiles for exotic platforms.
In the end, the reason for e.g. Autoconf, is the mess of different non-standard crazy stuff. Why not to agree a build standard base, at least for the C language and library handling?
"How much easier does autoconf make things, really?"
Speaking as someone who was a system administrator responsible for compiling a large number of packages across a wide variety of platforms (let's see, SunOS, Solaris, AIX, Irix, Linux, and HP-UX, all at different versions at different times) through the '90s and early 2000s, autoconf made things much, much easier.
It doesn't help the developers any; each individual difference has to be identified by the developer. (My biggest complaint isn't that it's written in m4, but that there's no comprehensive list of what I need to do with it. That information makes up the m4 tests, but there was no documentation.)
Also many of these bugs are somewhat transient, and it is better to not support a buggy platform, or insist on manual patching until it is fixed, rather than build complex tests and workarounds in.
libtool and autoconf are messy partial solutions to problems created by writing non-portable code. This is an aspect of the bazaar model, where reasonable standards (e.g. POSIX) are sometimes ignored in favor of novelty or minimal adequacy.
The bazaar model embraces the mess, believing that something spectacular will emerge. The cathedral model argues that coherence produces better results.
Both can point to successes. phk is arguing that the open source community has lost more than it has gained from accepting the bazaar as the one true way.
Even in the last ten years, where has the innovation in operating systems come from?
The CatB essay predates the book, and the idea predates UNIX.
Any non-trivial code is non-portable between Unix systems in practice. No system implements all of POSIX, even when they do they often disagree on how parts of it should be implemented, and some software needs features that simply aren't in POSIX. (For example, high-performance webservers need a better way of polling for events than select/poll, and that's platform-specific.)
As for libtool, anything that deals with shared libraries is inherently non-portable. The file extension they're created with and the compiler and linker flags to create them are entirely platform specific - POSIX won't help you there. (Also, a lot of libtool's hairyness exists because some older proprietary Unix systems don't support shared library dependencies.) If you're only planning on supporting open source OSes you don't actually need libtool though - their native toolchains are either the GNU one or something compatible with it, and include most of its functionality natively.
And this is because there is nobody with authority to force the right solution.
Compare POSIX to the Java spec: without Sun forcing coherence, "write once, run anywhere" would have died very early on, and we'd be saying the same thing about Java as you've just said about POSIX.
> webservers need a better way of polling for events than select/poll, and that's platform-specific.
POSIX was just an example, and it is possible to write portable code using #ifdef. But you have to have a goal of writing portable code.
Re: libtool, most of the reason it's ever necessary is because of the historical rush and the incompatibilities created. Again, a single command line switch could handle that.
Still, plenty of software which has clearly only ever been built for Linux requires libtool. It's just the way things are done, just like autoconf. It works, poorly, inefficiently, fragile-ly -- and to most, mysteriously.
Appreciation for autoconf and libtool depends on one's position on the defeatism-pragmatism-idealism spectrum.
"POSIX was just an example, and it is possible to write portable code using #ifdef.."
Yes, two ways. One way uses autoconf (and tests each individual option to see which branch of the ifdef to use) and the other uses Imake (and has a big, continuously maintained database of which option to use on each system).
Oracle's SunOS (now Solaris), HP-UX and IBM's AIX, plus the community-supported *BSDs and Darwin would be the only ones you'd see in the wild today, from that list. IRIX and Tru64 are retired or history, according to Wikipedia. ZDNet in April 2014 reports that Unix server share as a whole is 13.6% according to IDC, Oracle has 4.7% leaving 8.9% to competitors. Since many of these are RISC or Itanium servers, leaving them out means fewer compatibility headaches on multiple fronts. That all said, unless there's a testing service out there for this kind of enterprise hardware, I can't see how anyone writing open source code outside of large enterprises would be expected to support this configuration without patches from the community... which is probably why "standard checks" remain today. Nobody really knows if they're still needed or not, since the original patch might go back to the mid-90s.
so your in 98-99% of cases not going to be using OSX in production so why introduce more risk by developing on a different system to the end product will run on.
You've just made his point: you're so used to living in the bazaar that it seems an inevitable fact of life.
Through the lens of the Peter Principle, command is itself a competency. If you have sufficient authority, (1) is easy. His point is that unless you have someone with that authority, (2) is inevitable. People have risen to a position where they need the solution libtool provides, without the competency to force the right solution.
But this is what I don't understand. When was there someone with authority to force all the Unices to adopt (1)? Autoconf came out in 1991, so clearly the problem existed at that point. That's coincidentally also the year Linux was first released, so it can't be blamed.
It seems to me that Unix was never a cathedral. Each specific operating system may have been one, but they clearly weren't more compatible back in 1991 than a Protestant and a Catholic cathedral are, since someone felt the need to write autoconf.
Please give me an example of this shining 20-th century "cathedral development", because I don't see one.
I believe that his point is that the world would be a better place without those 22,198 tools rather than with the gigantic mess that having them creates.
I myself am somewhat amused by the paragraph,
"...Later the configure scripts became more ambitious, and as an almost predictable application of the Peter Principle, rather than standardize Unix to eliminate the need for them, somebody wrote a program, autoconf, to write the configure scripts."
The key phrase being "rather than standardize Unix". Anyone want to count how many Unix standards there are?
He goes on about libtool with,
"...yet the 31,085 lines of configure for libtool still check if <sys/stat.h> and <stdlib.h> exist, even though the Unixen, which lacked them, had neither sufficient memory to execute libtool nor disks big enough for its 16-MB source code."
I don't know if AIX had sys/stat.h or stdlib.h (actually, I believe it did), but I do know those machines I worked on certainly had enough resources to run libtool; I did it. Because without libtool, building shared libraries on AIX was an exercise in crazy. (Almost as crazy as m4.)
"Windows-the-operating-system" is, but "Windows-the-distro" is not. He's specifically mentioning Firefox, so his argument is not restricted to the core operating system.
True, but sometimes those "breaks" are good things (Vista to 7 was nothing but improvement). And generally, Microsoft will support their releases for longer than a decade. XP had over 13 years of extended support, and Windows 7 will have 11 years of extended support. Even Vista, one of the worst OS releases from Microsoft, is still in extended support for another two years.[1]
Contrast that to major (LTS) releases from Ubuntu, which can be from 3 to 5 years, and only 9 to 18 months for non-LTS releases.[2] And Ubuntu is one of the kings of bleeding-edge breakage in the Linux-based OS world, surpassed only by Arch in my experience.
All that said, it's not all rosy in the Windows world, and not all thorns in the Ubuntu world. I just wanted to point out some facts to counter your misinformation.
Not really. The decision to require applications to open ports, as opposed to just bind them, broke pretty much all networked software on Windows in the 00s. That's just one example where software may run but doesn't work. It's complicated.
It's not the people who wrote all that bloated software who are incompetent. It's the person who wrote that ridiculous book "The Cathedral and the Bazaar" who is incompetent.
I have always enjoyed that essay, it reminds me of something my Dad once said to me, "Cannibal's don't realize they are cannibals." It wasn't strictly true of course but it tried to capture the essence that if you've grown up in a society where X is the norm, you don't know what living in a society where X is not the norm is like or can be like. Combine that with humans who like what they know more than what they don't know, and you find communities moving to high entropy / low energy states and staying there.
That said, there isn't anything that prevents people from having standards. Both FreeBSD and MacOS are pretty coherent UNIX type OSes. But it is important to realize that a lot of really innovative and cool stuff comes out of the amazing bubbling pot that is Linux as well. I sometimes wish it were possible to do a better mashup.
> I sometimes wish it were possible to do a better mashup.
I feel the best of both worlds is the bazaar model combined with curated software/library repositories where rules for inclusion are more rigid and opinionated. For example, most programming languages have a standard library (the cathedral) and an ecosystem of third party libraries (the bazaar). Within that ecosystem of third party libraries, we could theoretically have curated repositories of libraries which share a coherent vision. In the real world though, I haven't really seen that happening. Node.js has registry.npmjs.org (no curation), Ruby has rubygems.org (no curation), etc. In the OS world, there is Ubuntu which is more cathedral like but I feel it should be a lot more opinionated perhaps at the cost of flexibility/portability.
People don't use 'Linux', they use a particular Linux Distro which does tend to provide the standardisation described (more so for some, less for others).
The fact that Linux-in-general is a roiling maelstrom is irrelevant to a user using a particular distro. It is relevant to developers and/or packagers though.
In theory, this isn't a million miles away from what Debian aims at. However there's a social pressure on the various teams to include as much as possible, so what curation there is doesn't end up being very opinionated. I personally have some fairly strong opinions on what Debian/Ruby should look like, but have neither the standing nor the time to expend developing it in order to move the needle there.
The article author points out several problems in the open source world.
The first is that it's too hard to converge things that have diverged. I pointed out an example in a Python library recently - the code for parsing ISO standard date/time stamps exists in at least 11 different versions, most of them with known, but different, bugs. I've had an issue open for two years to get a single usable version into the standard Python library.
Some of this is a tooling problem. Few source control systems allow sharing a file between different projects. (Microsoft SourceSafe is a rare exception.) So code reuse implies a fork. As the author points out, this sort of thing has resulted in a huge number of slightly different copies of standard functions.
Github is helping a little; enough projects now use Github that it's the repository of choice for open source, and Git supports pull requests from outsiders. On some projects, some of the time, they eventually get merged into the master. So at least there's some machinery for convergence. But a library has to be a project of its own for this to work. That's worth working on. A program which scanned Github for common code and proposed code merges would be useful.
Build tools remain a problem. "./configure" is rather dated, of course. The new approach is for each language has their own packaging/build system. These tend to be kind of mysterious, with opaque caching and dependency systems that almost work. It still seems to be necessary to rebuild everything occasionally, because the dependency system isn't airtight. (It could be, if it used hashes and information about what the compiler/linker/etc actually looked at to track dependencies. Usually, though, user created makefiles or manifest files are required. We've thus progressed, in 30 years, only from "make clean; make" to "cargo update; cargo build".
The interest in shared libraries is perhaps misplaced. A shared library saves memory only when 1) there are several different programs on the same machine using the same library, and 2) a significant fraction of the library code is in use. For libraries above the level of "libc", this is unlikely. Two copies of the same program on UNIX/Linux share their code space even for static libraries. Invoking a shared library not only pulls the whole thing in, it may run the initialization code for everything in the library. This is a great way to make your program start slowly. Ask yourself "is there really a win in making this a shared library?"
Shared libraries which are really big shared objects with state are, in the Linux/UNIX world, mostly a workaround for inadequate support for message passing and middleware. Linux/UNIX still sucks at programs calling programs with performance comparable to subroutine calls. (It can be done; see QNX. When done on Linux, there tend to be too many layers involved, with the overhead of inter-machine communication for a local inter-process call.)
That's borrowing, not sharing. The project borrowed from has no idea there's someone else using their file. So they will feel free to change it without informing others.
No Subversion repository has any idea how many checkouts there are. So not even the original repository has any idea by how many parties its files are being used (except you use explicit locking, maybe). Each checkout, external or not, can change a file and commit the change. So is it possible at all to tell “borrowing” from “sharing”? What SCM would do that?
Besides, “informing others” of changes is, of course, one of the central functionalities of an SCM. You will be informed about changes in files “borrowed” by others. So what's the difference to “sharing” again?
What strikes me the most, while reading through that thread, is how many times I saw the phrase "I didn't read the book that defined the terms "cathedral" and "bazaar", but cathedral/bazaar is..." Or something equivalent. The author kept asking if the people commenting read the book.
The terms cathedral/bazaar do have meaning in the community beyond the specific definition in that book. Worse, without some historical hesitation, using that specific definition now, 14 years later, is to ignore the influence of that book on the larger discourse on the topic. (Compare it to the Bible: interpreting it as is without taking into account the 2000+ years of history and cultural evolution is non-sense [although popular in certain circles])
Thanks, this was very helpful getting more of an idea of what he was talking about. More I read the more I realized I didn't actually understand what it was he meant. I should have known, it happens with my own writing all the time.
I always understood this article as being about software design in open source.
Every project (library, application, you name it) needs to be designed, by one or more people thinking deeply about the problem space.
You can't expect software design to just happen, once the code has been written, it's already too late, see you in v2. You can't expect a 100 people to do it, there will either be a lot of conflicting visions, or no vision at all. And you can't expect to do it in your 1 hour of "free coding time" a day, because it doesn't give you enough time for deep thinking about the problem.
If you try to bypass design and solve it from "another angle", you get libtool, a solution that is a hundred times more complex than the problem it solves.
Look at successful open source projects (Rails, Go, even Linux). They were all initially designed by someone, handed down to the community to improve.
They still have strong minded architects leading the effort. Now compare it to those random "Let's clone X!" threads that never produce anything.
So, there's cathedral thinking even in the bazaar. And it's the only thing preventing it from eating us alive.
> If you try to bypass design and solve it from "another angle", you get libtool, a solution that is a hundred times more complex than the problem it solves.
Agreed. One good thing about right now is that there are less people in the conversation, so things might actually be possible. If Linux, FreeBSD, OpenBSD and SmartOS decided to have a standard way of building a shared library with another flag to ld() they could obviate the need for libtool much faster than they could have in the 90s. And if not, that's fine, we'll just build it for Linux.
Good point, very odd way to prove / address it, using what is mostly a low level detail. I believe in the "quality happens only when someone is responsible for it" idea, however it's not really a matter of automake/autoconf, or at least this is an old thing to look at, and a detail. A more modern result of the idea that central authority is not needed in software projects is the bloatware created by the idea that software can be created like this >>> "Let's merge this Pull Request, tests passes". There are three flaws in this apparently non-brainer action:
1) "Let's merge": adding features without stress in taking software self-contained and small.
2) "This Pull Request": the idea that there is no need for a central design.
3) "If test passes": the idea that you can really provide quality code just with tests and without manual code quality inspection and writing code in a clean and defensive way to start with.
So I believe in the postulate but the proof is rather strange in this post.
I think the core issue the author seems to be getting at is bigger than just the Bazaar (Open Source).
Even when it comes to closed source commercial software development I really miss the days of code ownership. When I first started working as a programmer back in the 90's it was common for different members of the team to "own" sections of code in a larger system (obviously I don't mean "own" in the copyright sense, just in the sense of having one clear person who knows that bit of code inside and out (and probably wrote most of it). Of course we'd (preferably) still be beholden to code review and such and couldn't change things willy-nilly so as not to break the code of consumers of our code, but it was clear to all who to talk to if you needed some new functionality in that module.
The last few places I've worked for have been the exact opposite of this where everything is some form of "agile" and nobody "owns" anything and stories are assigned primarily based on scheduling availability as opposed to knowledge of a certain system. There is admittedly some management benefit to this -- easier to treat developers as cogs that can be moved around easily, etc, but my anecdotal belief is that this sort of setup results in far worse overall code quality for a number of reasons: lots of developer cache-misses when the developer is just bouncing around in a very large code base making changes to various systems day to day, lots of breadth of understanding of the system among all the developers, but very little depth of understanding of any individual component (which makes gnarly bugs really hard to find when they inevitably occur) and what should be strongly defined APIs between systems getting really leaky (if nobody 'owns' any bit of the code it is easier to hack such leaks across the code than define them well, and when your non-technical managers who interpret "agile" in their own worldview force developers to try to maintain or increase some specific "velocity" shit like this happens often).
Granted, there are some cases in which such defined ownership falls apart (the person who owns some system is a prima donna asshole and then everyone has to work around them in painful ways), but there were/are solutions to such cases, like don't hire assholes and if you do fire them.
I switched teams from one with a communal codebase to one with code ownership (within the same org).
The code ownership style has far better developer productivity.
But I think the issue is just not about team organization, but also of software architecture.
I think with code ownership the modules need to be small enough so that if the owner exits unexpectedly from the organization someone else can pick up from where the previous owner left.
With communal software that gets built and tested after each build its very easy to let the architecture slide into a structureless mud ball that works but nobody is quite sure how, especially with changes being diffused throughout the system with concurrent edits by different developers.l
We did code ownership, then very important developer left, and we did pair programming for a few weeks and then no code ownership (and people were assigned tasks from different parts of code) for a few months. Then we went back to code owenrship.
I think it was reasonable solution - better productivity, but people know more or less where to search for something, even if John isn't available when the bug hits.
I particulary liked pair programming and would like it to happen every few months for a few days.
Technology is more complicated than even a few years ago. It can do more. It is accessible to more people (and all of their unique needs and abilities). Computers have the ability to make an almost infinite number of interconnections with other computers.
The point is that a single person can't possibly keep track of a sufficient quantity of information to direct a sufficiently complex system anymore. And with the communication and development tools available today we are able to build these complex layered solutions without always having to worry about all of the other details that we can't possibly worry about.
Well, at least until you realize that your image library has a security hole in it, or that your crypto library is so complex that it's not really possible to audit it in its current form.
It certainly is, since the author of said article contributes heavily to freebsd. You know what else is an important part of your diet, kids? Reading and thinking before speaking.
Did I misread your comment? I took it as an insult on the author, but re-reading it I can see where it could have been just a humorous observation. Could you clarify?
>>Technology is more complicated than even a few years ago. It can do more. It is accessible to more people (and all of their unique needs and abilities). Computers have the ability to make an almost infinite number of interconnections with other computers.
Infinite? Hardly. This is hyperbole and bombast. One wonders what your measure of complexity is? Hardware is not more complex than a few years ago (processor instruction pipelining and caching was being done in the 80's for example). Languages are no more complex, although there will always be a fresh crop every year (has been since th 60's).
The complexity I will grant is that in large applications and interacting processes. But large systems have been complex since forever (50 million loc cobol mainframe apps) and intercommunition today is much simpler than 10 years ago (SOAP vs REST, xml vs json).
So the perception of increasing complexity is due to lack of perspective, imo.
It's only got that complex because nobody has been in a place to enforce simplicity. I strongly believe that it's possible to have systems with today's capabilities without today's complexities, but I don't think you can get there from here.
FWIW, the 19000 lines of "configure" for Varnish also check whether stdlib.h exists. Perhaps it's still useful today to do so in order to avoid obscure compilation issues or to catch problems on misconfigured systems early on?
As an old-timer with ~30 years of programming experience, I have similar sentiments as the author about complex projects today, yet I also often feel that too much knowledge, accumulated in sometimes cumbersome form, is being thrown away and reinvented badly. There has to be a compromise somewhere and it's no surprise that projects in an old language like C, running on evolved systems like Unix, de facto standardized on Autoconf to make it a little easier for developers. Do I want to use it myself? Certainly not, I have the luxury of being able to choose a modern language that abstracts most (not all!) platform-specific issues away at compiler installation time, at the cost of having much fewer deployment options for my code.
His observations are sound, but his blaming it on the "bazaar philosophy" doesn't really follow. The problem of unused dependencies that he points out with the Firefox port is a failure in packaging, either due to clumsy work with the port itself, or an inability to properly support soft dependencies.
I barely understand the voodoo magic behind libtool myself, but as PHK says, it "tries to hide the fact that there is no standardized way to build a shared library in Unix". I'd wager dynamic linking inherently poses such quandaries that are easier solved through kludges.
> I barely understand the voodoo magic behind libtool myself, but as PHK says, it "tries to hide the fact that there is no standardized way to build a shared library in Unix". I'd wager dynamic linking inherently poses such quandaries that are easier solved through kludges.
But that's exactly the problem. At every point we a little more complex because someone just applied a little kludge to fix things. Over a decade or two that turns into a TON of wasted effort and complexity because no one wants to spend the time to fix the original problem.
In a cathedral like Apple or Microsoft (or anywhere else) they can decide "This sucks, all effort is now into fixing X" instead of patching it over and over because they can take the long term view. In the bazaar the only longterm view is what you yourself will do. You can't be sure anyone else will help or pick up where you left off if you get half way though a big project and need help.
In the bazaar world you get things like the Linux TTY layer. It was a huge mess, I think it still is to a degree. People put in little patches because no one wanted to take on the huge responsibility to fix the whole thing (including maintain compatibility). As I remember the maintainer quit his position so he wouldn't have to keep messing with it and he didn't think he would have the support to get it fixed. But the problem is big and isn't sexy (like playing with STM, adding support for the 8193rd processor, or making a new filesystem) so it gets dragged along and slowly patched until it becomes a huge critical issue.
Dynamic linking isn't actually a problem in the open source world - you build your shared libraries with -fPIC -shared and link other code against them just like you would any other library, because their shared library systems were designed to behave sensibly from the start. Libtool was created for all the existing proprietary OSes with their own, often half-baked, support for shared libraries[1] and uncooperative, opaque corporate developers; it's as much an artifact of the cathedral as the bazaar[2]. To quote the documentation:
"Ideally, libtool would be a standard that would be implemented as series of extensions and modifications to existing library systems to make them work consistently. However, it is not an easy task to convince operating system developers to mend their evil ways, and people want to build shared libraries right now, even on buggy, broken, confused operating systems."
[1] Or worse, no shared library support at all. This is a major headache because linking against static libraries requires you to manually link all the libraries they depend on too; libtool handles this for you on systems with no shared library support or no native dependency tracking.
[2] Especially since it's a GNU tool and their development model has historically been more cathedral than bazaar anyway.
I barely understand the voodoo magic behind libtool myself, but as PHK says, it "tries to hide the fact that there is no standardized way to build a shared library in Unix". I'd wager dynamic linking inherently poses such quandaries that are easier solved through kludges.
I'm not so sure. I always assumed that the problem with shared libraries was that they hit the mainstream at the height of the "Unix wars", when every vendor was trying to differentiate their own flavor with new features.
I don't think the value in the minor differences between, say, how AIX and Solaris support shared libraries are worth the headache to support them both. In an ideal world there would have been a standard in short order.
> I updated my laptop. I have been running the development
> version of FreeBSD for 18 years straight now, and
> compiling even my Spartan work environment from source
> code takes a full day, because it involves trying to
> make sense and architecture out of Raymond's
> anarchistic software bazaar.
ahh, you should've tried Plan 9, phk. 130 seconds to compile all software and libraries, 6 seconds to compile a new kernel. no bazaar there...
of course, this didn't appeal to you back then, did it? ;)
plan9 was free software in 2000 when the 3rd release was made available (and before the issues discussed in this article were encountered). this is despite RMS's objections that were parroted by clueless linux fanboys around the net.
as to practicality, plan9 was exactly as practical as phk's supposed "cathedral" unix would've been. at the very least it had 1Friggin'UI instead of many :)
but hey, thanks for the downvote. it's not like we haven't had this discussion in person with phk, our difference in opinions is well known to each other.
There are many related problems to the one pointed out by Kamp. But I'm left asking, does the cathedral scale? Does it handle evolutionary complexity well?
I'm a believer that a much simpler/cleaner set of software tools could be created. But their wide-scale adoption would be more difficult.
There are a few good quality "stalls" in every bazaar, but discovering them, and maintaining some sort of order is difficult. Every package ecosystem suffers from this.
The author is very good at pointing out the problem yet offers no solution of his own. How else do you manage a mostly opensource ecosystem and not end up with different versions of everything? We don't get an answer for that. Maybe I should write an article about articles that offer no solution to problems people know exist.
I find this objection ("he doesn't offer a solution") almost always invalid. There's value in pointing out problems even if you don't know a solution. "A problem clearly stated is a problem half solved".
If you can't convince people a problem exists (and there doesn't seem to be a consensus just browsing comments here), then it doesn't seem very fruitful to start talking about solutions.
Ah, the good ol' days when all we needed was netcat and vi, none of this fancy graphics hoohah and kids these days don't even know how to spell!! Mind rot, I tell ya! Autocomplete!!! Bah! Use a printed dictionary, you lazy brats!! In my day a web browser supported 3 kinds of images: JPG, GIF, and BITMAP, that was all we needed! ... and you had to draw it yourself with colored pencils, we were all artists ... Now it takes 3 damn hours to recompile my whole friggin' OS and this cruft and nonsense and doohickies clogging up my disk drive ... no one knows how this stuff works anymore!!! It's all gonna fall on your head, and then you'll be sorry!
"detour", to reuse the word of the last paragraph, is how things happen. Look at how we'll get our favourite programming languages in the browser: by compiling to javascript, which evolves to become a potent IR (e.g. browsers support asm.js).
Even if you look at the end result (the m4/configure/shell/fortran example) and it is indeed twisted, to honestly say it is abnormal to reach such a state is to disregard any experience developing software. Any project, even brought to life in the cathedral style, will accumulate cruft in the long run that can disappear only with effort.
I'm a bit confused. Is this supposed to apply to software development in general or just package management / software repository systems? The author is describing his ideas at a high level of abstraction but I can't seem to make a concrete connection. For example, how would one design a web app in a cathedralesque way?
I am not author, but I think it was general. My interpretation of cathedral:
- understand your whole system before you code
- list possible data by kinds that need to be processed differently at each point in control flow
- look for what changes together and what changes independently and adjust the code flow so you don't need to test for the same things twice
- when requirements change - don't do workarounds and quick hacks, change the underlying mechanism
- solve problems at the level they originated, not at the level they were detected (for example null pointer is better solved by returning empty collection than by copy pasting if (x!=null) all over the place)
- don't treat anything like black box when you need to change something - if you never change things inside black boxes you will get a lot of nested black boxes doing the same thing over and over, just slighlty differently - this is wrong, no matter what OO theory says about hermetization
If a better build system were as easy as Kamp seems to think, he could have written it by now with 1/3 of the effort he's spent complaining about build systems over the years. Turns out the problems really are hard, and the layers of cruft are there for a reason.
Part of the objection is Firefox has 122 dependencies, and some of those need Perl, some need Python, and some need both.
If developers are writing code (or build scripts) in the wrong languages, you can't fix that with a build system - you have to convince the people doing it to change their ways.
In a cathedral, hypothetically that could be an edict from the architect. In a bazaar, you find a soapbox to stand on and shout as loud as you can.
Turns out the problems really are hard, and the layers of
cruft are there for a reason.
Yes to the first part, no to the second. The layers of cruft are there because different problems were solved. Problems that existed because of attempts to work around the actual problem. If the actual hard problem had been solved, no cruft would have been needed.
Altenative to a bazaar is not a cathedral. It's few huts on a parking lot of a shoping mall. Shoping mall that is superficially neater but becomes abandoned when economy shifts.
Definitively going to look for "The Design of Design"; Fred Brooks is a master.
Though I am relatively young - one of those clueless hacks who arrived at this industry when the dot com party was on its last legs - I can see the blight. Danced with the devil a couple of times, so to speak. It is really depressing to confirm that there's no greener pasture to be had anywhere.
"A million of replaceable moneys... building an ever collapsing tower... exploring implementation space..." [1] sounds an awful lot like my career so far.
[1] Not the author's words, but a comment in the original post.
So, a really great example of this is over in Node land. I was trying to install some basic boilerplate HAPI demo scaffolding, and I watched with horror as dependencies were pulled in (as third or fourth level deps!): isarray, isstring, three different versions of underscore and lodash, and so on and so forth.
I've never seen developers so lazy or just uneducated about their own language that they blantantly pull in libraries for such trivial operations. On the server even, no excuse about compatibility!
This is by design. The node community actively encourages you to break every module into the smallest possible pieces, then make a parent module which reassembles them. This is seen as good, modular design - you end up with small pieces which can be individually tested, patched, understood, documented and maintained. Some extra effort gets pushed to developers in that when you want to do something simple, you have to trawl through dozens of small libraries which will all accomplish the goal. But those small libraries should each be readable and understandable - unlike the mammoth size of most C/C++ libraries.
I asked isaacs about documentation on the way back from nodeconf this year (isaacs is the CEO of npm and former maintainer of nodejs). I asked him what he prefers when it comes to documentation of nodejs packages. He said that if your package needs anything larger than a readme, its probably too big and you should just break it up.
And again, for browser stuff, sure--but this is a controlled runtime on the server. This shouldn't even be a question.
What I've also noticed is the tendency to use lodash and underscore in places where there are completely reasonable standard language features, like _.forEach().
Again, the Javascript language is a bit weird, and lacking perhaps in a lot of functionality that a default Ruby installation (say) might get you, but this is just silly.
Javascript is weird and the library you're talking about are all about trying to make it sane. The problem is they're libraries, with potential breaking changes between versions. You want to be sure your code runs? You pull in the exact version you wrote it against.
And by and large, this doesn't really get better with more centralization unless the centralization is good: if something doesn't work, then you'll still get 50 different wrappers which fix it in slightly different ways. Because you have to.
What I don't understand is the justification for supporting more than one version of a single library at a time. That just seems crazy to me, and an utter pain for maintenance.
But libraries are just one level of fractal self-similarity. Packages bundle libraries and other supporting tools, so one package depends on others, containing tools and functionality you never use. And packages suffer the same dependency hell.
Then we invent even higher abstractions to make the dependency hell go away. Maybe Docker containers. But then what happens when you need assemblies of those? Pods of Containers?
This is a problem with the dynamic linkage model of code reuse. Not that static linkage solves the problem totally, but it doesn't have the same dependency bloat, and dependency failures occur in the development phase, rather than during deployment (or worse, many months later).
So yeah, Cathedrals are more elegant, and better code.
But, so what? Cathedrals might be nice to go sightseeing at in a new city. But they took years to complete, were funded by government and rich benefactors, and are mostly empty. I'm sure God appreciates them; but they are largely irrelevant to most mortals.
I've been doing this for more than 20 years, and my old self cringes when I download a 4 megabyte package for anything, given what we used to achieve with 4Kb. But I can't deny that I can get a product out much quicker now than I used to. It might not be as elegant, or perhaps as maintainable, but the market doesn't care: I can worry about that when I get revenue. If I spend time building a cathedral, a bunch of others are going to come build a bazaar outside the door and clean up, well before I've put down the foundations.
I agree with everything he writes, but I don't see any practical alternative.