The net result of all this is that we have data centres occupying many hectares, filled with computers that are architecturally identical to a Packard Bell 486 desktop running MS-DOS long enough to boot a crippled and amateurish clone of Unix circa 1987, and an endless array of complex, kludged-up hardware and software components intended to hide all of that. I would ask for a show of hands from those who consider this progress, but too many are happily riding the gravy train this abomination enables.
Woah.. that statement stung, mostly because I agree with it and realize that I'm one of the people that he refers to in his post.
>data centres occupying many hectares, filled with computers that are architecturally identical to a Packard Bell 486 desktop running MS-DOS long enough to boot a crippled and amateurish clone of Unix circa 1987, and an endless array of complex, kludged-up hardware and software components intended to hide all of that. I would ask for a show of hands from those who consider this progress,
that's definitely a huge progress. Successfully building a coherent computer size of many hectares out of commodity components... And that it doesn't matter whether it is 486 desktop or something else is among the great of achievements of today. Keith and the likes are stuck in the past when the content/architecture of that single box was important and thus they (and all their ecosystem) have been limited physically and mentally by a biggest box one can build (like SunFire 10K). Network is the computer - Keith's and my common former employer was saying. While may be not there yet, today we already have "AWS is the computer", "private FB datacenters is the computer", etc...
Not everything Sun made was big iron; even their small soldiers (Netra, SunBlade etc) used many of the same technologies (Open Boot PROM, etc.) that made their big-iron great. And their hardware and software documentation was second to none.
A network of good boxes is still better than a network of IBM BIOS-booting junk with weak telemetry, non-ECC RAM, and hastily engineered I/O controllers.
>A network of good boxes is still better than a network of IBM BIOS-booting junk with weak telemetry, non-ECC RAM, and hastily engineered I/O controllers.
in a magic world of unlimited resources - yes. Unfortunately, engineering is a discipline dealing with constraints. Sun failed to get that. 20 boxes of that "junk" for the same amount of money as 1 "good" box - tough choice, isn't it? :)
And by the way, wrt. "good" vs "junk", 10 years ago Sun hardware, like motherboards, layout, soldiering, its architecture/buses was already worse/behind than commodity x86/64 stuff produced by second rate Taiwan manufacturers.
>Not everything Sun made was big iron; even their small soldiers (Netra, SunBlade etc) used many of the same technologies (Open Boot PROM, etc.) that made their big-iron great.
that is kind of at the root of the problem - even smaller stuff was as result priced toward big iron, i.e. bad price/performance.
One of his central criticisms is that the software stack is ossifying upward. Yet when a project tried to simplify a layer of the stack, as Go did with its toolchain, he criticized it for being NIH. I guess the problem is that everyone has their own pet concern which they will vehemently defend (one of mine is accessibility in GUI toolkits), and thus things will never get simpler, at least in mainstream systems that are more or less designed by committee.
Regarding UEFI, I wonder if Joyent has enough clout that it could push the relevant suppliers to support coreboot. Edit: I guess the answer is no, since they couldn't interest anyone in Dogpatch.
From a Sun/Illumos aficionado's perspective, has GNU/Linux always been amateur, or is that more recent, e.g. with the intrusion of D-Bus and systemd into servers?
Assuming the last question is earnest: while I think "amateur" is a tad strong, I do think that Linux suffers enormously from NIH; it remains the first system (or first Unix system, anyway) for many of its contributors, who seem to know no other way -- and this shows up all over the system.[1]
In terms of how this has changed over time: if anything, it used to be much, much worse. For example, Linus' attitude on kernel debugging circa 2000[2] is not merely amateurish, and not merely irresponsible, but actively dangerous. While his attitudes have soften over the years, there still remain striking differences of engineering principle between Linux and other systems -- of which systemd is merely (in my opinion) one particularly egregious manifestation.
The question was earnest, and thanks for linking to that informative, entertaining talk. I currently run GNU/Linux on servers (mostly virtual servers) for pragmatic reasons, but I'm always interested in learning about alternatives and different approaches.
In the talk, you said that you and the Illumos community believe that the kernel and the system libraries (e.g. libc) should be inseparable, whereas Linux places a sharp boundary between the two. Why do you believe the Linux approach is suboptimal (except that it makes it easier for you to emulate Linux)? Is there any existing writing on this topic that I can read?
That's an excellent question, and no, I don't think that we have historically expanded on it. It merits an explanation though -- and it strikes me that it may be worth explaining our engineering ethos in some detail. Would that be of interest? Perhaps I could use the SMF example as the motivating example of systems thinking...
Oh, I don't blame Torvalds -- but I do think it's coming out of that same NIH culture that seems deliberately unwilling to look beyond their own system.
As for illumos, we already have SMF. I have often said that the only thing worse than SMF is not having SMF -- and at some point I intend to tell the full story there: it wasn't always pretty, and it actually makes for an interesting comparison with the systemd adventure.
Given other systems that systemd has enveloped, like udev, and insanities that they've pushed into kernel space, like kdbus, I'd blame quite a bit of that madness on systemd. Systemd is a symptom of the pervasive lack of engineering hygiene in the linux world that has pushed me away from linux and makes me dread having to work with an OS I used to rather enjoy.
That's just one symptom of the disease. Linux is polluted with silly pseudo-filesystems to handle systems tuning, poor documentation, and after some time in the BSDs, an inferior userland. Unfortunately, the baby got killed with the poison a few groups of people put in the bathwater.
I may be a bit biased -- the Bell Labs guys are major influences of mine -- but I think the criticism of Pike, et. al is unfair. I understood Pike's "systems research is irrelevant" presentation to be a tongue-in-cheek observation about the decline of systems research, not a statement that "OS research is dead" and we shouldn't work on it, but maybe I'm mistaken.
I don't consider Plan9 to be a failure, at least in the engineering sense. Many ideas pioneered in Plan9 influenced developments in the Linux kernel. It actually seems contradictory to me that Keith spends so much time flaming the community for not trying something new, then berates researchers like Pike for doing exactly that.
Meanwhile, everyone wants to rave about OpenStack, a software particle-board designed by committee, consisting of floor sweepings and a lot of toxic glue. Not surprisingly, since it’s not a proper system, few OpenStack deployments exceed a dozen nodes. And in this, the world falls all over itself to “invest” billions of dollars.
I spent 18 months deploying Openstack inside a pretty broken huge company.
Openstack is a disaster, because so many of the people building it have never used AWS and are instead trying to build "VMWare, but for free." Object storage is such an afterthought that there are regular rumblings that Swift shouldn't even be part of core Openstack. The six-month release cadence keeps delivering broken software. The releases are slushy enough that configuration options change mid-release, so you can't perform minor upgrades without potentially breaking something.
The big companies backing Openstack are using it as a cloud story that they can present to Wall Street, or as a bargaining chip to get better terms from VMWare.
Most of the Openstack startups have been aqui-hired, and life has moved on.
Depends on your perspective. If you think that grabbing source or pre-compiled bits and deploying it, like say, LINUX, you are going to be in for a rude awakening. Not easy, lots of architectural and configuration items to consider. For all intents and purposes you are attempting to build AWS or GCE from scratch. Its going to be complicated. OTOH, if you consider exactly what you are trying to solve for, pick a supported distort and go with that - the big players all focused on different verticals from the outset, and their strengths map to those niches. Many of them are pretty solid from an architecture, implementation, and support perspective. But in no way think that OS is a replacement for VMware...it just isn't.
For those wondering "Who is this guy?" - you can look up his engineering qualifications and contributions elsewhere. My experience of him personally, in two years spent looking across our desks at each other, is that he has intelligence, integrity, and empathy equal to few in this industry. His retirement is a great loss to tech. As a friend, I wish him well in whatever he wants to do in his life.
If true, it's perhaps a bit sad that he's so willingly gone out - with this blog post, and the one about Go - in a way that paints himself as a total asshole. It's all most of us will ever have had in the way of contact with him.
Thinking about this some more, I suppose the take-away for software engineers who don't want to retire young is this: Kill any sense of aesthetics you may have; always back the platform or technology that appears to be winning, regardless of what you feel its merits are; never hope for the underdog to win; worse is better; either be a mercenary, or look for ways to do some good within the broken system without trying to fix the parts you can't change. Is that too negative a conclusion?
I find screeds from engineers about how broken everything is to be directly at odds with the observable truth of vastly more people with less technical expertise having vastly more access to technology that provides them value. I think the take-away is to get some perspective and stop caring so much about the state of "the system" and just try to create things that are valuable to people.
That's a cynical view of the workplace that you could easily generalise to any institution or marketplace. The good guys never win, it's all about kissing a$$, sell anything that's profitable, milk it until it's dry, etc etc.
No market is perfect, and the IT market even less so, dominated as it is by a handful of US companies riding an Asian supply chain fundamentally incapable of (or unwilling to) creative disruption. But it's still more dynamic and open to innovative players than the car market or almost any other market you could mention. The parable of Apple alone should be taken as proof that "it can be done"... Just because some segments seem to be more conservative than others, it doesn't mean the whole sector is doomed; it's just a sign of consolidation in an industry that is barely 50-year old and still very much wild compared to any other sector out there.
At home I use a BBC Micro, Atari ST, Minix, DragonFly (which is written by a former Amiga guy). There is no pleasure to be had in modern mainstream OSs anymore. Love to work or work to live?
I'm glad I got the chance to read this article. As a relative newcomer to computing (and I'd like to assert that just about 90% of HN are relative newcomers), I value the input of an old fogey "telling it like it is" (how he thinks it is, more accurately).
I don't do much systems programming, but mean to do more, and I'm a little disappointed that what I thought was a good approach at a systems language (golang) is met with such derision by him (who seems to truly have systems experience), but I'm excited to read some of his more technical articles and find why that is the case.
While I take his opinions (as that's what they are) with salt, I think he's definitely got a point saying that the industry has become cozy/complacent. Many of the interesting things that everyone is re-discovering/clamouring about today has been around since the 50s/60s/70s -- I can never shake the feeling that I can never shake the feeling that I could be doing something more ambitious, more revolutionary on pretty much every project I work on, but maybe that's just me.
An alternate explanation of his frustation/bitterness is that the way the market forces have worked out, there hasn't been a good business case to create a new "systems company" or do ground breaking new systems work. About a decade ago, before ZFS was announced, I participated in an company-wide investigation about whether there was a viable business case to invest in new file system features. There was very large number of very experienced technical people involved: distinguished engineers that had far more experience at the company than I, business executives, etc. We explored a large number of file system enhancements, how much effort doing this development would cost, estimates of how many customers would cause customers to switch from their existing Legacy Unix systems, etc. The result of this exploration was the decision that there was no valid business case to add new features at the file system level, as opposed to other layers of the storage stack (i.e., storage arrays, enterprise databases, etc.)
After this company-wide technical study was completed, I stopped doing any ext4 work as part of my day job; it was no longer my primary job responsibility (although to be fair my employer did let me do some ext4 work on the side during business hours, so long as it didn't interfere with my primary work), and it remained that way until I started working at Google to deploy ext4 across all of Google's production servers.
Anyway, going back to my pre-Google days, there was some concern about whether the conclusions of this study was valid after Sun announced ZFS and started using it in their marketing and started pushing it in social media. This resulted in a behind-the-scenes effort to launch btrfs, but the various companies involved never did assign enough headcount so it would be successful. And despite these concerns, Solaris didn't really seem to gain significant market share versus Linux and other proprietary Linux systems. In fact, the opposite was true.
Perhaps this means that Sun was brave where other companies feared to tread --- but given that (a) various ex-Sun employees have stated that there was a lot of Solaris engineering work was done before consulting the marketing or sales "tribes", and (b) ultimately, Sun wasn't able to succeed in the marketplace --- leads me to propose that maybe pouring all of that engineering effort into ZFS and dtrace (while both are amazing technologies, I will be the first to acknowledge) --- perhaps wasn't a good product/market fit, and that a VC who looked at things from a very cold-hearted profit and loss perspective would have never allowed a lot of the Solaris engineering work to go on as long as it had unless there was a demonstrated way it could be somehow monetized.
These days, when I try to propose new ext4 work that will actually be supported by company resources (as opposed to hobby work that gets done for fun in off-hours), I have to take into account how much headcount (if any) I can get assigned, and there are often some very hard deadlines which mean that I have to make some compromises in the technical design. Of course, I try to make sure the "minimum viable product" is extensible enough that I can later on go back and try to make it better, but this is a very different way of doing systems engineering compared to some of the histories that I've seen in presentations about ZFS and Dtrace --- where some of the former Solaris engineers were proud that they didn't ask permission or get authorization from management before committing very large amounts of engineering resources, and proud of the fact that they were striving first for technical perfection.
Some former Sun engineers have in the past said some very disdainful comments about Linux: "By amateurs, for amateurs". This ignores the fact that most of the Linux engineers are in fact paid to work on Linux; but at the same time, it's also true that many of us who are paid to work on Linux have to respect real world economic considerations (which sometimes conflict with technical excellence for the sake of technical excellence) --- and very often the work to make things better than the minimum necessary to make a feature freeze deadline is in fact done because we are amateurs in the original greek sense of the word --- as a lover of the art.
But if that means that "amateurs" have been able to make an OS like Linux become widely used and at this point, has at least an order of magnitude more paid engineers working on it compared to Solaris/Open Solaris, which is considered by ex-Sun engineers a "professional" OS, it is perhaps not that surprising that someone such as the OP feel the bitterness and frustration which was in that blog entry.
Just about everything said here about "Solaris engineering" is false. For example:
Perhaps this means that Sun was brave where other companies feared to tread --- but given that (a) various ex-Sun employees have stated that there was a lot of Solaris engineering work was done before consulting the marketing or sales "tribes", and (b) ultimately, Sun wasn't able to succeed in the marketplace --- leads me to propose that maybe pouring all of that engineering effort into ZFS and dtrace (while both are amazing technologies, I will be the first to acknowledge) --- perhaps wasn't a good product/market fit, and that a VC who looked at things from a very cold-hearted profit and loss perspective would have never allowed a lot of the Solaris engineering work to go on as long as it had unless there was a demonstrated way it could be somehow monetized.
Ignoring the terrible reasoning (correlation does not imply causation) and the child-like view of venture capital, this is wrong on the facts: the ZFS and (especially) DTrace teams were tiny; the work that we did was with management's blessing; the work that we did was a direct result of our experiences with customers; and (most importantly, from my perspective), the work paid for itself many, many times over. In fact, this work had paid for itself many times over even before 2006 when the core members of the DTrace and ZFS teams started a new effort inside of Sun to build a new line of products based on the two technologies.[1] The resulting product -- which we had initially projected to be a $1 billion business within five years -- paid for its entire effort in the first fifteen days of shipping product in late 2008, and went on to exceed those initial projections (acquisition by Oracle notwithstanding). It should also be mentioned that Keith was a key engineer on this effort -- and its outsized commercial success was actually a triumph of systems thinking, not an example of its defeat. So yes, Sun failed -- but it was because it ultimately heeded systems thinking too little, not too much.[2]
So how many total people were actually involved, over how many years? I probably shouldn't have said "engineering resources" --- what I meant was total headcount: technical writers, people to do the userspace tools, test engineers, engineers to do the performance tuning, etc. I mean the total headcount to get a fully functional, enterprise-ready production file system. What was the total number of person years?
And when you say "paid for itself" in terms of the new line of products in 15 days of sales, were you talking about revenue, or profit? And did you subtract out not just the BOM costs, but also the costs of designing the new hardware, the costs of the commissions for the sales people / sales channel, the support costs (both the help desk and the sales engineers), etc? Given that I suspect (from doing research about how many PY's other companies have needed to get a fully production-ready, enterprise quality file system) it was at least 100 person years, and knowing what the typical profit margin on hardware tends to be, even a high-margin NAS box that was intended to compete with NetApp, please forgive me if I'm finding it a little hard to believe your numbers.
Based on your questions, you obviously didn't know Sun or its engineering culture very well. The company's best work was done by outrageously small teams -- and its most regrettable done by absurdly large ones.
Speaking for DTrace, everything was done by the three of us on the core team: we wrote 100% of the technical documentation (somewhat infamously, actually), we wrote the test suite, we wrote the test cases, we ran the alpha program, etc. I know it doesn't fit your narrative very well for DTrace to have been done by three people over the course of 23 months[1][2] -- especially when the Linux efforts to mimic it have floundered for so much longer than that -- but that's what it is.
ZFS was larger than DTrace, but not by as much as you might thing -- and it was very small (as in, only Jeff and Matt) for a very long time. (The history there is public, and I don't particularly feel like doing your homework for you.)
As for your question about paying for ourselves: I am (of course!) subtracting out COGS. Indeed, the product line was so immediately successful and profitable that it attracted the attention of one of its biggest customers as a potential acquirer -- but that's another story. Again, I know that it doesn't fit your narrative to have had small teams develop technologies that were both innovative and profitable, but that's what it is: ZFS and DTrace and Fishworks were examples of Sun at its best, not its worst -- and I don't think that that's a terribly controversial opinion...
> ... leads me to propose that maybe pouring all of that engineering effort into ZFS and dtrace (while both are amazing technologies, I will be the first to acknowledge) --- perhaps wasn't a good product/market fit ...
My understanding is that neither ZFS nor DTrace were the products of large engineering teams.
And the market is enthusiastically receptive to these products. If they didn't drive adequate additional hardware sales for Sun, it might be important to remember that Sun didn't charge for them...
Regardless, I think you'd be seriously misunderstanding things to believe that engineering over-commitment contributed in any way to Sun's death.
A big mistake for a newcomer to any field is to put too much stock into people's opinions, particularly those that are put disrespectfully. It's a common technique among the programming crowd to vocally disparage other people to make yourself seem smarter than you are. This article is one of those.
Definitely true, but for someone to be this mournful about the state of the industry is usually indicative of at least a LITTLE problem.
I'm acutely aware that it's his personal opinion, I am aware of the pitfalls of bashing-culture (I've fallen into it many times when looking at legacy code), and I didn't uninstall linux and switch to illumos. In this case, I think he has a point (though he states it pretty extremely).
At the very least, it's refreshing to read an article every once in a while that isn't simply praising the established platforms of today, or pointing out a small shortcoming (usually that the author fixed and decided to make a blogpost about).
I'm currently reading his golang post to see why he thinks this way and if he makes any solid points there that sit well with me, but I value the dissent to popular zeitgeist (not so much how disrespectful he was about it).
Its basic point is that you should be using the "gccgo" implementation of Go, not the "gc" one which is badly NIH'ed.
He refrains from pointing out any of the obvious issues with the Go language itself since that is not his area of expertise, so if you're coming from a language background then you won't find the post interesting.
I come from the background of working on the Go project for five years. He really doesn't see the trade offs we made when choosing to base the core Go implementation on the plan 9 tool chain, and seems oblivious to the work already done to alleviate some of his complaints. A poorly researched article with no actual insight at best. At worst, a thinly veiled personal attack against the Go authors.
> I don't do much systems programming, but mean to do more, and I'm a little disappointed that what I thought was a good approach at a systems language (golang) is met with such derision by him…
His golang article was aimed at the low-level compiler implementation, not at the language itself. In particular, he thought that gccgo was the right way to implement the language since it builds on top of existing infrastructure and doesn't try to reinvent it (poorly, in his opinion).
> I’m not going to do the thing where I try to name everyone
> I’ve enjoyed working with, or even everyone who has helped
> me out over the years.
As someone who've worked with Keith for a few months, I am going ahead and say that he's definitely someone enjoyed working with. Opinionated, yes. Very smart? Also yes.
On one hand, he's opposed to the contemporary Desktop Linux philosophy which tries to poorly imitate the monolithic stacks of OS X and Windows, but he also brutally lambasts the innovations behind Plan 9, as apparently for all of his derision towards modern computing, he finds starting from first principles to be anathema for whatever reason.
His proposed solution is not given, but it's hinted towards Solaris/illumos. So... a System V Unix. A well designed and modern SysV Unix with facilities that other Unices haven't implemented as well, that is certain, but his complaint that our hardware and software is stuck in old decades is ironic as a result.
Who is this guy? I was under the impression this was the author of dtrace due to the domain name. That seems wrong, he's not listed as a dtrace author on wikipedia.
It was actually pretty interesting -- terrifying, even.[1][2] Not that I disagree with your assertion about zones, of course -- and note that we are (very) actively working on a Linux compatibility layer for illumos zones.[3]
The first Linux container implementation to my knowledge was linux-vserver, which shipped in 1998. Solaris Zones shipped in 2004. So, yeah, like... forever.
VServer and Jails/Zones are not really the same thing, IIRC, and 1.0 was around 2004 too anyway, while still not offering equivalent functionality.
If I remember incorrectly, I stand corrected on the timeline! I've got no problem with that.
But my point remains: even containers are "old" news, while speaking of "exciting OS things".
Phew -- thank you! (And it's "Keith Wesolowski", for whatever it's worth.) Personally, I would title this "A young software engineer retires", but I'm 0-for-1 on HN titles today.[1]
Woah.. that statement stung, mostly because I agree with it and realize that I'm one of the people that he refers to in his post.