Hacker News new | comments | ask | show | jobs | submit login
Systems Software Research is Irrelevant (2000) [pdf] (cat-v.org)
64 points by panic 3 months ago | hide | past | web | favorite | 29 comments

And then everything changed. System software became about managing large numbers of machines. Good new languages were developed. Interprocess communication finally became mainstream, even if too much of it was over HTTP. The whole Javascript ecosystem appeared.

Still too much Linux, though. Things running in containers on in VMs need far less of an OS, and a lot less OS state.

I think most of the observations of that talk still stand.

> System software became about managing large numbers of machines.

Yes, but is academic research on these topics relevant? Most of the progress on infrastructure software has come from the attempts of internet companies to cope with their big data problems. The people who implemented these systems have all received their phds from academic CS institutions and they used bits and pieces of existing distributed systems research, but they did their innovative work in the context of commercial companies, not academic CS departments.

> Good new languages were developed.

And what are they? A quick look at https://tiobe.com/tiobe-index/ (for the lack of a better resource) corroborates the thesis that the bulk of software is developed using the same old boring languages. Yes, they have evolved, but not much. Golang is, well, not exactly innovative. Rust still has a long way to go to be called a mainstream language.

> Things running in containers on in VMs need far less of an OS, and a lot less OS state.

Unikernels exist but is there a compelling non-niche use case for them? Docker acquired Unikernel systems and that's what we got so far: https://blog.docker.com/2016/05/docker-unikernels-open-sourc...

These are all great examples confirming the thesis that producing good relevant systems research is hard. On the one hand we are piling up gross inefficiencies on top of decades old technology, so improving the current state of affairs should be easy, right? On the other, software systems are inherently very open systems with a lot of stakeholders, so doing any kind of successful "clean-slate redesign" is almost unthinkable.

> The people who implemented these systems have all received their phds from academic CS institutions and they used bits and pieces of existing distributed systems research, but they did their innovative work in the context of commercial companies, not academic CS departments.

Three thoughts.

First, the title of the talk is "Systems Software Research is Irrelevant", not "Systems Building in Academic Departments is Irrelevant". e.g. the talk discusses plan 9. The distinction the talk makes, I think, is between commercial products and R&D not between industry and academia.

In that sense, I think the stuff out of google (mapreduce up through tensorflow) are good examples of that trend reversing

Second, industry R&D has almost always lead the charge on developing large systems in CS. That's not new.

Third, as you noted, "[many of] the people who implemented [and more important lead the design of] these systems have all received their phds from academic CS institutions and they used bits and pieces of existing distributed systems research". One role of CS academic research is to build foundational ideas and then crank out competent researchers who are able to build real systems/algorithms/companies on top of those ideas. Just because TensorFlow was developed at Google rather than UW doesn't mean that academic research is now irrelevant.

> And what are they?

There's a lot of OCaml, Haskell, and Scala code in the world. C# and the entire .NET family is a veritable treasure trove of academic pl ideas making it into production languages.

Good comments. They got me thinking, what exactly is "research" - I don't know what distinction Rob makes, but for me research software is software that is written with the main goal of publishing a paper in mind. In this sense mapreduce is emphatically not research software - it was first deployed into production and only then there was a paper (with production deployment serving as validation). A good counterexample would be not mapreduce but Stonebraker's c-store which then got commercialized as Vertica.

> One role of CS academic research is to build foundational ideas and then crank out competent researchers who are able to build real systems/algorithms/companies on top of those ideas.

Educational role is very important, but do phd graduates build on top of their research? Instead it seems, they learn the foundations (which were "research" a few decades ago), do a phd project on some very specialized and inconsequential thing and then go off to do real stuff at commercial companies.

To use an example from another field - nothing would have seemed more specialized and inconsequential than clustered regularly interspaced palindromic repeats in bacterial DNA. But then, that turned into CRISPR. Working on obscure, niche stuff is how you actually contribute to science.

Obviously I don't know much about biology, but are regular genome patterns with unclear functions such a common occurrence that they are seen as inconsequential?

> Working on obscure, niche stuff is how you actually contribute to science.

That's what they tell you (my favorite example BTW is conic sections - obscure and niche for millenia before it became known that they describe the orbits of planets). But is the deluge of highly specialized obscure papers really the consequence of free play of scientists' minds pursuing their own interests? Or is it more a consequence of the sheer number of phd candidates and postdocs that each need to achieve sufficiently novel results on a reasonable time scale with a fairly certain chance of success (objectives that are obviously in conflict)?

Of course ground-breaking research can grow out of niche results. But for growth to happen someone should build upon and improve upon these results. It is difficult to build upon a research prototype that works barely enough to register a minimal improvement to some metric and is thrown out afterwards.

>someone should build upon and improve upon these results.

I'm also not from a biology background - but I remember from doing philosophy, it was much more useful to find a paper that made some trivial advance in a very specialized area that you were interested in, than ten 'general' papers. I mean, if you're building something, and somebody has written a paper that addresses a part of your domain, it's gold dust - even if it's generally too niche for anybody to bother reading, even if it's substandard work.

You're right that there are some kind of perverse incentives in the hothouse-production of phd theses, but I think in general, people should embrace the triviality and irrelevance of scientific work. Alchemy set out to answer the big questions of eternal life, and gold from lead, and ended up answering nothing. Science set out to answer questions like, how are colours in flower petals passed through generations? If you look at the history of 'big questions', it's way less illustrious than that of small, boring ones.

Apache Spark and Mesos both came out of academic research, and both are quite popular in industry, so those are two counterexamples to your claim.

Of course there are counterexamples. Yours are good ones (although Google had a system comparable to Mesos much earlier, they just didn't want to publish a paper). Some others that I mentioned in another post are DBMS projects by Stonebraker. BTW, both Spark and Mesos seem to come from the same place (UC Berkeley), maybe they are doing something seriously right there?

Berkeley has a somewhat unique model in that our systems lab partners with industry to discover their pain points. Our research hence often solves some issue faced by industry, making it more relevant for practitioners. The current iteration of this lab is the RISE lab

I took a brief look at redox, which isn't a super serious project, and noticed that they were at that point only set up to run as a virtual machine. No actual hardware supported. I'm sure this was for expediency in their case...

But then I thought, why not? It seems kind of antiquated for a server os to worry about actual hardware in this day and age. Funny how things turn out.

That's pretty much how every OS is developed at least since QEMU was made. Why would you risk wrecking actual hardware with your toy OS and go through a code->compile->install->reboot cycle to test?

Yeah, its a lot of work to make an HAL with support for all desirable platforms. Just use Linux with KVM and the virtualized devices become the HAL, best-in-class hardware support already included.

Consider this paper as the context for the creation of Golang, Kubernetes, MapReduce, Docker, Android, Dalvik, Chrome, V8, TraceMonkey, Splunk, paravirtualization — first as RTLinux and UserModeLinux and then as Xen —, AWS, Systemd, and iPython clustering, among other things. Systems software research became a lot more relevant.

He's still mostly right about academia - over half the projects in your list are by Google. The best many universities can offer in the area of CS is to be a training school for the places where the real work is done.

I hear lots of anecdotes from colleagues about academic research conferences in certain areas like "big data" where people are excitedly talking about "new" things that industry did years ago, but didn't publish in an academic journal. And then there's the people claiming they're doing "big data" because there's slightly over a million rows in their MySQL database.

In that paper, Pike wasn't talking about academia (primarily or exclusively) in the first place. I think when he wrote that paper he had spent his entire career at Bell Labs. He was lamenting the decline of the systems-research praxis that he, and his lab, had been such an important part of for so many years. There's an excellent reason he's working at Google now.

Most of the "industry" work on compilers and graphics wouldn't be possible with the actual research that was done by the universities.

Yes the industry does pump a lot of money into research, but the people doing the real work are mostly doing some Msc or Phd at university XYZ.

So it goes both ways.

Just to cite the two areas that are my main hobby when doing computing related stuff outside boring enterprise work.

As for OS, the universities are the only place left it seems, as all major vendors are very carefull when it doesn't bring short term profits, e.g. Midori.

All state-of-the-art in Big Data from universities. Spark from UC Berkeley. Flink from TU Berlin. HopsFS from KTH Stockholm.

I think the problem is that current OSes are too complicated. What does a common user really want? I'd say text editing, making presentations, multimedia, and browsing. While these sound like simple things, text editing usually happens in a Word-like format, which - from what I've heard - is ridiculously complex. The same thing holds for browsing (simple browsers don't exist anymore, and many use the same back-end for rendering). For multimedia, USB is very important, which is a very complicated format to use correctly. Then, there are tens of codecs and formats for multimedia, each with their own idiosyncrasies and ambiguities in the specs (or deviations from the spec that are so common that they are expected nowadays).

The case is even worse when your OS is supposed to run on different hardware. I think this is the main reason that Apple, in general, has a much better UX: They only have to support a limited set of hardware.

Casey Murati has a great rant about this: https://www.youtube.com/watch?v=kZRE7HIO3vk

> I think this is the main reason that Apple, in general, has a much better UX: They only have to support a limited set of hardware.

People assume this to be the case, without actually analyzing what hardware actually needs to be supported by operating systems these days. For internal components, most of the stuff in a PC is pretty standardized: AHCI and NVMe for storage, EHCI and XHCI for USB, HD Audio for sound (and it's almost always a Realtek codec). Apple doesn't have any significant advantage in any of those areas. They still have to support two out of three GPU vendors, the top two or three NIC vendors, and maintain a driver infrastructure that supports loading third-party drivers for any of the above when there are exceptions. (Sure, the expandable Mac Pro may be long-dead, but Thunderbolt docks and enclosures enable the same variety of components).

Apple's real advantage seems to be that they don't have to try to work around anybody else's broken motherboard firmware. Microsoft could take the same stance and declare it to be Lenovo's problem if your Thinkpad's power management is broken. Microsoft probably should have changed strategy there when the industry switched to UEFI.

Devices have bugs, corner cases, incomplete implementations of standards, and standards can be ambiguous. The advantage isn’t that Apple have to implement less interfaces, it’s that they have to test with less devices.

I’m sure on paper Linux works with all my devices. In practice it does not.

That’s the gap Apple close with their hardware.

I agree.

I think the lack of new operating systems since late-90s can be (partially) attributed to the simple fact that modern computing world is so fully-developed, we have the computers for everything, and enjoy all the state-of-art features and functionalities. You already have, well, all the things you have on your personal computer. Although desktop apps are getting crappier and crappier as always (which is another story), but some parts of underlying infrastructure are created and only understood by the best developers in the world (high-performance TCP stack, 802.11n, multimedia, 3D graphics, TLSv1.3, PCI-e, USB 3.0, virtualization, exploit mitigation, web browser engines, C10K-free web server). By creating a new system, you lost everything immediately and render the computer less useful, making it less attractive for both hackers and academic/industrial researchers alike.

Back in the early days, not too many things were being done on computers, and you had fewer things to lose.

Want to play Space War on PDP-7, but don't even have an OS? Then Unix!

Want to do some real work on 80s home computers? BASIC! Micro-Soft!

When Linus purchased his computer, he found that the computer only runs DOS and was effectively useless for him. So he ordered a copy of Minix, meanwhile playing a DOS game to kill his time. After Minix arrived he eventually got a usable programming environment, was able to got online to Usenet with a modem, and also got the Minix filesystem which was suffering from performance issues at that time.

This is everything he had for his computing.

Then he decided to create his own operating system. He wrote the kernel, ported glibc, bash, gcc, and the system boots, later he added modem support. Now he could do everything he was able to do on Minix. That's it! Naturally, a few years later, it became a community project and gained huge momentum.

Perhaps the early-90s was the last era, when hackers can "just" create a operating system in a tinker's way.

On the other hand, new OS are still being constantly created on platforms which don't have the burdens of workstations/servers, we still see many hobbyist/research/practical systems on microcontrollers every year. As they became more and more powerful, then it also runs Linux and the story ends there...

I think this is also one reason that retrocomputing is getting more and more popular, it brings back the "playground" nature of early computer systems, unlike the "factory" nature of modern ones.

You can still tinker away like the 90's on Arduino, ESP* like boards.

The problem is the effort to actually build something half as usable as an existing free OS, so most just give up.

Thus I mentioned "microcontrollers".

I heard similar arguments from a co-worker when talking about L4/QNX and OS research in general.

"Nobody uses them / they don't work / where are they?" were the answers, all which have good replies if you google for 5 minutes.

Hard to discuss with someone that considers what they know as everything that exists out there. It's a myoptic view of the present and the future.

IMHO, this happens when one has more opinion than knowledge. Pike, unfortunately, seems to suffer from a similar problem.

I wonder if Pike would still stand by that paper now, or if it was a product of his personal headspace in the immediate aftermath of plan 9 not quite catching on.

Inferno was Pike's last OS, not Plan 9.

Some inspiration here:

>Be courageous. Try different things; experiment. Try to give a cool demo.

>Measure success by ideas, not just papers and money. Make the industry want your work.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact