Hacker News new | comments | ask | show | jobs | submit login
Partial Evaluation and Immutable Servers (regehr.org)
59 points by mzl on Nov 19, 2014 | hide | past | web | favorite | 16 comments

There have been many discussions in this area recently. Certainly, the worrisome trend of using containers as 'lightweight VMs' (with nearly all the same attack surfaces as regular VMs) suggests the current container-craze is kind of unlikely to change things: your analysis is correct... there's useless fat there on many layers. The question is how do we get rid of it?

You discuss changing the entire design process and electing to use compilers or languages that are capable of shaving off disused codepaths automatically. The problem with this idea is that declaring what is disused is difficult in a conventionally hacked-together-from-components unix-style system. The option to hack stuff together is precisely what makes it a lovely platform for getting things done, and it's hard to retrain most developers/admins on to a greenfield platform (for an example of this approach see, eg. the Erlang or Clojure communities).

I have taken a different yet similar approach in my thinking[1] where one assumption is that behavioural profiling of a codebase from a very high (eg. OS, network) level should be generally applicable to any codebase, allow sidestepping the need to retrain developers, and yet still be able to whittle away much of this fat (though far from all of it). Things like grsec implement learning modes which allow this to be done at the system call level, network profiling is easy as is filesystem monitoring, etc.

When you step back and think about this junk, the sad fact is that the NSA was publicly releasing the kernel-level implementation for this stuff nearly 20 years ago ... and have likely been doing this for even longer. The academic community tries to provide mechanisms for provable correctness but the world's time-sapped developers just chug along the same old tired path.

What we really need, I feel, is automated help that doesn't require a difficult to achieve cognitive paradigm shift for current generation programmers.

[1] http://stani.sh/walter/pfcts/original/

I find it a bit amusing that we used to build systems like this in the first place. There was no unused code because there was no _shared code_ , no libraries ,etc. You only used what you needed. Then we worked our way up to the 'common code' idea and generated enormous amount of libraries that systems come with by default and make programmer's life much easier. But now we realize that this is not very efficient after all, so we want to be able to write a program and then strip it of all the unnecessary code that we introduced(historically) in the first place. What is the lesson learned here? Is it that common code is not always a good idea, or is it that we need to write it in a way that is more _modular_ and easier to take apart and remove the blocks we dont need? Would such a change be even possible now?

I think that there's a strong case for starting over, and doing it with your second suggestion in mind.

Much of our fundamental design is a whole generation (20-30 years) old in architecture, and has various modifications tacked on top of that original design. However, we've learned a substantial amount about what we're doing since then and are working with systems that are only poorly represented in those architectures. Many times we find that it's not possible to tack on the latest innovations to these legacy cores in a meaningful way, and that the only way to incorporate them properly would be a substantial rewrite, so we just sort of hack them on top and pretend.

I think it's a fairly normal process in most engineering fields to every few decades, reimagine some core technologies in light of new production methods and materials.

I honestly think it might be worth experimenting with fundamentally green field designs on various core technologies (like operating systems), in the sense of putting a serious development effort behind them (5-10 years of development work to reach parity with current ones) rather than purely as doctoral experiments that clearly aren't production ready.

There are lots of interesting approaches that aren't ever going to make it to market because no one wants to put in the time before the current approach completely fails, rather than just become an increasingly large stack of kludges.

Starting over is incredibly expensive, the more so when done at a deep level. Either you reimplement an existing API on top of it (quite a bit of work, doesn't pass through all the benefits of a redesign) or you also have to port or rewrite all your applications.

Well, the idea here is that with libraries in pure source form, a partial evaluator could be able to specialize and shrink down the program automatically, as you describe. So you'd get the best of both worlds.

I'm not aware of any systems that actually do this on a global level, but I have a feeling I may be surprised if I start looking around.

Well, link time optimization does this for C code in principle. It is fairly new in gcc and clang so it is not clear what it buys you in real world cases. A lot of code cant be compiled out because it is hard to prove it cannot be called. I plan to experiment with LTO and compiling a kernel and an application together at some point to see how it goes.

MirageOS is one such system that has been making the rounds here recently. (Also mentioned at the end of this Regehr posting)

I think the MLton SML compiler does this.

The increasingly higher-level languages that are being used now certainly contribute too, as back when Asm and C were the only popular languages, there was little or no unused code because the amount of effort required was great enough that adding any "superfluous abstractions" would mean basically no advantage to neither developer nor user since someone would have to expend effort to write that code, and at runtime it would also add overhead. Efficient, minimal code - often because of simple design - was encouraged more than it is today.

Now, thanks to all these libraries, frameworks, languages, etc., someone can in a single line of code summon dozens or more layers of abstraction - and be quite unaware of the actual amount of resources being used - until something breaks or feels too slow or runs out of memory. We are increasingly creating systems that are so complex and difficult to comprehend as a whole that they have nonobvious failure modes, and when they do fail, it's even more difficult to figure out why.

A more extreme and somewhat philosophical viewpoint on this: http://countercomplex.blogspot.ca/2014/08/the-resource-leak-...

Of course it's possible. Dead code stripping is pretty easy to do provided that you're also doing static linking. Proguard does this for Android apps and GWT does it when outputting JavaScript.

Server-side programmers haven't had to care as much but maybe we should.

This way you get all the advantages of shared code (all the libraries are still available when you're building the server), and most of the advantages of specialized code like lack of bloat.

"Would such a change be even possible now?"

Yes. I know it's possible because this is how I work.

It is unfortunate that avoiding "common code" sometimes involves more effort than just accepting it, but in my experience, once you have invested the time to free yourself from the "common code", it is gone forever. I find this to be a very liberating feeling.

The idea of "common code" that the user or developer "must" accept is, I think, seen on many levels.

From CPU's where I get dozens of instructions no program will ever use, to computers of all sizes that are pre-loaded with crapware I will never use, to operating systems that come with numerous drivers, libraries and programs I will never use, where the libraries themselves include dozens of functions never used, to IDE's where that would give me heaps of code that I will never use, to programs themselves, often loaded with features I will never use. I'm sure there is more but you get the idea.

Are there costs to "common code"?

For example, I have seen gratuitous features increase attack surface and make programs less secure. At the least these gratuitous features make the programs more complex.

One cost I would argue is time, which I would also argue the world's most valuable asset. But then I also have to invest time to avoid "common code".

I also see the "common code" problem in documentation. Manuals running in the hundreds or thousands of pages where the needed information could be conveyed succinctly and concisely in a few paragraphs.

What is behind this decision to always deliver "the kitchen sink"? Or maybe there is nothing behind it? As crypt1d says, it is amusing.

I recall many years ago various attempts at obtaining small command line utilities from Microsoft for working with Windows. In each case the utility was only a few KB. Yet obtaining the program always involved downloading a "kit" of several hundred MB or, in more recent years, several GB. Is this intentional? Mere oversight? What do you think?

Interestingly, the "common code" problem does not appear to have gained a foothold in the context of information retrieval. Even though users today have ample storage space and processing power to handle bulk data, in fact as much data as they will ever access in their lifetime, they are not usually presented with an option to download "the kitchen sink".

For example, I can download the entire Wikipedia, load it into a database using my database software of choice, wherein all my queries become local.

Or I can make each query across the open internet into a third party's database software of choice.

Obviously, the later approach might be more desirable to the third party as they can record all the queries. But the former "kitchen sink" approach is more appealing to me since the querying is faster and more reliable.

Lots of unexplored potential in this area.

Also think about optimizing memory access by applying this to data structures: if you can prove there are at most 42000 foo objects, the compiler could reduce 64-bit foo pointers to 16-bit indices. And reorder object layout so that fields are in access order. Etc. With everything being limited by cache misses, reducing your cache footprint and miss frequency like this can provide big speedups. (also shrink your VM memory requirements).

Managed languages with capabilities to deopt when assumptions are violated can do optimizations like these more optimistically and based on profile feedback. In fact Hotspot already does in a more limited way (comperssed oops).

if you can prove there are at most 42000 foo objects, the compiler could reduce 64-bit foo pointers to 16-bit indices

I think the main problem with doing this is that it requires a level of "intelligence" and insight that compilers currently are nowhere near; these types of optimisations are easy to see from the point of view of the (human) programmer, but due to their generality, compilers simply can't seem to "think" in this way. It's not quite at the level of "find an implicit O(n^2) algorithm in the code and replace it with an O(n log n) one", nor is it as simple as the common pattern-matching type of optimisation.

The other aspect is that size-optimisation has traditionally been highly downplayed in academic research despite the techniques being well-known in e.g. the demoscene and some embedded systems development. Overlapping instructions, for example, is one optimisation that I don't think compilers will be able to do any time soon. This and other types of low-level optimisation are often applied by (human) Asm programmers, however.

Like I said, HotSpot already does a version of this. Doing doing the full version with deopt wouldn't appreciably harder analysis-wise: https://wikis.oracle.com/display/HotSpotInternals/Compressed...

For static compilation, generally proving <42000 by reasoning from code is probably too hard, but lots of room for cheating. For example profile feedback generates observation about <42000 and you put in an assertion to that effect.

There's some attention deficit in academia, but there do exist papers about this stuff. Eg https://dl.acm.org/citation.cfm?id=1739033 and its refs http://scholar.google.com/scholar?q=related:g11WX-h6J6cJ:sch...

It's just not making its way to real-world production compilers/VMs. C/C++ don't really allow most of these optimizations, that's one problem.

I posted that on reddit but it's worth a little spam:

see http://c2.com/cgi/wiki?SynthesisOs for a kernel featuring jit and partial evaluation to specify threads of code on the go. Old but still exciting.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact