This works well for transport and storage, but not so well if you need to understand and transform the data. For the problems with ad-hoc extension mechanisms, see:
In particular, the section on "The Mirage of Extensibility", which uses ImageMagick as an example.
> One solution is to do both. You declare a standard schema upfront, with common conventions that can be relied on by anyone. But you also provide the ability to extend it with custom data, so that specific pairs of Input/Process/Output can coordinate. HTTP and e-mail headers are X-able.
> The problem is that there is no universal reason why something should be standard or not. Standard is the common set of functionality "we" are aware of today. Non-standard is what's unanticipated. This is entirely haphazard. [...]
> X- fields and headers are the norm and have a habit of becoming defacto standards. When they do, you find it's too late to clean it up into a new standard. People try anyway, like with X-Forwarded-For vs Forwarded. Or -webkit-transform vs transform. New software must continue to accept old input. It must also produce output compatible with old software. This means old software never needs to be updated, which means new software can never ditch its legacy code.
Also see: IPv6. IP packets being so universal makes upgrading very difficult.
(author here) Hm very interesting, I hadn't seen this article!
I would say it's highly related, but doesn't contradict anything I'm saying. His examples are at a higher application level, in the domains of images and animation. I'm more focused the system level, like the basics of building code, distributing it, dynamically composing it, monitoring it, etc.
There's definitely less need for types and schemas at that level, and more need for computation in his application domains. (I guess in networking terms this is a "smart middlebox" problem, as opposed to passive ones.)
The commonality is that they are trying to avoid combinatorial explosions of code by having a shared representation, e.g. avoiding writing the same transformations for both PDF and JPG.
I'd also say that he is overly negative on a few fronts -- I agree with a lot of it, but I'd frame the evolution as a success, not a failure. I guess my point the narrow waist is basically the only way we know how to build (really evolve) large scale systems over large time frames.
We can do it well or do it badly, explicitly or implicitly... From the code and systems I've worked with, we could easily do a lot better. But I appreciate this article because identifying these kinds of problems is the first step.
----
Some random responses:
Putting data inside a free-form key/value map doesn't change things much. It's barely an improvement over having a unknownData byte[] mix-in on each native type. It only pays off if you actually adopt a decomposable model and stick with it. That way the data is not unknown, but always provides a serviceable view on its own. Arguably this is the killer feature of a dynamic language. The benefit of "extensible data" is mainly "fully introspectable without recompilation."
This part feels overly negative ... I would just frame this as a tradeoff between static and dynamic. There could be reasons that dynamic doesn't work in his domain, but that doesn't mean it doesn't work elsewhere. I have a bunch of material in drafts about types and distributed systems, sort of related to this "Maybe Not" discussion: https://lobste.rs/s/zdvg9y/maybe_not_rich_hickey (i.e. schemas/shapes should be decoupled from field presence; one is reusable and global and the other is local)
From what I know Clojure has some good framings of the versioning problem, with Spec and RDF inspired schemas, but I haven't used it. Rather than the brittle versioning model, Hickey frames software evolution as "strengthen a promise" and "relax a guarantee". I don't know if it would be useful in those domains but it's worth addressing.
I would be interested in unpacking the problem differing PDF and JPG metadata a little more... Off hand I'm not sure why it's so difficult; I feel like it's mostly a problem in certain statically-typed systems.
I guess to put a big stake in the ground, you could say extensibility is inherently dynamic. Simply because you don't know what code is going to operate on your data 10 years from now. It hasn't been written yet, and the people who need it haven't been born yet. It's impossible in reality, not just in the type system :)
I think the difference is his domains are living with the confines of a single machine and single application, so there is the expectation that you should be able to do better (have more type safety). While nobody expects to be able to reboot the Internet all at once and upgrade it. And even say at Google nobody wants to recompile all the code in a cluster at once and reboot it.
Yes, I don’t agree with everything in the article, but it points out some sticky problems that can make things complicated.
One example that’s related to programming languages is the evolution of an abstract syntax tree. We often expect new compilers to work with old code when a new kind of AST node is added, but to what extent should existing tools work with new code that has a node type that it doesn’t understand? Also, if your AST datatype is a library, does adding a new node type immediately cause compiler errors in every tool that uses the AST, or do those tools compile fine and break at runtime when given new code?
Sometimes as a workaround, “metadata” gets put in specially formatted comments, as a way of saying that most language tools can ignore it. Alternately there can be general-purpose annotations (as Java has).
Also, often an intermediate representation gets used as a “narrow waist” but they are tricky to design because “lowering” code will often discard high-level constructs.
What I'd say is that we already have a narrow waist for every language -- text. That is the one that's stable! Having an additional stable AST format seems to be an insurmountable problem, especially in most statically typed languages.
The basic reason is that it's easier to evolve text in a backward compatible way than types. This is probably obvious to some people, but controversial to others... It has a lot to do with the "ignoring new data" problem discussed in the blog post. It would be nice to unpack this a bit and find some references.
----
Related: I recently circulated this language implementation, which has a front end in OCaml, a backend in C++, and the IR is transmitted via protobuf.
Protobuf has the "runtime field presence" feature, which would be useful for making a backward compatible AST/ IR. That isn't done here, but it would be something to explore.
-----
Also, thinking about the blog post a bit more, I'd say a huge number of decisions hinge on whether you can recompile the entire program or not.
If you can, you should. But if you can't, a lot of programmers seem to have a hard time with runtime composition. Sometimes they even deny that it exists or should exist!
In contrast, whereas admins/operators/SREs deal with runtime composition almost exclusively. Programmers are more familiar with a static view and build time composition. If I wanted to be ambitious, what I am getting at with this series is a theory of and guidelines for runtime composition, and versionless evolution.
e.g. even in Windows, you never recompile your COM components together, even though they are written in statically typed languages. Global builds and atomic upgrades don't scale, even on a single computer.
Yes, this gets into the static versus dynamic library debate.
There's one school of thought that says that a technically competent organization really should be able to recompile any of the code it runs, on demand. If you can't do that, you are missing source code or you don't know how to build it, and that's bad. It's not really your code, is it?
That's Google's primary approach, internally, though there are many exceptions. It fits in well with an open source approach. The Go language's toolchain uses static linking as a result of this philosophy.
But most people don't belong to an organization with that level of self-determination. Few people run Gentoo. We mostly run code we didn't compile ourselves, and if we're lucky there is a good source of security updates.
But there's still a question of whether you really need dynamic linking, or is having standard data formats between different processes enough? When you do upgrades, do you really need to replace DLL's or is replacing entire statically linked binaries good enough?
If replacing entire binaries is your unit of granularity for upgrades, this leads to using something like protobufs to support incremental evolution. Instead of calling a DLL, you start a helper process and communicate with it using protobufs. The helper binary can be replaced, so that part of the system can be upgraded independently.
None of this really solves the AST problem. I don't think there are many development tools that would still work if the parser and AST were in a DLL and you replaced them? I guess it would help with minor fixes and cosmetic changes. The approach you tried with protobufs seems interesting but how awkward was it?
Hm I do see that point of view, but to me the trends seem to be in the opposite direction.
I'd question if that view even makes sense at Google, because like I said nobody ever recompiles all the code in a cluster at once and reboots it. This is the "lack of atomic upgrade problem", which tilts you toward a dynamic view. Some more links here, I will probably put it on the blog:
It's even more true if you're AirBNB or Lyft -- are you going to recompile Google Maps or Stripe? No you use it dynamically through an API. Pretty much all organizations are like this now, not just software companies. They're a big pile of dynamically composed services, and so the static view has less and less power.
I was thinking today about a slogan Poorly Factored Software is Eating the World. My first job was working on console games where you shipped a DVD and never updated it, and it was incapable of talking to a network. But basically no software is like that anymore, including games and embedded systems.
What I see is that the world is becoming a single big distributed system and that actually influences what you'd consider technical details like the type systems! And static vs. dynamic libraries. There is pressure from applications on languages and operating systems.
-----
The DLL vs IPC question has a few different dimensions... I'd say Linux distros do it very poorly. The whole model of "named versions" and "constraint solving" is extremely brittle:
1. It reduces something that's multi-dimensional to one or 2 dimensions (a version number)
I wrote this awhile back, and it's related to the Clojure and protobuf style of "versionless" software evolution:
2. It means that you're often running a combination of software that nobody has ever tested! In Linux distros, testing is basically done by users.
Google does a little better -- you have a test cluster, and you deploy to that first, so ideally you will have tested the exact combination of versions before it reaches end users. And you have canarying and feedback too, so even when you have bugs their impact is more limited.
So while I favor IPC, I'd say you can probably do dynamic linking well, but there are some "incidental" reasons why it's done poorly right now, and has a bad reputation. It leads to breakage and is hard to deploy, but that's not fundamental.
----
I guess the overall frame I'm trying to fight is "static vs. dynamic". What I'd say is that static is local and domain-specific, and you should take advantage of it while you can. But dynamic is inevitable at a large scale and we need better mechanisms to deal with it, better ways of thinking about it, better terms and definitions, etc.
Shell is of course all about dynamic composition -- it's basically unheard of to recompile all the programs a shell script invokes :) The language is such that you can't even determine all the programs it invokes statically (but Oil is going in that direction for deployability / containers)
----
Oh and the language project wasn't mine -- I just circulated it. I did 2 related experiments called "OHeap" that were protobuf/capnproto like, which I described on the blog, but I'm not using them, and they were only mildly interesting. In Oil there's not a hard line between front end and back end, so serializing the AST didn't end up being a major requirement.
Yes, there are no atomic upgrades. Even when a static binary is replaced, there are many instances of it running and the processes aren't restarted atomically.
However, in a protobuf-based system within a single organization, you can at least say that every deployed binary expects an API that was defined by some version of the same protobuf file. At Google there is (or was - it's been a while) a linear history of this file in source control somewhere. That limits the possible API variations in existence.
By contrast, in the acko.net article, he describes the ad-hoc variation that happens when many organizations make their own extensions to the same protocol, with little coordination with each other. (And yes, the web is like this too.)
Adding one more thing: I'd also say there's a big asymmetry in DLLs vs. static libraries. DLLs that can be independently upgraded pretty much have to use the C ABI, which is more like RPC/IPC. It doesn't really make sense to have a pure Rust or C++ DLL -- you lose all your types and type safety; you have to write glue.
So actually I'd say IPC and DLLs are closer together, and static libraries are on the other end of the spectrum. IPC and DLLs are both dynamic, and that's the important distinction. That's what a lot of the decisions in the acko.net article hinge on
As someone who teaches basic programming type things, and enjoys hacking away at things solo, I really like this reframing and reconsideration of that good ol' "Unix Way" type of thinking, and why I'm a huge fan of Bash for everything, and it makes me wonder why it isn't more "text everywhere." In the back of my head I don't think I'll ever be convinced that those reasons aren't mostly arbitrary. (i.e. language/tool developers for other developers subconsciously fomenting lock-in)
It's interesting to apply this at various levels of abstraction.
Almost all raster image editors work on a grid of RGB(A) pixels. Whether it's PNG, JPG, WebP or whatever - they're all deflated to a pixel grid.
Wireless technology strikes me as the opposite of the narrow waist. Sure, it is all just variants of basic radio tech. But Bluetooth audio and WiFi are fundamentally incompatible - despite using more-or-less the same spectrum. 2G phones can't send or receive 5G.
Actually wireless is the classic example of the narrow waist as suggested by the sibling comments but it's not because of GNU Radio but I/Q signal.
All wireless communication are based on EM waves and can be reduced to I/Q or quadrature signal [1]. But rather than design by human as IP or text mentioned by the OP it is found in nature similar to our chromosomes pair. Using this wonderful I/Q signal concept you can even digitally create and control not only digital modulations but all the conventional analog modulations for examples AM, FM and SSB. If you want you can even control the EM signal impedance itself by using the I/Q signals and in microwave measurement world this modern technique is known as digital load-pull or source-pull.
The interoperability problems that you've observed is due to the Bluetooth and WiFi are using different modulation techniques. Bluetooth is using the Frequency Hopping Spread Spectrum (FHSS) and WiFi is using OFDM. The former is good for low power and the latter is good for bandwidth. Initially WiFi standard also support FHSS but due to its bandwidth limitation the later WiFi standards (and also 4G/5G) are based entirely on OFDM.
Most wireless transmission protocols use generic modulation techniques that in theory can be put together using a software package like GNU Radio.
In theory GNU Radio could be the narrow waist for all wireless transmission protocols. It isn’t practical yet but it’s a possibility. Instead of rewriting a new wireless stack each time we just use GNU radio and specify our specific signal blocks.
The analogy is similar to IR being the compiler’s narrow waist.
C APIs like ioctl, setsockopt, etc. are perhaps also an example of narrow waists which aren't mentioned in the article. I've been looking for a term for these kinds of functions, which seem to be quite successful long-term in terms of API design.
Yes, file descriptors and whole file systems are other narrow waists that Unix has, not just files themselves! I want to make diagrams for those too.
The terminology I'd use:
File descriptors are just a way to do polymorphism in the type system of C. Basically you fall back to an "untyped" integer (similar to void*).
Go's interfaces solve this problem pretty directly, much better than C++, which can't really express it in the type system. That is, with the "-er" interface pattern like Reader and Writer.
(I think type classes do better too, but I haven't used them much. I'd like to see an article that unpacks this better in the context of file descriptor APIs)
This style relates to "versionless evolution". The key issue is that Go doesn't require you to declare that a type supports an interface. It's implicit and the compiler figures it out. That was controversial, but it is the key to enabling evolution.
Most programmers are viewing things through the lense of "strictness" and "catching mistakes", rather than evolution. ioctl() is messy as heck but has allowed a lot of evolution.
It's highly related to this discussion, which I've linked on the blog before:
Ha! Just a few months ago, I came across Cloudflare Research's Address Agility blog post (since published in SIGCOMM [0]) that squeezed Internet's waist down to a single IP (dubbed Ao1), and instead reasoned that perhaps with popularity of TLS, a naming scheme (like the SNI field) is all that's needed, because given the pigeon-hole principle, there will always be more names (variable length puny-codes) than IP addresses (fixed-length hex/octets).
Telecom industry perhaps deployed something similar in Multi-protocol Label Switching. Other address translation schemes such as the HTTP Host header, and Google Maglev-esque Virtual IP load balancers (in-use at BigCloud providers) are other attempts at addressing hosts using something other than public IPs (which enabled domain-fronting, once employed by Signal before it was outlawed [1]), squeezing Internet's narrow waist further.
I am mostly interested in Ao1 from an anti-censorship point-of-view (IP fronting) as that's the last frontier as far as stateless firewalls go, while names, DNS and SNI, can be hidden away today with DoH/DoT and ECH (encrypted client-hello), respectively.
That said, I like the term Simplfying Assumption better as described in a series of posts on it by apenwarr [2][3] while the classic The Rise of Worse is Better pits C against Lisp, arguing that sacrificing correctness, consistency, completeness for the sake of simplicity of the implementation, is a worthwhile trade-off [4].
Yes but only when you're programming in Lisp (which can obviously be extremely useful and productive).
I want to get to this on the blog, but the narrow waist idea is actually hierarchical, with bytes at "level 0", and things like s-exprs and JSON at "level 1".
Building up this claim, I'd say ALL good languages are defined by their core data structures. See my comment (oilshell) here:
e.g. Lisp is defined by s-exprs, C by pointers and offsets, shell and Tcl by strings, etc.
The lowest common denominator between all these "meta-models" languages is BYTES. S-expressions are expressed as bytes. In Python and JavaScript, you serialize to JSON, etc.
----
To support this point, I have a slogan that "XML was a failed narrow waist". It was designed for documents, but people tried to express everything in it -- records, tables, and even programs.
But that was a bad idea! JSON took over some of that space since it's simply more appropriate. But JSON isn't good enough for documents, so we use HTML. Which means the lowest common denominator is again text.
(author here) How does that relate to the article? It's definitely an important principle of network design, but the narrow waist principle is distinct.
The waist isn't the middle of the network as far as connectivity; it's the architectural middle with applications on top and transports on bottom. I think these are mostly separate issues, but I'm interested in any elaboration.
The narrow waist falls out of the end to end principle because when you push all the intelligence out to the ends, there is often only one reasonable intermediate/substrate; and all narrow waists in your article, whether TCP/IP or text formats, are driven by end-to-end concerns. The narrow waist is the special-case of the more important principle. Anyone who thinks about these issues will be thinking about end to end, so at the very least, you should discuss why you think narrow waist is more fundamental and important than end to end.
BTW, all NAPs are available online as a federal government body, all you had to do is google the title, no need to buy it: https://www.nap.edu/read/4755/chapter/1
Thanks for the PDF link. I downloaded it and the funny thing is that it ended up at 4755(1).pdf on my computer. I had downloaded it in July, but neglected to read it, or remember that I downloaded it!
BTW I found an earlier citation for the hourglass metaphor after publishing the post:
(John Aschenbrenner in 1979, with reference to a 1984 article by Jean Bartik)
-----
I see your point, but my main focus is on writing O(M + N) code, not O(M * N) code, and the narrow waist captures that. I'm making an analogy from networking to software.
The linked video uses "coding the area" vs. "coding the perimeter", but even that doesn't capture the idea that there is a narrow waist that somebody actually designed! The waist is a real thing and it makes the interoperability and code reduction possible.
I think the key difference is incentives:
- In networking, there is a big incentive to interoperate (Metcalfe's law). And arguably the Internet "completed" this task decades ago.
- In software, there is an incentive to build your own walled gardens, and not interoperate (Microsoft and Win32, Apple and iTunes, Google and the Cloud, etc.).
But fundamentally there is the O(M x N) issue in both cases. Basically I'm tired of writing software for this reason :)
I'm not sure I agree that the narrow waist "falls out" of the end-to-end principle even in the networking domain. See all citations about the moving narrow waist. It's far from clear what the design should be. The narrow waist isn't pure or mathematical -- it's a deep engineering compromise and can be done well or poorly.
Even if it does fall out in networking, it definitely doesn't in software. Like the Unix syscall interface is small but far from trivial. It's been haphazardly extended. People argue about its design all the time. Ditto for ISAs -- the designs are heavily debated compromises.
There are constant arguments about narrow waists on Hacker News. The bias I tend to see is favoring local convenience over system integrity and economy.
It sounds very related to your Holy Wars article -- Thanks, I will read it and the other ones :) The Spolsky article is great and I quoted a few others here: http://www.oilshell.org/blog/2021/07/spolsky.html
-----
There are more dimensions to this that I would like to cover, e.g.
> See all citations about the moving narrow waist. It's far from clear what the design should be. The narrow waist isn't pure or mathematical -- it's a deep engineering compromise and can be done well or poorly.
And what motivates the moving or when the waist isn't so narrow?
> - In software, there is an incentive to build your own walled gardens, and not interoperate (Microsoft and Win32, Apple and iTunes, Google and the Cloud, etc.).
> Even if it does fall out in networking, it definitely doesn't in software. Like the Unix syscall interface is small but far from trivial. It's been haphazardly extended. People argue about its design all the time. Ditto for ISAs -- the designs are heavily debated compromises.
End-to-end doesn't argue that you can't have rent-seeking or local optima; just that if you build complexity and intelligence into the intermediate, you will probably limit performance of the total system.
And this is super true in all of the examples you give, they all sacrifice performance that more end-to-end systems would be able to achieve: nobody thinks that ISAs are close to optimal, and RISC won over CISC because they delivered more performance (or consider Itanium); Unix syscalls give up huge amounts of performance all the time (just look at memory management alone, where a decent chunk of syscall complexity is there to let databases and other applications opt out of syscall complexity) and also as demonstrated by exokernels; nobody involved with iTunes etc is laboring under the delusion that its centralized paywalling and Apple-centric design in any way maximizes worldwide music consumption or production efficiency (because it doesn't even try to support countless ways of interacting with music aside from a few blessed ones like albums and podcasting), just laboring under the true belief that it's very good for Apple's pocketbook.
Hm I don't follow ... The narrow waist claims I'm making is only indirectly related to performance. I think you would have to define "intermediate" and "end" in Unix, ISAs, etc. It's not clear what analogy you're making.
The claim you're making appears to be very strong: the narrow waist idea is a special case of the end-to-end principle? Is this something you are asserting now, or has something been written about it by you or others?
As mentioned I see them as distinct in networking ... and when extended to software, I barely see the relation. I am purposely extending the "meta-idea" to software; this comment shows where it's going:
People understand what I'm saying, they are able to apply it, and they see the consequences...
I will mention related concepts like Metcalfe's law, and maybe the end-to-end principle, but that one seems the least relevant.
I'd definitely be interested in reading an article that claims otherwise though!
---
To answer the first question, the moving waists are motivated by real world usage, i.e. engineering requirements. The waist is a big engineering compromise. IP doesn't have auth or content caching; HTTP has that. People who want that functionality start building on that layer, etc.
Scaling and increase in nodes is another example of pressure on the waist, e.g. the need for NAT traversal
Adding one more thing: I'd also say there's a big asymmetry in DLLs vs. static libraries. DLLs that can be independently upgraded pretty much have to use the C ABI, which is more like RPC/IPC. It doesn't really make sense to have a pure Rust or C++ DLL -- you lose all your types and type safety; you have to write glue.
So actually I'd say IPC and DLLs are closer together, and static libraries are on the other end of the spectrum. IPC and DLLs are both dynamic, and that's the important distinction. That's what a lot of the decisions in the acko.net article hinge on
I think the author is going for more of https://en.wikipedia.org/wiki/Star_topology, just with the special case added that you're simplifying a bipartite graph rather than a general graph.
On Variance and Extensibility <https://acko.net/blog/on-variance-and-extensibility/>
In particular, the section on "The Mirage of Extensibility", which uses ImageMagick as an example.
> One solution is to do both. You declare a standard schema upfront, with common conventions that can be relied on by anyone. But you also provide the ability to extend it with custom data, so that specific pairs of Input/Process/Output can coordinate. HTTP and e-mail headers are X-able.
> The problem is that there is no universal reason why something should be standard or not. Standard is the common set of functionality "we" are aware of today. Non-standard is what's unanticipated. This is entirely haphazard. [...]
> X- fields and headers are the norm and have a habit of becoming defacto standards. When they do, you find it's too late to clean it up into a new standard. People try anyway, like with X-Forwarded-For vs Forwarded. Or -webkit-transform vs transform. New software must continue to accept old input. It must also produce output compatible with old software. This means old software never needs to be updated, which means new software can never ditch its legacy code.
Also see: IPv6. IP packets being so universal makes upgrading very difficult.