I don't have a transcript link at hand, but as far as videos go, "Functional Core, Imperative Shell" / "Boundaries" by Gary Berhardt is also a must-see (or must-read, hopefully).
I’ve been programming for a long time, watched this presentation several times, done a bunch of other research, and still don’t know if I understand what this presentation is about. I fear that I’ve tried to apply these simple-vs-complex principles and only made my code harder to understand. My understanding now is that complexity for every application has to live somewhere, that all the simple problems are already solved in some library (or should be), and that customers invariably request solutions to problems that require complexity by joining simple systems.
> still don’t know if I understand what this presentation is about
1. The simplicity of a system or product is not the same as the ease with which it is built.
2. Most developers, most of the time, default to optimizing for ease when building a product even when it conflicts with simplicity
3. Simplicity is a good proxy for reliability, maintainability, and modifiability, so if you value those a lot then you should seek simplicity over programmer convenience (in the cases where they are at odds).
If you agree with her hypothesis, what it's basically saying is that a clean design tends to feel like much more work early on. And she goes on to suggest that early on, it's best to focus on ease, and extract a simpler design later, when you have a clearer grasp of the problem domain.
Personally, if I disagree, it's because I think her axes are wrong. It's not functionality vs. time, it's cumulative effort vs. functionality. Where that distinction matters is that her graph subtly implies that you'll keep working on the software at a more-or-less steady pace, indefinitely. This suggests that there will always be a point where it's time to stop and work out a simple design. If it's effort vs. functionality, on the other hand, that leaves open the possibility that the project will be abandoned or put into maintenance mode long before you hit that design payoff threshold.
(This would also imply that, as the maintainer of a programming language ecosystem and a database product that are meant to be used over and over again, Rich Hickey is looking at a different cost/benefit equation from those of us who are working on a bunch of smaller, limited-domain tools. My own hand-coded data structures are nowhere near as thoroughly engineered as Clojure's collections API, nor should they be.)
> I fear that I’ve tried to apply these simple-vs-complex principles and only made my code harder to understand. My understanding now is that complexity for every application has to live somewhere, that all the simple problems are already solved in some library (or should be), and that customers invariably request solutions to problems that require complexity by joining simple systems.
Simplicity exists at every level in your program. It is in every choice that you make. Here's a quick example (in rust):
fn f(i) -> i32 { i } // function
let f = |i| -> i32 { i }; // closure
The closure is more complex than the function because it adds in the concept of environmental capture, even though it doesn't take advantage of it.
This isn't to say you should never pick the more complex option - sometimes there is a real benefit. But it should never be your default.
You are correct in your assessment that customers typically request solutions to complex problems. This is called "inherent complexity" - the world is a complex place and we need to find a way to live in it.
The ideal, however, is to avoid adding even more complexity - incidental complexity - on top of what is truly necessary to solve the problem.
I think, the shift in programmer's perspective on where complexity should live is very much related to the idea of "the two styles in mathematics" described in this essay on the way Grothendieck preferred to deal with complexity in his work: http://www.landsburg.com/grothendieck/mclarty1.pdf.
Rich belongs to the small class of industry speakers who are both insightful and nondull. Do yourself a favour if you haven't and indulge in the full presentation.
I still can't believe that I was actually there during that exact presentation but at the time it didn't have the impact on me that it seems to have had on HN as a whole. Maybe I should review it again, or maybe I'm just not smart enough / don't have the right mindset, IDK.
Rich Hickey seems to be a bit of a Necker cube. Some people i know and respect think he is a deep and powerful thinker. But to me his talks always seem like 90% stating the obvious, 10% unsupported assertions.
That is the key: stating the obvious actually is hard and I think Rich does a beautiful job to translate the thoughts and feelings most programmer have into words. It actually gives a way to discuss and think about things (especially design and architecture) with others. I learned that there is no such thing as "common ground" or common knowledge magically and intuitively shared by all programmers. So if this already reflects your thoughts - even better.
Yeah, I think it depends on whether you're thinking about things from a SYSTEMS perspective or a CODE perspective.
Hickey clearly thinks about things from a systems perspective, which takes a number of years to play out.
You need to live with your own decisions, over large codebases, for many years to get what he's talking about. On the other hand, in many programming jobs, you're incentivized to ship it, and throw it over the wall, let the ops people paper over your bad decisions, etc. (whether you actually do that is a different story of course)
Junior programmers also work with smaller pieces of code, where the issues relating to code are more relevant than issues related to systems.
By systems, I mean:
- Code composed of heterogeneous parts, most of which you don't control, and which are written at different times.
- Code written in different languages, and code that uses a major component you can't change, like a database (there's a funny anecdote regarding researchers and databases in the paper below)
- Code that evolves over long periods of time
As an example of the difference between code and systems, a lot of people objected to his "Maybe Not" talk. That's because they're thinking of it from the CODE perspective (which is valid, but not the whole picture).
What he says is true from a SYSTEMS perpective, and it's something that Google learned over a long period of time, maintaining large and heterogeneous systems.
tl;dr Although protobufs are statically typed (as opposed to JSON), the presence of fields is checked AT RUNTIME, and this is the right choice. You can't atomically upgrade distributed systems. You can't extend your type system over the network, because the network is dynamic. Don't conflate shape and optional/required. Shape is global while optional/required is local.
If you don't get that then you probably haven't worked on nontrivial distributed systems. (I see a lot of toy distributed computing languages/frameworks which assume atomic upgrade).
I read a bunch of the other ones. Bjarne's is very good as usual. But Hickey is probably the most lucid writer, and the ideas are important (even though I've never even used Clojure, because I don't use the JVM, which is central to the design).
I think that the thing about that talk that struck a chord is that he took a bunch of things that people had been talking about quite a bit - functional vs oop, mutability, data storage, various clean code-type debates, etc. - and extracted a clear mental framework for thinking about all of them.
At the time I found it I was was working where the key challenge was not so much technical as much as ensuring that simple technology, spread across a large breadth of functionality, adhered to a consistent vision of the business domain that was being implemented. The program became an implementation of the mental model I developed of the business domain itself: its purpose, its uses, and its allowances and prohibitions. The Naur paper hit exactly not only on what I had been implementing in code, but on what other developers would have to know in order to maintain that code over time.... and had the kind of knowledge that had been lost over the life of application by the time I came to be involved.... and part of why my project existed.
The examples near the start reminded me of another piece shared here before, "How to Build Good Software". Most notably the part near the end, titled "Software Is about Developing Knowledge More than Writing Code".
So glad to see the Law of Leaky Abstractions in there - that's had a very long-running impact on how I think about programming. It's still super-relevant today, nearly 18 years after it was published.
It's nonsense, written because Joel had never used a language with a decent type system. Any Haskell programmer uses half a dozen non-leaking abstractions before breakfast. Even the examples in the post itself don't hold up - using UDP instead of TCP doesn't actually mean your program will work any better when someone unplugs the network cable.
I think we Haskellers have to be realistic and say that although it feels that many of our abstractions are non-leaking, they're only non-leaking in the sense that a modern, triply-glazed, thoroughly insulated house is non-leaking of heat compared to a draughty, cold house built 200 years ago. There are indeed leaks, but they are small and generally ignorable.
I don't think that's true. A lot of these abstractions are provably correct and so simply cannot leak (and in slightly more advanced languages you might even enforce those proofs - consider Idris' VerifiedMonad and friends).
Of course if you put garbage in at the lower levels (e.g. define a monoid instance that doesn't actually commute) then you will get garbage out at the higher levels (e.g. the sum of the concatenation of two lists may no longer equal the two lists' sums added together), but that's not the abstraction leaking, that's just an error in your code.
You are abstracting over a CPU and memory. Your abstraction leaks in that memory layout actually matters for performance, for example. Or if you have a bad RAM chip.
> Your abstraction leaks in that memory layout actually matters for performance, for example.
There are cache-aware abstractions if your situation warrants them. Of course if you abstract over a detail then you lose control over that detail. But that's not the same as a leak, and it's the very essence of programming at all; if the program needs to behave differently every time it runs, then creating a useful program is impossible.
> Or if you have a bad RAM chip.
That's another example of what I said about garbage in, garbage out. The fault isn't in the abstraction, the fault is the bad RAM chip. If you were manually managing all your memory addresses then a bad RAM chip would still present the same problem.
That's not exactly false, but at that point you might as well say that anything that breaks is an abstraction leak. If my car won't start in the morning, is that an "abstraction leak"? I don't think it is (or at least I don't think it's a useful perspective to see it as one), because the problem wasn't that I was thinking of the abstract notion of a car rather than the details of a bunch of different components connected together in particular ways; the problem is that one or more of those components is broken (or maybe that some of the components are put together wrong).
> You are abstracting over a CPU and memory. Your abstraction leaks in that memory layout actually matters for performance, for example.
I find the idea that an abstraction is leaky if different implementations of it perform differently to be fairly useless. I don't think it's a useful concept unless the abstraction captures the expected performance. If the abstraction doesn't give any performance guarantees, then the caller shouldn't have any performance expectations.
Similarly to abstractions around accessing a file on disk that might fail. The abstraction should account for potential failures. If it doesn't account for failures, but the implementation does fail, then it's meaningful to call it leaky.
Performance matters sometimes. If you're in a situation where performance matters, and the abstraction doesn't capture the expected performance, but you're subject to the abstraction's actual performance anyway, then the abstraction leaked in a way that matters to you.
You've found that the abstraction isn't useful for doing your task. That's not leaking. That's like complaining that your ice cream maker can't cook rice. That's not what it's for.
In fact, this is a common manner for abstractions to become leaky. You find you are in need some guarantee not present in the abstraction. You choose to add whether or not that guarantee is satisfied to the shared interface. Congratulations! You've added a leak to the abstraction.
But that's not the only option available. If you need a guarantee not provided by an abstraction, you could ignore the abstraction and use something that actually provides the guarantees you need.
Abstractions are equivalences, not equalities. You shouldn't expect an abstraction to make a linked list the same thing as a vector - they aren't, and they never will be - but they are equivalent for certain purposes, and a good abstraction can capture that equivalence. The performance of those two different collections is not the same, but that's not a leak unless the abstraction tried to claim that it somehow would be the same.
> You shouldn't expect an abstraction to make a linked list the same thing as a vector - they aren't, and they never will be
I would even argue that's the point of an abstraction. Hide the details that don't matter to the caller. If performance is a detail that matters, and the abstraction doesn't capture it, then you're using the wrong abstraction.
Yeah, and that is still completely pointless observation, because literally nothing I do will change because of it. Because if abstraction leaks in these edge cases, we are still better off trying to come up with same abstractions then not.
The theories as conceived to relate to the actual programming environment. Any proof about a Haskell program’s correctness relies on a leaky abstraction (an axiomatization) of what will actually happen when you run GHC on the source file.
WTF? The set of integers isn't finite. There are non-leaky ways to represent integers or computable reals in a computer (of course one cannot compute uncomputable reals, by definition). And plenty of finite subsets of either are well-behaved and non-leaky. If you treat a finite subset of the integers as being the set of all integers then of course you will make mistakes, but that's not a problem of abstraction.
Is it possible that you are misinterpreting what Spolsky meant? I think he means that in the real world we interact with implementations of abstractions, and that the implementation always shines through and can bite you in the ass. This is what makes side-channel attacks possible, and (in Spolsky's view) unavoidable.
> I think he means that in the real world we interact with implementations of abstractions, and that the implementation always shines through and can bite you in the ass.
I understood fine. He asserts that "always" on the basis of a handful of examples, only one of which even attempts to show anything more than a performance difference. It's nonsense.
I think you're giving Haskell too much credit... In my experience most abstractions need to be replaced because of performance requirements - achieving lower latency, higher throughput, etc. That's the reason to go with UDP instead of TCP. Not sure if this sort of leakiness falls under what Joel had in mind though.
IMHO the switch to UDP is happening because the work TCP is doing to ensure reliability is now done at network and thus having TCP do it is redundant. TCP assumed very simple and dumb network, which is no longer the case.
More or less reliability in the datagram layers affects performance - for example, WiFi does its own retransmissions whereas ethernet does not, because WiFi uses a less reliable physical layer, and because you don’t want your packets to have to go from London to New York and back before you discover one of them was lost.
But reliability at the WiFi later cannot give your application the semantics of an ordered data stream, so it is not a substitute for TCP. You can replace TCP with a different transport protocol if you want different behaviour, eg SCTP or DTLS or QUIC, but in all cases they are providing a higher level abstraction than raw datagrams, not just (and not necessarily) more reliability.
1. Every type has a bottom in every mainstream language, most of them are just less explicit about it.
2. Bottoms do not make abstractions leaky in some generalised sense. The "fast and loose reasoning is morally correct" result applies: any abstraction that would be valid in a language without bottoms is still valid wherever it evaluates to a non-bottom value.
I agree. Dijkstra et al. always pushed (as early as in the 1960s) that resources used at abstraction level n should be effectively invisible at level n + 1. Anything else is an improperly designed abstraction.
Of course there's always the thermodynamic argument that "any subprogram has the permanent and externally-detectable side effect of increasing entropy in the universe by converting electricity to heat" but that is to me a bit of a Turing tar-pit of an argument.
Even that's just an effect that you can represent in your language. The evaluation of 2 + 2 is not exactly the same thing as the value 4, but you could track the overhead of evaluation (e.g. in the type) and have your language polymorphically propagate that information.
My takeaway from it is that we need to distinguish what you might call essential from incidental duplication. Essential duplication is when two bits of code are the same because they fundamentally have to be, and always will be, whereas incidental duplication is when they happen to be the same at the moment, but there's no reason for them to stay that way.
For example, calculating the total price for a shopping basket has to be the same whether it's done on the cart page or the order confirmation page [1], so don't duplicate that logic. Whereas applying a sales tax and applying a discount might work the same way now, but won't once you start selling zero-rated items, offering bulk discounts to which a coupon discount don't apply etc.
[1] Although i once built a system where this was not the case! In theory, the difference would always be that some additional discounts might be applied on the confirmation page. In theory ...
I don’t understand why people interpret that article as recommending you avoid prematurely removing duplication and comparing it to the rule of 3. The point of her essay is that you should resist the sunk cost fallacy and refactor your duplication-removing abstractions when requirements (or your understanding of them) change.
The problem isn't DRY. The problem is most programmers' inability to tear down abstractions that aren't correct for your new requirements when they evolve.
Yeah that sucks. Especially when you're designing service endpoints and your colleague insists upon reusing an existing endpoint instead of opening up a new one, because the two use cases looked the same when he squinted hard enough.
Now instead of /credit-card and /debit-card, which are independently testable, debuggable and changeable, you just have /card. Can't change the debit logic in /card because it will break credit. Can't change the credit logic in /card because it will break debit.
Well... early in my career, a colleague and I developed a bit of a mantra: "Did you fix it everywhere?" Later in my career, I learned the value of there only being one place to have to fix.
At the same time, too much DRY can over-complicate (and even obfuscate) your code. That's not the answer either.
Taste. Taste, experience, and wisdom. But I don't know how to give them to someone who doesn't have them. Maybe by pointing out the problems of specific things they're trying to do, in a way that they (hopefully) can understand and see why it's going to be a problem. Maybe...
by everything you mean outside of programming also? Or just in programming? I guess I find DRY pretty important for me, as a tool to force me to abstract and to help me understand the system I'm working on.
> All programmers are forcing their brains to do things brains were never meant to do in a situation they can never make better, ten to fifteen hours a day, five to seven days a week, and every one of them is slowly going mad.
Yeah, this resonates. I've got in trouble with my wife more than once for being in "code mode" when I'm working and something happens, it seems to turn off my basic empathy for some reason. Programming changes people.
Lately I've been into what I would call "the classics", Knuth, Peter Norvig, Minsky, Dijkstra... I realised most of what nowadays is called modern software/techniques basically consist in re-framing old essays from them.
Some of my references: the famous Norvig view on design patterns [1] and also his view of clean code [2]. Knuth on programming [3] also really enlighting.
Definitely recommend. I recently bought a book on Amazon on distributed systems[0] which talked about fault tolerance without even a cursory mention of Joe Armstrong's work. I've returned the book, but I wish I could have browsed the bibliography before buying.
I've never understood why people like this paper. Over the years I've evolved a bite-sized rebuttal, if anybody wants to fight me:
The authors' "ideal world" is one where computation has no cost, but social structures remain unchanged, with "users" having "requirements". But the users are all mathematical enough to want formal requirements. The authors don’t seem to notice that the arrow in "Informal requirements -> Formal requirements" (pg 23) may indicate that formal requirements are themselves accidental complexity. All this seems to illuminate the biases of the authors more than the problem.
> But the users are all mathematical enough to want formal requirements.
This seems like a strange interpretation. I understand that the term "formal requirements" has a technical meaning in some disciplines, but I also think it's pretty clear that the author isn't using the term in that way.
It is much more likely that the author meant that users have requirements, but those requirements don't typically map cleanly to actions taken by the computer. This step of translation is necessary in the construction of a program, even if it is typically done piecemeal and iteratively.
It's a strange interpretation only if you focus on just the vocal minority that talks about formal requirements. Here are two quotes by Hillel Wayne (https://www.hillelwayne.com/post/why-dont-people-use-formal-...), whom I've found to have the most balanced take:
"The problem with finding the right spec is more fundamental: we often don’t know what we want the spec to be. We think of our requirements in human terms, not mathematical terms. If I say “this should distinguish parks from birds”, what am I saying? I could explain to a human by giving a bunch of pictures of parks and birds, but that’s just specific examples, not capturing the idea of distinguishing parks from birds. To actually translate that to a formal spec requires us to be able to formalize human concepts, and that is a serious challenge."
"It’s too expensive doing full verification in day-to-day programming. Instead of proving that my sort function always sorts, I can at least prove it doesn’t loop forever and never writes out of bounds. You can still get a lot of benefit out of this."
The post makes a strong case, IMO. Formal methods can be valuable, but they don't have the track record yet for anyone to believe they're the #1 essential thing about programming.
Fourteen years later, I still think about this. I have three half-written drafts of blog posts about things this essay has inspired me to do.
I think part of the mystique of it is that the authors kind of faded into the background after publishing it. I've never been able to find a follow-up paper from them.
As a less experienced web developer, i found "How I write backends"[1] very enlightening and I recommend it to all my peers when we discuss useful resources.
I just read it but I think it focuses too much on specific tools (Redis, Node,...) and their configurations which might not be the best for most use cases. Especially if it's for a beginner who maybe doesn't have to start out with a load balancer and Redis.
"If you choose to write your website in NodeJS, you just spent one of your innovation tokens. If you choose to use MongoDB, you just spent one of your innovation tokens. If you choose to use service discovery tech that’s existed for a year or less, you just spent one of your innovation tokens."
I think the author is stuck in 2010.
___
1. https://www.infoq.com/presentations/Simple-Made-Easy/