Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Simple Systems Have Less Downtime (gkogan.co)
685 points by gk1 on March 3, 2020 | hide | past | favorite | 263 comments


Instagram was what, 12 employees when they got sold for a gazillion dollars? They all could fit into a van. Because they kept their system simple. It was (and still is) a monolith. Now imagine that they decided to go the microservices way. Multiply that team size by 10 at least.

Don't solve problems you don't have.


WhatsApp, similarly, had 30+ employees when they got acquired [0]. They built the fastest IM on the market with 450M+ users sending 1B+ messages everyday, and at one point surpassed Facebook in terms of number of images uploaded.

The engs they had were world-class, so really, I think, saying microservices (or latest-fad) get in the way etc is disingenuous since you also require world-class talent to begin with (if you're going to keep the team-size small and yet be able to manage crazy scale) and a competitor willing to pay through the nose for the acquisition.

[0] https://www.sequoiacap.com/article/four-numbers-that-explain...


They chose Erlang. A language built for communication and managing wire protocols at scale. Which describes WhatsApp itself. That was probably the biggest impact single decision for WhatsApp technically.


I am not sure if WhatsApp engs chose ejabberd because it was written in Erlang or because ejabberd was the defacto implementation of XMPP. They stumbled upon and fixed bugs in BEAM/OTP at their scale [0][1][2]. They also ran FreeBSD (for its superior networking?) on bare-metal hosts running customised system-images [3] and employed networking experts at some point.

[0] https://www.youtube-nocookie.com/embed/c12cYAUTXXs

[1] https://www.youtube-nocookie.com/embed/wDk6l3tPBuw

[2] https://www.youtube-nocookie.com/embed/93MA0VUWP9w

[3] https://www.youtube-nocookie.com/embed/TneLO5TdW_M


They ran FreeBSD because Jan came from Yahoo, and that's what Yahoo ran.


'customized systems (kernel)..FreeBSD..bare metal': Which is what most of us still do if we care about performance for a single application. Running openmp fortran code for both host/accelerator on oversubscribed vmware or vanilla kvm/qemu for a good laugh and a lot of pain. Simple systems approaches are best even when what you are doing is complicated.


I could be wrong but I think freebsd allowed multiplexing connections per port before Linux? Please correct me I am mistaken about this.


Erlang is like Bugman from the Nightmare Before Christmas. It is literally made out of microservices at every level.

Microservices are a design philosophy that people confuse as a deployment strategy.


> Microservices are a design philosophy that people confuse as a deployment strategy.

I'm going to print this and show everyone, great words, thank you.


Not necessarily, you can design erlang systems with more coupling than you'd expect in traditional microservices. Microservices mean that your codebase is segmented along business concern lines.


I would rather argue that microservices are best implemented when each service owns a bounded context in the lingo of domain-driven design. Business concerns might align with that, or they may not, depending on how technically complex (vs product complexity) the product is.


Yes, that is a better definition, but nonetheless you can design erlang systems with a single domain, or design erlang systems as a monolith, even if it has a billion actors running around underneath.


Kinda, yeah, except Erlang is microservices made easier. I've loved how the language makes building distributed systems so comparatively easy since I ran across it back in 2005.


That was probably the biggest impact single decision for WhatsApp technically.

As a rule of thumb based on my own experiences and the opinions of more experienced engineers I've had the good fortune to work with, language choice is far less important than the quality of the team using it.

While I have no doubt that trying to build WhatsApp in a language that would be the wrong tool for the job (say... PHP) would have been fatal, I have many more doubts that choosing Erlang was the key enabling decision.


Language choice I can perhaps agree with, but from what I've read, the biggest factor with Erlang is much more its runtime environment. BEAM + OTP is a very impressive piece of kit.


I don't know. Erlang is a "funny" language that can somewhat easily do certain things that would be much more time consuming to accomplish in a more mainstream programming language. The difference between building whatsapp in Erlang vs any other programming language is probably larger than building it in php vs any other mainstream language.


Converse: Can you imagine WhatsApp parallel universe serving with Node?


Yes :)

Success is more to do with user experience..

If you hit gold and scale you can almost always optimize the implementation later.


Iirc Slack's backend is PHP.


They've moved over everything to Hacklang, which is a derivative of PHP.


It's amazing how much effort it takes to do something with the wrong tools.

Sunk Cost Fallacy tends to fight any broad-stroke improvements. Until a competitor starts eating your lunch.


The thinking is backwards, if you are only 10 people, there I no need to shape the system in a microservicr fashion. You just extract the pieces that need scaling. If you are 60 devs, you need to split the system so that everyone can work on it without walking on each other.


Also you can have so few devs when your value is in your network. Most other businesses' value is in their features. We have customers constantly begging for features, so we have to have more engineers to produce more value to our customers.


I believe that the instinct to over-engineer is based in part on bad prior experiences with trying to separate concerns after it's 'too late'. Either your own personal experiences, or those of your mentors.

Lacking any better skills to identify and avoid those problems when they begin, they try to stop it from happening in the first place. Fences get erected everywhere in case they might be needed, and they frequently turn out to be in not quite the right spot or shape. The code becomes coupled to the bad interface instead of to other code, and the fixes are just as bad.

YAGNI in theory is about trying to develop those other skills, but gets twisted into an excuse for bad tech debt loads.


As someone said, microservices is a technical solution to a people problem. Devs don't want to talk to each other so they wall off behind their own API. Boom, no need to talk to each other. Ever. Or is there?


Requiring everyone to talk to everyone so everyone has global context isn't just "devs don't want to talk to each other"; it is actually an information dissemination and coordination problem which scales non-linearly (at least n^2), and needs some kind of modularity to be tractable to normal humans.

Microservices are like modules but for SaaS rather than shrinkwrap, and are where you end up when you follow SLAs, encapsulation of resource consumption, etc. to their logical conclusion.


"Microservices" don't guarantee that everyone doesn't have to talk to everyone. Good thoughtful design is necessary regardless of how you're building/organizing/deploying/operating the code.



Microservices are not the only way to split a system.

Far from it. They are the most onerous, least tractable way to get what you are going for.


I'm now working with a system where we add microservices. In my experience, they allow you to split a team that then owns its whole deployment cycle. This allows for easier hot-fixing and dealing with database schema changes.

There are now almost fifty developers and we already have a third (micro-)service. ;-)

I agree completely that starting with a monolith is a win for feature development, performance and operations. But if your team grows, your release cycle will slow down unless you decentralize it.


That has nothing to do with the release cycle. A large monolith can still do continuous delivery and release multiple times per day. What tends to slow down is the feature delivery cycle.


Would you care to explain your reasoning? This topic genuinely interests me.

My experience is quite the opposite. To deliver a feature, you still need to integrate changes in multiple services. To do it in a monolith is IMHO easier. You have a single artifact you can test, the probability you have the right automation is higher. Also, the devs will run bigger part of the system in the development version.

What I was talking about was the latency of smaller changes. Think of bug fixes. Somebody has to notice it, triage, fix, (wait for automated tests) and deploy. Often, a bug fix affects just one service. With smaller services, this chain is simpler.


There are no hard rules. In general a microservices architecture tends to enforce segmentation which makes it faster to iterate on features that only impact a single service. But coordinating larger changes across multiple services owned by separate teams can certainly be slower.

Monolith architectures often degenerate into the proverbial "big ball of mud" over time. But if the team has the discipline to maintain proper design through continuous refactoring then they can retain the ability to deliver features with short cycle times.


Sure,but the value of Instagram came from the fact it was a good idea, well executed,at the right moment in time. I'm not sure they managed to solve a ton of complexity with a small team.


Has to be true. They were dealing with major scale pre-purchase. Also... talented set of engineers.


In my experience you do end up having external dependencies and more than one service. You do end up breaking out some code into special instance types, (high ram for video processing or what have you). These are problems you do have and do have to solve so you might as well come up with a plan. Deploying microservices really isn't that hard once you make it routine, imo.

But what do I know? I would not have expected Instagram to run things like user login on the same instance as photo upload and processing.


microservice vs monolithic is a shade of grey. what if all the code is in one codebase, but distributed to varying instance types that use some config flag to say the role it plays in that context. is that monolithic or microservice? I'd say the code is monolithic and the architecture is microservices. So it's some hybrid version.

what you're talking about (isolation of responsibilities) can be done and still be considered monolithic. you can also use the exact same instance type for everything and still be considered microservice.

I think what we're really talking about is containers vs machine images. And personally I think containers right now are suffering from the same abuse/hype that datastores suffered like redis/mongo/couch etc. Sure they have an application and solve problems, but they're being over used to the point of causing technical debt.


It doesn't mean that they do. Their system is partitioned to a point, and something like image processing is very easy to offload and expose via HTTP, for sure.


I would really love to read their codebase.



typically deploying to production around a hundred times per day

That is insane!

We make a release once a week at work and things still go wrong sometimes. I am in awe how they are able to pull this off, especially at their scale.


The more frequently you release, the less, or less serious, bugs there typically are, and the earlier you catch them.

The faster development cycle also helps people invest in testing infrastructure more effectively.


I'm trying to get a sense of magnitude, here.

Let's say 10-20 commits per day per dev. Over 5~10 hours that's what, on the order of 1 commit-test-release cycle every 15 to 60 minutes? (subjectively for each dev)

What do we actually write in that timeframe on average (thus including the ~90% of time we don't type code but think or read or test)? What's the "unit commit" here?

So I'm thinking... let's take an example: today I'll refactor a few functions to update our model handling; I wish to reflect our latest custom types in the code. So it's a lot of in-place changes, e.g. from some list to tuple or dict; and the syntax that goes with it. No external logic change, but new methods mean slight variations in the details of implementation.

- refactor one function: commit every testable change, like list to tuple? At least, I'm sure I'm not breaking other stuff by running the whole test suite every time it "works for me" in my isolated bubble. So I might commit every 5-10 minutes in that case.

- Now I'm touching the API so I can't break promises to clients: I actually need to test more rigorously anyway. I'm probably taking closer to 20-40 minutes per commit, it's more tedious. Assuming I commit every update of the model, even insignificant, I get immediate feedback (e.g. performance dump), so I know when to stop, backtrack, try again? And it's always just one "tiny" step?

- Later I review some code and have to go through all these changes. I assume it's easier to spot elementary mistakes; but what of the big picture? Sure I can show a diff over the whole process — I assume you'd learn to play with git with such an "extreme" approach.

Am I on the right track, here? I totally get your comment but I'm trying to get a feel for how it works. I typically commit-test-release prod 3-4 times a day at most (on simple projects), and typically more like once every 2-3 days, 2-3 times a week. Which is "agile" enough I reckon... So I'm genuinely interested here. I feel there's untapped power in the method I'm just beginning to grasp.


Are you assuming the tests are all 100% automated? If QA needs to take a look, how is it possible to have a commit-test-release cycle every 15-60 mins? I mean, it would take a human few mins to just read and understand what they need to test, isn't it?

The article talks about static analysis, I wonder if they do human code reviews at all?

Any which way we slice this, this is incredible! Sure instagram is not healthcare, transport or banking application - nobody is going to die if the website goes down, it is still an awesome achievement.


Indeed, and not only are their tests automated, they also rely on production traffic to expose failure cases. Since some problems that are exposed are only applicable at scale. They use canary deployments to slowly ramp up traffic to the new version; rolling back if they detect anomalies.

Maybe you'll find this video interesting: https://youtu.be/2mevf60qm60


I think you've got it. Now put 10 people on the project, and have them all working at that pace.


Ah, awesome, thanks for the feedback.


What’s their stack?


Primary I think is a Django project https://www.youtube.com/watch?v=lx5WQjXLlq8&t=10s


I am straining to hide my lack of being impressed


But microservices are an example of simpler systems. Each microservice does far less than the whole monolith does. You can read all the code in ~15 minutes.

I've worked at companies that have monoliths that are 50x more difficult to work on because of the size. Some of them millions of lines of code. Nobody really knows how they work anymore.


> But microservices are an example of simpler systems. Each microservice does far less than the whole monolith does. You can read all the code in ~15 minutes.

Microservice usually means distributed system. A distributed system is more complex than a non-distributed system since it has to do everything the non-distributed system has to do and, additionally, handle all the distributed problems. Microservices just hide the complexity in places where people don't see them if they take a cursory look over the code, e.g. what is a function call in a monolith can be a call to a completely different machine in a microservice architecture. Often they look the same on the outside, but behave very differently.

The hierarchy of simplicity is: Monolith > multithreaded[1] monolith > distributed system. If you can get away with a simpler one it will save you from many headaches.

> I've worked at companies that have monoliths that are 50x more difficult to work on because of the size. Some of them millions of lines of code. Nobody really knows how they work anymore.

That is a bad architecture, not something inherent to a "monolith". There's probably also a wording problem here. A monolith can be build out of many components. Libraries were a thing long before microservices reared their ugly heads. What you describe sounds more like a spaghetti architecture where all millions of lines are in one big repository and every part can call every other part. Unfortunately, microservices are not immune from this problem.

[1] or whatever you want to call "uses more than one core/cpu"


>Microservice usually means distributed system. A distributed system is more complex than a non-distributed system since it has to do everything the non-distributed system has to do and, additionally, handle all the distributed problems. Microservices just hide the complexity in places where people don't see them if they take a cursory look over the code, e.g. what is a function call in a monolith can be a call to a completely different machine in a microservice architecture. Often they look the same on the outside, but behave very differently.

This is a false assumption. Some problems are distributed. Sometimes you'll have an external data store or you'll need to deal with distribution across instances of the monolith. You really run into pain when you build a distributed system in your single monolithic code base and your monolithic abstractions start falling apart.

In my experience you end up solving these problems eventually, monolith or not. You might as well embrace the idea that you're deploying multiple services and some form of cross service communication. You don't need to go crazy with it though.


What assumption are you talking about?

Anyway, if your problem requires a distributed system, congratulations, you'll have to go to the top of that complexity hierarchy, and will have to solve all the problems that come with it.

That doesn't change anything about there being more problems. You just don't have any other option.


Simple example where this is not quite true: moving a slow, fallible operation out of band to a durable queue with a sensible retry policy will tend to make the system simpler and less brittle, even though it becomes distributed.


Microservices, often, is just another word for "distributed monolith". Sure you can read the code of a single service quickly, but often in practice it's not possible to just make changes to one service. There are usually both explicit and implicit dependencies that span many service layers. What you gain in readability I think you often lose more in maintenance overhead.


I worked on a microservices practice (with over 400 microservices by the time I left) where it definitely was not a "distributed monolith". I could change all kinds of individual services that did not require changes to other services.


I think monoliths can be written the same way such that you can change one module without changing the others. Microservices enforce that best practice.


"Enforce" gets thrown around a lot, but that's approaching silver bullet expectation levels, in my opinion.

A core skill problems on a team that would prevent building a maintainable monolith do not go away because you've added more things for them to manage.


The complexity is still somewhere. You can structure your monolith in a way that you can read each module's code in ~15 minutes. Each module also can be developed and tested in isolation.

However, if the organization is not capable of structuring the monolith, why should it be successful with microservices?

Such organization may lead to sharing data stores and custom libraries between microservices and that's when the real fun begins. Maybe even trying to deploy all of the services atomically to not worry about API compatibility.


Underrated point, although incomplete from my perspective. 2 simple systems > 1 complex system.

But 1 semi-complex system > 10 simple systems. Especially when you consider the points of integration between those systems increases geometrically with the number of systems.


Microservices are simple components of what could be a simple or a complex system. If things are overly broken down then unnecessary complexity could easily be added.


Microservices make you architect your whole service differently. The communication is essentially asynchronous message passing which may or may not make things more difficult.


Excellently said


I gave a talk on this subject at CU last year and have, one way or another, spent my entire professional life thinking about this topic.

I agree wholeheartedly that simple systems have less downtime.

I would like to add a line from the talk that I give:

Simple systems fail in boring ways. Complex systems fail in fascinating, unexpected ways.

rsync.net storage arrays typically have multi-hundred day uptimes. But across our entire network, our actual aggregate uptime is unimpressive - perhaps 99.99% ? Our SLA dictates 99.95.

But when they do fail they fail in very, very boring ways. There are usually zero decisions to make in response to a failure.


"simple systems have less downtime"

That is not much of an insight. It is as insightful as saying "water is kinda wet.". Well... sure it is.

What we need to deal with, is not "make a simplest system".

Rather, we need to deal with: "build a system that does A, B, C, ... and so on". Now, if you can do all of the above and make it simple... awesome. But if you cannot do all of the above, but the system is simple.... that is useless.


"Rather, we need to deal with: "build a system that does A, B, C, ... and so on". Now, if you can do all of the above and make it simple... awesome."

I agree with this. It is a good point.

Some systems and protocols have very difficult requirements to meet and different constituencies driving those requirements - it's not always possible to implement simple and elegant solutions.


beware: for most is not that obvious. For most developers modern, extensible, layered comes before simple.


rsync.net is built on ZFS which is anything but simple. Having worked with multi-petabyte ZFS systems, you can run into some seriously hard to track down issues.


"rsync.net is built on ZFS which is anything but simple. Having worked with multi-petabyte ZFS systems, you can run into some seriously hard to track down issues."

Obviously I have a few things to say about this ...

First, zfs was considered production worthy on FreeBSD and put into fairly widespread use as early as ... 2008 ? 2009 ? We could have made very good use of it as we were running into all kinds of limits and corner cases with UFS2 and very large filesystems with hundreds of millions of inodes.

But we waited until late 2012 to do our first deployment and it took us about six years to finally deprecate the last UFS2 systems. We did this out of an abundance of caution and a desire to see things shake themselves out over several major releases of FreeBSD.

As for the complexity of ZFS, our previous architecture had 3ware RAID cards and all of their firmware and complexity sitting between the drives and the OS. In a way there was a beautiful elegance (in my opinion) in giving the OS a single drive:

    newfs /dev/da0
.... and that single drive just happened to be 40 TB in size and the OS has no idea what's going on underneath ... but there is a lot of complexity inside a full-blown RAID card and a lot of ways for drives to interact weirdly with it - especially when the drive is X years newer than the latest firmware for the card.

I find that on the very deepest, hardware level, handing over raw disks to ZFS and letting it manage them is simpler. We remove a fairly complex piece of hardware and accompanying firmware (since our HBAs are "degraded" to dumb "IT" mode).

Further, all of the "bolt-ons" of UFS2 that we absolutely relied on, such as quotas and snapshots, are elegantly built into the filesystem from the very lowest levels.

We ran rsync.net (and its predecessor) on UFS2 for 11-12 years and I can tell you from the perspective of both day to day management and middle of night firefighting, ZFS has made our lives much simpler.


Is ZFS more or less complicated than mdraid+LVM+ext4?


I'd say it's more complicated even for simple ZFS pools without any redundancy or fancy features. My favorite illustration on this topic would be figure 2 from "Reliability Analysis of ZFS" (PDF): http://pages.cs.wisc.edu/~kadav/zfs/zfsrel.pdf

I still miss some of the features that are hard (if not impossible) to replicate in layered storage systems, like Merkle tree checksumming, but near-nightmare that low-level debugging of ZFS was actually turned me away from that filesystem.


sorry for offtopic.

there a broken link that I was interested to read on. its on https://www.rsync.net/resources/howto/rsync.html ctrl+f "rsync snapshots are detailed here"

I think given the topic, someone is expected to figure it out on their own (remember linux cake?), but I feel a bit lazy after a work day.. sorry


Sorry - that is in purpose ...

The definitive page for rsync snapshots has been, and always will be, here:

http://www.mikerubel.org/computers/rsync_snapshots/

... we actually don't want people to do rsync snapshots anymore because ZFS snapshots are much more efficient and use up less of their rsync.net account.

If you change one bit of a file, the rsync snapshot method will cost the entire size of that file, since it's a hard link and is either broken or not broken.

But if you just do a "dumb" sync to us and change one bit in the file, your ZFS snapshot will take up just one more bit.

So not only do you get a much simpler backup script - you can really just do a dumb sync to us and let us rotate your snapshots - but you get more efficient space usage for changed files.


Thanks for the link and explanations, appreciate it!

I've read that page (i linked) inattentively, re-read 3 times that bold text next to the broken link and still asked that question... :facepalm:


The only language that I have worked with that realizes that in the real world simple is _always_ a lie, is common lisp. It is the only language that has actually embraced the fact that its designers/committee were not geniuses and provided the tools for dealing with the complexity of the system. When unix tools fail, hope that you are on a system where it is possible to get the symbols and/or the source code, and even then good luck fixing things in situ. Most existing systems do not empower the consumer of code to do anything if it breaks. CL remains almost completely alone in this, because the debugger is part of the standard. Most of my code is written in python, and I can tell you for a fact that when python code really fails, e.g. in a context with threading, you might as well burn the whole thing to the ground, it will take days to resolve the issue, roll back the code and start over. The fact that people accept this as a matter of course is pure insanity, or a sign that most programmers are in an abusive relationship with their runtime environment.


I disagree about Python. First, you can code a "manhole" in your code which lets you evaluate arbitrary string as Python code at runtime, basically a shell available via a socket.

Second, you don't even need that. Gdb with some tooling (see pyrasite) lets you attach to an arbitrary Python process and evaluate Python code in its context.


There's also PDB.


Forgetting Smalltalk here?

Mesa/Cedar, Oberon System were also like that by the way.

Java and .NET also share a bit of it, after all Java is half-way to Lisp (as per Guy Steele), and .NET follows up on it.


Sort of. I have a bad habit of forgetting Smalltalk in these kinds of conversations, but that is possibly because Smalltalk images exist in their own happy little worlds. I love working in Smalltalk environments, it is always a mind blowing experience, the system is completely homogeneous, accessible, introspectable, modifiable, etc. -- until you hit the hard boundary between the image and the host system, be it hardware or software. That boundary leaves quite a gap that smalltalks tend to be unable to fill by themselves (mostly due to a relative lack of resources). I can imagine a smalltalk system that could go all the way down to modern assembly, but unfortunately no such system exists today (that I'm aware of). If you can live inside the image then yes, Smalltalk is possibly even better.


> most programmers are in an abusive relationship with their runtime environment.

Just pulling this out because it's a good sentence.


> when python code really fails, e.g. in a context with threading, you might as well burn the whole thing to the ground

This sounds weird. Why burn the whole thing to the ground? You've got frames from all threads available. Why would it take days to resolve the issue? Why do you think it's easier to resolve it in common lisp?


I think I need to unpack what I mean by 'really fails' to capture what I was trying to convey. I deal with Python programs running in a number of different environments, and there are some where literally all you have on hand are the libs you brought with you. Maybe that is an oversight on my part, but the reality is that in many cases this means that I am just going to restart the daemon and hope the problem goes away, I don't have the time to manually instrument the system to see what was going on. I shudder to imagine having to debug a failure from some insane pip freeze running on a Windows system with the runtime packaged along with it.

Worst case for CL means that at the very least I don't have to wonder if gdb is installed on the system. It provides a level of assurance and certainty that vastly simplifies the decision making around what to do when something goes wrong.

To be entirely fair, the introduction of breakpoint in 3.7 has simplified my life immensely -- unless I run into a system still on 3.6. Oops! I use pudb with that, and the number of uncovered, insane, and broken edge cases when using it on random systems running in different contexts is one of the reasons I am starting no new projects in Python. When I want to debug a problem that occurred in a subprocess (because the gil actually is a good thing) there is a certain perverse absurdity of watching your keyboard inputs go to a random stdin so that you cant even C-d out of your situation. Should I ever be in this situation? Well the analogy is trying to use a hammer to pound in nailgun nails and discovering that doing such a thing opens a portal to the realm of eternal screaming -- a + b = pick you favorite extremely nonlinear unexpected process that is most definitely not addition. You can do lots of amazing things in Python, but you do them at your own peril. (Disclosure: see some of my old posts for similar rants.)


> The only language that I have worked with that realizes that in the real world simple is _always_ a lie

This feels like an excuse to me. I’ve worked on a lot of rather simple web apps that all more or less do the same stuff. A few of them have managed to have delightfully simple codebases, most of them haven’t. There’s no reason that couldn’t be true for all of them. You usually end up having at least some complexity. But small complexity trade offs don’t necessarily require you to undermine the simplicity of the entire system.


Sure lisp is a wonderful language, but let's not pretend there's no difference between incidental and accidental complexity.


Fred Brooks called it "essential complexity" and "accidental complexity"[1].

Essential complexity is inherent to problem being solved and nothing can remove it.

Accidental complexity is introduced by programmers as they build solutions to the problem.

Lisp is nice because eliminates a lot of the accidental complexity through minimal syntax and lists as a near-universal data structure.

1: http://worrydream.com/refs/Brooks-NoSilverBullet.pdf


One of the traditional criticisms of Lisp, though, is that it lets programmers re-introduce a whole lot of accidental complexity in their Lisp code, and, worse, everyone introduces a completely different set of accidental complexities into their code.


That problem is not limited to Lisp.

As a programming language, Common Lisp is large enough and multi-paradigm enough to allow for elegant solutions to problems. It does require some experience with the language and some wisdom and discipline to know what pieces to select and how to best use them.

However, like all large, multi-paradigm programming languages that have been around for a while (I'm looking at you C++), programmers tend to carve out their own subsets of the language which are not always as well understood by those who come after them, particularly as the language continues to evolve and grow.

There is also the problem where programmers try to be too clever and push the language to its limits or use too many language features when a simpler solution would do. All too often we are the creators of our own problems by over-thinking, over-designing, or misusing the tools at hand.


> programmers tend to carve out their own subsets of the language which are not always as well understood by those who come after them

if the language is not powerful enough to allow for that people will inevitably add preprocessors, code generators, etc... to do the things they want.


>> if the language is not powerful enough people will inevitably add preprocessors, code generators, etc... to do the things they want.

This is definitely true and it adds to the accidental complexity of the system, usually to save programmer time or implement layers of abstraction for convenience.

Common Lisp and C++ have both incorporated preprocessors and code generators through Common Lisp macros and C++ template metaprogramming and C++ preprocessor / macros. These features give the programmer metaprogramming powers, enable domain-specific language creation, implement sophisticated generics, etc.

They are powerful language facilities that need to be used wisely and judiciously or they can add exponential complexity and make the system much more difficult to understand, troubleshoot, and maintain.


No language can prevent someone from exercising bad taste, but a language can prevent someone from exercising good taste.

Some languages attempt to discourage bad taste by limiting the power given to users. Common Lisp embraces the expression of good taste by giving users considerable power.


This is a valid criticism, but it's not lisp specific.

We know somewhat how to control accidental complexity in systems that are highly constrained and don't let you deal well with essential complexity. And we know somewhat how to give you powerful tools to deal with essential complexity.

We haven't figured out in general how to make systems powerful enough to deal with the range of essential complexity, but constrained enough that this sort of accidental complexity isn't common.

Of course a lot of real world systems don't do a great job of either.


Clojure is simpler than Common Lisp, but I still feel that the features are too many and too complicated, and continue to simplify, do not use some complex features, and insist on writing systems with pure pipeline structure.

As a result, the simplest system is obtained, but the design process is a systematic project.It is difficult to design a complex system into a simple and smooth pipeline system.


Do you mean inherent vs accidental?

I think "incidental" (non-essential, secondary, happenstance), and "accidental" (non-intentional, happenstance) are more or less synonyms here, not contrasts. I think there, uh, is basically no significant difference between "incidental complexity" and "accidental complexity".


Fred Brooks could have identified a third problem: accidental complexity is fun; inherent complexity is boring.


I'm referring to this part when I talk about the incidental-accidental complexity dichotomy:

[...] in the real world simple is _always_ a lie [...]


Care to elaborate what corresponds to the incidental/accidental complexity in CL? I tried to understand these concepts, but it feels very subjective. Whether or not something is accidental looks in the eye's the beholder.


Most proponents of Common Lisp swear by emacs/slime as the perfect IDE for it, but those may be not to everyone’s taste. Are there alternatives that are just as good?


There was a better alternative: Lisp machines. But in today's world, there's nothing as good as Emacs/SLIME. You can get almost there by connecting another editor to Lisp with an extension that speaks SLIME's SWANK protocol (like SLIMV for vim).


>Lisp

All those complicated lists? Forth is even further down the path to ultimate minimalism. Really only one data structure, a stack with binary values on it...


The point wasn't about minimalism.


Common lisp was never about minimalism. It was about lisp dialects like scheme.


The fact that I can program in the highest level language and still have realistic, working tools to control machine code that is being generated or choose level or type of optimizations or lets me change all of that case by case and during lifetime of the application kind of blows my mind (FYI, I am using SBCL).


See Gall's Law:

> A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.[9]

* https://en.wikipedia.org/wiki/John_Gall_(author)#Gall's_law


I find it mildly annoying when people state such nuggets of knowledge as “laws”. Sure, it’s generally true that you can’t go for the most sophisticated and powerful system at the first go, but sometimes the minimal working system is complex no? Anyway, it’s certainly not a “law” of the universe.


Klysm's Law:

Nuggets of knowledge are not laws. Referring to them as such annoys Klysm.


An aphorism being called a law is called a klysmism.


"Law" can also be used for pithy statements that don't have to be particularly rigorous. C.f. Murphy's Law.


This statement generalizes quite well all around the world to the other kind of law, that's voted in parliament/congress... haha! ;-)

More seriously, I think the word OP is looking for is theory. Nuggets aren't theory, they're really empirical by essence — observations.


It’s a colloquialism, like Moore’s law.


>For example, an analytics dashboard built with a no-code analytics tool like Looker is likely to have more qualified people to fix it than one built with a patchwork of custom scripts and APIs. Nobody should have to pull data scientists or product developers away from their work to fix a bar chart.

I don't have experience with Looker, but the unseen complexity of no-code tools often leads to very complex systems with interactions that are hard to understand. A dashboard may not have that many interactions with other tools, but the black box nature of these kinds of tools usually leads to significant downtimes when there's a small detail not working as expected, or where you need a small customization that wasn't foreseen by the no-code API creators.


As somebody who built a lot of Looker, it is not no-code for this very reason. It's why your Looker data model is stored in hand-editable code and under version control!


Looker isn't no-code. A lot of people consider Looker a slightly more robust Tableau, but the visualization aspect is almost inconsequential. It's the data modeling part, the "this is exactly what structure my data takes and we define that in code", that matters. Otherwise what you get is "Here's my Tableau workbook with an unholy pile of hand-written untested SQL as custom data sources"

Incidentally I'm all ears if there's a good open-source LookML style project out there.


That's true for drag&drop tools that provide a user interface as an alternative but Looker or similar tools introduce their own language that limits your capability by sandboxing the environment but still lets you programmatically define your models. See LookML for Looker (https://docs.looker.com/data-modeling/learning-lookml/what-i...) or Jsonnet for Rakam. (https://docs.rakam.io/docs/compose)

P.S: I'm affiliated with the company Rakam.


I think that’s missing the point he was making. No matter what front-end representation of a system you get, the implementation backing a no-code system will have edge cases around interactions both internally and with third-party systems. The fact that you’re hiding a system behind a veneer of “no code” doesn’t hide the fact that code is behind the actions that are taking place, it just means you can’t introspect it (even if this “no code” is in fact code).

I think a better point would be to make that that’s not a bad thing, sometimes abstractions provided in “no code” systems can greatly simplify a solution. But even the most well-designed solution will fail to some essential complexity of the root-problem the original authors didn’t understand. Without debugging tools (or privileged access to the backend of the implementing system) it’s difficult or impossible to understand what went wrong.


Spot on. Looker is a great tool, but it has its own language LookerMl and a set of abstractions. From my limited experience it‘s not easy to pick up. It‘s likely that an analyst is already familiar with it. But like with any BI if there is a problem that can‘t be solved with an existing abstractions and tools, you either go back to writing the pipeline step to solve or you come with a very ugly solution.


I think the point, actually we don't have to think, it was stated explicitly, is that if Looker breaks, you can easily switch to an alternative.

This was presented as the rationale quite plainly, so I don't understand why people are missing it.


I like the ship analogy from the article. The fact it is so robust is not because the system as a whole is simple, but because its components have well defined, narrow responsibilities and there are abstract interfaces between them, which hide a lot of complexity. You don't need to understand the internals of the pump to explain what it does, even though the actual internal technology may be quite complex and clever. Also you don't need to understand the internals of the pump to design a rudder. This is a property that many IT systems lack. Insufficient abstraction leads to a network of components where you can't reason about anything without understanding all of it. And such systems often break in weird ways.


A startup is not a container ship going between port A and port B where everything is known.

A startup is a new destroyer that has been floated, did not have sea trials and went to war with a crew that might have a couple of people that used to a drive container ship but mostly staffed with kids that thought it was cool to play with a destroyer. Oh, and 3/4 of the systems are still at best have been drawn on a napkin and most of the rest came from a salvage yard. Oh and if you complete your next mission you may get money you can spend on some of the systems but if you do get that money you would be expected to do a lot more runs, quicker.


A modern startup is a destroyer that had all its core operations systems fedexed to the yard, as they were rented from the cheapest third-party service providers, and nobody on the ship has ever bothered to read the contracts, much less know what a "SLA" is. Inevitably, at some point in the middle of the ocean, one of these systems will get remotely bricked because the company that provided it went under or got acquired.


I would say a startup is a ship that has been floated. But it‘s unclear if it needs to be a destroyer or a cargo ship, or a cruise ship. Later on it may turn out to be a spaceship.


Your startup is probably just a variation of CRUD.


The reason why we consider shipping to be a simple and solved problem is that in 1956 McLean had a brilliant idea that to ship efficiently and well everything imaginable would be put into identical containers with no variations.

That's why it is possible to have a container ship with a crew of 5-6 people pilot the load. Containers aren't sort of identical. They are completely identical and completely standard ( several standard sizes ). They have the weight distributed in a specific way and they are stacked on a ship in a specific way depending on the weight of every container.

The startup equivalent of a container ship in simplicity is a startup sorting a pile containing A4 paper, standard business envelopes and hang folders at a daily rate of 500 items.


But it's a B2B SAAS and it's in the cloud with edge processing and probably AI.


The author, probably, picked the worst example. Here is a pic: https://iro.nl/app/uploads/2018/12/P-67-onboard-the-BOKA-Van...

No, the containership is not a simple system. It's very sophisticated one. It takes massive engineering effort (literally historical effort and knowledge) to build, massive resources for the material and outsourcing, very complex (like the pic) infrastructure in case of repair, satellites to guide the ship and check the weather, operators from all around the world to track the containers, massive ports to load-unload the shipments, etc...

The operation of guiding the containership through the sea is a simple one but only once you have everything of the mentioned above in order.

If you are building software (a full SaaS solution), you are not in the business of guiding the containership. You are in the business of building either the containership alone; or the whole related infrastructure (or parts of it). That's a huge undertaking that requires massive engineering efforts. It's not a simple task.

If your task is a simple team that guide the container, you might make money for a while until everyone else figures out that you have no edge there and they can do your job for much less.

The Instagram/Tinder examples are a bad one. It ignores the fact that to find the right (addictive/highly viral) app, you'll need (again) massive psychological knowledge and expertise into how primates brains work. We don't have that, so engineers are brute-forcing by making too many apps until one hits the jackpot.


Hey, author here.

My point is that simple-to-understand systems--not simple as in primitive--have less downtime, not that we shouldn't ever have complex systems. A container ship like the one in my article can be drydocked ("in the shop") for repairs and back on the water in less than two weeks. A nuclear-powered aircraft carrier cannot. So be mindful when you're building an aircraft carrier when a container ship would've sufficed.

By the way, most of the "complex" stuff in the photo--which isn't a container ship--is scaffolding and piping.

Source: Got a naval architecture degree and a marine engineering license, operated a steamship for six months at sea, and designed ships for the US Navy for three years before abandoning ship to work with startups.


An interesting thing to contrast that with is the Boeing disaster. Where they decided to make the steering actually more complex, then hide it from the pilot with software to make it look like the old, simple thing.

The result was a system that worked fine until you encountered bugs in the software, at which point there was no helping that several hundred people were about to die.


sure nuclear carriers cannot be dry docked and serviced like a container ship but they can be deployed for long periods of time without refueling and other routine services.

My point here is that the way you measure reliability is subjective and there are likely trade offs associated with “simplicity”

For example, having a global signal point of failure, like keeping all of your api servers in one cloud region, or having a single database with no replicas is much simpler and easy to understand than a more distributed alternative. However, global resources make your system more likely to fail catastrophically when there are outages.

There are trade offs here and _oftentimes_ in complexity arises in distributed systems because of reliability issues, not the other way around


To be fair though the autopilot does more than move the rudder.


Did you actually read the article? He makes it quite clear why the lesson of simplicity applies to a container ship. Analogy is not homology. Just because container ships are part of a complex system does not mean that the steering system isn't valuably simple compared with other alternatives. And it definitely doesn't mean we can't learn a lesson from that.

Also, I think this is straight up wrong:

> If you are building software (a full SaaS solution), [...] That's a huge undertaking that requires massive engineering efforts. It's not a simple task.

If you believe it will require "massive" engineering efforts, you will certainly prove yourself right. But quite a lot of people do it differently. Look at Amazon's rule about two-pizza teams, for example. Or just this week I visited a ~30-person company that's been going 8 years with a successful SaaS business. They have a team of 4 developers, and it works just fine because they work to keep things simple.


One of the first line of the article states that he is a formal Naval Architect. That is to say he literally built ships. I would imagine he has some idea of the complexities involved, but still found it to be a useful anology.


Postulating about how things should be by non-experts has become a new e-sport. Then they try to apply it to software or startups as if its a apples to apples comparison.


In general, arguments by analogy tend to get hung up on the aptness of the analogy. I find it helps short-circuit a lot of long, unproductive discussions at work by just straightforwardly stating my thoughts—“we should do x because y,” rather than “system A is like system B (which does x because y), so we should do x.”


sometimes the length of the discussion is the entire goal especially on social media; https://www.owensoft.net/v4/item/2456/


As the author points out in the sibling comment, that's not a containership. It's an FPSO on top of a heavy-lift ship. An FPSO is basically a floating oil refinery, which is what the maze of piping and equipment is for. AFAIA, the vessel itself has no propulsion (which is why it's on top of the heavy-lift ship) and is just a large floating tub—You can't get much simpler than that.


I was curious about what was going on in that image, turns out that it's a "semi-submersible heavy lift ship" wrapped around another ship (an FPSO?)

https://en.wikipedia.org/wiki/BOKA_Vanguard

I think this story is about the same liftee:

https://www.projectcargojournal.com/shipping/2019/11/20/new-...


The author was not talking about software engineer but the user of software (sales & marketing). Building Hubspot (the example in his article) is a complex task. But implementing it in your business process should be a redundant and straightforward one.


Warehouse/Workshop Model is more simple, the large industrial assembly line is the mainstream production technology in the world.

But simplicity does not mean easy. It is actually systematic engineering. It is difficult to design a complex system into a simple and smooth Warehouse(database, pool)/Workshop(pipeline) Model system.

https://github.com/linpengcheng/PurefunctionPipelineDataflow


Is there any way I could convince you to stop clogging up the internet pipes with comments that seem to solely focus on 'pipes' and 'warehouses'? Surely there are other things that you care about and could contribute to HN?

At this point it's literally spam.

EDIT: Literally figuratively speaking, of course.


In fact, when you comment, my github adds 5 stars, this shows that my information is valuable. and I have all posted comments on related topics.

I think you should enhance your ability to appreciate technology, Too poor technology should learn more, instead of criticizing others.


I love when people talk like they know what they're talking about and then get sonned. Hackernews everyone


The author is not arguing for simplicity per se, but about being able to insert a human into a system that is normally automated. The example about the ship's steering system is perfect, actually. The system is not "simple" (if I were on a ship and the steering failed, I would be clueless), but it provides plenty of interjection points where a knowledgeable human can step in and either debug or fix the issue. It's those points that should be relatively simple - the whole system itself can be as complex as needed.

Your system should be testable, debuggable and it should allow a human to step in and take control if needed. An open system is generally better than a closed one.


Humans in the loop are likely to be a source of trouble, and you will need to measure to figure out if they provide any overall benefit other than giving you somebody to blame when that trouble happens. You need to ensure that the "human in the loop" scenario is obviously worse than automation or else humans, always over-confident of their abilities, will put themselves in the loop when it'd be far safer not.

A good example of how to do this would be London's Docklands Light Railway. A DLR train is capable of autonomously going from one station to another. Trained operators are aboard every DLR train, and they have a key to a panel in the front of the train which reveals simple controls for manual operation. But very deliberately the maximum speed of the train with an operator at the controls is significantly lower than maximum speed under automation (there are two modes in fact, driving with the train still enforcing rules about where it can safely go which is a bit slower, and driving entirely without automation helping which is much slower). This underscores that manually driving the train is the way to sidestep a temporary problem, not a good idea in itself.

The DLR has had several incidents in which trains collided, it will come as no surprise that they involved manually controlled trains. Humans are very flexible but mostly worse as part of your safety system, don't use humans if you can help it.


I think what he meants by human and users is "programmers" and layers of abstractions in your code. The ability to change and go deeper when you need it.

But you example works nicely as well, you should avoid having to work at lowers abstractions, and limit the work there as well. "changing the core to adapt to new features rather than adding to the core"


Agreed, the first bit of the article isn't about simplicity as much as it is about visibility and the ability to directly control each aspect. This is very much in line with how modern SCADA systems are designed (and indeed the ship example is one of these). Each component can run automatically based on sensors and the state of its peers, or it can be directly controlled by the operator if required.

The latter part of the article talks about simplicity but doesn't seem tightly tied to the earlier part.


I considered using other labels besides "simple," like "redundant" or "straightforward," but decided no matter what word I used some ambiguity would remain. So I stick with "simple" and hoped the context would help carry the point across. I'm glad that it did, as it seems.


How would a steering system that is simpler than that look like?

I think a key element is that this system is composed of parts that have one input. That allows a human to replace a component that failed and provide that single input to the next component in line.


I like to “embrace complexity.”

Any line of code, especially one that introduces a new concept, like a class or a method, is in “addition” to the root problem we’re trying to solve.

The closer the code/data/org-processes/whatever matches the fundamental problems we’re trying to solve, the less complexity we introduce.

However, you cannot “remove” complexity! A common mistake I see in software companies is using a single Jira ticket for one or more reports, one or more conclusions, and one or more action items.

For example, if we get reports A, B and C, they may all go into a single Jira. Oh, but C is a totally different thing! It’s not a “duplicate”! Ok now someone needs to extract the info from a comment into a new Jira (which no one will... and it’d be a mess if they tried).

Now a bunch of discussions are happening in the comments of this mega-Jira. 3 conclusions come out of these discussions — perhaps what the cause is and what should be done about it.

Then let’s say there’s two action items, one is a temp fix and one is a long term fix. Unfortunately, I usually see people is the same Jira for both tasks (“Assign this back to me when the temp fix is in so I can do the real fix”).

But splitting Jiras up is frustrating. The actual splitting is frustrating to set up, and it’s a pain to browse them.

So most Jira workflows remove fundamental complexity (reports vs discussions/conclusions vs tasks) and introduce extraneous complexity (tonnes of fields and workflows that don’t necessarily apply).

One should “embrace” the fundamental complexity of the problem(s) they’re dealing with.


My favorite variants of this phenomena are "let's just add a column instead of creating a new table" and "let's add a boolean argument instead of creating a new function."


Or worse: "we built a table for arbitrary key-value pairings, but now the value needs to be an array or object, so we'll store it as a JSON string"


I usually do the inverse: who knows what additional data we'll want to store for each row, let's put there a jsonb column, store anything not clearly critical there, and add columns or tables as these jsons get populated and used.


Agreed. It should be almost as simple to create a sub-task (of any task) as it's to comment. Also convert a comment to sub-task. Browsing is another problem...


And there a fundamental impedance mismatch between these tools and actual work rears its ugly head. Work is structured as a dependency graph - a DAG. To complete C, you need to do A and B, but a part of B depends on a part of A to be done, etc. Yet these tools insist on a list, or a very flat tree (a task and maybe a subtask, but no sub-subtasks).


Usually, being easy is confused with being simple. Easy to use systems usually tend to be very complex but they'd only hide it from the average user. Once you go one step ahead, do something outside of quick start, you find yourself helpless. In that sense, the article is on point. Not only the management of the ship is simple, machinery behind the scenes is simple as well.

This applies to programming languages, too. Python is easy but when you do advanced stuff, often without realizing, you start to grok many complex details to make it work or make it work in a performant way. On the other hand, Go (and Rust supposedly, but I don't have experience) is simple. It doesn't let you do many things and that results in fewer ways of doing a specific task but when you do it, you feel safer because in the end, it does look and feel simple although you spent 1+ hour than you'd normally spend compared to coding in Python (this assumes you're doing some advanced stuff, not quick start).


Go is actually a pretty complex language, more so than Java for example, in my opinion. It has all sorts of primitives that you need to get used to, special rules for how the built-in types work, there are all sorts of rules to see what creates a copy of a struct and what doesn't, pointers have at least three completely different use-cases (optionality, mutability, and avoiding copies), several rules for what can and can't be addressed etc. And this is all before looking at the standard library.


I think concrete things you listed like the case for pointers is not much different than Java. While why you use a pointer changes, it's simply pointer in the end, just like Java.

I'd say Go is able to put so much meaning to those fundamentals because 1. they have to, there isn't many other mechanisms 2. they can because it's still early in the evolution of the language. After some adoption and wildly different use-cases that users wants to be addressed, those patterns start to disappear and people end up with lowest common denominators, pointers being just pointers in this case.


I do wonder what the trade off would've been in Go if something like: a := b was guaranteed, under all expressible circumstances, to create a completely new copy - regardless of the content of b, unless it was explicitly a pointer.

It does frequently feel like this could've been solved by making basic slice assignment always go as a := b[:], and adding some type of syntax and checker where you needed to confirm that you were doing a not-completely immutable copy.


In real life, there are monsters lurking behind Go's "simplicity." An example: errors as values. It forces you to deal with errors as they come up, right? Not exactly... People just start ignoring errors. The extremely opinionated linter doesn't care if you just assign an error to _, or even just don't handle the return values at all. And it's not something you can easily spot in a pull request either. Then you end up with nil pointers or spaghetti errors that only show up at runtime and can be quite difficult to trace, where a Python exception would make debugging trivial.

This isn't necessarily a knock on Go. I've enjoyed working in it full time for several years now. I don't think there's a such thing as a simple programming language. Go just hides the complexity from initial inspection.


I think there are multiple levels to this phenomenon. With Go in particular, after having used it in some high performance scenarios (10k request per second for images that are rendered on the fly by GPUs, on systems holding 200GB of binary radar imagery in memory, that also need to ingest 50MB per second of new radar data), I've seen that the simplicity of the language is not a hindrance in these areas.

Edit: if you find yourself spending most of your time trying to come up with the perfect abstraction, Go may really piss you off, I won't deny that, it's not a strength. You have to be satisfied with 'good enough' and move on, solving edge if/when they arise. Go often encourages moving toward the concrete, and you can solve generic problems in really basic ways sometimes. As an example, I was writing a DAG server last year, and coming up with ways to move data between vertices in the graph in a general way. Rather than getting out my abstractions, I just pass around []byte, and leave the interpretation of those bytes up to each vertex (often just a type cast). I personally find this refreshing, and while there are costs to doing it this way, with a few basic helper funcs, you can get 95% of what you want from a generic server like this without doing a lot of modeling.


> Go just hides the complexity from initial inspection.

If you (parent, other readers) haven't seen it, Rob Pike's "Simplicity is complicated" is a good discussion of just this point.

video: https://youtu.be/rFejpH_tAHM

slides: https://talks.golang.org/2015/simplicity-is-complicated.slid...


> [...] easy is confused with being simple [...]

The first half of Rich Hickey's "Simple Made Easy" presentation does a great job of defining easy/hard and simple/complex axes and distinguishing them.

video: https://www.infoq.com/presentations/Simple-Made-Easy/

It has been discussed before on Hacker News:

https://news.ycombinator.com/item?id=4173854


No comment on Go. One pattern I've seen are tools that make easy things easier, but make hard things impossible.

Which is fine until the built system needs to grow. Then the pain hits.


>On the other hand, Go (and Rust supposedly, but I don't have experience) is simple.

I was reading most if this yesterday which seems to disagree: https://fasterthanli.me/blog/2020/i-want-off-mr-golangs-wild...


I really enjoyed this article. Thanks for linking it!

It’s interesting to see the relative naïveté in some of the implementations of what seem to be relatively key builtin libraries like file pathing. I also have never had to deal with cross platform support in Go, so I’d never seen the magic compilation comments/suffixes.



Every new programming style epiphany is an ad hoc, intuitively-specified, hazy restatement of the Unix Way.


If Unix is your yardstick for simplicity, that just shows how far we've come…


Linux is incredibly simple compared to Windows once you actually your computer as a computer instead of as a really poorly made gaming console.


Try talking Grandma over the phone to help her share her photo directory with Aunt Jeanine, on the same PC. No command line allowed, and you know she can only focus on that type of task for 4, maybe 5 minutes top.


That wouldn't work reliably under Windows, either, as nobody knows anything about files and directories generically anymore.

(That's the reason why first the Windows explorer was suddenly sufficent enough and then almost completely disregarded)

A public folder on the same PC is way harder to explain than just telling your photo app to share it with your resident spying company/cloud provider, despite how wasteful the round trip is.

Not that that has anything to do with a discussion about the "unix way" as a special case or even superset of IDEs.


I think you're confusing easy and simple. Windows is easier (for most people), but linux is certainly simpler.


I can't imagine any system where you'd have a good chance of succeeding in that scenario without some kind of remote administration tool. Remote controlling a user over the phone is difficult, unreliable and time-consuming.


No command line allowed

It's easier to tell someone what keys to press than to find and manipulate UI elements.

on the same PC

On a PC, everyone already has access by default, so "sharing" doesn't involve anything additional.


My mother can't reliably remember how to operate the remote control but uses linux just fine (browsing, editing text documents, etc).


.. install remote admin tool? Gotomypc etc?

Or just tell them to copy it to a USB stick. Yep, same machine, it's still the best answer.


Share? Google Drive. Or email. Both of which are equally good/bad to explain to grandma on Windows and Linux.


Bullshit. You're talking about an operating system where the common advice for someone who wants to install up to date software is to fucking compile it from source because the whole community never got their collective shit together enough to allow developers to directly distribute binaries without a gigantic fucking headache.

Christ, it's such a fucking mess that one of the most compatible ways to distribute software is to write it for Windows and rely on WINE.


Hardly. I've maybe had to do that 5 times in the last...10 years?

And I split my time between Arch, CentOS & OpenBSD. The vast majority of things these days are packaged in a useful way. There's also flatpak and similar now.


> I've maybe had to do that 5 times in the last...10 years?

Good for you. Some of us have to install up to date software more often than every 5 or 10 years.


I'm running Arch on my office workstation. I install up to date software literally every day.


Only after a third party makes an AUR for it. I'll stick to an OS where the developer can publish directly to the users without 15 different packaging formats, thanks.


Again, flatpak, appimage, docker ... you can use one of these, don't have to use 15.


No, the developer can choose one of these. As a user, I have to deal with all of them because I cannot choose how the developer distributes it and there is no standard. If I could choose, I'd use AppImage for everything (since it is the only one that is portable) but I can't do that.


> compile it from source

Way simpler to do on Linux than M$ Windows.

> [no way to ] to directly distribute binaries without a gigantic fucking headache.

I mean, flatpak, appimage, docker, etc ...


> Way simpler to do on Linux than M$ Windows.

Completely unnecessary on Windows since it is an operating system, not a kludge of random source code from the internet.

> I mean, flatpak, appimage, docker, etc ...

Which of those is ubiquitous enough to ensure availability of what you're looking for in that format? In my experience: none of them. AppImage is easily the best, but the community seems to hate it because it makes things too flexible and simple or something.


> Way simpler to do on Linux than M$ Windows.

Where, pray tell, do people who write the software for this "operating system" store their source code if not somewhere that is connected to the internet? Do they use punch cards?

> Which of those is ubiquitous enough to ensure availability of what you're looking for in that format?

Have not met someone who hates AppImage. flatpak is also rather ubiquitous. I use both every day.


> Have not met someone who hates AppImage

Talk to Drew DeVault, to name one.


UNIX is very simple, it just needs a genius to understand its simplicity.

And it is 2020, I guess nowadays everybody understand how unix-like system works and how simple they are.


And thus, a new law was born.


There's way too much simplification and generalisation here, IMHO.

"Choose tools that are simple to operate over those that promise the most features."

Doing this very much depends on your requirements, doesn't it? What good is a simple-to-operate tool if it doesn't do what you need it to do? Sure, maybe you can simplify your requirements; then again, maybe not.

The example of troubleshooting a whitepaper form very much depends on the design of the system. Maybe there's a reason for having multiple forms. If they all shared common architecture and the problem lies in what they share, it's not necessarily more difficult to troubleshoot many as opposed to just one.

Moving from Marketo to HubSpot is good and well - for now. What about years from now? How did they end up with such complexity with the Marketo solution? Both solutions and requirements evolve. Down the line, you could end up with the same difficulties with HubSpot. I think it depends in part on how the organisation handles change.

Lastly, I agree with the sentiment of Mateusz Górski, who commented on the page: is there any hard proof beyond anecdote to back up the article? If you expand the article's concept of a system to beyond just software, all you need is a hardware fault in a third-party hosting company with lousy support for your simple piece of software to be down for weeks.


> all you need is a hardware fault in a third-party hosting company with lousy support for your simple piece of software to be down for weeks.

I suspect the author would reply that you can just replace the hosting provider. Since the interface and division of responsibilities is clear, one host is completely replaceable with another.


As a sub-case of this, almost every HA (high availability) system I've ever seen has been less reliable than the original system was (without HA). It sounds like a good idea, but the extra complexity kills it.

One non-software system was a rather expensive UPS/generator. It was meant to trip on a power loss and provide X minutes of stable power. In reality, it was incredibly sensitive to minute power fluctuations and would trip and then immediately fail, dropping all supplied power in an instant. The system was dramatically more reliable with this unit simply disabled.


At least for the relatively small businesses that I deal with, I basically always recommend to save the effort that would go into building a 'highly available' system, and instead spend it on a scenario for keeping the business running when tech fails—which it will to some extent at some point, no matter what. I suppose that line of thinking would do wonders for much larger businesses as well. When a system becomes too important, rather than trying to make it even more reliable, try making it less important first.


Indeed. It shocks me when I see boo-coo bucks being spent on all sorts of bells and whistles, and then ask about their backup strategy. Oh, all of our drives are RAID...


> almost every HA (high availability) system I've ever seen has been less reliable than the original system was (without HA).

I've seen similar things with rabbitmq, in situation where the load & number of events to process is tiny but HA is some checkbox item to deliver without a serious QA process to measure if the setup is actually working effectively. Number of events in production we genuinely needed multiple rabbitmq nodes in the cluster to cope with failure of one or more nodes: 0 . Number of times when rabbitmq got too excited processing large messages that it delayed heart beating & subsequently decided that there was a network partition when there was in fact no such thing, leading the prod support team to step in and manually recover the cluster: > 0


I said the same thing about UPS causing more trouble than preventing it. Then I moved to Africa and quickly learned that a local UPS can be VERY helpful.


This has absolutely not been my experience, at least for well designed cloud infrastructure. High availability doesn't have to be complex or hard to understand. In most cases, automatic failovers and restarts on well designed architecture fix problems with zero end-user impact. It happens so seamlessly that my standard procedure these days is to fail over infrastructure if I want to restart it.


What is this "well-designed infrastructure" of which you speak? ;-)


Every? that seems a stretch. Plenty have very good and simple HA architectures: Cassandra, Kafka come to mind. Others of course don't.


Well, "almost every" and by "seen" I mean in environments that I've had to support, in corporate and academic land.

Perhaps I've just had bad luck, but my impression is that getting this right, both internally and in the field, is a lot harder than it looks. And when it breaks, it can be a real s___show compared to the simple, non-HA version.


I wrote my first deployed side project 5 years ago. It has been with the customer for the whole time and worked flawlessly while doing some fairly complex things, managing multiple processes for the customer. When I wrote it I had never done any web development and had not coded for 4+ years. C++ before that.

Due to my lack of web development knowledge and massive time crunch, I built it with no framework, zero best practices and the most basic code I could build. The code is atrocious, and would not scale at all but it does what it is supposed to do and works because I did everything in the most basic way, there are no gotchas. Straight html, basic javascript and php on the backend. It interacts with google docs as well. Its almost to dumb to break.

Sometimes we really do complicate things with all of our new fancy frameworks, microservices, states, etc.

With that said, it would be impossible for anyone else to maintain and if anyone posted the code on the web I would probably be unemployed. But it works, very difficult to insert new features though.


This tactic is also used to criticize/de-legitimize existing, working solutions when a new manager wants his old, familiar environment. "What Bob's team built is far too complex. If we replace it with the system I used at my last company, it will be much simpler." If I had a dollar for every time I've seen this type of thing, I'd be rich. So be careful you are not just playing corporate politics and changing solutions every time a new CXO joins.


Totally. Funny enough, marketing platforms like those mentioned in the essay are usually the first things to get replaced when a new VP or CMO joins. Too often, unfortunately, that decision is made not because it's the best tool for the company but because it's the most familiar one.


Using the company this author helped to migrate to hubspot as an example.

I'm not sure if the simpler solution - hubspot existed when the company started hacking solutions together.

Also, the move to hobspot was relatively easy because the business processes and workflows were already well known and defined.

Starting from scratch, fighting the fires as they appear, a bit of patchwork seems unavoidable.

Everything is obvious in hindsight.

I guess the real lesson here:

Simplify once in a while. Like Facebook just did with messenger.


> I'm not sure if the simpler solution - hubspot existed when the company started hacking solutions together.

Author's value proposition is "I will help you outsource your complicated marketing tools". If you reject the gospel that your marketing tools are complicated, author's service would be not needed.


Not the intention at all. My job is to parachute into a startup facing some problem, fix the problem quickly, and get out. I have no business being at a startup that's running perfectly, or creating solutions that will require me to stick around.


I built a static jekyll site using wordpress as a backend for the content editors (jekyll build reads directly from the database) hosted on a handful of AWS VMs running NGINX and loadbalanced by HAProxy. The build system is automated in CircleCI. WordPress is hosted privately on a hosted WordPress provider.

There's an ELK stack + plugins for logs and alerting and saltstack to trigger some basic automated sysadmin tasks.

I left 3 years ago and the entire infrastructure has run on autopilot with zero downtime. Content editors and web developers still add to the content constantly. There is some dynamic content on the site handled by JavaScript and a few light APIs that I built to support the site. There's over 5000 pages of content. This simple infrastructure powers a website with a low-5 digit alexa rank for a business bringing in about a billion dollars a year.

The entire infrastructure bill comes in under $100/mo and they no longer need to employ any backend engineers. We negotiated a contract agreement for me to support the site if they need it. They haven't.


In terms of the little guy having a saas, this is true for me...my saas basically is relying on s3 for both file storage and a json file per account to list out users allowed in the system. The only time it goes down is when s3 goes down or the occasional container restart.


I'd be interested in hearing more details here.

Are you saying that you use s3 as your only data store?


Yes...think of it as layers... or manifests that point to another set of manifests global (list of users) -> accounts (have lists of users in them and role), content -> list of json files. I use a combination of json logic or simply listing out keys in a specific bucket...the speed of access doesn't seem to be something that bothers anyone, it started out as an experiment of less is more and because the app is not heavy on db searching etc rather only specific to account. I use oauth2 instead of relying on my own infrastructure and let people send out invites where they would use an email only but it must support the oauth 2 providers i already have in place. Basically, doesn't go down because it relies heavily on other things working that has teams of people working on them with gigantic resources. Don't get me wrong...I'm not saying it will scale to a million users, my clientele is mainly corporate that have a good amount per account sale but if it needs scaling, at least i didn't waste time and money on it during the interim and can always db-back it or dynamo it later.


> Simple systems may have less downtime

* Yeah, but often they do less / are less effective during uptime.

* "simple" is an inspecific concept. Is the implementation simple? The API? The hardware? The requirements?

* "simple" is a vague concept. Is a 1000-line C program more complex than a one-liner script in a fancy higher-level scripting language? Yes, if you ignore the 1,000,000 line abstract virtual machine, JIT compiler, interpreter, large standard library and what-not. Otherwise - maybe so, maybe no.

> Modifications before additions

I actually agree, but:

* Managers don't like it.

* Sometime this downtime happens before there's ever been any uptime...

* Tomorrow you need something else, and the changes for that require re-engineering the parts you already re-engineered to best accommodate the previous addition.


This holds up to a point. The systems that require really little downtime get progressively more complex.

Need to handle machine failures? Push a bugfix without causing downtime? Maintain a single state across the redundant machines? Handle load spikes? Ones created on purpose to DOS you?

All of this requires additional mechanisms, which can themselves cause failures.


This feels like it is stating the obvious. However it is easily forgotten.

A lot of Software Engineers / Developers in my experience tend to over engineer projects or they are mandated they do so by so their dev leads. It is easy to get caught up into abstracting almost everything away almost for the sake of it while it provides no real benefit.


Don't forget the old classic Resume Driven Development.


Simple systems also benefit from Amdahl's Law, but with humans instead of CPUs. This is an observation I've been trying to sell others on for years and having only modest luck. When I can show them, they get it, but you have to push and push to get people there if words fail.

When the shit hits the fan, there's an 80% chance that your most senior members will find the problem. They may find it quicker in the simple system, even, and others might get there first. But that last 20% is a very long tail, and even longer in a complex system.

In a complex system, many of your members cannot participate in the triage process, because they don't know enough to know what's relevant. They slow down the people trying to work the problem trying to learn new things (good) or offering low-probability scenarios (bad).

There is no 'All Hands on Deck' scenario for the complex system. To be responsive, you have to kick some or even a bunch of people out of the room, and once they leave they can't really participate.

The first phase of a triage is getting a tight repro case. Many avenues are blocked until that happens. And some people have a knack for repro cases but aren't so great at debugging. With more people you get a higher quality repro, which cuts a lot of time off the rest of the process.

Part of debugging is the cost of the verification versus the likelihood. Simple Systems afford the opportunity for people to test out unlikely but plausible scenarios that are in the long tail, without distracting from the more 'boring' checks already being done.

And when bugs are identified prior to deployment, a simple system means you can hand the repro case to the responsible party and expect/demand that they get their fix on the first try.


Think this article is a different variation on the same theme.

https://mcfunley.com/choose-boring-technology


> Modifications before additions.

> When new requirements come up, the tendency is to add layers on top of the existing system—by way of additional steps or integrations. Instead, see if the system’s core can be modified to meet the new requirements.

I like the mention of this as it is - at least to me - really counter-intuitive.


"[The] more simple any thing is, the less liable it is to be disordered, and the easier repaired when disordered..."

- Thomas Paine, Common Sense, 1776


Perfect. Added it to the essay. Thank you!


Took a few years, but glad that Government degree finally came in useful! (=


the article is self-refuting in a way, although the message makes sense. The author describes simplicity as "fewer levels of abstraction, and each of them well understood" , like a ship rudder (which is a good definition). But no-code tools etc are anything but well understood, and usually introduce hundreds of levels of abstraction (due to their genericality). The same goes for frameworks that introduce tons of abstractions in order to cover cases that are irrelevant to the problem at task. Ideally, the chain of scripts that do one thing and can be read in 10 minutes is actually simpler and a much better way to fix the ship.


Declarative coding tools are to blame here.

This is a well written article, but the metaphor is a little stretched.

Marketo and Salesforce are the "ship builders". Marketers are the crew operating the ship.

The declarative tools in enterprise apps provide Marketers with conditional branch and execution capabilities without code structure, so the average Marketer just keeps adding layers of complexity. Whereas a Developer would continually refactor and make old code obsolete.

The crew operating actual ships do not have anywhere near this level of access to the ships configuration, hence more reliability in the system.


This article misses one crucial point in suggesting Looker over "a patchwork of custom scripts and APIs". The complexity doesn't go away because you offload it to Looker. When Looker goes down, your application is still in trouble. It's still your responsibility when your system goes down, even if you can point the finger at some third party service. Your system is still down! That being said, don't use Looker in the first place. Do not spy on people.


Several people pointed out the fault in using Looker as an example. I very much welcome suggestions for a different product to plug there!


I don’t know, most systems like the container ship he/she described just seem simple but are actually pretty intricate inside. The logic behind them might be easy to understand, but the devil is in the details. A combustion engine is also simple conceptually: make stuff explode in a closed chamber to move a rod, convert that the movement to a rotation of a shaft. Super simple, yet I doubt I could debug many problems my car engine has with such basic knowledge.


This seems like an oddly appropriate metaphor though - engines aren't super-complex once you know what everything does, but the biggest problem in car mechanics is just getting access to the specific part you need to measure or look at.


Reminds me a bit of home automation (which I enjoy). You manage a plethora of battery powered devices that need constant connection to finiky wifi with continuous access to your Linux server hub so that push and pull of data remains in tact so that you can have AI actuate a function through a "remote" - all to replace a light switch. Sometimes I wonder where this will end up. Might need 50 years before reliability and simplicity are up to par.


The problem is that for different people 'simple' has different meaning.


> features don't justify complexity

Such a good line. I've seen companies killed by hard-to-follow arguments advocating for technical complexity which serves feature goals which serve business goals.

'Good strategy is simple' from some strategy book is sometimes right; arguments that hold business goals hostage to technical complexity are trouble.


His website had this annoying message about subscribing for updates that has no obvious way of getting rid of it excepting by clicking the accept button. I guess that's simpler than having an additional "no thanks" button. The article was a formulaic retelling of well known wisdom too.


Hey, there is an X in the corner to close it. The subscription slider is made by Mailchimp so I don't have much control over the size of that X, unfortunately.


Sometimes the complexity is there for a reason. But if you can just start over, those reasons was not that important!? So don't hang on to functionality because it would be nice to have, you will have to pay the price for those features over and over.


Sometimes, but a lot of devs fetishise complex stuff. I disagree wit a lot of my tech leads decisions because he always wants to do the "most correct" way, whereas I want to keep things as simple as possible with the aim of keeping our code maintainable. For example, he wants to a add a graph database to the tech stack, while I realise that its one more thing to manage and potentially break and mongodb (that we already have) will be probably be good enough.


Things only get complexed if you mix them, like cords in a box. So you might be able to keep it simple as long as you do not have code that depends on both the databases, and that the two databases don't depend on each other.


Everything seems logical and simple. But the examples lack details and realism. Take for example this quote:

> In the end, the system I put in place had 97% fewer processes (from 629 to 20) while providing all the same capabilities. A bug that was found a few days later got resolved in four minutes.

It‘s called refactoring. It‘s a constant process, that none likes to allocate time to, because it does not move metric immediately. Especially in marketing with a very fast changing landscape you go from 20 to 600+ very fast.


I share the idea of the author, but I miss actual talk about how we can make things better. Because his arguments are also true when it comes to security.

What are alternatives to commonly used complex systems, like Docker/Kubernetes, Active Directory, Remote Access or just shared folder structures? How to keep the functionality of centralized management while making it simple? Would be happy to hear your thoughts on this.


It's similar to optimization in that it follows a U-curve. Up to a point, simpler things have more performance, and simpler things have less downtime.

But after a certain point, it's no longer enough to make it simpler. You have to get a lot more involved again, and this time the complexity has to be intelligently focused around the end goal (be it performance or uptime).


It's important to find a compromise. The simplest system at my work would be something like a php/mysql backend hosted on a old fashion shared server using ftp.

Good luck attracting good developers with that. We use something over engineered but we have fun. The customers don't care but we do.


There are a lot of companies where Senior PHP devs are paid more than Senior Java devs, because it is so hard to find a brilliant PHP dev


Since PHP has become more decent with the last versions, if you are starting a new project that maybe actually not even be too terrible.


That's all good if your company is making money.


plug for my all time favorite paper in the subject https://web.mit.edu/2.75/resources/random/How%20Complex%20Sy...

5 pages of pure awesomeness


A bunch of paragraphs to type out KISS.


Do they also have less people using it?


I guess the main question to be asked is should be:

is the complexity appropriate for the goal (i.e. Everything should be made as simple as possible, but no simpler. )

Hence, often a required step to fix complexity is to change the goal, in such a way that it can be implemented with less complexity.


> In the end, the system I put in place had 97% fewer processes (from 629 to 20) while providing all the same capabilities.

I am suspicious of this statement. Is the author now supporting the company and defacto replacing the guy that left?


I totally agree when designing man-made, synthetic systems. But it is interesting to consider that the most reliable, self-healing systems that exist -- animals! -- are incredibly complicated...


there is a difference between complex and complicated. Also consider that systems where complexity serves a role are not complicated. Also: simple things that are aggregated into complex things through a hierarchy that is also simple can be reasoned about with failure happening subsystem level (and can be easily mitigated - think supervision trees in erlang)


Surprised nobody has yet mentioned Einstein’s quote.


My microservice that returns “Hello World” via HTTP has been running solid for 5+ years now with zero downtime.


`Uptime` of various *x systems comes to mind. Quality of code also matters.


Very good article. Please also preach this article to those who praise Kubernetes right and left (remember, Kubernetes it's the new blockchain hype; and just like blockchain you don't really need it unless you're a cryptocurrency; which 99% of companies who use Kubernetes aren't).


I love simple things, thank you for sharing!


I love simplifying things.


I'm not sure I follow, could you break that down for me?


KISS


Keep It Simple Stupid



duh


This is a very important idea. I like the shipping container metaphor. I have often used the idea of a 2x4 (as in lumber) in building houses. The humble 2x4 is such a simple product, but when put together with nails and screws becomes the cornerstone of a complex structure.

It is simple and it just works for its intended purpose.

I am working hard to apply this to the design of Webase [1] which is a #nocode platform that is inherently complex. But I really believe that if we get the design right it will support a large number of sophisticated use-cases.

Thank you for sharing!

[1] https://www.webase.com


You're overdoing these links. That's no doubt why you're getting downvoted and flagged.

It's fine on HN to post your own work (1) in places where it's relevant, as long as (2) you only do it occasionally and (3) are also participating in the community in the intended ways, submitting interesting articles and having curious conversation. But when users break these rules of thumb (and you've been breaking all three of them!) it crosses into spamming. The community is very aware of that and really doesn't like it, so it's not in your interest.

Doing this by hijacking top comments and threads about other people's projects is particularly not ok.


Thanks for the feedback. So far I have been getting positive feedback which is why I have continued to do this.

I have been trying to add to the overall value of HN with insight in general and only including a link to my site if it is relevant.

But I hear you and will adjust accordingly.

I do sincerely appreciate the feedback as I have been steadily gaining karma and then out of no-where it when backwards today.


It's a clear indication from the community that you're overdoing the promotional links. The fact that you haven't really been participating in any other way accentuates that impression. Readers check these things.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: