Hacker News new | past | comments | ask | show | jobs | submit login
Never Update Anything (kronis.dev)
185 points by generatorman 4 months ago | hide | past | favorite | 119 comments



"In my eyes it could be pretty nice to have a framework version that's supported for 10-20 years and is so stable that it can be used with little to no changes for the entire expected lifetime of a system."

This is what applications used to be like, before the web and internet hit and regular or even push updating became easy.

It was simply so difficult and expensive to provide updates once the software was in the customer's hands that it was done intentionally and infrequently. For the most part, you bought software, installed it, and used it. That was it. It never changed, unless you bought it again in a newer version


Frequent updates, in the old days, meant that a vendor had poor QA. I think that's probably still the case most of the time today, too.


> Frequent updates, in the old days, meant that a vendor had poor QA. I think that's probably still the case most of the time today, too.

Back in the 80s and 90s if you had bad QA you'd ship very buggy software and customers hated you because they had to live with it for months and months until the company managed to do another release. And then it was costly to ship off those floppies to every customer. So there was a very real price to pay, in money and reputation.

Then it became possible to do updates online. Initially it was a nice way to deliver an emergency fix if necessary, but you mostly continued to do nice QA'd releases every now and then.

But as with everything, when something becomes too easy it gets abused. So companies realized why do much QA (or any QA in extreme cases). Just push updates all the time and hope for the best, if customers (who are now QA) scream, push another update tomorrow. Break quick, fix quick.

It's mostly unsatisfactory if one values stable quality software.


> So there was a very real price to pay, in money and reputation.

Microsoft Corp. would like to have a word with you. /s


> Frequent updates, in the old days, meant that a vendor had poor QA. I think that's probably still the case most of the time today, too.

The internet has normalized poor QA. The bosses don't give a shit about QA anymore because it's so cheap to just push out a patch.

I mean just look at old video game magazines that talked about the software development process: the developers would test the hell out of a game, then test the hell out of it again, because once it was burned onto a $100 cart (in 2024 dollars) it wasn't ever going to change.

Now games can remain buggy and unstable for months or even years after "release."


I never worked on games, but I did do a streaming video app for PS3 in 2010, during the time period when it was arguably the best media center box available. Working with Sony (SCEE) on this was eye opening how their QA process was set up. Essentially you formally submitted the compiled app (C++ and Flash!) to them, and then tickets would come back and you'd have to address them. QA was a separate org so you never really met them in person, all issue discussion would happen in-band to the ticketing system. QA had all the power to decide when and if this thing would ship, much moreso than Product or Engineering. I can't say the process made a ton of sense for a downloadable app powered by a multi-platform online service, but it was illuminating as to how the high quality bar of 90s/00s console games was achieved.


> then tickets would come back

Luxury! With Nintendo you'd often get only one ticket. Any further bugs would cost you further submissions, and months of slippage.


Thanks for jogging my memory! I believe we did two rounds of QA, the first was unlimited but shallow and high latency, the final QA I believe was called Format QA and there was a quota, if you had more than 6 issues or something you were rejected and had to go back into the queue (for months).


aka acceptance testing

aka The Correct Answer™

For the products I managed in the 90s, I put QA/Test in charge of releases. Very unusual. The results were awesome.


Windows XP. High profile zero day cases and Windows Update during 2000s created a "security updates are like dietary supplements" mindset.


The differece is it no longer means the vendor's QA is poorer /than average/.


See Microsoft and Crowdstrike for details. /s


I remember even games, or especially games were like this. Interplay would rarely have a post launch patch or make it past 1.01 versions of a whole game. then in the late 90s or 2000ish Tribes 2 came out and basically didn't even work for over a year until patches finished the game. I think once Internet hit critical mass things shifted forever and haven't gone back.


Also, the bugs became part of the game and make the whole thing more interesting. Speedrunning as a whole relies on these flaws.


> For the most part, you bought software, installed it, and used it. That was it. It never changed, unless you bought it again in a newer version

And it was much better than the current situation, if you ask me.


That software did a lot less, however, and you just had to live with bugs. I knew people who just learned that they couldn’t use a feature without crashing, or had to disconnect from the internet to print, had to write their own math formula because a built in function was wrong, etc. and just worked around it for years.

The first company I worked for in the 90s had a C codebase which seemed like it was half #ifdef or runtime version checks because they had to support customers who rarely updated except when they bought new servers, and that meant that if some version of SunOS, DOS/Windows/etc. had a bug you had to detect and either use a work around or disable a feature for years after the patch had shipped.

I do agree that stability, especially on the UI side, has serious value but my nostalgia is tempered by remembering how many people spent time recovering lost work or working around gaps in software which had been fixed years ago. I think the automatic update world is better on the whole but we need some corrective pressure like liability for vendors to keep people from pulling a Crowdstrike in their testing and reliability engineering.


We still have to find workarounds for crappy software. But with a continuous update model, we waste more time fighting with bugs because the bugs are constantly changing. Every week you have to waste figuring out what broke now. No thanks.


> I think the automatic update world is better on the whole

I think that we went to another extreme. Because it's so easy to update, we just ship bad software saying "we'll fix it later". And we don't.


Hence the rest of the quoted sentence. I think removing barriers to fixing bugs is good but companies need to feel more pressure to do so.


Companies need to feel more pressure to ship software that doesn't have the bugs to begin with. That's the pressure that was eliminated.

These days, it's just "ship it and we'll fix it later" instead, which is a big part of why (in my opinion) software quality has been declining for years.


Oh hey, I was wondering why the VPS suddenly had over 100 load average, restarted Docker since the containers were struggling, now I know why (should be back now for a bit). Won't necessarily fix it, might need to migrate over to something else for the blog, with a proper cache, alongside actually writing better articles in the future.

I don't think the article itself holds up that well, it's just that updates are often a massive pain, one that you have to deal with somehow regardless. Realistically, LTS versions of OS distros and technologies that don't change often will lessen the pain, but not eliminate it entirely.

And even then, you'll still need to deal with breaking changes when you will be forced to upgrade across major releases (e.g. JDK 8 to something newer after EOL) or migrate once a technology dies altogether (e.g. AngularJS).

It's not like people will backport fixes for anything indefinitely either.


>might need to migrate over to something else for the blog, with a proper cache

Never Update _Anything_ :)


I am very much tempted not to because it works under lower loads, could just put it on a faster server, but how could I pass up the chance to write my own CMS (well, a better one than the previous ones I've done)? That's like a rite of passage. But yes, the irony isn't lost on me, I just had to go for that title.


If you have to write your own CMS, make it compile to static files. I did that with Django, used Django-distill, and it's hands down the best static site generator I've ever used. My site never needs updates and never goes down under any amount of load.


“static files” are nothing more than no-TTL caching strategy with manual eviction.


You’re not wrong… but “static files” ultimately are infinitely less complex than any dynamic CMS, and require no effort or brain power to migrate between (even bottom-of-the-barrel) providers


I sort of not understand why we are not using static Files more often now.

In the old days we need CMS mostly because generating links and update to certain pages were expensive. Hard Disk were slow and memory were expensive. Now we have SSDs that eliminate 99.9999% of the problem.


OK.


If you are using open source you can always support your own old versions ~joking but not really~

Of course security updates are very hard, but if an old version has some good community you have the option of forking or upstreaming the updates yourself


> ~joking but not really~

For some languages and applications it can be trivial to backport the changes then trying to keep up with the new features. If it's tested and stable it will likely be more stable than a new version, I do this for some smaller programs and I'm not even a real programmer but more of a hobbyist


For contrast, I recently had a no 1. HN hit and my Pi4 never had a core beyond 20%

Yes, it’s a static website. It’s amazing how little performance you actually need to survive a HN avalanche


It's amazing how powerful hardware is in the past decade, and how masterly crafted and efficient some software is (linux, nginx, etc) ... while other software is so profoundly inefficient, that we forget.


Yeah, I've had a few HN frontpages with sites running Django and recently Phoenix with essentially no caching or optimization running on a 256MB fly.io free tier instance with barely a noticable increase in load (just big spikes in the network/traffic graphs).


Back in 2018, I helped a startup update their AngularJS wrapped jQuery prototype to 1.6 and tried hard to separate business logic from component code so they could migrate over to vue, react or angular2 after I left.

I regularly snowboard with someone still at the company. They’re still on AngularJS.

AngularJS never dies.

AngularJS is forever.


Alpine linux was designed for web services, as it includes the bare minimum resources necessary for deployment.

https://wiki.alpinelinux.org/wiki/Nginx

Also, may want to consider a flat html site if you don't have time to maintain a framework/ecosystem. =3


Alpine is pretty nice!

I did end up opting for Ubuntu LTS (and maybe the odd Debian based image here or there) for most of my containers because it essentially has no surprises and is what I run locally, so I can reuse a few snippets to install certain tools and it also has a pretty long EOL, at the expense of larger images.

Oddly enough, I also ended up settling on Apache over Nginx and even something like Caddy (both of which are also really nice) because it's similarly a proven technology that's good enough, especially with something like mod_md https://httpd.apache.org/docs/2.4/mod/mod_md.html and because Nginx in particular had some unpleasant behavior when DNS records weren't available because some containers in the cluster weren't up https://stackoverflow.com/questions/50248522/nginx-will-not-...

I might go for a static site generator sometime!


Apache is stable for wrapping mixed services, but needs a few firewall rules to keep it functional (slow loris + mod_qos etc.) =)

Ubuntu LTS kernels are actually pretty stable, but containers are still recommended. ;)


That's fair! Honestly, it's kind of cool to see how many different kinds of packages are available for Apache.

A bit off topic, but I rather enjoyed the idea behind mod_auth_openidc, which ships an OpenID Connect Relying Party implementation, so some of the auth can be offloaded to Apache in combination with something like Keycloak and things in the protected services can be kept a bit simpler (e.g. just reading the headers provided by the module): https://github.com/OpenIDC/mod_auth_openidc Now, whether that's a good idea, that's debatable, but there are also plenty of other implementations of Relying Party out there as well: https://openid.net/developers/certified-openid-connect-imple...

I am also on the fence about using mod_security with Apache, because I know for a fact that Cloudflare would be a better option for that, but at the same time self-hosting is nice and I don't have anything too precious on those servers that a sub-optimal WAF would cause me that many headaches. I guess it's cool that I can, even down to decent rulesets: https://owasp.org/www-project-modsecurity-core-rule-set/ though the OWASP Coraza project also seems nice: https://coraza.io/


I prefer x509 client GUID certs, and AMQP+SSL with null delineated bson messaging.

Gets rid of 99.999% of problem traffic on APIs.

It is the most boring thing I ever integrated, and RabbitMQ has required about 3 hours of my time in 5 years. I like that kind of boring... ;)


What exactly do you do to protect Apache from slow loris? Its my main reason for not using Apache.


There are several different ways, but the easiest is mod_reqtimeout/mod_qos/mod_security. Check your install with "sudo apache2ctl -M", and there should be several legacy tutorials available (I'd ignore deprecated mod_antiloris.)

Rate-limiting token-bucket firewall settings are a personal choice every team must decide upon (what traffic is a priority when choking bandwidth), and often requires tuning to get it right (must you allow mtu fragging for corporate users or have a more robust service etc.) These settings will also influence which events trip your fail2ban rule sets.

Have a great day, =)


Because of this, in the JDK we've adopted a model we call "tip & tail". The idea is that there are multiple release trains, but they're done in such a way that 1/ different release trains target different audiences and 2/ the model is cheap to maintain -- cheaper, in fact, than many others, especially that of a single release train.

The idea is to realise that there are two different classes of consumers who want different things, and rather than try to find a compromise that would not fully satisfy either group (and turns out to be more expensive to boot), we offer multiple release trains for different people.

One release train, called the tip, contains new features and performance enhancements in addition to bug fixes and security patches. Applications that are still evolving can benefit from new features and enhancements and have the resources to adopt them (by definition, or else they wouldn't be able to use the new features).

Then there are multiple "tail" release trains aimed at applications that are not interested in new features because they don't evolve much anymore (they're "legacy"). These applications value stability over everything else, which is why only security patches and fixes to the most severe bugs are backported to them. This also makes maintaining them cheap, because security patches and major bugs are not common. We fork off a new tail release train from the tip every once in a while (currently, every 2 years).

Some tail users may want to benefit from performance improvements and are willing to take the stability risk involved in having them backported, but they can obviously live without them because they have so far. If their absence were painful enough to justify increasing their resources, they could invest in migrating to a newer tail once. Nevertheless, we do offer a "tail with performance enhancements" release train in special circumstances (if there's sufficient demand) -- for pay.

The challenge is getting people to understand this. Many want a particular enhancement they personally need backported, because they think that a "patch" with a significant enhancement is safer than a new feature release. They've yet to internalise that what matters isn't how a version is called (we don't use semantic versioning because we think it is unhelpful and necessarily misleading), but that there's an inherent tension between enhancements and stability. You can get more of one or the other, but not both.


Most LTS strategies are like this, enterprises run the LTS version on the server while the consumers run the latest version. In a way, it is beta testing, but the consumer isn't really mad about it since he gets new features or performance boosts. LTS users usually update once every 3-6 months or if a serious CVE comes out, while normal users update daily or weekly. To be honest, I know servers running whatever the latest version of nodejs is, instead of LTS, mostly because they don't know that node has a LTS policy.


Is this any different than the LTS approach Canonical and others take?


I think Canonical do something similar, but as to other LTSs -- some do tip & tail and some don't. The key is that tails get only security patches and fixes to major bugs and rarely anything else. This is what offers stability and keeps maintaining multiple tails cheap (which means you can have more of them).

Even some JDK vendors can't resist offering those who want the comforting illusion of stability (while actually taking on more real risk) "tail patches" that include enhancements.


What the article points to is that most updates are bad updates. We teach people that they should accept all updates for security reasons, but really they should only accept security updates.

But they can't, because this is not a possibility that is given to them. All updates are put together, and we as an industry suck at even knowing if our change is backward compatible or not (which is actually some kind of incompetence).

And of course it's hard, because users are not competent enough to distinguish good software from bad software, so they follow what the marketing tells them. Meaning that even if you made good software with fewer shiny features but actual stability, users would go for the worse software of the competitor, because it has the latest damn AI buzzword.

Sometimes I feel like software is pretty much doomed: it won't get any better. But one thing I try to teach people is this: do not add software to things that work, EVER. You don't want a connected fridge, a connected light bulb or a connected vacuum-cleaner-camera-robot. You don't need it; it's completely superfluous.

Also for things that actually matter, many times you don't want them either. Electronic voting is an example I have in my mind: it's much easier to hack a computer from anywhere in the world than to hack millions of pieces of paper.


And you can't have _just_ security updates, because the combinational complexity of security fixes across feature versions is insane, not to mention the fact that the interaction of the security and features changes can themselves introduce bugs. There's groups that try (eg distro maintainers) but it's ultimately a losing battle. I'm convinced that patching is only a bandaid, but it's also impossible to have 100% bug-free code, so there needs to be some sort of systematic solution on top of whatever particular code is running. Behavior analysis, egress network analysis, immutable by default images with strictly defined writeable volumes and strictly defined data that's going to be written there, etc. There's not a silver bullet, but I think patching and trying to keep up with updates is, like, a gallium bullet at best


> And of course it's hard, because users are not competent enough to distinguish good software from bad software,

There was a time when Windows had a description for updates. Now the only distinction is between KB3587690 and KB67457770.


For a long while now it's been worse than that. Security updates are an excuse to push out new unwanted features (remote flags) or remove existing features in the process of being monetized.


I also noticed that there's another kind of update not mentioned:

* Updates that add a theoretically independent feature, but which other software will dynamically detect and change their behavior for, so that it's not actually independent.


"Never Update Anything"

Author proceeds to add to two updates to the article, epic troll.


These days it makes sense to life-cycle entire container images rather than maintain applications with their dependencies.

The current BSOD epidemic demonstrated the folly of mass concurrent versioning.

*nix admins are used to playing upgrade Chicken with their uptime scores. lol =)


I pretty much agree- most systems don't need updating. I've seen and setup OpenBSD servers that ran for a decade without issues never getting updates. I currently run some production web services on Debian where I do updates every 3 years or so, and no issues.

Leaving something alone that works good is a good strategy. Most of the cars on the road are controlled by ECUs that have never had, and never will have any type of updates, and that is a good thing. Vehicles that can get remote updates like Teslas are going to be much less reliable than one not connected to anything that has a single extensively tested final version.

An OS that is fundamentally secure by design, and then locked down to not do anything non-essential, doesn't really need updates unless, e.g. it is a public facing web server, and the open public facing service/port has a known remote vulnerability, which is pretty rare.


> Vehicles that can get remote updates like Teslas are going to be much less reliable than one not connected to anything that has a single extensively tested final version.

I don’t think it’s necessarily this, but the fact that being able to update anytime is a great source of pressure to release untested software at any cost.



Well, if only he had updated his server stack to something more scalable...


Maybe they should have updated the server capacity?


Currently it's running on a VPS that has 1 CPU core and 4 GB of RAM, resources which are shared with a few other processes. I'm thinking that I might move over from multiple smaller VPSes (better separation of resources, smaller fallout from various issues) to a fewer bigger ones in the future (also cheaper), in which case the containers would struggle less under load spikes.


When I was the sole IT guy for a small consulting company, once I got everything working, I never updated it.

We used Microsoft office 2000 for 12 years. Never had to retrain people, deal with the weird ribbon toolbar, etc.

It's only the deranged use of OSs with ambient authority that gums up what would otherwise be stable systems.


There's an alternative solution: update everything, but limit your dependencies.

Example: for my (personal) projects, I only use whatever is available in the debian repositories. If it's not in there, it's not on my dependency list.

Then enable unattended upgrades, and forget about all that mess.


To jump on a related article since it's linked and comments are now closed: https://blog.kronis.dev/articles/stable-software-release-sys...

The 2021/2022/2023/2024 version-numbering schemes are for applications, not libraries, because applications are essentially not ever semver-stable.

That's perfectly reasonable for them. They don't need semver. People don't build against jetbrains-2024.1, they just update their stuff when JetBrains breaks something they use (which can happen at literally any time, just ask plugin devs)... because they're fundamentally unstable products and they don't care about actual stability, they just do an okay job and call it Done™ and developers on their APIs are forced to deal with it. Users don't care 99%+ of the time because the UI doesn't change and that is honestly good enough in nearly all cases.

That isn't following semver, which is why they don't follow semver. Which is fine (because they control their ecosystem with an iron fist). It's a completely different relationship with people looking at that number.

For applications, I totally agree. Year-number your releases, it's much more useful for your customers (end-users) who care about if their habits are going to be interrupted and possibly how old it is. But don't do it with libraries, it has next to nothing to do with library customers (developers) who are looking for mechanical stability.


I was working at a place that delivered onprem software. One customer asked us "We like features of version N but we're running N-1. Can you backport them so you don't have to upgrade?". I replied we'd already done that, it was called version N.


I mean... there could be a case for "does this feature really require breaking changes?" that would distinguish those.


We're being paid to migrate our hardware boxes programmatically to Windows 10 IoT LTSC so that new boxes ship with 10+ years of security. We're still supporting some XP devices (not connected to the internet.) So to anyone depending on us: You're welcome.

But let me tell you something: Long-Term Support software mostly doesn't pay well, and it's not fun either. Meanwhile some Google clown is being paid 200k to fuck up Fitbit or rewrite Wallet for the 5th time in the newest language.

So yeah. I'd love to have stable, reliable dependencies while I'm mucking around with the newest language de jour. But you see how that doesn't work, right?


The fucking up of Fitbit and the rewriting of Wallet are not the engineers' fault. These kind of projects are mostly decided and planned by PMs: clueless and incompetent PMs. For payments in particular it was not even just an incompetent PM, but an incompetent director that saw the success of the NBU Paisa payment in India and thought the U.S. would be the same.

The engineers are at most just complicit. Those who aren't are laid off or they quit on their own accord.


Engineers are mostly complicit because they get that 200k$ salary when they chase next shiny thing.

No one is paying such salaries for mundane clerical job.


Lead engineers and architects chasing resume-filler projects are absolutely part of the problem


You're saying complicity is not a fault?? :)


Even as a developer not focused on web dev this sounds pretty bad, unless everyone in your dependency tree (from OS to language to libraries) decides to make a switch and even then, you'll be stuck with outdated ways to do things.

Who wants to continue maintaining C++03 code bases without all the C++11/14/17/20 features? Who wants to continue using .NET Framework, when all the advances are made in .NET? Who wants to be stuck with libraries full of vulnerabilities and who accepts the risk?

Not really addressed is the issue of developers switching jobs/projects every few years. Nobody is sticking around long enough to amass the knowledge needed to ensure maintenance of any larger code base.

Which is caused by or caused the companies to also not commit themselves for any longer period of times. If the company expects people to leave within two years and doesn't put in the monetary and non-monetary effort to retain people, why should devs consider anything longer than the current sprint?


> Who wants to continue maintaining C++03 code bases without all the C++11/14/17/20 features? Who wants to continue using .NET Framework, when all the advances are made in .NET? Who wants to be stuck with libraries full of vulnerabilities and who accepts the risk?

With the exception that in this hypothetical world we'd get backported security updates (addressing that particular point), who'd want something like this would be the teams working on large codebases that:

  - need to keep working in the future and still need to be maintained
  - are too big or too time consuming to migrate to a newer tech stack (with breaking changes in the middle) with the available resources
  - are complex in of themselves, where adding new features could be a detriment (e.g. different code styles, more things to think about etc.)
Realistically, that world probably doesn't exist and you'll be dragged kicking and screaming into the future, once your Spring version hits EOL (or worse yet, will work with unsupported old versions and watch the count of CVEs increase, hopefully very few will find themselves in this set of circumstances). Alternatively, you'll just go work somewhere else and it'll be someone else's problem, since there are plenty of places where you'll always try to keep things up to date as much as possible, so that the delta between any two versions of your dependencies will be manageable, as opposed to needing to do "the big rewrite" at some point.

That said, enterprises already often opt for long EOL Linux distros like RHEL and there is a lot of software out there that is stuck on JDK 8 (just a very visible example) with no clear path of what to do once it reaches EOL, so it's not like issues around updates don't exist. Then again, not a lot of people out there need to think about these things, because the total lifetime of any given product, project, their tenure in the org or even the company itself might not be long enough for those issues to become that apparent.


> And lastly, choose boring technology. Only use software that you're sure will be supported in 10 years. Only use software, which has very few "unknown unknowns". Only use software, where the development pace has slowed down to a reasonable degree.

Perl has been stable for a couple of decades.


Perl 6 is incompatible in all sorts of ways. It's so incompatible that people didn't even upgrade! So incompatible that they renamed the language! Even more incompatible than Python 3!

Even Perl 5 is rapidly evolving. They just added try/catch. Added a new isa operator. Added a new __CLASS__ keyword. Added defer blocks.


I’ve supported enterprise software for various big companies and I can tell you that most decision makers for DCs agree with this sentiment.

EMC had a system called Target Code which was typically the last patch in the second-last family. But only after it had been in use for some months and/or percentage of customer install base. It was common sense and customers loved it. You don’t want your storage to go down for unexpected changes.

Dell tried to change that to “latest is target” and customers weren’t convinced. Account managers sheepishly carried on an imitation of the old better system. Somehow from a PR point of view, it’s easier to cause new problems than let the known ones occur.


> My experience shows that oftentimes these new and supposedly backwards compatible features still break old functionality.

Well that's the first issue: downright malpractice. Developers should learn how to know (and test) whether it is a major change or not.

The current situation is that developers mostly go "YOLO" with semantic versioning and then complain that it doesn't work. Of course it doesn't work if we do it wrong.


I find it pretty funny that immediately on the first click of this article I was greeted with an internal server error.


That was me scrambling to allocate more resources to the container and redeploy it, after my alerting ticked me off about issues and I figured out what's going on. While the container itself was down, the reverse proxy returned an error.


Avoid software that will need constant updates. Because that is a signal it is defective to begin with, or expected to be broken soon.

For example, I avoid graphical commercial OS, large, graphical web browsers. Especially mobile OS and "apps".

Avoidance does not have to be 100% to be useful. If it defeats reliance on such software then it pays for itself, so to speak. IMHO.

The notion of allowing RCE/OTA for "updates" might allegedly be motivated by the best of intentions.

But these companies are not known for their honesty. Nor for benevolence.

And let's be honest, allowing remote access to some company will not be utilised 100% for the computer owner's benefit. For the companies remotely installing and automatically running code on other peoples' computer, surveillance has commercial value. Allowing remote access makes surveillance easier. A cake walk.


Exception is mobile apps


A feature I've wanted for ages, for every OS package manager (Windows, apt, yum, apk, etc.), every language's package manager (npm, pypi, etc.), and so on is to update but filter out anything less than one day, one week, or one month old. And it applies here, too.

Now, some software, they effectively do this risk mitigation for you. Windows, macOS, browsers all do this very effectively. Maybe only the most cautious enterprises delay these updates by a day.

But even billion dollar corporations don't do a great job of rolling out updates incrementally. This especially applies as tools exist to automatically scan for dependency updates, the list of these is too long to name - don't tell me about an update only a day old, that's too risky for my taste.

So for OS and libraries for my production software? I'm OK sitting a week or a month behind, let the hobbyists and the rest of the world test that for me. Just give me that option, please.


Debian has 2/3 stages of software deployment that I know of: Unstable, Testing and Stable. By the time it comes to stable it has been quite extensively tested. The exceptions are only security updates which you may want to get very quickly anyway. I really recommend Debian (in particular with unattended security upgrades) for severs.

Other distros have this as well (Thumbleweed, Void, etc.), and I really think most people should not be using recently-deployed software. A small community using them however helps testing so the rest of us can have more stability. Which is why I don't recommend using Arch (or Debian unstable) for general users, unless you specifically want to help testing and accept the risk.

Also randomizing update schedules by at least a few hours does seem very wise (I don't think even the most urgent updates would make or break in say 6 hours of randomization?)


al2023 uses "releasever" which is basically a dated snapshot of the packages. You can choose to install last-1 instead of the latest.


Isn't that basically non rolling-release distros?


Yesn't, many still release updates frequently as long (usually, if server etc.) as they are compatible. Mostly though only minor updates for features.

This is required for some components, like, e.g., glibc or openssh, to stay secure-ish.


There is also another type of update: security updates that don't actually matter in the environment that the software is used in. The question of whether the "new features" are for or against the user is another point to ponder.


Extreme viewpoint, but agree strongly. Big reason why working in Common Lisp brings a smile to my face - it’s a standard, quicklisp works, ffi works, etc. I can run code and follow instructions written DECADES ago, it just damn works.


our industry could use a risk assessment index scanner on updates, similar to "npm audit" , that measures the delta between versions and gives a risk indicator based on a number of parameters.

The issue with changelogs is that they are an honor system, and they don't objectively assess the risk of the update.

Comparing changes in the symbol table and binary size could give a reasonable red/yellow/green indicator of the risk of an update. Over time, you could train a classifier to give even more granularity and confidence.


Previously (November 4, 2021 — 319 points, 281 comments): https://news.ycombinator.com/item?id=29106159


He. I've been running always everything one or two versions behind latest (for my personal laptop, not servers). That means mainly OS (e.g., macOS), but as long as I can avoid automatic updates, I do so.

I believe the chances of having a bricked laptop because of a bad update are higher than the chances of getting malware because running one or two versions behind the latest one.


Kinda ironic that the article itself was updated


> Not only that, but put anything and everything you will ever need within the standard library or one or two large additional libraries.

you can definitely do that with python today: assemble a large group of packages that conver a large fraction of what people need to do, and maintain that as the 1 or 2 big packages. nobody's stopping you.


You would need to maintain python itself too. Imagine if you had done this same plan prior to the python 3 transition.



I disagree, keep things constantly updated (within reason).

Most companies I've worked for have the attitude of the author, they treat updates as an evil that they're forced to do occasionally (for whatever reason) and, as a result, their updates are more painful than they need to be. It's a self-fulfilling prophecy.


No no no, it’s “never update anything and don’t expose your machine to the internet”. Winning strategy right there.


I know it's supposed to be a statement to take the absurd title of my article a bit further, but in some cases, I can see that being said unironically.

Nothing good would happen if some machine running Windows XP in a hospital that's hooked up to an expensive piece of equipment that doesn't run with anything else suddenly got connected to the Internet. Nor does the idea of any IoT device reaching past the confines of the local network make me feel safe, given how you hear about various exploits that those have.

On one hand, you should get security patches whenever possible. On the other hand, it's not realistic to get just security patches with non-breaking changes only. Other times, pieces of hardware and software will just be abandoned (e.g. old Android phones) and then you're on your own, even if you'd want to keep them up to date.


I used to work with state agencies and they run outdated unpatched Windows computers all over the place.

Nowadays I work in medical software and hospitals are running outdated unpatched Windows computers everywhere.

Nobody cares about updates. Almost nobody. I never saw Windows 11. Windows 10 is popular, but there are plenty of Vistas. I'm outright declining supporting Windows XP and we lost some customers over this issue.

My development tools are somewhat outdated, because compilers love to drop old Windows versions and 32-bit architectures, so sometimes I just can't update the compiler. For example I'm stuck with Java 8 for the foreseeable future, because Vista users are too numerous and it's not an option to drop them.

Hacker News is like another world. Yes, I update my computer, but everyone else does not. Even my fellow developers often don't care and just use whatever they got.


If only that were possible with some appliances. I can keep my TV offline, but not the Roku. Internet connected utilities which will continually patch themselves into enshitification.


The world is not static and software these days is very interconnected. Dreams of not updating only work in a unchanging world. Sadly, this world is still to be found.


Funny you're saying that. In context of the stricken crowds I've read that FedEx, UPS and SouthWest had no problems at all, because many systems run on Windows95, or even something based on Win3.1x :-) Seems they are well isolated, but networked nonetheless. Or do you think they use bush-drums and smoke signs to conduct their businesses?


That probably has more to do with not jumping on the latest fad* and installing an unattended unrestricted automatically-globally-updating rootkit on all of your critical machines.

That is generally recognizable as stupidity. And the ones that did so are now paying the price.

* compliance tactics are very prone to fads. just look at cookie banners.


I had started to think I was the only one saying this.


Urgent updates can be necessary every once in a while but should be recognized as technical failures on the part of the developers. Failure can be forgiven, but only so many times. The comments saying "what about X update that had this feature I need?" are missing the point entirely. Instead ask yourself about all of the updates you've made without even looking at the patch notes, because there are just too many updates and not enough time. Instead of blaming the producers for creating a blackbox relationship with the consumers, we blame the consumer and blindly tell them to "just update." That's what needs to change. It's a bit similar to opaque ToS issues.


A reasonable strategy is to wait a week after release before applying an update, unless it's a zero day fix.


the react module bloat example is not a fair one, the recommended way to start a react project isn't to use create-react-app. other methods are more streamlined. but then again, the deprecation of create-react-app perhaps proves the point that updates create problems.


It's not anymore the recommended way and last I checked it's not really being maintained as much as other ways, but for quite a while, it was the recommended way.


that is what i'm saying ;-)


java over golang lol. when golang has literally been version stable for over a decade now


I think the article is interesting. But you are right, golang is the one language I would recommend to someone who does not like updates.


Or use NixOS/Guix Systems instead of living in the stone age of containers...


I ran into critical issues with some NixOS packages I needed, the maintainer doesn't seem to care about my use case, and I can't for the life of me understand the Nix language enough to fix it myself. Have stopped using NixOS because of that.

If Nix language was replaced with something sensible I'd jump back in excitedly.


The point is not much the project per se but the concept, Infrastructure as Code, OS as code. So far NixOS is a relatively small ecosystem, in a complex community situation with the main sponsors, a mil-tech company, who have got a wave of backslash for debatable reasons, Guix System on other side it's mostly a French INRIA project focused more on HPC than anything else, so both have their corners, but they show a modern way to develop and deploy OSes as OpenSolaris (since it was before IllumOS fork) show the first integration of storage (zfs) and package system (IPS) and bootloader (with the boot environment concept), somewhat badly copied later by FreeBSD and partially on GNU/Linux with a set of scripts.

ALL OTHER OSes/distro are still stuck in the '80s in that sense and most people seems even unable to understand. On storage alone the famous "rampant layer violation" and the absurdity of btrfs and stratis "against" zfs are really good examples of blind tech reactionary behaviors by high skilled people and their outcomes a showcase of why we damn need to innovate instead of shooting ourselves in the feet switching from something obsolete to something even worse (like the now-almost-finished full stack virtualization on x86 and thereafter the paravirtualization/container mania still current) layering crap on crap with more and more unmanageable infra with enormous attack surfaces.

Another small and relatively known example: Home Assistant project: apart of their design, they choose to distribute a python application as a GNU/Linux entire distro because to the such move seems to be commercially sound and many others choose to follow them instead of simply pip-install HA in a local venv, wasting an immense amount of resources on their system for what?

Such kind of tech evolution must end or we will collapse soon digitally speaking.


Or, alternatively, realize both are valid options and have different use-cases.

Or big-brain it and use Nix to build your containers and get the best of both worlds.


> In my eyes it could be pretty nice to have a framework version that's supported for 10-20 years and is so stable that it can be used with little to no changes for the entire expected lifetime of a system.

Yeah, me too. I also would like a few million bucks in the bank.

It's naive to think that every project wouldn't want to set this goal, simply because it's so unrealistic.


The 'Skip this Update [pro]' button example (Docker Desktop) just made me facepalm and helped me internalize that I'm not a luddite from technology, I'm a luddite from the collectives of people (not the individual people...(!) ) feeling compelled to craft these dark business patterns.


There is humor in this blog itself has 2 updates


Kinda weird to see Java over Go, when the former is basically an entirely new language from what it was 10 years ago and the latter has made it an explicit goal to never break older versions and (almost) never change the core language.


Writing backends in Go I do get that warm fuzzy feeling knowing that it will compile and work in ten years. The syntax is easy to read, if I'm not lazy to add extensive tests I can simply read these as documentation to re-familiarise myself later. It's now my go to tool for everything server side.


Now do it for lisp and your libraries alone were last updated 5 years ago.


I know it's this is a rather long tangent and not the main point of the article, but regarding "Docker Swarm over Kubernetes", I've had a ton of bad experiences at my employer running a production Swarm cluster. Among them:

- Docker Swarm and Docker Compose use different parsers for `docker-compose.yaml` files, which may lead to the same file working with Compose but not with Swarm ([1]).

- A Docker network only supports up to 128 joined containers (at least when using Swarm). This is due to the default address space for a Docker network using a /24 network (which the documentation only mentions in passing). But, Docker Swarm may not always show error message indicating that it's a network problem. Sometimes services would just stay in "New" state forever without any indictation what's wrong (see e.g. [2]).

- When looking a up a service name, Docker Swarm will use the IP from the first network (sorted lexically) where the service name exists. In a multi-tenant setup, where a lot of services are connected to an ingress network (i.e. Taefik), this may lead to a service connecting to a container from a different network than expected. The only solution is to always append the network name to the service name (e.g. service.customer-network; see [3]).

- Due to some reason I still wasn't able to figure out, the cluster will sometimes just break. The leader loses its connection to the other manager nodes, which in turn do NOT elect a new leader. The only solution is to force-recreate the whole cluster and then redeploy all workloads (see [4]).

Sure, our use case is somewhat special (running a cluster used by a lot of tenants), and we were able to find workarounds (some more dirty than others) to most of our issues with Docker Swarm. But what annoys me is that for almost all of the issues we had, there was a GitHub ticket that didn't get any official response for years. And in many cases, the reporters just give up waiting and migrate to K8s out of despair or frustration. Just a few quotes from the linked issues:

> We, too, started out with Docker Swarm and quickly saw all our production clusters crashing every few days because of this bug. […] This was well over two years (!) ago. This was when I made the hard decision to migrate to K3s. We never looked back.

> We recently entirely gave up on Docker Swarm. Our new cluster runs on Kubernetes, and we've written scripts and templates for ourselves to reduce the network-stack management complexities to a manageable level for us. […] In our opinion, Docker Swarm is not a production-ready containerization environment and never will be. […] Years of waiting and hoping have proved fruitless, and we finally had to go to something reliable (albeit harder to deal with).

> IMO, Docker Swarm is just not ready for prime-time as an enterprise-grade cluster/container approach. The fact that it is possible to trivially (through no apparent fault of your own) have your management cluster suddenly go brainless is an outrage. And "fixing" the problem by recreating your management cluster is NOT a FIX! It's a forced recreation of your entire enterprise almost from scratch. This should never need to happen. But if you run Docker Swarm long enough, it WILL happen to you. And you WILL plunge into a Hell the scope of which is precisely defined by the size and scope of your containerization empire. In our case, this was half a night in Hell. […] This event was the last straw for us. Moving to Kubernetes. Good luck to you hardy souls staying on Docker Swarm!

Sorry, if this seems like like Docker Swarm bashing. K8s has it's own issues, for sure! But at least there is a big community to turn to for help, if things to sideways.

[1]: https://github.com/docker/cli/issues/2527 [2]: https://github.com/moby/moby/issues/37338 [3]: https://github.com/docker/compose/issues/8561#issuecomment-1... [4]: https://github.com/moby/moby/issues/34384




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: