My previous team was also a "no swap in prod", and this behavior bit us more than I care to admit. The devs were occasionally on the side of "swap for safety", ops was religious no-swap, and ugh. It can take 10+, 30+ minutes for systems encountering this to resolve to some meaningful conclusion, and half the time, I'm desperately trying to ssh in so-as to kill -9 the errant task anyway but ssh is paged out, and I wish the OOM killer would just do it for me instead of Linux trying to page everything through what feels like a single 4KiB page. I need to play around will sysctls more on some sort of test rig.
On AWS instances with EBS disks (most instances), disk is basically network.
I once suggested "cgroup'ing" (loosely speaking) the entire system into two rough buckets: one for SSH, with enough dedicated RAM that ssh will never get swapped, and one for everything else.
Also, I feel like the number of devs out there who understand that mmap'd files — including binaries/libraries — are basically mini-swap files when memory pressure is high is really low; more than once I've diagnosed a machine as "page thrashing" to get back "what? but it has no swap that cannot be?". Well, pgmajfault and disk I/O metrics don't lie.
I feel like I've been waiting 25 years for someone to implement this by default in every operating system, and yet no one seems to do this. (Arguably iOS does this with Jetsam priorities, btw.) Why is it so hard to ensure that these base components get guaranteed RAM? It was always extra infurating on Windows, as you'd end up in swap hell and need to kill something because the computer is now running insanely slow; so you hit ctrl-alt-del and the UI would instantly respond with a menu that would work great... and then you'd click Task Manager, and rather than that functionality being implemented in that menu--or being designed to never swap--you would get thrown back into the swap hell in the hope that eventually the taskman would load. It was insane, and so easily fixed in numerous ways (the most simple probably being to add a trivial task manager that only had memory usage and a kill button in that OS escape menu).
Damn straight. The xbox builds of Windows do this too (so the console is still controllable when a game is running) and I swear the lack of control in desktop OSs (can't start Task Manager to kill the thing that's stopping you from starting anything) will kill desktop computing if mobile OSs increase their capabilities.
A device that isn't responsive to input is indistinguishable from the device being broken.
This is also one of the reasons that putting swap on a different physical drive than /usr and /bin can help a lot. Once you hit memory pressure, I/O caching is going to disappear at the same time as your I/O is saturated doing swap. Just being able to read /usr/bin/ssh from a different drive than the one swap is thrashing can be night and day.
You can use memlockd for that: https://manpages.debian.org/buster/memlockd/memlockd.8.en.ht...
When I have had swap, the machine simply hangs rather than OOM saving the machine.
What I don't understand is how 500M of swap can help on a 16G machine. Why would it be any better than an 18G machine?
My guess would be that if it's all RAM, then the kernel will happily use all of it for IO buffers, etc. and make no attempt to reduce its usage until actually necessary. If some of it is swap, then the kernel will (over time) try to reduce the size of its IO buffers until swap is no longer needed, but that de-allocation doesn't need to happen immediately, or block the rest of the system whilst it does happen.
Maybe there is a way to have that "high water mark" without needing swap - reserve 500M to be only used in "emergencies" - I don't know the details of the linux kernel well enough to know if that's a possibility.
You're looking for zram, I believe.
If your load fits into 14gb and you get 16gb of ram and add 0.5gb of swap, when you see swap start to fill, you know it's time to look for memory leaks and/or get more ram. If you have a bigger memory leak / burst allocation, you have a chance of being able to connect in and shut the thing down cleanly.
There are some issues with kernels swapping out 'the wrong' pages, and filling swap early. But, assuming that's tuned, I'm not aware of a better indicator of you need more ram than your swap is full / swap i/o is high.
Lots of things could/should and would be done differently.
For example, if you don't have swap space.
Without swap, Linux is winds up flushing file caches, mmaped files, etc. and usually re-reading them all over again...the machine is technically not out of memory or "swapping", but effectively, it's swapping and unresponsive.
Sounds like you have a redundancy problem not a swap problem. If should just be able to kill a machine that gets into a bad way like that and move on. What if it wasn't swap but one of the million other things that could make your server crawl?
In particular, a machine that is swap/page thrashing will generally show as having no available RAM, a high amount (especially relative to baseline) of major (required disk reads) page faults, and often the CPU profile will be spending a lot of time in I/O wait, too, I think, though I usually just use out of RAM + page faulting. Also, the metrics service tends to go dark shortly afterwards — it's having the same issue as everything else on the VM at getting CPU time.
Major page faults, perhaps the key stat for "this machine is page thrashing" since it directly corresponds to it, is found in /proc/vmstat and is called "pgmajfault". Though like I said, we generally had Prometheus and Grafana to turn these into pretty graphs, and to export them out of the VM itself, since when something is page thrashing, getting it to do anything is hard.
CPU contention lacks the "out of RAM" part, and won't knock the metrics offline. Network contention can knock out the metrics, but often doesn't, and lacks the other signals: out of RAM/page faults. Disk I/O lacks the out of RAM & doesn't knock the metrics out since they don't require (beyond being paged in) disk I/O. (And those — CPU, RAM, network, and disk-ish — are about the only real resource dimensions on a VM.)
Alternatively, if you let it play out and the VM eventually recovers, it might nonetheless decide to OOM kill a thing or to along the way, and those show up in dmesg / on the console, in you can get to those.
> If should just be able to kill a machine that gets into a bad way like that and move on.
I must admit that the production I love and cared for was not always perfect. Often, yes, I could, but there were a few spots where things weren't so rosy. Even when I could, I generally wanted to have some understanding as to why the VM went under, so as to not have the problem come back again, later, on a different VM. Page thrashing, in particular, is basically always symptomatic of a bug. And in distributed systems, the bugs are also distributed.
The economics of getting devs enough time to develop to that quality vs. management wanting new features has been one of the hardest challenges of my career.
Also for a while I lacked permission to actually kill a machine because ops was locking permissions down because "devs shouldn't have access to the actual machines"; like she says in one of the linked posts, fine, have my pager, you deal with the pages.
I've wobbled back and forward on swap. In the early days I used to be annoyed by how much disk space it'd take. (an 8 gig disk with 1 gig for swap is too much)
I've run an 8 core machine with only 2 gigs of ram, and tried to compile something with boost in it. swap allowed me to kill it, and recover the system.
I've run VMs with no swap, some swap and loads.
However, what I've never done is actually benchmarked the same workload on machines with no, some and loads of swap. However, I generally defer to Rachel, because Rachel has been there and been bitten by that before.
On this point: http://rachelbythebay.com/w/2018/04/28/meta/ which everyone should read and digest, this remark jumped out:
>"This is so easy to test for"
If I ever say this and never qualify it, it should read: "haha, yeah, I made that same mistake, that's why I test for it now."
The only reason why I am "better"[I hope] than my younger self, is because I've made a bucket load of mistakes before. Some of them are technical, but to be honest a load of them are societal. (as in, bleating like Cassandra and not being able to affect change.)
If we, as "engineers" are to grow as a class of people, we have to actually learn from other people's mistakes, not just use them as bias confirmation. This is why I like blog posts where they lay out the problem, outfall, cause, workaround and eventual solution.
As a result, I've encountered this flaw the Linux kernel design multiple times. And it is a flaw, as you only have to do the same test on Windows to see how an OS should behave when you run out of RAM: a hiccup followed by your offending tab being terminated by the OOM killer. Locking up the system for 10 min. is just not acceptable.
This is where I am now, where I'm comfortable making technical mistakes and know enough to avoid the most grevious pitfalls, but struggle to expand the sphere of things affected by my changes beyond a few people or single team/product. Some people get into management for this, but I've seen that management is mostly planning, budgeting, hiring/firing, and non-technical communications.
Influence and leadership are force multipliers. If you can lead people well and influence decisions (be it as a people leader, or a technical leader, or both), your experience can be learnt from by more and more people.
There is no doubt that leadership involves doing 'management' more often than not. But it's just one aspect, it's not the end in itself.
This is what this story is really about. People not instrumenting their systems, no benchmarks, no real clue how things will perform under load.
Swap or not swap: it doesn't matter what you do if you haven't benchmarked the system before you set it up - in this case, its your customers who will tell you, eventually, how your software behaves ...
I have. A well utilized machine is going to absolutely tank once it hits swap. Do you want to engineer your application to be able to cope with two radically different performance regimes, or do you simply want to ensure that your working set stays bounded?
Seems like a good time to just kill it and move on. "Cattle not pets."
Swap is there to relieve memory allocation pressure. Memory allocation pressure is an incredibly dense concept but it basically means "if lots of people are asking for fresh pages, how quickly can I service them"?
There are also other types of memory pressure. One is "if lots of people are reading and writing to pages, how quickly can I service them?"
This depends on the state of the mapping for the virtual page being read or written. If that virtual page has an associated physical page then the answer is "not too slowly". If the physical page is in one of the caches then the answer is "more quickly". If the relevant part of the page is in a register then the answer is "in one clock cycle".
On the other hand, if the read or write is associated with a page that needs to be mapped in, either from a disk file or a swap file, then the answer is "slowly".
These types of pressure need to be balanced with each other. The idea isn't just to keep as much data in physical RAM as possible. The idea is to use the RAM as effectively as possible (perhaps by keeping as much relevant data in RAM as possible) whilst also being able to respond effectively to requests for fresh pages.
In general, under pressure and load, the Linux kernel tries to keep a few MiB of ready-to-allocate pages. When it runs out of these it raids the page cache. It's reasonably quick to get (non-dirty) pages from the page cache because they can just be zero'd and handed out. The most difficult pages to reclaim are those that must be copied to swap before they are zero'd. So the kernel tries to minimise that.
How does it do that? Well, it has some tricks:
When processes load and run they allocate a bunch of pages. Sometimes, and not infrequently, they allocate and write to pages which are never accessed again!
I see this on my laptop all the time. I never use even close to the physical amount of RAM I have in the machine. Right now I'm using 4,959MiB of 15,799MiB and it rarely surpasses 6 or 7 GiB. However, if I leave it running for any length of time (a number of days or weeks) then I start to see a little bit of swap getting allocated. I've currently been up about 98 days and right now I'm using 317MiB of swap.
What's happened here is that the kernel has swapped out pages that it thinks are never going to be used again. That way, if memory allocation pressure suddenly increases it's got those physical pages ready to service those requests.
If there was no swap, those pages would be unnecessarily pinned into physical memory even tho' they would never be used.
Another commenter asked a question like "What's the difference between a machine with some memory and some swap and a machine with more memory?" Well, there are a few subtle differences, but this is one of them. The machine with swap will have a higher percentage of physical memory available and will be able to respond faster to larger allocations.
Another difference is price. RAM is still expensive compared to disk. If you have a 16GiB box with a bit of swap then you have a 16GiB box that you can use for data that's actively being read and written. If you don't have the swap then you're paying for some of that RAM that gets written and never used. This doesn't matter so much in the small scale but when you have a few machines you want to be getting the most out of them (and what that really means is the subject of another post!).
Swap space gets you more bang for your buck.
So swap space is really about giving the kernel a mechanism to manage the different types of memory pressure without taking too many compromises on the different trade-offs. Even if you're never using all your RAM, swap will still get used so that the machine can respond optimally to as many types of memory activity as possible.
One thing I'd really like to know is whether the number `free` gives me for "Swap used" is the amount of data that's in swap and nowhere else or whether it's the amount of swap space that's used even if the pages are also still in physical memory.
A monitoring process, like earlyoom, can do this for you. Also features like cgroups, setrlimit(), prlimit() can help. Last one can even be used to dynamically adjust memory limits per each process and avoid kernel's weird OOM behavior. This is all in accordance with fail fast, which requires monitoring processes that actually deal with all the failings.
> For everyone else, you'd probably cry too. I sure did.
I remember a colleague (employee) crying because a third party vendor screwed up and tried to blame us which would have sent a multimillion dollar project down the tubes. What saddened me was she felt the need to apologize. It was a pure expression of frustration, anger, sadness, and exhaustion on a project we were all deeply committed to, produced by the brazen unfairness of this contract house.
It's not good to live in a culture that denigrates human expression. I'm glad rachelbythebay was able to express this.
* We were able to apportion blame properly, get a proper result from someone else, and make the regulators happy with no funny business.
Larger scale systems with redundancy? No swap.
Having swap in systems like this still doesn't make sense to me. It treads heavily on the "cattle not pets" philosophy. I shouldn't be ssh-ing into a machine thats swapping to see whats up. It should be killed. One server in the cluster starts swapping and falls out of step with its peers? It should be killed. When a machine starts swapping it falls into a while different performance regime than the rest of your systems, now you've got more variance in your response times. Not good when you care about your response times. Unless you have memory-pretending-to-be-disk for swap (in which case why isn't it just memory)
I've never seen a machine 'act funny' because it didn't have swap, its always the other way around. I don't think I've ever encountered a machine that used so much memory that the kernel didn't have buffers, but not so much that it invoked OOM killer. Unless there was a woefully misconfigured process running on the machine.
If a machine is well utilized CPU wise it is going to get absolutely crushed when it starts swapping.
Time and time again I see swap being an issue. The past year I've been in a Large Scale shop which for some ungodly reason it has swap (nowhere I've been in the past 10 years as swap as a general rule)
Don't even get me started with EBS IOPS exhaustion when you start swapping onto an EBS volume.
Note that the reason this topic is currently in vogue is that this has become a lot easier recently. If you run the system on a modern low-latency SSD, the current OOM killer algorithm often fails to kill anything before the entire system is on it's knees with approximately 0 pages left for IO and non-anonymous memory, at which point the OOM killer will never run because the machine is so thoroughly locked. The proper way to fix this of course is to make the OOM killer hit earlier.
Why not give them swap, set off pagers, and _maybe_ kill them? There could still be something worth investigating there, and having swap will make that easier
You also don't want to have a cascading failure where a massive leak makes all your machines fill their ram, and start killing everything like crazy.
Cascading failures are a very real thing that have knocked whole systems offline.
It sounds like the real solution is a balanced solution involving some engineering: kill them if you aren't killing _everything_. Page if the problem is ongoing, not if a couple of machines have a problem.
Either way, you can add swap _and_ kill them. One does not preclude the other.
Although I do find the scenario of the kernel evicting mmapped pages causing performance degradation to be interesting and sounds plausible, but I haven't personally witnessed this behavior.
Where I see swap tend to get especially detrimental is with GCed processes. I've spent significant effort tracking down long GC pauses to getting blocked on swapped pages (although the software was not optimized and responsible as well and this was spinning rust). But in line with the article and your comments, this depends on engineering the system to have headroom.
IIRC processes that use more than a NUMA node worth of memory also run into some issues with the OOM killer with swap disabled, unless set to interleaved on the NUMA policy. So that's another thing to look out for when dropping swap, although I forget exactly why it happens.
I have seen this being done on Android devices and wondered why it is being used so rarely in other areas (Desktops/Servers).
There's a lot of institutional knowledge, and mythology, around swap. Less so around zram/zswap, and that is going to compete with other competing capabilities and lore.
I've been wranging boxen since the late 1990s, and using Unix since at least the late 1980s. I'd only run across references to zswap / zram a few weeks ago when attempting to compile OpenWRT, and didn't look into it until seeing your comment (one reason for writing copiously footnoted HN comments -- I might accidentally learn something).
zswap might very well be The Answer We've All Been Looking for, but, well, All Of Us Realising that is another stage in the Hierarchy of Failures in Problem Resolution.
If you want to use the systemd solution you can do it like:
pacman -S systemd-swap
vim /etc/systemd/swap.conf # disable zswap, enable zram
systemctl start systemd-swap.service
systemctl enable systemd-swap.service
Institutional knowledge, mindshare, and documentation are all Things.
Debian's documentation (Debian Administrator's Handbook, Debian Installation Guide, Debian FAQ) do not appear to mention either zram or zswap at all. DAH/DIG do mention swap configuration, but only in terms of traditional swap patterns.
There is mention on the Debian Wiki: https://wiki.debian.org/ZRam But not under the Swap topic: https://wiki.debian.org/Swap
"As easy as" really doesn't mean much if the information isn't accessible. It's also harder to to advocate if it's not at least mentioned in standard documentation.
If you're aware of any standard Linux documentation mentioning zram/zswap as options, please let me know.
Again: this is not an argument against the technical merits or advisibility of zram or zswap. It's an argument that knowledge of these options is not widely disseminated or assimilated. I'd commented recently on the matter of intergenerational knowledge transfer, both general and specific (https://news.ycombinator.com/item?id=20617656). This would be a case of that.
zram and zswap are also a lifesaver on low memory devices like chromebooks, especially if they have anything slower than nvme storage.
It works just fine however you need to keep a appropriate headroom to allow the kernel to do its thing with caches as indicated otherwise things get very weird very quickly.
Containers are very helpful in this regard for helping explicitly divide a machine up between processes without allowing any one to get out of hand.
Maybe consider that not everyone has the same use case as you: some people are running larger chunks of computation per node or even stateful applications. Even when they could be killed and are redundant, running a bit slower for a moment until they recover may be preferable to restarting the node (which also takes time).
Do you mean a cluster of many computers that are collectively seen by the OS as a single machine with petabytes of ram? I can I do that too?
The cluster had thousands of individual machines, each performing their own discrete tasks with aggregate ram across the cluster in the petabytes.
There was no swap on any of those machines.
I absolutely hate Linux's behavior with swap enabled, as described in a previous thread: https://news.ycombinator.com/item?id=20479622
It makes sense that it can also be broken with swap disabled: paging out too many file-backed pages can also lead to an unresponsive system.
> Earlier this week, a post to the linux-kernel mailing list talking about what happens when Linux is low on memory and doesn't have swap started making the rounds. ... Now, here we are in 2019, and we have a fresh set of people still fighting over it, like it's some kind of brand new dilemma. It's not.
The problem isn't new, but the approach I saw them discussing (use the new PSI stuff to OOM kill early) is new—PSI was only added ~a year ago, iirc. So I think this comment is unnecessarily dismissive.
I've seen the systems behave badly without swap. I don't see the bad swapless behavior as often personally, but I believe it exists. (In particular, I haven't tried the reproduction instructions in the lkml thread.) I don't know how the "tinyswap" approach is supposed to help—I'd love details. Swapless with the PSI-based OOM killing is an approach that actually makes sense to me in theory.
Is there some exact figure on this ? Like how much percentage of RAM size should be allocated as swap space.
My reasoning is as follows (ams concludes that swap does nothing):
4GB RAM, no swap.
Firefox allocates all RAM. Linux starts taking the program code and data sections out of disk. Serious thrashing. Hang.
4GB, some swap.
Firefox allocates all RAM.
Some stuff is in swap. Firefox allocates more memory. Swap is full of firefox's stuff. RAM is full of firefox's stuff. Linux starts taking the program code and data sections out of disk. Serious thrashing. Hang.
It's the same thing, just more complicated thus harder to debug. It will behave better,
if you never get to fill RAM+swap. But today's programs simply allocate more. And the kernel says yes.
IMO the real solution, given that we overcommit, is for firefox to check that % RAM usage you see in top(1) and to try to not let it go above 95% by freeing cache, and for the kernel to do OOM killing early.
Doesn't it to that already? I remember 3-4 years ago I use to watch in amazement how people complained that Firefox ate over 2GBs of memory, whereas I ran it on a laptop with 2GB total, and FF rarely hit 1G even with dozens of tabs open.
EDIT: Mozillazine seems to confirm my experience: "These numbers will vary because Firefox is configured by default to use more memory on systems that have more memory available and less on systems with less" (http://kb.mozillazine.org/Reducing_memory_usage_-_Firefox)
Not unviable under the stock OEM firmware, but confining when switching to, say, OpenWRT. Not that I'd know about this or anything.
The only time I've ever seen my swap go above 500 megs was when I was running qterminal with a memory leak. It was a long running terminal, with a long running process that was extra ordinarily chatty.
4 gigs is probably enough for most people under most conditions.
 this is workstations. on servers, I've seen it happen all the time. Crucially it gives one time to react.
It would be great if the installer optionally asked how much swap to allocate without having to do the full partitioning manually.
I'd assume that you'd have key metrics plotted in grafana, so its case of following up/alerting on that
In previous cases, It was fairly simple, an alert was fired because the queue size ballooned. Looking at the machine stats we saw that all the ram cache had been ejected just at the same time performance started to drop.
Either way, its pretty trivial to put an alert on a stat in grafana.
however, your mileage may vary.
Also If you use tmpfs you should have enough swap to cover the size of your tmpfs partitions in their entirety so that a bad application won't eat all your memory.
Linode has been configuring 256MB swap partitions by default on their VPS for a long time, even for large plans. Maybe it's bigger for the really large plans, but I haven't tested them. Anyway it feels like a nice default, and seemed to work fine with most kinds of loads in 10+ years of usage. Some of the newer VPS services (DigitalOcean, Lightsail) come with no swap by default, and I don't feel comfortable about it so I add a 256MB swapfile on them. I do turn down the swappiness a bit, though.
Or more practically, if I’m doing stuff with 8gigs and then I upgrade to 16gigs of ram, then why would I need to keep my 512mb of swap? Surely the extra 8gigs comfortably covers that extra room? Am I missing something?
Linux (and generally most popular OS) really don't like having no swap whatsoever (https://lkml.org/lkml/2019/8/4/15).
Plus if you're running out of RAM (regardless of how much you have, sure put more in if you have it) the system starts visibly degrading but remains recoverable when it starts swapping. If it can't swap, it pretty much just dies.
The point with my example was that if 512 MB of swap would save you, then why wouldn't 8 GB of extra RAM save you?
Yes, if you run out of RAM, you're in trouble, but if you run out of RAM+swap you're in trouble too, so what's the point in adding a small 256 or 512 MB swap?
Ok, so the point is that its visibly bad with swap before it actually dies, giving you time to recover. How about setting up some kind of alert when you're in your last half a gig or gig of RAM then? It seems like this would give a much better recovery experience. If its a server, you need an alert anyway since you're not going to notice the bad performance until its too late, most likely (you're not watching it 24/7 I assume).
Of course, from the article:
> I stand by my original position: have some swap. Not a lot. Just a little. Linux boxes just plain act weirdly without it.
That's fair enough, if the reason is to stop linux from acting weirdly, then fine.
Having 8 GB more RAM might avoid the issue, but it won't visibly degrade the system so you will not see that you're in trouble.
And again if you can have both, have both.
> Yes, if you run out of RAM, you're in trouble, but if you run out of RAM+swap you're in trouble too, so what's the point in adding a small 256 or 512 MB swap?
Because if you run out of RAM and have no swap, the system dies. If you run out of RAM and have swap, it starts swapping, which is noticeable.
> How about setting up some kind of alert when you're in your last half a gig or gig of RAM then? It seems like this would give a much better recovery experience. If its a server, you need an alert anyway since you're not going to notice the bad performance until its too late, most likely (you're not watching it 24/7 I assume).
If it's a server you need to do that anyway because the swapping will probably not be noticeable.
Only if you're using the system, or are reacting to monitoring (if you can monitor it and the thing you're monitoring fails) fast enough
I'd rather have a dead system than one that's not working
But in any case shouldn't OOM killer come to the rescue?
Indeed. Its similar to the argument that segfaults are good due to fail fast. Wouldn't it be better to fail fast due to OOM than to hobble along?
If you're actively using the system, ok, you might get a chance to save your work or whatever first, but, in my personal anecdotal experience, that process is pretty much dead anyway and I have to kill it. Yes, my system doesn't get taken down, but if OOM takes the system down, why isn't the kernel killing the process that's eating all the memory? It sounds to me that swap is just masking the problem.
A swapping system is degraded, not "not working".
> But in any case shouldn't OOM killer come to the rescue?
The OOM killer will heuristically kill random crap. It could be the process with an unbounded memory growth, or it could be your text editor.
Try some numbers yourself.
Anything kilobyte or below gives you effectively zero swap.
Megabytes gives you 90MB of swap on a machine with 8GB ram, that's an unusually small number and if it was supposed to be that small the advice would probably just be "128MB for all".
Gigabytes fit the idea that if you have a smaller amount of ram, you have swap comparable to it, but once you scale up you don't add a lot more for each gig.
Terabytes gives you hilariously large numbers for computers with 64GB of ram or less.
some background on anonymous memory: https://utcc.utoronto.ca/~cks/space/blog/unix/NoSwapConseque...
At one company where I worked over a decade ago, we ran some Linux-based equipment without swap also. To prevent executables from being evacuated by low memory pressure, I put a hack into the kernel: executables and shared libs were mapped such that they were nailed into memory (MAP_LOCKED).
Of course, real-life often is different than theory. Does your machine have a spinning disk or an SSD? I am much faster to enable swap on an SSD, since it won't be painfully slow, should we ever get into a situation where our RAM is saturated.
What happens in cloud VMs? These things use a network disk storage (transparent to us), and often writes need to be sent over the network more than once (for redundancy). How extensive swapping would behave in such an environment?
As for saying no, it's important to set some rules to avoid chaos, but it's also important to trust our senior people to take decisions. If they need to go against a rule, I would expect a good explanation in their commit —because, infrastructure as code— and documentation. If a junior wants to go against a rule, they can consult a senior. Issuing a no and expecting everyone to follow it blindly, is the worst form of micromanagement. :)
Why not just permanently allocate enough RAM for the kernel? If I have 16GB of RAM but the kernel needs 1GB to do its job, then just tell me that I have 15GB to work with.
It's certainly possible to create a small, protected area of memory that contains a kernel-level interrupt handler (which itself allocates no memory) whose sole job is to run a couple of times a second and check for thrashing and OOM. If it sees memory problems, it takes over the computer, determines which processes are using the most memory and kills the ones that are expendable. ("Expendable" is a list configurable by the user and yeah, Chrome would be right at the top for a desktop system.)
Embedded systems designers routinely build such watchdogs into their systems. It could probably be added to Linux as a kernel patch.
In the patch case, he asked about testing, and they realized the ssh/scp versions she tested with weren't the same as the ones the code was using. She promised to follow up with best-practice testing and didn't. (Without knowing the reason, this isn't unusual: people get busy and drop things all the time.) I didn't get the same sense of rejection or hostility she did. And the second developer (who got her patch accepted) credited her in the code review, tested in a middle way (better than she originally did, worse than she promised to do later), requested the review from a different person than she had (why I don't know), and got a review question with a similar tone before it was accepted. None of the parties' behavior looked unusual/red-flag-worthy to me.
I don't fault her for imperfectly describing an interaction that was five years ago when she wrote that post and is twelve years ago now. I'm trying to figure out what the lesson is and who should be learning it. A few unorganized ideas:
* Much of what people are thinking and feeling is left unwritten/unsaid, so two people can have very different ideas of what happened. (A reminder I suppose to listen to both sides before making a judgement on something.)
* I don't want to dismiss her feeling about bad team dynamics, even if I don't see them in this particular interaction. "At the end of the day people won't remember what you said or did, they will remember how you made them feel." - Maya Angelou
* A (imo typical) code review question can seem intimidating or hostile from a senior developer when "you're already not sure you belong there at all". Maybe an in-person follow-up would have helped, either then or later ("hey, did you have a chance to try writing that test? can I help? I want to get your change in"). I've been on both sides of this one. The junior developer often wants some extra help and attention, and the senior developer is often feeling overwhelmed by the volume of questionable-quality things coming in, such that they can go into more of a gatekeeper role than trying to mentor each person thoroughly in each interaction. (I think this is what she's talking about with "Any lazy fool can deny a request and get you to 'no.' It takes actual effort to appreciate and recognize what they're trying to accomplish and try to help them get to a different 'yes'.")
Then again anything above 80% memory utilisation and we begin looking at adding another box to the cluster due to the fact that occasional spikes in usage can easily put us beyond what swap can protect from and that just causes a shit storm.
I/O also used to have a smaller speed penalty. There was just one core and it wasn't so dramatically faster than the rest of the machine. Hell, there used to be faster disks for swap than for the filesystem, and tuning was about scheduling jobs and placing inputs and outputs so as to fully utilize the fs disks.
Anyway, isn't this the kind of argument that should be replaced by gathering objective data? Otherwise, the low/no swap space problems really appear to be symptoms of someone irresponsibly experimenting in production.