It makes me feel a little embarrassed for HN actually. This article might be a revelation for the noobs on the Linux subreddit, but I'd expect the HN crowd to find it pretty pedestrian.
Not only that, but in some cases it is flat-out wrong:
No, disk caching only borrows the ram that applications don't currently want. It will not use swap. If applications want more memory, they just take it back from the disk cache. They will not start swapping.
Try again. You can tune this to some extent with /proc/sys/vm/swappiness but Linux is loathe to abandon buffer cache, and will often choose to swap old pages instead.
I have learned this the hard way. For example, on a database machine (where > 80% of the memory is allocated to the DB's buffer pool) try to take a consistent filesystem snapshot of the db's data directory and then rsync it to another machine. The rsync process will read a ton of data, and Linux will dutifully (and needlessly) try to jam this into the already full buffer cache. Instead of ejecting the current contents of the buffer cache, Linux will madly start swapping out database pages trying to preserve buffer cache.
Some versions of rsync support direct i/o on read to avoid this, but they're not mainstream and readily available on Linux. You can also use iflag=direct with dd to get around this problem.
Not only that, but in some cases it is flat-out wrong:
No, disk caching only borrows the ram that applications don't currently want. It will not use swap. If applications want more memory, they just take it back from the disk cache. They will not start swapping.
Try again. You can tune this to some extent with /proc/sys/vm/swappiness but Linux is loathe to abandon buffer cache, and will often choose to swap old pages instead.
I have learned this the hard way. For example, on a database machine (where > 80% of the memory is allocated to the DB's buffer pool) try to take a consistent filesystem snapshot of the db's data directory and then rsync it to another machine. The rsync process will read a ton of data, and Linux will dutifully (and needlessly) try to jam this into the already full buffer cache. Instead of ejecting the current contents of the buffer cache, Linux will madly start swapping out database pages trying to preserve buffer cache.
Some versions of rsync support direct i/o on read to avoid this, but they're not mainstream and readily available on Linux. You can also use iflag=direct with dd to get around this problem.