Hacker News new | past | comments | ask | show | jobs | submit login
Introduce DAMON-based Proactive Reclamation (kernel.org)
32 points by marcodiego 11 days ago | hide | past | favorite | 11 comments

Linux behavior under memory pressure on the desktop has been a problem for a long time[0]. Although not aimed at the desktop, I hope this combined with zram, cleancache, frontswap and all the infrastructure required to support systemd-oomd finally fixes this.

[0] https://lkml.org/lkml/2019/8/4/15

What's described there is disk thrashing, and while there might be different approaches on fixing it, I'd suggest checking [1].

[1] https://gitlab.com/post-factum/pf-kernel/-/commit/1102c90c79...

I wonder how does this compare to the state of the art solutions from the author of Nohang: https://github.com/hakavlad/prelockd https://github.com/hakavlad/memavaild https://github.com/hakavlad/le9-patch in terms of % saved memory or % prevented hangs/OOM / lost base runtime performance

An interesting possibility would be for, once backing store is full, if any new page should go to swap, the LRU page on the backing store should be "forgotten" and freed; if a "forgotten" page was ever used again, such process would receive the SIGBUS signal.

That would keep the system running happily even if processes leak arbitrary amounts of memory.

This seems like a massively overcomplicated solution to a simple problem. If you're running out of memory, it's because:

1. You don't have enough memory for what you're trying to do

2. Your software has a memory leak

Add more memory to the system or fix your software. Why is this fancy feature needed at all? It's simpler and safer just to reboot and have a fresh start if you're out of memory. Set panic_on_oom, job done.

Hmm, while i definitely agree with 2., i'm not sure that i'd dismiss 1. so easily.

To me, that sounds a bit like saying: "Compressing backups? If you can't store them as they are, you don't have enough storage for what you're trying to do." With finite resources, regardless of whether it's storage, RAM or CPU cores, it makes sense to look for options to utilize them more efficiently.

Not everyone can acquire all of the RAM for the problems that they'd want to tackle and therefore solutions like this could provide tangible benefits with sometimes acceptable drawbacks (the mentioned CPU usage). Personally, i feel like having VPSes that'd offer me more memory would be a good thing, since most of the software that i use is mostly memory not CPU constrained, though i understand why opinions could differ.

As for memory leaks and rather liberal memory usage - i feel that it'll be inevitable for as long as the industry uses Java, .NET, Ruby, Python, Node and most other technologies with a high abstraction level (and VM runtimes, ofen coupled with GC). Yet not everyone is keen on writing their business apps in C++ or Rust.

FWIW, there has been a lot of research on making GC (particularly on the JVM) communicate its memory status to the kernel, but the (usually Linux) kernel developers’ attitude was always “no, you don’t get to put your thesis into mainline, unless you can prove people actually use it”. Which is partly fair, but the end result is still a dumb chicken-and-egg situation.

I can’t speak to this particular solution, but the problem is, in fact, complicated for two interrelated reasons:

- Modern UNIXes usually can’t communicate, and modern applications usually can’t handle, out of memory conditions, because of overcommit, necessary due to pervasive use of fork(), which is ingenious and easy to use but lends itself poorly to resource accounting. Compare with Symbian, which was very awkward to program in, but was resilient enough to run the radio stack and user apps on the same processor.

- While the monolithic-kernel ideal is to put all caches and other discretionary memory consumers in the kernel, that is of course impossible to achieve. Everything on your system has buffers, caches, or un-garbage-collected memory it could afford to lose, but the system can’t communicate or cooperate with itself to actually make use of that. Cooperating resource usage of untrusted processes which have different ideas of how important things are is a hard and mostly unsolved problem.

> Set panic_on_oom, job done.

So any buggy app (or just a `make -j`) can crash the system?

Yea no thanks

>So any buggy app (or just a `make -j`) can crash the system?

Yes. This isn't the 80s anymore, nobody runs multi-seat mainframes. Any critical system should be designed to be resilient enough that a machine can reboot without downtime.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact