Hacker News new | past | comments | ask | show | jobs | submit login

Depends. "free" reports the area used for disk buffers and programs, hence "available" and "free" numbers.

On my servers I want some available RAM which means "used - buffers", because this means I configured my servers correctly and nothing is running away, or nothing is using more than it should.

On the other hand, you want "free" almost zero on a warmed up server (except some cases which hints that heaps of memory has been recently freed) since the rest is always utilized as disk cache.

Similarly having some data on swap state doesn't harm as long as it's spilled there because some process has ran away and used more memory than it should be.

So, RAM usage metrics carry a ton of nuance and can mean totally different things depending on how you use that particular server.




One of the older arguments I get to keep having over and over is No, You May Not Put Another Service on These Servers. We are using those disk caches thank you very much.

I do not enjoy showing up to yet another discussion of why our response times just went up “for no reason”. Learn your latency tables people.


Yeah, people tend to think server utilization as black and white.

Look, we're using just 50% of that RAM. Look, there're two cores that are almost idle.

No & No. Rest of the RAM is your secret for instant responses, and that spare CPU resource is for me to do system management without you notice or to front the odd torrent of requests we have semi regularly (e.g.: /. hug of death. Remember?).


I need to find a really good intro to queuing theory to send people to. A full queue is a slow queue. You actually want to aim for about 65% utilization.


This might be too basic, but I found this blog post to be an incredible introduction to queues: https://encore.dev/blog/queueing


Also, there was a formula for determining the optimal cache size. I forget the name all the time. IIRC, in the end, caching most popular 10 items was enough to respond to 95% of your queries without hitting the disk.


If the numbers from the phoenix project are to be trusted, a loose estimate is the time spent in queue is proportional to the ratio of utilized to unutilized resources. For example, 50% used & 50% unused is 50:50 = 1 unit of time. 99% used is 99:1 = 99 units of time.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: