I'm always a bit confused about the CPU limit (for the pod), some guides (and to...

jeffbee · on Oct 13, 2021

CPU limits are harmful if they strand resources that could have been applied. I usually skip them for batch deployments, use them for latency-sensitive services. Doesn’t seem like a security topic though.

dilyevsky · on Oct 14, 2021

They are actually even worse for latency sensitive workload because cfs with 100ms default period will cause crap tail latency (especially for multithreaded processes such as most go programs)

gui77aume · on Oct 14, 2021

Interesting. It's my impression too. I understand that CPU limit will artificially throttle CPU, when not necessarily needed, wasting CPU cycles I could use. (Java programs in my case but I imagine it's comparable to Go ones)

Do you recommend to disable CPU limit? In the general case.

dilyevsky · on Oct 15, 2021

We don’t set them anywhere in prod and generally didn’t have any issues. We always set cpu requests and alert if those are exceeded for prolonged periods and always set memory req=limit

jeffbee · on Oct 14, 2021

Yes, but I put limits on LS workloads because I expect them to have a capacity plan, stick to it, and not abusively starve out batch workloads.

dilyevsky · on Oct 14, 2021

I think this is backwards. How are you planning on “sticking to it” when you’re serving unpredictable user traffic? If requests are set appropriately everywhere then it won’t really starve batch as kernel would just scale everything to their respective cpu.shares when cpu is fully saturated. This would allow you to weather spiky load with minimum latency impact and minimize spend

jeffbee · on Oct 15, 2021

It's weird that apparently you are a borg user from google, according to other discussions we have exchanged, but you question the value of hard-capping for latency-sensitive processes.

dilyevsky · on Oct 15, 2021

Borg sre even ;) (former) and yes i do question them. For one borg aint using 100ms cfs period and it wasn’t even standard cfs if i recall so yes i do question that outside of limited borg usecase