Discovered this sometime last year in my previous role as a platform engineer managing our on-prem kubernetes cluster as well as the CI/CD pipeline infrastructure.
Although I saw this dissonance between actual and assigned CPU causing issues, particularly CPU throttling, I struggled to find a scalable solution that would affect all Go deployments on the cluster.
Getting all devs to include that autoprocs dependency was not exactly an option for hundreds of projects. Alternatively, setting all CPU request/limit to a whole number and then assigning that to a GOMAXPROCS environment variable in a k8s manifest was also clunky and infeasible.
I ended up just using this GOMAXPROCS variable for some of our more highly multithreaded applications which yielded some improvements but I’ve yet to find a solution that is applicable to all deployments in a microservices architecture with a high variability of CPU requirements for each project.
There isn't one answer for this. Capping GOMAXPROCS may cause severe latency problems if your process gets a burst of traffic and has naive queueing. It's best really to set GOMAXPROCS to whatever the hardware offers regardless of your ideas about how much time the process will use on average.
Although I saw this dissonance between actual and assigned CPU causing issues, particularly CPU throttling, I struggled to find a scalable solution that would affect all Go deployments on the cluster.
Getting all devs to include that autoprocs dependency was not exactly an option for hundreds of projects. Alternatively, setting all CPU request/limit to a whole number and then assigning that to a GOMAXPROCS environment variable in a k8s manifest was also clunky and infeasible.
I ended up just using this GOMAXPROCS variable for some of our more highly multithreaded applications which yielded some improvements but I’ve yet to find a solution that is applicable to all deployments in a microservices architecture with a high variability of CPU requirements for each project.