The title is a bit misleading, kub didn't cause the 10x latency - also latency was lower after they fixed their issues
TL;DR version - Migrate from EC2 to Kub; due to some default settings in Kiam & AWS Java SDK, latency of application increased, fixed after reconfiguration and kub latency lower than EC2
I'm just saying the title is misleading, and clickbaity.
More complicated than what? Without a baseline for the comparison it's not that useful. In our case we transitioned over the last four years from running hundreds of VMs provisioned with puppet and jenkins to running K8S workloads (on a lot fewer nodes than we had VMs) provisioned with helm/kustomize and using gitlab ci/cd pipelines. In my opinion the current platform is much less complex to understand and manage than the old one was. Yeah there are layers of complexity there that didn't exist in the previous platform, i.e the k8s control plane, the whole service/pod indirection, new kinds of code resources to manage, but it's all pretty consistent and works logically, and isn't really any harder to internalize than any other platform-specific complexity we've had to deal with in the past. And in terms of day-to-day development, deployment and operations k8s has been completely transformative for us.
It’s sort of like cursing Microsoft every time an application crashes... because it’s running on windows.
No it doesn't.
This is only true if you don't need the features in the first place, and you haven't gone through the learning curve yet.
So I can deploy a single container. Works fine. You can tell docker to restart it. Works fine. If that's all you need, you don't need k8s.
But you may now want to deploy two containers. And let's say they are servers. Now, you need incoming connections from a single port going to them. So maybe you are going to deploy something like haproxy – which now you have to learn.
Also, you now need a way to tell if your service is down. So you add a health check. But then you want to kill the container if it goes bad. So now you add a cronjob or similar.
At some point, one machine is not enough. So do you replicate the exact workloads across machines? If you do, it is a bit easier - but potentially more expensive. But if you need to spread them across different machine types, it gets more difficult. And now you also have storage distributed which you need to keep track of. If you haven't automated your deployment, now it is the time.
Now you have this nice setup. It is time to upgrade. How do you do that? Probably some custom script again.
Instead, you can forget all of the above. You can spin up a k8s cluster with any of the currently available tools (or use one of the cloud providers). Once you have k8s, a lot is taken care for you. Multiple copies of the service (with the networking taken care for you) ? Check. Want to increase? kubectl scale deploy --replicas=<n>. Want to upgrade? Apply the new YAML. Health checks and auto-restart? Add a couple of lines to your YAML. Want to have different workloads over different machine types? That's also an easy change.
Want to have storage (that follows your workload around)? Easy. You are in a cloud provider like GCP and want a new loadbalancer (with automated SSL cert provisioning!)? A couple of lines of YAML. Want granular access controls? It's built in. I can go on.
Of course, there's a learning curve. But the learning curve is also there if you stitching together a bunch of solutions to replicate the same features. Once you get used to it, it's difficult to go back.
That said it is a very common deployment strategy in ec2 to run kiam or kube2iam. I wish the kube core teams took over the development of an aws iam role service since issues like bad defaults would be solved much quicker. Your only other alternative is to use iam access keys and nobody likes that (security wise and it’s a pain to configure).
Again, my point was that rather than trying to do PR damage-control it would be better to work to improve things so this doesn't understand. Someone went to the trouble of posting a detailed examination of a real problem with a fix and some references to upstream improvements. That's a lot more useful than trying to draw a line between one of two tools commonly used to meet security requirements when operating Kubernetes in one of the most popular cloud environments.
So would disagree that is specifically is a necessary component.
KIAM and the Java SDK have bad timeout overlap. That's the problem, has nothing to do with Kubernetes, and looks like it's well documented and now resolved.