Not that I'm not impressed by Kafka and its stability, performance and scalability, but I see the same behaviour from our customers.
They specifically want Kafka, there's no real reason other than they need a queue, which Kafka actively states that it's not. At that point it gets really tricky to reason with the developers about why they might be better served by something else. Generally speaking it's not much of an issue, because Kafka will deal with workloads just fine, it's just weird. I have seen one customer use Kafka as a database, that works less well.
We do see the same with Kubernetes. The developers pick Kubernetes and at that point it's to late. They specifically want Kubernetes even if you could more easily solve the problem with Nomad, Docker-Compose, plain old VMs or EC2, depending on the problem.
> How do you teach someone to look at the problems first and then pick the tools?
By understanding what the person cares about. Everyone knows "pick the right tool for the problem". Not everyone uses such a simple calculus because life isn't that simple. People have their own agendas, backgrounds, experiences, career growth desires, personal lives, etc., that are all part of their personal objective function. If you want to convince someone that your tools are better, show that your tools have a higher payoff for their personal objective function. This is way more than a mere product question. In a team setting it's even harder, because you have to balance it across multiple people simultaneously.
"Due to CPU bottlenecks, we were not able to drive a throughput higher than 38K messages/s, and any attempt to measure latency at this rate showed significant degradation in performance clocking a p99 latency of almost two seconds."
I'm about to pick Kubernetes even though a different solution would theoretically be a much better fit for my needs. This is entirely because some other tools I'm looking at play nicely with Kubernetes out of the box. If I picked something else, I'd end up writing my own glue code.
I wonder if something similar is happening with Kafka?
My last gig was Kubernetes, and aside from all the hate it gets here, (You're not Facebook, you don't need Facebook scale) it was a very pleasant experience. So pleasant in fact that when I moved on to my next job (Amazon EC2 VMs) it was pretty painful. They were running an old version of Amazon Linux and hadn't been updated in years. The versions of some runtimes were impossible to update due to GLibc being out of date. Our immediate answer was, can we at least get to a Docker solution? ECS/Fargate provided a nice middle ground. But I'll admit. Once you start getting into running multiple replicas, it's nice to have the other stuff that Kubernetes affords you.
I pick Kubernetes because I want to manage software and not manage servers. I think it gets a lot of hate because people look at helm charts that are designed to support all possible software configurations and they are quite confusing. If you break it down to the basic pieces of configuration it's not really any more complicated than say a docker-compose file. Just a little more verbose.
Oh definitely. Helm/Kustomize.... Yamls all over the place. It can be awful. But then how nice it is to have a cluster with load balancing, a nice API Gateway, services spinning up and down gracefully. While that is certainly achievable in other ways, this one has been my favorite. (I used to deploy WARs to JBoss and have zero downtime... while it was possible, it was horrible)
There's also strong pressure to take whatever the "safe" choice seems to be. If everyone's jumping on some technology, and your manager already mentioned it, even, then you will catch no blame even if it's entirely terrible. Advocate for something else, and win, and congrats, now even if it's way better every little hiccup and difficulty is your fault.
To experience it on a smaller scale, get a family member (non tech savvy) who uses windows/macOS to use linux for a few days. Everything that is not the same as it was on windows/macOS will be your fault even if the linux way is better/faster/cleaner than windows.
Framing it like this is unproductive IMO. There's merit to picking technologies that help you and your teammates grow as engineers. You can take it too far...we should accept that it's a messy process to find the balance that lets us be productive now while helping us be more productive in the future.
As an engineering leader sometimes that even means knowing that people are making the wrong decision, and letting them do it anyway, and then helping them learn from it.
Hmm, that might be me remembering wrong. At least I can't find it. Sorry, I may be wrong.
They do go to great length to avoid calling Kafka a queue. No where does it directly state that Kafka is not a queue. The docs just never talks about Kafka as being a queue.
They specifically want Kafka, there's no real reason other than they need a queue, which Kafka actively states that it's not. At that point it gets really tricky to reason with the developers about why they might be better served by something else. Generally speaking it's not much of an issue, because Kafka will deal with workloads just fine, it's just weird. I have seen one customer use Kafka as a database, that works less well.
We do see the same with Kubernetes. The developers pick Kubernetes and at that point it's to late. They specifically want Kubernetes even if you could more easily solve the problem with Nomad, Docker-Compose, plain old VMs or EC2, depending on the problem.