Edit to avoid anyone misconstruing, I'm not trying to imply one thing or another, just that I can't approach this impartially. And in any case, I wish everyone well on both sides of this acquisition. I'm just genuinely curious how they plan to proceed from a technical standpoint, as it's a really interesting challenge.
This appears to be 6 years old. Is it still relevant?
Primarily the product backend is monolithic PHP (custom framework) + services in various languages + sharded MySQL + Memcached + Gearman. Lots of other technologies in use though too, but I'll defer to current employees if they want to answer.
Reality big data: Let's shard it across Mysql.
My answer above was limited to the product backend, i.e. technologies used in serving user requests in real-time. And even then I missed a bunch of large technologies in use there, especially around search and algorithmic ranking.
I honestly don't see the draw for Kafka. And by all means I get it, I just don't buy it. Maybe I'm just holding it wrong or something.
For high-volume OLTP though MySQL is an excellent choice.
Regarding Kafka: in many situations I agree. Personally I prefer Facebook's approach of just using the MySQL replication stream as the canonical sharded multi-region ordered event stream. But it depends a lot on the situation, i.e. a company's specific use-case, existing infrastructure and ecosystem in general.
Kafka is not going to replace MySQL specifically because it depends on the task at hand.
If you can't replace MySQL with Kafka, then why not just stick with whatever queue/jobs/stream infra you had before kafka. At least those solutions are quite limited in scope and easily replaceable.
At this point Kafka is a solution looking for a problem.
But there are relatively few situations where that's absolutely vital. And you can solve it with good ol' SQL.