Hacker News new | past | comments | ask | show | jobs | submit login

Did that stack include daily traffic analytics hardware?



See https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar... for an overview of their stack.

They do NOT need a stack of daily traffic analytics hardware. As I said above, remote RPCs take an order of magnitude more resources than function calls. If you are used to using various distributed tools to do analytics, you are used to requiring an order of magnitude more hardware than a simpler solution. And if there are inefficiencies in your solution, it is easy for a second order of magnitude to sneak in. Therefore your experience is going to seriously mislead you about the likely limits of a simpler solution.

Making this concrete, if your job requires a Hadoop cluster of less than 50 machines, odds are good that it could be run just fine on a single machine with a different technology choice. But if you are fully utilizing hundreds of machines in your cluster, then you actually need to be distributed. (Though you could achieve considerable savings for some of your jobs could be run in a more efficient architecture. Amazon in particular makes excellent use of this kind of hybrid architecture in places.)

Google is making excellent decisions for the scale that they operate on. You just are unlikely to operate at a scale that makes their decisions make much sense for you.


For what the hell do you need traffic analytics beyond parsing access logs? Current ad-tech/analytics is a mixture of outright fraud, fraud sold as "AI", fraud sold as "big data", middlemen claiming to solve said fraud, and privacy invasions on a scale that would make the Stasi drool.

Stackoverflow doesn't need this kind of shit and frankly, no one else does. The GDPR was a first start in the right direction and I'm seriously hoping coronavirus has at least one positive side and eliminates those fraudsters who survived the GDPR wave.


Stackoverflow logs every request into SQL Server which uses columnstore tables to handle all their analytics. It's fast, efficient and more than enough.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: