By saying it can be done with $12,000 of hardware, you essentially claim it is trivial.
> everything is so perfectly shardable as it is in Twitter.
Shardability, or partitionability, depends on the underlying distribution. It's only "perfectly" partitionable if that distribution is perfectly uniform.
Twitter's traffic patterns are power-laws. Some tags vastly outnumber others. Some users have vastly more followers. Those distributions are unstable across time and space in surprisingly short order.
> Twitter had the resources to do it better from very early on, and they didn't (I think it was 2012 or so before Twitter became reliable).
Twitter wasn't interested in a contest to use the least hardware. They were in a race to keep up with surging load, while migrating from Rails to something that didn't exist yet. Meanwhile they have, under the hood, built infrastructure that could not be bought off the shelf from anyone at the time.
We can't actually come to a conclusion here, because you're arguing a counterfactual. It's always easy to have the ideal solution to a problem you never actually solved yourself, because the visible features of a problem are obvious and the many, many invisible problems are where the bulk of the effort are hiding.
If you want to prove that Twitter doesn't need such large and complex distributed systems, you can probably round up a few hundred thousand in angel funding and sell yourself to Twitter for a nice return.
> By saying it can be done with $12,000 of hardware, you essentially claim it is trivial.
Not at all. What I'm saying is that, with the proper software (which is not trivial to write), you can do it with very little hardware. I know stunt programmers making $500K/year, and they are in some senses infinitely (not just x10) more productive than bad and even average programmers - because they quickly produce working systems that others just can't.
Whether it makes sense to pay $50K/hardware and $500K/programmer or $2M/hardware and $100K/programmers depends on how you run your business, though - and in many cases, the $2M/hardware+$100K/programmers are the more economical choice (because you risk starving for stunt programmers choosing the first)
> Some users have vastly more followers. Those distributions are unstable across time and space in surprisingly short order.
And yet, as a system designer you actually get to choose what the distribution is of - and a good choice makes it uniform. The power laws may or may not favor sharding on uid specifically, but usually there's a simple way to shard uniformly. And if you can't find a way to programmatically shard uniformly, then shard using a lookup table on the userid - 8 billion user ids require all of 8GB of ram if you have 256 shards or less (and if you bundle users in groups of 256, 32MB is suddenly enough). Migrate around to keep balanced. This has been a solved problem for years. Really, even migration patterns. Look at "consistent hashing" literature -- (it's not the fundamental issue that consistent hashing solves, but the peripheral solutions are well known, common, and apply here: migration, redistribution, redirection).
> Twitter wasn't interested in a contest to use the least hardware.
> We can't actually come to a conclusion here, because you're arguing a counterfactual. It's always easy to have the ideal solution to a problem you never actually solved yourself,
First, I basically agree with you, if it wasn't clear. I know not what problems twitter were facing. I suspect that they weren't technical in nature, though -- because the technical issues have been solved before them. It might be management issues, it might be ego issues. I've seen more projects fail or stumble on those than on technical issues.
I actually did solve those problems myself, on a smaller scale (hence my interest, but with perfect sharding that would have scaled to any size, live migration and redistribution and all). But that startup folded because we sucked at getting traction - which only shows to go you that technical prowess matters not in these issues, or at least not much.
> many invisible problems are where the bulk of the effort are hiding.
Again, to be clear - I totally agree with you. I'm just disagreeing with the aura of engineering excellence that Twitter gets in this (and many other threads). They may have it, or may not - I haven't seen evidence that they do. They can keep twitter running well in the last 3-4 years, which means they are reasonably competent. That's all I have evidence for.
> If you want to prove that Twitter doesn't need such large and complex distributed systems, you can probably round up a few hundred thousand in angel funding and sell yourself to Twitter for a nice return.
As I have mentioned several times in other replies (and other discussions) - twitter's problem is not, in fact, engineering, and hasn't been since 2012 at least (it definitely was 2007-2009). They threw enough money/people at the problem, and solved it.
Right now, they are bringing in $2B/year, but spending $2.5B/year. If they are spending more than $200M/year on user facing hardware at this point, I'd be surprised. I'd even be surprised if they are spending $100M/year. But let's assume that I can save them $200M - that's nice, but won't actually save them - the company needs much more significant changes than that. And that part of the infrastructure only brings in users. I know not what systems they use to actually bring in money -- which is actually much more important to optimize.
And ... what I'm doing now is more profitable than what I can likely get from such a project (and I don't have to gamble or raise money). So, thanks, but I'll pass.
Let me ask you this, though: Look at the healthcare.gov debacle. I assume our discussion would have essentially been the same (including the counterfactuals) up until the point where a team rewrote it in a fraction of the time, with a fraction of the resources, and much much better - and I guess at that point we would be able to agree factuals.
Would you consider the discussion futile in either case? I don't care about agreement (factual or counterfactual), I'm trying to learn about the problems that are invisible to me. So far with Twitter, I've learned non so far over 8 years of the (more or less twice annual) discussion.
Again, I've mentioned several times: Whether or not they now run their infrastructure efficiently is very unlikely to matter -- unless they are horribly incompetent, which I assume they are not. I'm just trying to understand the aura of excellence (or alternatively, the depth of the problems) that Twitter deals with technically.
By saying it can be done with $12,000 of hardware, you essentially claim it is trivial.
> everything is so perfectly shardable as it is in Twitter.
Shardability, or partitionability, depends on the underlying distribution. It's only "perfectly" partitionable if that distribution is perfectly uniform.
Twitter's traffic patterns are power-laws. Some tags vastly outnumber others. Some users have vastly more followers. Those distributions are unstable across time and space in surprisingly short order.
> Twitter had the resources to do it better from very early on, and they didn't (I think it was 2012 or so before Twitter became reliable).
Twitter wasn't interested in a contest to use the least hardware. They were in a race to keep up with surging load, while migrating from Rails to something that didn't exist yet. Meanwhile they have, under the hood, built infrastructure that could not be bought off the shelf from anyone at the time.
We can't actually come to a conclusion here, because you're arguing a counterfactual. It's always easy to have the ideal solution to a problem you never actually solved yourself, because the visible features of a problem are obvious and the many, many invisible problems are where the bulk of the effort are hiding.
If you want to prove that Twitter doesn't need such large and complex distributed systems, you can probably round up a few hundred thousand in angel funding and sell yourself to Twitter for a nice return.