This is simple enough when described, but is not a technique I've seen applied much in practice or discussed in the community. I'm wondering if it's something that gets reinvented for all the projects that need it or if it's secret sauce known only in youtube. Regardless, I thought it was pretty insightful.
http://www.stdlib.net/~colmmacc/2009/09/14/period-pain/ for the background material).
The reasons primes are optimal should be obvious :)
The other way to solve the problem though would be to handle that circumstance cleanly. There are ways to resolve a thundering herd without creating a scalability problem.
Search for "random". Cool spec written by Stuart Cheshire of Apple.
> Vitess - a new project released by YouTube, written in Go, it’s a frontend to MySQL. It does a lot of optimization on the fly, it rewrites queries and acts as a proxy. Currently it serves every YouTube database request. It’s RPC based.
It's not insane, though not terribly relevant in the modern world. The only common technology still using it is NFS.
Also, it helps to have money since this approach requires more boxes. But as I said, it's very reliable.
The fastest, easiest, arguably most reliable way to scale is throwing money at it. And apparently it's easier to hire good people who know 20-year old tech rather than 3-year old.
As an aside, this fellow is probably one of the best presenters I've seen from the pycon videos for this year. Confident, smooth, not reading from a computer screen or sheet of paper, clearly smart and in firm command of the subject matter.
I'd love to see more talks from him.
Curious to hear more about that one. If true, I hope they open source it, because that could potentially make MongoDB a lot faster for everyone.
EDIT: It's apparently in their vitess code. Relevant code: http://code.google.com/p/vitess/source/browse/#hg%2Fgo%2Fbso...
I knew the view counter wasn't propagated but the likes were and I was like: "Damn this is Youtube, kinda disappointing..."
I guess if both were propagated at the same time I wouldn't mind.
This has to go down in history as one of the best pivot decisions ever made.
The more magical the code is the harder is to figure out how it works."
A nice formulation of the kind of advice I keep reading here in HN.
Having never worked with code-bases larger than ~50kloc, I have a lot of trouble understanding what 1 million lines of code is needed for, especially considering that python is such a high-level language.
Does anyone have any idea why there would be this much code?
It's the world's 3rd biggest website with hundreds of billions of views and dozens of millions of users, so maybe that's why.
I think it's a credit to Python that a website that does that and has grown in a fairly haphazard fashion only has about 1000k SLOCs.
Turns out they're just making that shit up.
Since this was just a field in a database, it involved some simple update code.
The results? More people are interested in anything that they think are popular (or they are curious as to why so many people viewed it).
My client got more actual hits overall on these topics.
> The number of videos has gone up 9 orders of magnitude and the number of developers has only gone up two orders of magnitude.
2 orders of magnitude means at the very least, going from 9 to 100 developers, which is a huge increase, but it could mean way more. I wonder how big the team really is, and what the changing team dynamics are like on that scale at that pace.
For me, YouTube is good for watching videos. If I want to discuss it, I post it on FB.
"Cheating - Know How to Fake Data
Awesome technique. The fastest function call is the one that doesn’t happen. When you have a monotonically increasing counter, like movie view counts or profile view counts, you could do a transaction every update. Or you could do a transaction every once in awhile and update by a random amount and as long as it changes from odd to even people would probably believe it’s real. Know how to fake data."
So all those people who buy views are kinda screwed now :-) I suspect this is a bad example. I HOPE this is a bad example, if only for the KONY2012 campaign :P