Hacker News new | comments | show | ask | jobs | submit login

blekko needed 700 servers at launch for our web-scale search engine. We figured an index of a few billion webpages was the minimum needed to give reasonable results.

This is definitely not a one-size-fits-all sort of question!

Hah! Yeah, I imagine that this answer is probably outside of the question the OP is asking, but I do always enjoy hearing about scaling requirements for projects outside of those I have dealt with. Do your requirements (If you don't mind me asking) require a more disk, memory or CPU focused infrastructure?

To let our programmers be a bit lazy, we want fat (2-socket) nodes. That allows us to do things like have gigabyte-sized tables in memory on all nodes for instant access.

At that kind of server core count, it turns out that we can keep 8-12 disks busy, when crawling and indexing. For serving results, it turns out that we can keep 2 ssds busy, and, 96 gigs of ram seems to be the sweet spot.

The one thing we don't stress so much is the network. We're very conscious of locality, and we can get away with having 1 gigabit to all the nodes. We use beefy 10 gig switches in the middle to give high bisection.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact