At that kind of server core count, it turns out that we can keep 8-12 disks busy, when crawling and indexing. For serving results, it turns out that we can keep 2 ssds busy, and, 96 gigs of ram seems to be the sweet spot.
The one thing we don't stress so much is the network. We're very conscious of locality, and we can get away with having 1 gigabit to all the nodes. We use beefy 10 gig switches in the middle to give high bisection.