How many queries per second can they handle on those nodes, and with what latency? What kind of relevancy calculations were they able to do at query-time in their system with 1B documents per node? Were they able to support query-time aggregation of structured fields in their documents? Was the index stale or did they support continuous feeding and indexing of new documents? If the latter, how well did they meet their SLA QPS and latency when indexing new documents?
I can set up a single search node and fill it up with God know how many documents any day, but the difference between supporting 10 QPS with ~500ms latency and 3000 QPS with the 99 percentile below 40ms is really more interesting than exactly how many documents I have per node.