

Can sorting be compared to matrix computation? - zeynel

I have a question about two articles recently on HN. In one of them a physicist says that<p>"There is no computer on Earth that could possibly store such a big matrix in its memory"<p>On the other Google says that they sorted "1PB (10 trillion 100-byte records) on 4,000 computers" in six hours.<p>I know one is talking about matrix computation the other sorting. But is there a way to compare these two to verify if what the physicist claims is wrong and that google could actually store and make that computation?<p>Thanks<p>http://www.newscientist.com/article/dn16095-its-confirmed-matter-is-merely-vacuum-fluctuations.html<p>Virtual quarks make the calculations much more complicated, involving a matrix of more than 10,000 trillion numbers, says Stephan Dürr of the John von Neumann Institute for Computing in Jülich, Germany, who led the team.<p>"There is no computer on Earth that could possibly store such a big matrix in its memory," Dürr told New Scientist, "so some trickery goes into evaluating it."<p>http://googleblog.blogspot.com/2008/11/sorting-1pb-with-mapreduce.html<p>It took six hours and two minutes to sort 1PB (10 trillion 100-byte records) on 4,000 computers.
======
tsally
These claims don't contradict each other. At any point in time, there is no
way Google stored any significant amount of records in memory. The scientist
is not saying it's impossible to multiply the matrix, he is just pointing out
that you need to use special algorithms that don't require the entire matrix
to be loaded into memory at once.

------
cabalamat
Let's work it out. We have 1e16 numbers. If each is 8 bytes, that's 8e16
bytes.

Dell do blade servers with 192 GB RAM. Call it 200 GB, which is 2e11 bytes.

You'd need 8e16/2e11 = 4e5 = 400,000 of them. At £35k each that's 14 billion
quid. (Though with that big an order, you could no doubt build or get them
cheaper). This is about gross world output for 4 hours. So it's doable, but
not on the budget that most organisations could afford.

