The correct statement is that you can replace 5 with any larger number and the algorithm still works, but 5 is the least number you can pick here. That's the most interesting part of the algorithm to me - that "5" is connected to median-finding in the same way that "2" is connected to divide in conquer searching in a sorted array. But 2 is more obviously interesting than 5 (you can say "dividing something in 2 makes it smaller", but there aren't very many similar things you can say about 5).
I am not saying 3 is a good number in a pragmatic sense, just that eyeballing the informal proof doesn't seem to yield any obvious problems with using groups of 3 for the desired result.
T(n) = T(n/3) + 4n/6 + O(n) __
It seems as if this doesn't workout to O(n) whereas if the middle term is 7n/10 (parition size = 5), it does work out.
I've used these codes on some very large datasets, and they work very well.
Of course, this requires 0 comparisons (but you might want to to a insertion sort for the final round), and a maximum of 4 rounds for 32 bits numbers using 8 bits buckets.
Finding the median strikes me as more of a academic problem than a real life problem. However, partial sort (for which finding the median is a sub problem) is an easy optimization relevant for search engines, where you sometimes only need to extract the hits from position 1000-2000.
The median of medians is not necessarily the median. E.g. take 1 2 5, 3 4 6, 7 8 9: medians are 2, 4, 8. Median of medians is 4 but overall median is 5.
Found an answer: