> but count_ones stays a bottleneck; I'm pretty sure you can't do that in O(1) e...

bsza · on Dec 15, 2022

> Every heard people saying "Sorting takes O(n logn) time"? They are clearly assuming comparing two numbers can be done in constant time.

Where n is the number of items. Not the number of digits! And we don't normally talk about "time" when discussing sorting algorithms in the first place. We talk about number of comparisons (precisely because we have no clue how big the items are!).

> Just make a table with the result of all (lg n)/2 bit inputs

We're interested in the number of ones in a C-bit integer, where C is the size of the character set (and incidentally the maximum acceptable value of the window size W). Simply counting the ones bit by bit costs O(C) time. Making a lookup table for all (C/2)-bit numbers - as you seem to suggest - costs O(2^(C/2)) time and space. And the algorithm we wanted to improve runs in O(N * C) time.

Look, I'm trying to decipher what your point is, but you seem to be confusing the size of the input space with the size of the input. So AFAIC, this is as much attention as this algorithm deserves. It's a nice party trick, but nothing more.

thomasahle · on Dec 15, 2022

> And we don't normally talk about "time" when discussing sorting algorithms

Sure you might be a purist and talk about comparisons, but I'm pretty sure if you Google it most people would talk about time. And "n logn" is exactly the amount of time it takes to sort n word-length integers in Word RAM.

> We're interested in the number of ones in a C-bit integer

You're not really engaging with the point that you can make a table in sqrt(n) time that allows counting the bits in logn bit strings in constant time. That means the final algorthm runs in O(NC/logN) time.

This is definitely not a party trick, as you would know if you've ever written a program using these methods. Now with AVX instructions it's more important than ever.

You also didn't engage with O'Donnells lecture notes. I can find you many more text books explaining why this is the right assumption to make, in case you don't like the reasons I've given above.