More

williamkuszmaul · 2023-05-27T23:15:09

Overall seems like a great book. The hashing chapter is a bit half baked though. It claims without reservation that deletions simply cannot be efficiently implemented with linear probing. But there are at least two two efficient to do this (lazy deletions or just fix up the hash table), and both are worthwhile for students to know.

williamkuszmaul · 2023-05-27T00:06:31

1/100 is too large of a cutoff imo. If you have a class of 96 students, there's a decent chance that an innocent student gets flagged for no reason. I hope he lets the student on the bubble off the hook.

williamkuszmaul · 2023-05-18T16:43:37

One thing I'm confused about: Did the author try vectorizing the linear search implementation? (Of course, it is possible that even if they did not, the compiler did.) I would imagine that vectorization is behind the advice to use linear searches on small arrays.

dkersten · 2023-05-18T18:16:04

> I would imagine that vectorization is behind the advice to use linear searches on small arrays.

I always thought a core part is that searching linearly is simply very cache efficient due to easy prefetching of linear data and that this is simply faster when you don’t have to search too much. Of course, that and vectorisation together is even better.

orlp · 2023-05-19T11:53:56

The benchmarking code is available, I did not: https://github.com/orlp/bitwise-binary-search. The compiler also did not appear to have autovectorized it.

That said, I would expect the overhead of vectorization to dominate at smaller sizes, and the fact that linear search is... well... linear to dominate at larger sizes. There might be a sweet spot for a few dozen elements where it would be the best, but I think it'd be rather niche (and strictly limited to floats/integers).

Regarding cache effects of linear search, this entire article is under the assumption your data is in-cache, if not your results will vary (wildly).

williamkuszmaul · 2023-05-18T16:30:28

Related recent paper in Science: https://www.science.org/doi/10.1126/science.aam9744

williamkuszmaul · on April 5, 2023

One important caveat: it is widely believed that randomization does make a big difference for data structures problems. For example hash tables (which use random hash functions) take O(1) time per operation, but it is conjectured that no deterministic data structure can match this.

There are also problems in online algorithms where randomization provably changes what is possible, for example the "cup game problem." (https://arxiv.org/abs/1904.02861)

Finally, although randomized primality testing (1970s) is often credited as the start of randomized algorithms as a field, it's worth noting that hash tables (1954) and quick sort (1961) were much earlier.

unlikelymordant · on April 5, 2023

are hash functions considered random? considering they are deterministic in their inputs and outputs. With randomised primality testing, you could use a true random number (based on e.g. nuclear decay) every time, and get the same correctness guarantees. With a hash table, you could never retrieve anything if you always used a true random number as a key every time.

williamkuszmaul · on Oct 10, 2022

I think it would be fair to say that it's a kind of funny trie-hash-table hybrid. What's neat though is that it manages to achieve better space bounds than either a trie or a hash table would on their own.

I didn't invent the algorithmic idea --- it's basically a simplification of a data structure that was published in ICALP 2003 [1]. As far as I know, the idea of combining hash tables and tries in this kind of way (despite being incredibly simple!) doesn't appear in either the theoretical or experimental literature prior to that.

I'd be interested in any other earlier sources that people might have.

[1] Raman, Rao. Succinct Dynamic Dictionaries and Trees. https://link.springer.com/chapter/10.1007/3-540-45061-0_30. ICALP, 2003.

williamkuszmaul · on June 5, 2022

Unless I'm misreading, the question as stated in the blog post never says there is only one duplicate (there might be many!), so in that sense I think his answer may be wrong. A more robust solution is just to have an array of n counters and just count how many times each item appears.

philiiiiiipp · on June 5, 2022

> when given an array of length n + 1 containing integers 1 through n, find the duplicate integer in an array.

I do believe it is only one.

jhot · on June 5, 2022

The length is n+1 and contains all ints 1 through n, so there can only be a single duplicate.

williamkuszmaul · on June 5, 2022

But does the question ever say "all"...?

williamkuszmaul · on June 5, 2022

Using 64 bit integers, we can store the square of any 32 bit integer. Not that small...

ciupicri · on June 5, 2022

Also it was 2004 and 64 bit machines weren't common. If I remember correctly you had only 2 GB available under Windows XP and if you wanted more, 3 GB to be more precise, you had to boot with a special parameter.

saagarjha · on June 5, 2022

32-bit machines generally 64-bit arithmetic of some sort; for example "long long" in C is mandated by the standard to be of a width greater than or equal to 64 bits.

ciupicri · on June 5, 2022

Yeah, 64 bit integers were available under gcc on Linux, but the memory issue remained. The array had to fit in 2 GB, unless of course the problem stated otherwise, for example the numbers were read from a file.

williamkuszmaul · on June 3, 2022

Impressive!

There are already implementations of sample sorting that are much faster than c++ sort (but I don't recall how much faster). I'd be very interested in a comparison to some of those...

Also, since we are sorting integers, I'd also be interested to know how well a modern implementation of radix sort can be made.

janwas · on June 4, 2022

:) Are you thinking of ips4o (sample sort)? That's about 2-3x std::sort. There's also a pending patch to std::sort that replaces it with BlockQuicksort which would narrow that gap.

We integrate vqsort into ips4o, calling it once subarrays fit in L2. That speeds up ips4o by 2.9x (single thread) and 1.6 (parallel - bandwidth limited). More details in the paper.

I previously also worked on radix sort: https://arxiv.org/abs/1008.2849 Interestingly, that 12 year old result on dual-socket/8 cores is only about twice as fast as a single AVX-512 core today.

williamkuszmaul · on June 3, 2022

If the coin were unbiased, we could compute the exact probability of getting 10231 or more heads with 20000 flips as:

"sum (20000 choose x)/2^20000 for x from 10231 to 20000",

which Wolfram Alpha evaluates to 0.00056.

The probability of getting a number of flips that differs from 10000 by at least 231 is twice that, so about 0.001.

So, in fact, the probability of this happening by dumb luck is about 1/1000. That's pretty strong evidence.