
Revisiting the Collatz Sequence – Laurent Rosenfeld - lizmat
http://blogs.perl.org/users/laurent_r/2020/04/revisiting-the-collatz-sequence-pwc-54.html
======
eesmith
FWIW, the following Python code has about the same run-time. (The raw numbers
are a bit faster on my machine than the reported numbers, but that means
little given the unknown difference between the machines.)

    
    
        # http://blogs.perl.org/users/laurent_r/2020/04/revisiting-the-collatz-sequence-pwc-54.html
    
        # If the previous term is even, the next term is one half of the
        # previous term. If the previous term is odd, the next term is 3 times
        # the previous term plus 1
    
        # Calculate the sequence length for all starting numbers up to 1000000
        # and output the starting number and sequence length for the longest
        # 20 sequences.
    
        import heapq
    
        def main():
            MAX =  1_000_000 #  0.78 seconds
            #MAX = 10_000_000 #  8.2 seconds
    
            arr = [0] * MAX # cache Collatz length for a given number. 0 = not computed
            arr[0] = -1
            arr[1] = 1
    
            for i in range(2, MAX):
                if arr[i]:   # Already computed. Go to the next one.
                    continue
    
                values = [i] # Keep track of the missing terms
                while 1:
                    if i % 2:
                        i = i*3+1
                    else:
                        i //= 2
                    if i < MAX and arr[i]:
                        # Found a hit. Back-fill the missing parts.
                        n = arr[i] + 1
                        values.reverse()
                        for val in values:
                            if val < MAX:
                                arr[val] = n
                            n += 1
                        break
                    values.append(i) # Extend the sequence
    
            # Report the top 20
            for i in heapq.nlargest(20, range(MAX), key=arr.__getitem__):
                print(i, arr[i])
    
        if __name__ == "__main__":
            import time
            t1 = time.time()
            arr = main()
            t2 = time.time()
            print(f"{t2-t1:.2} seconds")

~~~
raiph
Edit. I just realized tio.run is running raku from 2018 which is a LOT slower
than current raku, whereas it's running a more recent python. I could well
believe the sequential raku numbers are much closer to the python numbers and
the parallel could perhaps be an order of magnitude faster.

Fwiw, I ran the above on tio.run. It took 0.9s.

I also ran the perl code in the OP in the section headed "Using an Array
Instead of a Hash". That took about twice as long.

The changes in the section headed "Further Optimizations" supposedly deliver
about a 30% improvement. So, closer to your Python code -- but still
noticeably slower.

The changes in the section headed "Combining Caching and Parallel Execution"
then sped things up a further 15x on 32 cores -- more than an order of
magnitude faster.

(I presume it would be fairly easy to parallelize the python code too and it
might be equally efficiently implemented, but the proof, as they say, is in
the pudding...)

In the meantime the sequential raku code is way off, something like 5X as slow
as the perl code I tested on tio.run, so 10x slower than the python code you
wrote.

Then again Liz made some mods to the raku code that she says doubled its
speed.[1]

And it's typically relatively easy to parallelize raku code, so presumably the
raku code could relatively easily be made to outperform your python code on
hardware with enough cores -- at a guess around 10 or so.

[1]
[https://www.reddit.com/r/rakulang/comments/g2au9v/revisiting...](https://www.reddit.com/r/rakulang/comments/g2au9v/revisiting_the_collatz_sequence_laurent_rosenfeld/)

~~~
eesmith
Thanks for the followup!

I last seriously used perl in the 1990s and wasn't really able to follow the
code shown in the link. Do you know if they are the same algorithm?

It might be that the performance difference is more due to my coming up with a
more efficient implementation.

EDIT: Just tried with PyPy 7.3.1 implementation of Python 3.6.9. It's about 8x
faster than CPython - 0.091 seconds. (0.20s if I include startup costs.)

