Hacker News new | past | comments | ask | show | jobs | submit login
Revisiting the Collatz Sequence – Laurent Rosenfeld (perl.org)
3 points by lizmat on April 16, 2020 | hide | past | favorite | 3 comments

FWIW, the following Python code has about the same run-time. (The raw numbers are a bit faster on my machine than the reported numbers, but that means little given the unknown difference between the machines.)

    # http://blogs.perl.org/users/laurent_r/2020/04/revisiting-the-collatz-sequence-pwc-54.html

    # If the previous term is even, the next term is one half of the
    # previous term. If the previous term is odd, the next term is 3 times
    # the previous term plus 1

    # Calculate the sequence length for all starting numbers up to 1000000
    # and output the starting number and sequence length for the longest
    # 20 sequences.

    import heapq

    def main():
        MAX =  1_000_000 #  0.78 seconds
        #MAX = 10_000_000 #  8.2 seconds

        arr = [0] * MAX # cache Collatz length for a given number. 0 = not computed
        arr[0] = -1
        arr[1] = 1

        for i in range(2, MAX):
            if arr[i]:   # Already computed. Go to the next one.

            values = [i] # Keep track of the missing terms
            while 1:
                if i % 2:
                    i = i*3+1
                    i //= 2
                if i < MAX and arr[i]:
                    # Found a hit. Back-fill the missing parts.
                    n = arr[i] + 1
                    for val in values:
                        if val < MAX:
                            arr[val] = n
                        n += 1
                values.append(i) # Extend the sequence

        # Report the top 20
        for i in heapq.nlargest(20, range(MAX), key=arr.__getitem__):
            print(i, arr[i])

    if __name__ == "__main__":
        import time
        t1 = time.time()
        arr = main()
        t2 = time.time()
        print(f"{t2-t1:.2} seconds")

Edit. I just realized tio.run is running raku from 2018 which is a LOT slower than current raku, whereas it's running a more recent python. I could well believe the sequential raku numbers are much closer to the python numbers and the parallel could perhaps be an order of magnitude faster.

Fwiw, I ran the above on tio.run. It took 0.9s.

I also ran the perl code in the OP in the section headed "Using an Array Instead of a Hash". That took about twice as long.

The changes in the section headed "Further Optimizations" supposedly deliver about a 30% improvement. So, closer to your Python code -- but still noticeably slower.

The changes in the section headed "Combining Caching and Parallel Execution" then sped things up a further 15x on 32 cores -- more than an order of magnitude faster.

(I presume it would be fairly easy to parallelize the python code too and it might be equally efficiently implemented, but the proof, as they say, is in the pudding...)

In the meantime the sequential raku code is way off, something like 5X as slow as the perl code I tested on tio.run, so 10x slower than the python code you wrote.

Then again Liz made some mods to the raku code that she says doubled its speed.[1]

And it's typically relatively easy to parallelize raku code, so presumably the raku code could relatively easily be made to outperform your python code on hardware with enough cores -- at a guess around 10 or so.

[1] https://www.reddit.com/r/rakulang/comments/g2au9v/revisiting...

Thanks for the followup!

I last seriously used perl in the 1990s and wasn't really able to follow the code shown in the link. Do you know if they are the same algorithm?

It might be that the performance difference is more due to my coming up with a more efficient implementation.

EDIT: Just tried with PyPy 7.3.1 implementation of Python 3.6.9. It's about 8x faster than CPython - 0.091 seconds. (0.20s if I include startup costs.)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact