Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Speeding up non-vectorizable code with Cython (isaacslavitt.com)
37 points by callmekit on Aug 8, 2015 | hide | past | favorite | 3 comments


Before jumping to Cython, remove things like exponentiation and all the calls to `ord`:

    def column_to_index(col):
        col_index = 0
        for byte in bytearray(col):
            col_index *= 26
            col_index += byte - 64
        return col_index - 1
FWIW, you _can_ vectorize this, albeit it's neither simple nor fast, due to Numpy's fixed overheads. If you convert thousands of strings at once, though, it's probably reasonably speedy.

    import numpy
    from numpy import arange

    place_values = 26 ** arange(20, dtype=object)[::-1]
    overhead = numpy.add.accumulate(64 * place_values[::-1]) + 1
    def column_to_index(col):
        digits = bytearray(col)
        length = len(digits)

        summation = place_values[-length:].dot(digits)
        return summation - overhead[length-1]


Nice article. Shameless plug: readers that would like to use cython outside of the convenient ipython notebook interface should look into runcython [1]. You can run any python file using cython with:

    sudo pip install runcython
    mv main.py main.pyx && runcython main.pyx
[1] http://github.com/russell91/runcython


The type limitations example seems a bit contrived, a spreadsheet with 2^64 is slightly unrealistic. The point remains, though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: