"the relative speed of all 3 technologies is (roughly) the same"
I'll be a bit pedantic. The original essay says "pure Python", which is not a technology but a language. There are several implementations of Python, which I'll argue is the "technology". The PyPy implementation of the pure Python algorithm is 10x faster than the CPython implementation.
According to the original essay, that's close to the CPython/NumPy performance, and faster than the CPython/Pandas version.
It's tempting to omit PyPy because "nobody uses it in data science". That's a self-fulfilling argument.
PyPy is doing amazing work to support both NumPy and Pandas, but it's limited by funding. Why aren't data science people pumping money into PyPy? It would give a huge performance boost even for naive, throw-it-together algorithms.
If you don't know why PyPy can do for you, how do you learn if/when you should use it?
I have tried PyPy once and it gives huge speed up. I am not sure how it works with Pandas/Numpy. I have also tried Numba JIT. Eeasy to use, but in my experince it saves just up to 10% of time.
There is also Cython (https://pandas.pydata.org/pandas-docs/stable/enhancingperf.h...) and it definitely works with Pandas. Have not tried yet.
If you have a pure Python implementation, then why does it matter how well PyPy works with Pandas and NumPy?
My point is a minor one. "Python" isn't a technology. "CPython" and "PyPy" are technologies. Or you can try MicroPython, Iron Python, Jython, and others.
In any case, you'll see the best idiomatic Pandas code is reported to be 500x faster than the original, while the best idiomatic Python is reported to be 950x faster. The proportions are not the same, so it's not like "dividing by the same denominator".
The rank order is unchanged, though that's a weaker observation than what you wrote.
I'll be a bit pedantic. The original essay says "pure Python", which is not a technology but a language. There are several implementations of Python, which I'll argue is the "technology". The PyPy implementation of the pure Python algorithm is 10x faster than the CPython implementation.
According to the original essay, that's close to the CPython/NumPy performance, and faster than the CPython/Pandas version.
It's tempting to omit PyPy because "nobody uses it in data science". That's a self-fulfilling argument.
PyPy is doing amazing work to support both NumPy and Pandas, but it's limited by funding. Why aren't data science people pumping money into PyPy? It would give a huge performance boost even for naive, throw-it-together algorithms.
If you don't know why PyPy can do for you, how do you learn if/when you should use it?