Hacker News new | past | comments | ask | show | jobs | submit login
Using Python's Bisect Module (johnlekberg.com)
29 points by kaunta on Nov 22, 2020 | hide | past | favorite | 13 comments



The entire bisect module, consisting of 4 public functions, is literally only 32 lines of code.



Interesting to see the module defines a pure Python implementation first, then tries to import a fast C implementation. If the C version is present, it replaces the Python version.

Bisect is a very old module, certainly present in Python 1.5.2 from 1999 that I started with. I expect the C implementations got added a year or two later.


Yes, but now I want to know:

1. Is there a C version? I don't see it in the same directory.

2. If there is a C version, does anyone who submits a PR (like the one in 2019 adding 'key'[0]) have to submit a C version along with it?

[0] https://bugs.python.org/issue4356


The C version is in Modules/_bisectmodule.c, whereas the Python version is in Lib/bisect.py.

Python version: https://github.com/python/cpython/blob/master/Lib/bisect.py

C version: https://github.com/python/cpython/blob/master/Modules/_bisec...


Thanks. So it seems both were updated with 'key' at the same time: https://github.com/python/cpython/commit/871934d4cf00687b3d1...

Now I'm wondering - what is the point of the python version, if it will always be overridden by the C implementation? Are there circumstances (platforms, compile flags, ...) under which the C version would be unavailable when the python runtime is compiled?


It may be that other Python implementations use CPython's standard library, or at least part of it.

I think that PyPy in particular does this, but I'm not 100% sure. I know for certain that it uses pure Python implementation of some modules from somewhere. One program I took great pains to be PyPy compatible ended up being a lot slower in it, and it turned out to be that the built-in sqlite3 module has a C implementation in CPython that's faster than the pure-Python version even when runing in PyPy.


> Interesting to see the module defines a pure Python implementation first, then tries to import a fast C implementation.

Peak Python


Hate to be pedantic, as it is a thing of beauty, but most of it is replaced by a C implementation instead.


Bisect is nice, but it's not the fastest option.

If binning data, discretize it and then use a dict lookup - `grades_to_letters[grade//10]`, for example.

For insort, and indeed anything with sorted collections, just use the http://www.grantjenks.com/docs/sortedcontainers/ module. Inserting an element is worst-case sublinear time, and also faster than C-extensions. It's one of the very few data-structure libraries I use regularly.


Is O(n + log n) really a thing or should he just write O(n)?


Short answer is no because we’re talking about asymptotic behavior and n dominates log n.

Long answer is yes because O notation is commonly written this way to be more illustrative of what’s happening in an algorithm and that were smushing to algorithms together.

If you can be more precise then feel free to write it down. For example O notation hides a constant factor but if you happen to actually know it exactly you might see O(7n) in the wild.


I should just write O(n).

Thanks for catching the mistake, @dmurray. I'll fix that.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: