Using Python's Bisect Module

snicker7 · on Nov 22, 2020

The entire bisect module, consisting of 4 public functions, is literally only 32 lines of code.

rahimnathwani · on Nov 22, 2020

https://github.com/python/cpython/blob/master/Lib/bisect.py

stevesimmons · on Nov 22, 2020

Interesting to see the module defines a pure Python implementation first, then tries to import a fast C implementation. If the C version is present, it replaces the Python version.

Bisect is a very old module, certainly present in Python 1.5.2 from 1999 that I started with. I expect the C implementations got added a year or two later.

rahimnathwani · on Nov 22, 2020

Yes, but now I want to know:

1. Is there a C version? I don't see it in the same directory.

2. If there is a C version, does anyone who submits a PR (like the one in 2019 adding 'key'[0]) have to submit a C version along with it?

[0] https://bugs.python.org/issue4356

kaunta · on Nov 22, 2020

The C version is in Modules/_bisectmodule.c, whereas the Python version is in Lib/bisect.py.

Python version: https://github.com/python/cpython/blob/master/Lib/bisect.py

C version: https://github.com/python/cpython/blob/master/Modules/_bisec...

rahimnathwani · on Nov 23, 2020

Thanks. So it seems both were updated with 'key' at the same time: https://github.com/python/cpython/commit/871934d4cf00687b3d1...

Now I'm wondering - what is the point of the python version, if it will always be overridden by the C implementation? Are there circumstances (platforms, compile flags, ...) under which the C version would be unavailable when the python runtime is compiled?

quietbritishjim · on Nov 23, 2020

It may be that other Python implementations use CPython's standard library, or at least part of it.

I think that PyPy in particular does this, but I'm not 100% sure. I know for certain that it uses pure Python implementation of some modules from somewhere. One program I took great pains to be PyPy compatible ended up being a lot slower in it, and it turned out to be that the built-in sqlite3 module has a C implementation in CPython that's faster than the pure-Python version even when runing in PyPy.

snicker7 · on Nov 22, 2020

> Interesting to see the module defines a pure Python implementation first, then tries to import a fast C implementation.

Peak Python

alexchamberlain · on Nov 22, 2020

Hate to be pedantic, as it is a thing of beauty, but most of it is replaced by a C implementation instead.

zhd · on Nov 22, 2020

Bisect is nice, but it's not the fastest option.

If binning data, discretize it and then use a dict lookup - `grades_to_letters[grade//10]`, for example.

For insort, and indeed anything with sorted collections, just use the http://www.grantjenks.com/docs/sortedcontainers/ module. Inserting an element is worst-case sublinear time, and also faster than C-extensions. It's one of the very few data-structure libraries I use regularly.

dmurray · on Nov 22, 2020

Is O(n + log n) really a thing or should he just write O(n)?

Spivak · on Nov 22, 2020

Short answer is no because we’re talking about asymptotic behavior and n dominates log n.

Long answer is yes because O notation is commonly written this way to be more illustrative of what’s happening in an algorithm and that were smushing to algorithms together.

If you can be more precise then feel free to write it down. For example O notation hides a constant factor but if you happen to actually know it exactly you might see O(7n) in the wild.

kaunta · on Nov 22, 2020

I should just write O(n).

Thanks for catching the mistake, @dmurray. I'll fix that.