Hacker News new | past | comments | ask | show | jobs | submit login
Uncommon Uses of Python in Commonly Used Libraries (eugeneyan.com)
178 points by 7d7n on Aug 20, 2022 | hide | past | favorite | 38 comments



> That said, is there a reason not to use relative imports?

Yes, they make reading imports across your entire project rather difficult: Suddenly there are multiple ways of referring to the same module. If you ever have to do a project-wide search & replace during a refactoring (because your favorite refactoring tool failed you), this will be hell.

Moreover, in each file you'll end up with a weird blend of absolute and relative imports, depending on what was shorter or looked nicer to the author at the time. Not nice to look at at all.

> This led me to dig into why we might add to __init__.py

…or why we might rather not. Init files are one of the main reasons imports in Python often behave in unexpected ways. As a library user I do not want to study the library's init files first, but unfortunately I often have to in order to understand what is going on. (Case in point: Tensorflow 1/2. To this day, I can't claim I understand how exactly their init magic works and time and again I get bitten by failed imports.)


>Yes, they make reading imports across your entire project rather difficult: Suddenly there are multiple ways of referring to the same module. If you ever have to do a project-wide search & replace during a refactoring (because your favorite refactoring tool failed you), this will be hell.

> Moreover, in each file you'll end up with a weird blend of absolute and relative imports, depending on what was shorter or looked nicer to the author at the time. Not nice to look at at all.

it sounds like your project had inconsistent import styles leading to this issue. IMO imports should always be relative and always be one imported symbol per line, they should be sorted and linted [1] that they follow this exact form. you will have no issues with imports, merges, search and replaces, etc. after that.

[1] I use https://pypi.org/project/flake8-import-order/ along with an in-house import rendering tool.


Imports from 3rd party packages are always absolute. That's why I prefer imports from first party packages to be absolute as well


well yes but when you sort your imports, 3rd party imports are always in a separate stanza entirely. there's no "weird blend of absolute and relative imports, depending on what was shorter or looked nicer to the author at the time.", nothing is blended and also Black should be used for formatting. That way nobody is making any "this looks nicer" kinds of decisions nor does anyone have to waste their time on that.


“There should be one– and preferably only one –obvious way to do it.“

If you’re not maintaining one of the libraries listed in the article, and try to pull any of this “clever” stuff you’ll be bitten on the ankle by a pythonic snake. No exceptions.


> There should be one– and preferably only one –obvious way to do it.

I'm honestly tired of explaining to people that this line of Python zen does not mean that there shouldn't be more than one ways to do something. On a very literal level, it states that there should be one obvious way to do it -- and in no way defines how many non-obvious ways there should be.

What you're calling "clever" stuff is just regular Python functionalities, even if many of them are non-obvious. The only thing before something being "simple" and "clever" is how much is an individual familiar with something.


Arguably the way to get bitten is _not_ to have that super init call. Seems worth making that a standard practice just to avoid the potential confusion later if the class is ever multiply inherited.


This could be fine, but it does restrict things somewhat and could lead to more unexpected results.

The piece doesn't present the correct way to use multiple inheritance in python, so this section is a bit of a strawman. Namely, it is the responsibility of the inheritor to call init on all subclasses with the arguments it wants to pass on. Maybe the python 3 addition of super() has muddled this responsibility somewhat.

If we use the solution in the piece then we lose the ability to pass non-identity expressions in the arguments we received on to the base classes, and also the ability to have same named arguments but with different values.

If a class wants to pass non-identity expressions, and the underlying base classes use the piece's methodology, then you get a bug.


Nothing in the article is particularly egregious. The only odd one is using Python’s “call your children’s parents” super() for code-reuse and mixins but it’s perfectly normal and pythonic.


Python: if something can be done weird, someone will do it.

I know I'm guilty of doing dynamic library imports and monkey patching things.


This is true generally for all systems; that is, if there’s significant gap between what is allowed by the system, its non-systematically enforced usage guidelines, and how users want to use it, users will find a way to get what they want, especially if doing so does not impact them.

For example, shared a blog post I found few days ago on injecting custom Python syntax/functions into local code; obviously problematic, but something Python doesn’t even attempt to prevent, in part because it is open source:

https://news.ycombinator.com/item?id=32484778


Surely the most common reason for __init__ file content is re-exporting some otherwise deeper objects. The second is probably laziness - just bung it all in __init__!


As a personal opinion, I always found adding code into a "__init__" file a fudge and an antipatern.


Not at all - it is an excellent way to make modules, well, modular - allowing complex internal namespacing while keeping the surface clean and simple.


> it is an excellent way to make modules, well, modular

Strongly disagree. It is perfectly possible to do that without adding anything into __init__. I prefer explicit imports and any other logic declared directly into the module(s) that are the places where one would expect to find those.

I see code in __init__ as a convenient hack, but a hack nonetheless.


Code in __init__ is one of the core concepts of Python; seeing it as a hack is purely individual, and probably based on your own experience with it. It's exactly the same as calling mataclasses "a hack".


> Code in __init__ is one of the core concepts of Python; seeing it as a hack is purely individual

Yes, in fact I started my first message with a clear "As a personal opinion...".

Apart from that, the comparison with metaclasses does not make sense. Metaclasses are an OOP concept.


I agree beyond imports (it can be convenient to allow e.g. from django.db.models import CharField instead of from django.db.models.fields or whatever it is, for example) - that's why I called it laziness, a fudge done to just get it working quickly, lazily not doing it properly.


Recently was also looking around for quantifiable stats on Python for: downloads, usage, FAQs, etc. — and found these:

https://pypistats.org/top

https://www.programcreek.com/python/index/module/list

https://stackoverflow.com/questions/tagged/python?tab=Votes

Anyone know of any others or large collections of Python source code that are easy to download?



Thanks, any idea what the “leet” Python version listed at the very top right of the first chart linked to below, which was in one of the links you provided; attempted to Google it and found nothing.

Direct link to the chart, see “133.7” Python version (elite version) in the top right:

https://github.com/hugovk/pypi-tools/blob/main/images/all.pn...

Which is from:

https://github.com/hugovk/pypi-tools


I would add overriding boolean dunder methods: __and__, __or__, __xor__ dunder methods.

Even more rare is overriding bitwise shift operations: __rshift__, __lshift__ etc. This is unfortunate, as these methods are only natively implemented in integers, so they’re basically freebies.


> boolean dunder methods: __and__, __or__

Those are the bitwise operators (a & b, a | b). You can’t override the behavior of the boolean operators (a and b, a or b), you can only define the truthiness of your object with __bool__.


I think multiple inheritance will always scare me. What order do the superclass inits run in? What happens if they do conflicting things? What if some superclasses call super().__init__ and others don't?

No thanks, I'll suffer through reading a few additional lines of:

class SomeBusinessyThing:

  def __init__(self, util, other_util):
    self._util = util
    self._other_util = other_util

  @classmethod
  def create(cls):
    return cls(util_module.Util(), other_module.Other())

  def calculate(self):
    source = self_other_util.get_source()
    return self._util.get_stuff(source)
vs

class SomeBusinessyThing(Utils, Other):

  def __init__(self, \*kwargs):
    # What does this do? No one knows
    super().__init__(self, \*kwargs)

  def calculate(self):
    source = self.get_source()
    return self.get_stuff(source)


Meh, that's just the standard composition vs inheritance dichotomy. In reality, those two concepts are orthogonal, and you can use one, the other, and both, as suitable to the situation.

Using multiple inheritance to implement certain common functionality, using mixin classes, is possible in Python; it's another powerful tool in the arsenal, but doesn't mean that you have to use it.

Inheritance works best to denote "is-a" relationships, i.e. for defining subtypes, especially when using type annotations and checks. Sometimes - albeit very rarely - you need a class that belongs to two separate type hierarchies; multiple inheritance comes very handy in those cases.


One case where I (ab)used this in R was to add an abstract class called Timed that measured the time the inner function took.

I guess I'd probably use a decorator in Python but this was R and I was on an S4 buzz back then so I took the approach above.


I don't think he is unaware of any of that. His point was that multiple inheritance involves enough fiddly surprising behaviour that it's best avoided - you are better off manually delegating to distinct member variables, then it is clear what is happening even if it is a bit more tedious.

(Btw that's the only way to implement inheritance in Rust, even single inheritance.)


> multiple inheritance involves enough fiddly surprising behaviour

In which way? MRO is very well defined.

> manually delegating to distinct member variables

Again, this is composition, which has nothing to do with inheritance. If you need to define a subtype, inheritance is most straightforward.


The Method Resolution Order (MRO) is firm and documented. It's just not something anyone keeps in mind unless you use multiple inheritance a lot.

Conflicts are determined by the MRO. If some classes don't call super(), then they won't call super --> classes further down the MRO won't be called and won't be initialized.

The choice isn't: multiple inheritance or a couple lines. In the right situation, multiple inheritance could save hundreds of lines and condense a complicated mechanism into a simplistic one. Used flippantly, they can be a nightmare -- but that's true of all programming paradigms.


I won’t argue with “keep things simple” but if you’re ever forced to work with multiple inheritance in Python try inspecting:

  SomeBusinessyThing.__mro__
https://docs.python.org/3/library/stdtypes.html#class.__mro_...


Agree, core feature of Python is to be readable and familiar. While I enjoy reading more advanced deep dives in language features, at the point you’re being crafty to flex, it’s likely bad idea; aka if next person reading your code mostly will have no clue what it’s doing, it’s likely a bad idea to do.

Wish there was a way to visualize or rate how average code is - especially two separate versions covering same concept; hence my other comments on resources to quantify usage patterns.


This is "explicit is better tha implicit".

I would think that Inwoukd hate the above code but actually I appreciate it and prefer it. It's great.


Over thinking it, I think this is probably the "right" solution to multiple inheritance where there are conflicting attributes.

I suspect it is a hard pre-commit check but you would want to only inherit from classes with no conflict- then if there are it is down to this approach (!)


I thought the article would be about "uncommon uses" as in "I didn't know this library had pieces written in Python". Relatedly, the original BitTorrent client might be one of the first widely-distributed applications written in Python.


Is the BitTorrent Python code you’re referencing Twisted, the Python networking library, something else?

Also, guessing that Google was likely one of the first systems widely used that was implemented in part using Python.


BitTorrent is the first ever program that implements the BitTorrent protocol, written by Bram Cohen, who designed the protocol.


Thanks, with the additional information I was able to find the original source code from 2001 and how to run it, for anyone that’s curious:

https://stackoverflow.com/questions/17654103/running-bram-co...


The original BitTorrent code was written in Python.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: