Hacker News new | past | comments | ask | show | jobs | submit login
How the Python import system works (tenthousandmeters.com)
497 points by zikohh on July 24, 2021 | hide | past | favorite | 203 comments



Fun fact: you can overload the Python import system to work with other languages that you create.

I use this for my Python-based Lisp: https://github.com/shawwn/pymen/blob/ml/importer.py

  import foo
checks for "foo.l", and if found, compiles it to foo.py on the fly and imports that instead.

It's so cursed. I love it.


See also https://github.com/drathier/stack-overflow-import for stupid Python import tricks.

"from stackoverflow import quick_sort will go through the search results of [python] quick sort looking for the largest code block that doesn’t syntax error in the highest voted answer from the highest voted question and return it as a module. If that answer doesn’t have any valid python code, it checks the next highest voted answer for code blocks."

I once implemented a custom importer as part of a system where the Python interpreter never touched the filesystem.


That stackoverflow library is so gloriously bonkers. Is it useful in practice?


No. Perhaps for a hackathon.


Lies!


Can I just say that the introduction to Lisp you made in your README.md is really good!

I keep trying to get into Lisp (and JavaScript, and TypeScript, etc etc) but I've been a sysadmin my whole professional life and also a chronic pain sufferer. That translates into mostly having energy only for work and that's it, not much motivation to learn after work or on the weekend.

In my DevOps job, I write Terraform, plus read javascript and cloudformation yaml. I do wish I could convert my current stuff to AWS CDK, but I don't want to fragment the multiple projects that are using Terraform. (I haven't looked into tf-cdk much at all yet)


Not mine! That was all Scott Bell. It's forked from Lumen: https://github.com/sctb/lumen

But, I did make an interactive tutorial here: https://docs.ycombinator.lol/

If you have any questions about it, I'd be happy to answer. This stuff is pure fun mixed with a shot of professionalism.

For what it's worth, as someone with narcolepsy, I relate quite a lot to your chronic pain. (https://twitter.com/theshawwn/status/1392213804684038150) For me, it mostly translated into wandering aimlessly from job to job, since I thought no one would have me. I hope that you find your way -- there's nothing wrong at all with taking it slow and spending years on something that takes others a few months. Everyone is different, and it's all about the fun.


Does "-5e4" really evaluate to "-5000" in this language?


Hah! Good catch! That readme typo has been in there since Lumen’s inception.

It evaluates to -500000, as you’d expect.

(Just kidding, it’s -50000. Amusingly, the https://docs.ycombinator.lol version gets it right, since it has to; every expression is actually evaluated in the browser.)


Thanks for sharing this. I didn't expect to read that whole gist but I did and I'm glad I did. Happy for you.


Lots of things you can do with Python but probably shouldn't and people typically don't. That's one reason I prefer it to Ruby, or even Node, where monkey-patching or otherwise exposing bad magical behaviours is common and even encouraged -- the power is all there, but the ecosystem encourages you to use it for good, not evil.

This sounds very much like the good kind of magic, though.


To add to this: bad magic is magic that the user has to be aware of in order to use safely. Good magic is an implementation detail that the user doesn't need to know anything about.


I don't even know what that means.


Wait, what?

> that you create

As in existing language supplied eg. Perl, Java, etc., or literally anything? Like bootstrapping your own home made language from scratch?


Literally anything. 'Tis a homemade homegrown lisp, grown by Scott Bell for several years till I took it all for myself. Nom nom.

It starts with reader.l: https://github.com/shawwn/pymen/blob/ml/reader.l where the raw character stream is turned into a bunch of nested arrays. E.g. (+ 1 2) becomes ["+", 1, 2]

Then it's punted over to compiler.l https://github.com/shawwn/pymen/blob/ml/compiler.l where it's passed through `expand`, which does a `macroexpand` followed by a `lower`. E.g. (do (do (do (print 'hi)))) becomes ["print", "\"hi\""]

Then the final thing is thrown to the compile function, which spits out print("hi") -- the final valid Python code that gets passed into the standard python `exec` function.

Works with all the standard python things, like async functions and `with` contexts. Been screwing around with it for a few years.


That's absolutely disgusting. But in a Web Assembly sort of way... I don't know whether to spit on it or give it a medal.


I have a homemade language that's not a lisp, but is lispy in some ways. I've only got to the point where I expand the code to an abstract syntax tree, and deciding how to go from there is the hard part for me right now. It's never crossed my mind to just compile it to valid python code. Thanks for the inspiration!


If the import system allows your code to run instead of the ‘import’ statement, and to produce the module however you want, then of course you can do whatever: load code from Google or StackOverflow results, if you wish.


> As in existing language supplied eg. Perl, Java, etc., or literally anything? Like bootstrapping your own home made language from scratch?

Should work with anything as long as you can ultimately generate Python bytecode (and provide a module object). The import system is not simple, but it's really open.


NodeJS allows this as well. I think this is pretty much a must-have feature for any serious dynamic language.

Edit: A must have for any prolific dynamic language. But now I’m not sure that’s true, because even though it apparently works in Python, it’s certainly not widely used. In NodeJS this feature is used quite heavily for typescript, coffeescript (etcetera) interop.


> But now I’m not sure that’s true, because even though it apparently works in Python, it’s certainly not widely used.

I mean, it is used—even in the standard library—but often for alternative packaging (e.g., loading python modules in zip files) rather than alternative languages. It may be used less prominently than in Node, but it definitely is used for a variety of things.


It's pretty much the exact feature that's behind the saying that ‘to parse Perl you must have the Perl interpreter’. Because Perl allows some kind of language/imports handlers, as exhibited by tricks like Lingua::Romana::Perligata.


A master of the dark arts I presume.


You mean that you can override the import mechanism, which means that it allows you to do just about anything, including making it work with other languages.

That doesn't sound cursed to me, just flexible.


What’s flexible for a dynamic scripting language is often cursed from a static perspective. Knowing what imports resolve to statically can be nice.


There are other purposes as well.

In the first .com wave we used a similar mechanism, just with TCL instead of Python, to ship our scripts encrypted.

Only our modified loader, written in C, could handle them.


I don't get it. Where do you define which Lisp-to-Python translator to use? It certainly doesn't seem to know on its own.

    $ touch foo.l
    $ python3
    >>> import foo
    ModuleNotFoundError: No module named 'foo'


Close!

  git clone https://github.com/shawwn/pymen -b ml
  cd pymen
  touch foo.l
  bin/pymen
  (import foo)
It's a bit of a WIP (notice this is on the `ml` branch, not mainline), but it does work. >:)

You need nodejs to be installed too, ha.


I'm guessing something else needs to be imported first.


Hylang also does this. Macropy too.


JPype does this for importing Java classes.


There are two things that are annoying about Python's import system.

Number one is that relative imports are weird. My intuition about imports is good enough that I never bothered to learn all the rules explicitly, but sometimes something simple is just not possible and it bites me. I think the case is importing files relative to a script (and not running with python -m ...).

Number two is, in order to do package management, you have to create a fake python installation and bend PYTHONPATH. Virtualenvs are the canonical way to do it, but to me it feels like a hack - the core language seems to wants all packages installed in /usr. So now I have all these virtualenvs lying around and they are detached from the scripts.

Why couldn't the import system resolve versions, too? You could say `import foo >= 1.0` and it would download it to some global cache, and the import the correct versions from the cache.


Once I wrapped my head around when you can and can't use relative imports, I've been pretty ok with them. The think that irks me is whether they work changes based on where you've invoked Python from. `./bin/my_script.py` behaves differently from `./my_script.py`.

Coming from JS, that was a pretty frustrating realization.


> The think that irks me is whether they work changes based on where you've invoked Python from. `./bin/my_script.py` behaves differently from `./my_script.py`.

It does not though, unless you have altered the default sys.path to always contains `.`.

When running a file / script, Python will add that file's directory as the first lookup path, so these two invocations should have the exact same sys.path, and thus the same module resolution throughout[0].

If you have added `.` to sys.path (which is not the default), then the first invocation will also have the "grand-parent" folder on the sys.path, while the second won't.

This also doesn't seem to have anything to do with relative imports, you need to already be inside a package for relative imports to do anything.

[0] that is not the case of `-c` and `-m`, both add CWD to the path, and they differ in what they do: `-c` adds `''` to sys.path, which is the in-process CWD. `-m` stores the actual value of CWD at invocation, so changes to CWD while the program runs don't influence module resolution


I actually spent time wrestling with exactly this earlier today, and I think I finally have a 2 line solution for it:

https://pastebin.com/5WCXb2Gg

Assuming `foo.bar` is a script inside the package `foo` that you want to both import and run directly (without `python -m ...`), this lets you do so without too much hassle.


> The think that irks me is whether they work changes based on where you've invoked Python from. `./bin/my_script.py` behaves differently from `./my_script.py`.

This.

IMO that's also one of the main issues of Bash – you can't modularize a script unless you make sure your working directory is the directory containing the script. (And good luck with finding out the latter! [0])

[0]: https://stackoverflow.com/questions/4774054/reliable-way-for...


Relative imports used to work much more naturally IMHO in python2 but then they broke it in python3 because Guido wanted scripts and modules to always be separate codebases. So, whereas it used to be easy to have a module that could also be run as a script inside a package, this is now very difficult to implement. To the extent that any python2 code that does this, should probably be refactored when being ported to python3.


This decision has bit me in the ass so often.

I want to organize my code logically in directories. As a script grows, I want the ability to spin out parts of that file to separate files.

In order to do that in python, I need separate directories between the script and the spun-out functionality. This ends with a script that says "do function from module" and all code being in the module.

Having code in different directories for no reason except "the import system" sucks. How is this supposed to go?


FWIW, my command-line tools and any support code are all in a subdirectory under my package directory.

I used to include a simple stub Python program (~10 lines) as a "script" in my setup.py that would import the right code and call it.

Then I learned that "entry_points" implemented most of the dispatch behavior I wanted.

I no longer have those scripts, just an entry that basically says "from abc.cli.prog1 import main; main()". The prog1.py, prog2.py, etc. look like normal command-line scripts, assuming the usual:

    if __name__ == "__main__":
        main()
The major downside is I can't suppress SIGPIPE and SIGINT until main() is called, leaving a wider window where something like ^C gives a unwanted Python stack trace.


So as your script grows, don't you eventually rename it to whatever/__main__.py and run it with -m? seems like a fairly trivial transform that then allows you to spin out as many modules as you need.


Are you sure you're not confusing changes to the import system with changes to the default sys.path (especially when executing a file)?

Because unless I'm missing something, the changes to the import system make this more reliable, not less: in Python 2, implicitly relative imports means `import x` might depend on the file's location. In Python 3, it does not, it depends solely on the setup of the sys.path.



Nice, do you know of anything similarly-comprehensive that is updated for Python 3? IIRC, Python 3 simplified things a good deal by removing "implicit" relative imports, but I'm a little foggy on exactly what that means.


Explicit relative imports use a "." to indicate the file/package is from the current directory and not found elsewhere in sys.path:

  from . import foo
  from .foo import bar
Implicit relative imports don't have such an indicator:

  from foo import bar
In py2, that second one could be a relative import or from anywhere in sys.path, while in py3 those implicit relative imports were removed so that means it'll only only look in sys.path and not the local directory.

https://stackoverflow.com/questions/48716943/what-is-python-...


> IIRC, Python 3 simplified things a good deal by removing "implicit" relative imports, but I'm a little foggy on exactly what that means.

Exactly that: in Python 3, if an import doesn’t start with a . it will always be absolute aka looked up from sys.path.

In python 2, it would try performing a relative import (so importing a sibling module) first.


> the core language seems to wants all packages installed in /usr.

There's also ~/.local/lib/python3/site-packages (or whatever your distribution made of that). Virtualenvs are only necessary if you want to isolate dependencies between environments. That's useful if you have projects with conflicting dependencies, because Python doesn't allow you to install multiple versions of the same package, for better or worse. However, if you've written some simple scripts that don't care much about the exact version of their dependencies, it's perfectly fine to install their dependencies glboally.


Quick shout-out to nix-shell shebangs, which allow a script to specify exact versions of all dependencies, Python or otherwise, which will be cached into a temporary sandbox.

https://nixos.org/manual/nix/stable/#use-as-a-interpreter

I wrote a Python script yesterday which calls out to a couple of external commands (`mlr` and `xq`), with a shebang like this:

    #!/usr/bin/env nix-shell
    #!nix-shell -i python3 -p python3 -p miller -p yq


Earnest question: why are you all trying to use relative imports? What problem is that solving for you? I've never even bothered to try it out because it seems potentially problematic in the way all relative references can be, e.g., relative file paths.


I do use them quite often without much issue, not sure why some people are struggling with it.

Say you have a main package "mylib" with a subpackage "mylib.utils". Typically I like to see "mylib.utils" imports as being in one of (roughly) 4 categories:

- standard imports, that I would put first in the file (e.g. "import logging")

- external imports, that I would put second in the file (e.g. "import requests")

- library local imports, bits and pieces from the "mylib" package that you want to reuse in "mylib.utils", but are external to the current package (e.g. "from mylib.email import client")

- and package local imports, which I see as implementation details of the current subpackage, and should be agnostic from the overall architecture of "mylib" (e.g. "from .helpers import help_function")

The last category are modules that only make sense from within "mylib.utils", should relocate with it even if it is renamed or moved elsewhere, and shouldn't require a change whatever the structure of "mylib" becomes, which is why I would use "mylib.utils" relative imports in there.


People have a script run with python, and want to use code in other files.

This is not supported in python. For reasons beyond my understanding, you are supposed to put the script with python, or with the shebang, in a different directory.

Alternatively, you can always use `python -m` to run your code.


> People have a script run with python, and want to use code in other files. This is not supported in python.

Come again? This is what "import" does, obviously. This is not a clear explanation of your problem.


I missed one detail. The files and the script are in the same folder.

I say script here because the apparent intention is that scripts and modules are separate. That is why it's not easy to import functions from a file in the same directory as a script.

This restriction is very unexpected. And it is not borne out well by the fact that it is not obvious this distinction exists.


When you run a script, the directory containing it is on the path. You can just write "from somefile import something". What you can't do is "from .somefile import something" because it is not a package.

When you run a module, the root package containing it is on the path by definition and imports work.


I didn't know that.

Does that still work if you call the script from a different directory (so `python bar/foo.py`)?

Imma guess it doesn't work with symlinked scripts. But I don't think it should without doing an install, in which case a module does make sense.


The directory containing the file is added to the path, and the named file is imported as the __main__ module. That's why "if __name__ == '__main__':" works in script files.

I don't think Python will resolve symlinks here, so if that needs to work, you probably need to do it yourself before touching other code / files.


While I rarely need it, I like being able to vendor a package by dropping it into a project subdirectory.

If the package uses absolute paths then I would need to tweak all the imports for the new absolute path.


So much this. There's no good reason to use relative imports. They're less readable, more dangerous and don't solve a single problem.


In JS land everything is a relative import. It's nice because you can move a whole directory of code from one project to another (or around in the same project) and it still works because all the imports pointing to other files inside that directory were relative, you only have to fix imports that go outside the directory.

Also the way that JS imports are just relative paths is very nice because it means that the imports are statically determinable, your editor can understand them and fix them automatically and you can trust that refactoring. Python has turing complete imports because there's so much dynamic messing about with sys.path that goes on in Python due to inadequacies of the import system.


> It's nice because you can move a whole directory of code from one project to another (or around in the same project) and it still works

Yes, just like absolute imports...

> relative paths is very nice because it means that the imports are statically determinable, your editor can understand them and fix them automatically and you can trust that refactoring

1) Just like absolute imports?

2) I absolutely contest that relative imports are easier for an IDE to refactor. I've never had VSCode hang while refactoring a Java package; I've had VSCode hang while refactoring "create-react-app" app...


In Python you can do (and people do do) this

    import a.b.c
    sys.path.insert(0, 'some/other/path')
    import x.y.z
Or

    env PYTHONPATH="foo/bar:$PYTHONPATH" python ...
There is no way in general for Python tooling to figure out where an import points. All it can do is guess. It's a very regrettable situation.

Yes in Java it's fine because Java has a build stage. So tooling can figure out where imports point to from your static configuration.

However from the tool's perspective there is nothing better than a relative path. Relative paths require no messing about with configuration files at all to resolve. It's just a path to another file or directory on disk.

When a tool sees

    import c from "./a/b/c"
It can resolve it immediately.

So what is the advantage of absolute imports really? If import lines are mainly written and maintained by tooling then shouldn't we pick the representation that is easiest for the tooling? Then we can have more and better tools and the tools will be more reliable.

And it turns out that, relative paths are easy for humans to understand too. The same configuration-free resolution algorithm also works in your head when you are reading code! At least when the language doesn't overcomplicate them too much (JS is guilty of this to a certain extent, although nowhere as bad as Python)


Given most editors handle absolute imports as good as if not better than relative imports, I don't see any real benefit to using the latter.


It stops some circular imports if you use relative imports.


I'd rather suffer the late import with absolutes than wrestle with relative imports


IIRC, it was done for security, to avoid inadvertently picking up a library from an unexpected place. For instance, drop a file named test.py in the same directory as your python 2 script and have "fun" figuring out what went wrong.


> Number two is, in order to do package management, you have to create a fake python installation and bend PYTHONPATH.

Well whatever the language or runtime, you need to tell it how to find its dependencies. Be it with a configuration file, environment variables or parameters.

Your statement is not exactly correct either. You don't need a fake installation _and_ bend PYTHONPATH.

Virtualenv leverages the fact that "../lib/python<version>/site-packages/" relative to the interpreter is a default import path (default value of PYTHONHOME). It doesn't use PYTHONPATH.

> Why couldn't the import system resolve versions, too?

Not sure that would be really great. I prefer to have my dependencies all grouped together in a setup.py rather than scattered in various files at import time.


> Number two is, in order to do package management, you have to create a fake python installation and bend PYTHONPATH.

If I understand what you mean by package management in this context, I wonder if editable installs will help you.

https://www.python.org/dev/peps/pep-0660/#abstract

https://discuss.python.org/t/pronouncement-on-peps-660-and-6...


> Why couldn't the import system resolve versions, too? You could say `import foo >= 1.0` and it would download it to some global cache, and the import the correct versions from the cache.

What do you do about conflicts? Or say you have `import foo >= 1.0` in a file, and `import foo == 2.4`, but the latest version is 2.5, so the first import grabbed the latest version, and you later realize you need 2.4?

Imagine running a report generator for 5 hours, only to have the formatting module require a conflicting version of something and erroring out at run time...


> Number one is that relative imports are weird. My intuition about imports is good enough that I never bothered to learn all the rules explicitly, but sometimes something simple is just not possible and it bites me. I think the case is importing files relative to a script (and not running with python -m ...).

I think the thing to remember about using relative imports is that it requires modules. Using relative imports to import something into a script will fail because the scripts don't belong to modules.

> Number two is, in order to do package management, you have to create a fake python installation and bend PYTHONPATH. Virtualenvs are the canonical way to do it, but to me it feels like a hack - the core language seems to wants all packages installed in /usr. So now I have all these virtualenvs lying around and they are detached from the scripts.

This is a pain. You can abstract this away with with pyenv or virtualenvwrapper, though.


I never followed the changes around imports in 3 … but def still run into issues especially trying to move code from dev to deploy


Yeah, saner relative imports might be one of the few things that I envy the JS ecosystem for.


Python import system is by far the worst one I dealt with. Using Setup.py and regular or namespace packages, relative import, having complex sub packages and cross importing, running a script from somewhere inside one of your sub packages, and many more craps like these. Import system must be intuitive and easy to use!


Yeah it really tripped me up as a beginner. I think the hardest part to get used to was that the import syntax actually changes based on how, and from where, you are running your code. So depending on how you call your code, your imports might or might not work. This is ESPECIALLY painful when you are building a distribution. There is no syntax that works for all situations, which seems like it would be pretty important for an import system. I had to bookmark this tab, and still refer to it often.

https://stackoverflow.com/questions/14132789/relative-import...


… have you tried Rails’ autoloading? Especially when running rake tasks?


What import system does not have these problems?


This is great: I'm learning so much reading this.

It lead me to read the source of the Python "types" standard library module, which really does just create a bunch of different Python objects and then use type() to extract their types: https://github.com/python/cpython/blob/v3.9.6/Lib/types.py

Some examples from that file:

    async def _ag():
        yield
    _ag = _ag()
    AsyncGeneratorType = type(_ag)

    class _C:
        def _m(self): pass
    MethodType = type(_C()._m)

    BuiltinFunctionType = type(len)
    BuiltinMethodType = type([].append)     # Same as BuiltinFunctionType


Python importing quirks can be time consuming. Things beginners will encounter:

- three modules cannot depend on each other in a circular way

- relative imports are fragile ("module not found")

- the __all__ definitions in the __init__ file make modules available under different full names

- how to reload a module in a jupyter notebook if edited

and so on.


I’m always amazed just how tolerant javascript’s import system is when I have circular imports. I guess maybe because it doesn’t care about modules and just cares about specific elements that are being imported/exported.

When I do have a nasty circular dependency Webpack usually does a bad job telling me what’s wrong.

Though I should still treat circular imports as, at the very least, an organization code smell.


Circular imports in JS are fine and not a code smell, for instance in typescript you may have two classes that reference each other's types - obviously this is not a real import from JS's perspective but the point is that you should not have to care whether it is real or not. That's a world we don't want to live in.

Circular imports are only ever a problem when you have code running when the module loads. Then you run into module load ordering issues. So avoid any side effects on module load and make all setup explicit.


IIRC an explicit design goal was to enable circular deps (hence why imported bindings are considered "live". It's interesting to see this works in practice though, I've never tried using them myself.


Circular imports in JS matter when the imported code is being called immediately at import time. If a file defines functions that later call functions from another file (and vice-versa), but those symbols will be populated before the function actually gets called, there's no problem


Yeah. It’s so flexible. I get frustrated at Python (Django) serializers that legitimately need to depend on each other. And the answer on the forums is to create a near duplicate class.


JavaScript does not provide namespaces by default which allows this.

Python is built upon namespaces and cycles in dependencies, either viewed as graph theory or kSAT introduce that fun NP-complete problem of version hell.

Using `need` in Javascript maintains the directed acyclic graph structure, but if you get fancy you will run into the same problems with circular depends in Python.

Karp's 21 and/or SAT will catch up with you at some point if you don't respect the constraints that make the problem tractable.

Note I am not saying I prefer or like pythons choices...but that they had to make one.


Sorry what’s ‘need’ ? I can’t seem to find it as a keyword.


Allowing circular imports was the ugliest part of ECMAScript modules.


> __all__

No it doesn't. __all__ just defines which objects are imported when doing a star import.


I usually deal with this by defining a cell with the actual contents of the class/module code so that I can just re-execute that cell any time I make changes to it. Then I simply copy/paste all the code back into the module.py file once I'm done tweaking it and playing with it. Thus for me Jupyter sort of operates as almost an IDE of sorts.


probably, you may add to the list

- understand difference between Python in IDE and Python in shell

So many times people do `pip install <>` and still not able to use in IDE


> three modules cannot depend on each other in a circular way

What would the purpose of circular modules like this be? You may as well collapse into a single module and the situation would not be any different would it?


What would the purpose of circular modules like this be? You may as well collapse into a single module and the situation would not be any different would it?

What is the purpose of modules? You may as well collapse into a single script and the situation would not be any different, would it?

I'm not being facetious here. The answer to the second is the answer to the first.

A common example might go like this. You have a module for each kind of thing you have in the database. But now if someone loads a Company, they need to get to Employees. And if someone loads Employee they need to get to Accounting for the salary, payments, etc. And Accounting needs to be able to get to Company.

Those are all large and complicated enough that it makes sense to make them modules. But you just created a circular dependency!

The standard solution is to load a base library that loads all kinds of objects so they all can assume that all the others already exist and don't need a circular dependency. But of course someone won't like wasting all that memory for things you don't need and ...


> But you just created a circular dependency!

Only if those things not only need to be able to “get to” each other, but also need to know, at compile time, about the concrete implementation of the others.

That's possible to be a real need, but its also something that often happens because of excessive and unnecessary coupling.


The coupling that I described needs to be in the software because it exists in the real world that the software is trying to describe.

However your "compile time" point is important. There is another solution, which is to implement lazy loading of those classes.

So you put your import in the method of each that needs to know the other. This breaks the circular dependency and needs more up front memory. However it can also become a maintenance issue where a forgotten import in one function is masked by a successful import in another, until something changes the call and previously working code mysteriously goes boom.

It's all tradeoffs.


> What is the purpose of modules?

So parts of the system can be managed independently.

> The answer to the second is the answer to the first.

Clearly not - since circular modules cannot be managed independently!


To me the purpose of models is to help humans manage code. Our brains don't hold much at once, so the more we can forget about in a given circumstance, the easier it is. So I think btilly is correct: the reason I want modules, ignoring things I don't care about, is the same reason I want them to deal reasonably with circular references.


Exactly.

In the example that I gave, the design described will handle complex edge cases such as a part time employee working for multiple companies. And will do so without programmers having to think through the whole system at all points.

Independence of modules has no importance in a codebase that ships as a whole. But modularity does.


In my experience they creep in over time as the system grows. Coupling between parts of the system that was previously unnecessary is added, and the cycles form.


I don't think purposeful circular dependency, but you can end up with circular imports after refactoring for instance.

The common approach to solving this is pulling everything that is used by all the modules into leaf libraries, effectively creating a directed acyclic graph, but this is not obvious nor easy to do the first time.


A Customer has a BillingContact, which references a Person, which has primary Customer.

Boom, circular dependency.

Happens in basically all corporate code bases that grow over the years, with varying path lengths.

Throwing all potentially circular types into one big module isn't a great solution.

(In practice, we tend to rely on run-time imports to make it work. Not really great, but better than throwing several 10k or 100k lines of code into a single module).


I guess I don’t get why so you want to use separate modules if you aren’t getting the benefits of modularisation?


I still get some benefits of modularisation, even if there are some cross-dependencies.


Suppose you have two classes, A and B. They are sufficiently complex to merit their own modules.

Suppose you have some method of A which does something special if it gets an instance of B, and vice versa. Now you have a circular import problem; glhf


I generally solve this problem by having a module specifically containing the abstract base classes of each of the classes I will be working with that implements either no or bare minimum functionality for these objects. That way, any other module can import this module and have visibility of every other class I will be working with.


Won't this work just fine if you instead of writing:

  from a import A
write

  import a
and in the code (which is presumably not at module level) check against

  isinstance(obj, a.A)

?


> Now you have a circular import problem; glhf

But why not collapse into a single module at this point if you can’t avoid dependency? What are the separate modules adding at this stage forward?


You can, but sometimes it’s not ideal.

I ran into this when A and B had many derived classes. I wanted to put A and it’s derived classes in one module, and B and it’s derived classes in another. It was messy.

I wound up putting A and B in a single module and having a separate one for the derived classes. Not ideal.


It does sound ideal to me, or at least better than the initial proposal.

A and B both need to know about the other's base definition, neither cares about the details about the other's derived classes. Splitting it into three modules shares as little surface area as possible.


> Suppose you have some method of A which does something special if it gets an instance of B.

While that’s in rare circumstances the right thing to do, it's mostly an anti-pattern—you should be taking an object supporting a protocol, with the behavior difference depending on a field or method of (or actually implemented in a method of) that protocol. If you do that, you don't create a dependency on a concrete class that happens to require the special behavior.


IMHO that's code smell. Modules shouldn't depend on each other, because that creates a web of tangled dependency where you have to understand everything before you can understand one of them. Circular dependency is to modules what goto is to control flow.

Besides, if you are in a "well, fuck it, deadline is tomorrow" mode, you can always do something horrible like:

    if 'classB' in type(obj).__name__: ...


I think bad code gives raise to more dependencies in general and so circular dependencies.

But the truth is sometimes it has happened to me and the only solution I found was creating an small module with maybe one or two functions which is not exactly ideal.


In my experience, typically a codebase that has grown organically in a different direction than the original design, and where the cost of refactoring is not deemed worth it.


The purpose is you want to use code in other modules.

If you keep doing that for a while, a circular dependency will happen.

This is the dumbest thing thing in Python. All other languages I know have solved it.


I've never had a circular dependency I couldn't resolve. Just organize your modules into a DAG. I have a pretty standard flow:

- protocols and abcs. Depend only on builtins, this sits at the top of the hierarchy and can be used anywhere

- top-level types/models, data structures. Only depends on external libraries

- config and logging. Depends on pydantic, which handles all init config validation, sets up log level. Many things import this

- specific class implementations, pure functions, transforms. imports from all the above

- dependency-injected stuff

- application logic

- routers and endpoints

I've had some close calls since I type check heavily, but protocols (since 3...6? 3.7 for sure) are a fantastic tool for both structuring and appeasing mypy.


> The purpose is you want to use code in other modules.

So put them in the same module? Circular modules don’t give you the benefits of modules, do they? Not an expert in modularity.


Putting all your code in one file does indeed solve all import problems, but creates far bigger ones.

In case you didn't know, each source code file is a module in Python.


If you have A -> B -> C -> A, then you need all three of A, B and C to be defined if you're going to use any one of those modules. When that's the case, the only thing you're gaining by using modules is the organisation of the file.


Sure, but organizing code in files is one of the most important parts of programming!


Probably half of the commenters here know this, but since we're here, this is my go-to boilerplate for starting a python script. (Probably won't work on Windows.)

    #!/bin/sh
    # Run the interpreter with -u so that stdout isn't buffered.
    "exec" "python3" "-u" "$0" "$@"

    import os
    import sys
    curdir = os.path.dirname(os.path.realpath(sys.argv[0]))
    # Add enough .. to point to the top-level project directory.
    sys.path.insert(0, '%s/../..' % curdir)

    Your main program starts here ...


> # Add enough .. to point to the top-level project directory.

This suggests that there is more than one entry point to the Python project?

While I'm sure there are good reasons for this, and while I'm not criticising your instance of this specifically, as a general point of advice I've found this sort of thing to be a bit of an anti-pattern.

Having one entry that handles things like path setup and other global concerns, before delegating out to subcommands or whatever construct works best makes it much easier to keep the whole codebase aligned in many ways.

Django has a system for this and while it has its flaws, it is nice to have it. Using this, on our main Python codebase of ~400k lines, we have a single manual entry point, plus one server entry point. Coordinating things like configuration, importing, and the application startup process, are therefore essentially non-issues for us for almost all development, even though we have a hundred different operations that a dev can run, each of which could have been a separate tool like this.


I have been a huge fan of this issue, too!

For additional bonus points, have your single entry point exhibit a CLI that properly documents everything the developer can do, i.e. what features are available, what environment variables and config flags can be set etc. That way, the code essentially documents itself and you know longer have to keep your README file updated (which people always tend to forget).


I use a very similar flow. The highly-opinionated-yet-effective pattern I use involves pydantic, cleo, and entry_points/console_scripts in setup.py.

- everything is structured as a module

- options and args are stored in their own module for easy reuse

- the whole stack has one cleo.Application, with however many subcommands. Usually of the form "mytool domain verb" e.g. "mytool backend start."

- cleo args/options are parsed into pydantic objects for automatic validation (you could do this with argparse and dataclasses to skip the deps but it's more work)

- each subcommand has a `main(args: CliArgsModel)` which takes that parsed structure and does its thing. This makes it super easy to unit test

I install into a venv with `poetry install` or `pip install -e` for editable installs.

It all just works, no fuss, and so damn modular.


There's no well known module to do most this for you? In perl, the recent canonical way is to use the FindBin module to find the current binary's running dir, and the the local::lib module to set the import path (or just use lib for older style library dirs). That always seemed cumbersome to me at 2-3 lines that weren't very clean looking.

Also, say what you will about Perl and esoteric global variables, but it's kinda nice to be able to toggle buffered output on and off on the fly. Is there really no way to do this in python without re-executing the script like that?


Ya... if you're trying to get the path of the script, you can use `__file__` special variable (instead of loading it from bash $0 and grabbing sys.argv[0]).

For adding current directory to front of path, the sys.path.insert() call is a pretty sound way of doing it.


Yeah, I think you're right - didn't know about __file__.

(To clarify, using "$0" with bash is just standard method to invoke the same script with an interpreter - sys.argv[0] will work with or without bash exec part.)


That's crafty; it reminds me of the suggested similar tactic with tclsh since tcl honors backslashes in comment strings(!): https://wiki.tcl-lang.org/page/exec+magic


Wouldn't this part:

    #!/bin/sh
    "exec" "python3" "-u" "$0" "$@"
Be the same as this?

    #!/usr/bin/env python3 -u


One difference is that parameters in the shebang is non standard and not supported on all OSes. Linux supports it, though.


And I think only one. Tripped me up once.


Why not just add a shebang , chmod +x and then you're done?


    #!/usr/bin/python
This is sometimes what you want, but it will always look at this exact path, and won't play nicely with virtualenv/conda.

    #!/usr/bin/env python
This works - it will use Python found in $PATH. Unfortunately you can't add any more parameters to the interpreter.

The contraption I wrote allows adding arbitrary parameters - I was burnt one too many times by Python silently buffering my debug messages, so I use it to always add "-u".


Reading `man env` in Linux shows that you can use:

  #!/usr/bin/env -S python $ARGS
For example:

  #!/usr/bin/env -S python -i
  
  print("Entering interactive mode...")
I'm not sure if it works in other OS.


I'm just grateful I'm not a core Python dev after reading this thread. I've never seen so much negativity concentrated in one place in quite some time, for a feature of a programming language which is fairly innocuous.


I love Python, but to be fair, the dependency/import system has not aged well, and the various projects trying to fix it are proof of that.


They're not trying to fix the import system.


Python is the language everyone at HN loves to hate. One presumes it has something to do with the fact that it's Y Combinator's recommendation for most startups.


Despite using python for the past 4 years, it still takes me several tries to set up packages and imports correctly when I make them myself. Honestly, I wish that python had an import system similar to JS (where you can just say “I want this file” <insert path to file> and specify the exports yourself). For me, it just feels more intuitive and less “magic”-like when dealing with custom scripts you want to import.


I have seen way too many `ModuleNotFoundError`s. It is moderately infuriating when the two files are in the same directory and python can't find the module.

Honestly that error is misnamed. It should be `ModuleImportRefusedError`.

And the frustration caused by getting PyTest to work in a project is likely responsible for a large percentage of the untested python projects in the world...


I’ve not had either of these issues. Maybe a detail left out?


the first occurs if you haven't created __init__.py.

the second occurs if you want your test files outside of the package directory (e.g., project/src/foo_app/foo.py, project/test/test_foo.py)


Years ago, I thought I'd try learning Python, I'd heard it was supposed to be easy, good for beginners and everything. I read one of those beginner Python type books and followed along with a roguelike tutorial. Everything was going pretty alright, until I started trying to split everything into different files and use imports.

I ended up just giving up. I read programming in lua, and rewrote my entire project in lua and actually finished it.

Some day I'd like to go back and maybe learn Python but I really didn't enjoy my experience with it. I even found C headers easier to figure out than Python imports.


Strange, Python’s import is not very difficult to use.


It's on of those things people using Python for so long forget about: Some people try to run individual files and cd around the place. I never do that anymore. I have a test suite and breakpoints and that's it. But before you've learned those tools it feels natural to "run that file there" and then say "oh hey why doesn't it work any more?"


It has some wildly frustratingly unintuitive behaviours in precisely the wrong place for beginners: in between having everything in a single script and building a proper package, especially when you are invoking your script with 'python script.py' as opposed to say 'python -m scripts.script'.


Yeah. Start writing a program in 'myprogram.py' as things grow do the right thing and split a function out to its own file and import it. It doesn't work. Suddenly you need to learn a whole bunch about python modules and the import system and scripts vs modules, and some of the questions you have just literally have no good answer.


Import by module:

    ⏵ ls
    module.py  script.py

    ⏵ cat module.py 
    def foo2():
        print('foo2 called.')
        
    ⏵ cat script.py 
    import module

    def foo1():
        print('foo1 called.')

    foo1()
    module.foo2()

    ⏵ python3 script.py
    foo1 called.
    foo2 called.

Import by name:

    ⏵ cat script.py 
    from module import foo2

    def foo1():
        print('foo1 called.')

    foo1()
    foo2()

    ⏵ python3 script.py
    foo1 called.
    foo2 called.


I moved module.py into a modules folder to clean things up and now I get:

"ImportError: attempted relative import with no known parent package"

Looks like I'm back to having to learn a bunch of stuff about scripts and modules again?


>python3 script.py

I was using Python 2 at the time. Python 3 was still relatively new. Not sure how much difference it makes for your example, but the import systems are different between 2 and 3.

https://nerdparadise.com/programming/python/import2vs3


It was probably something I did. The original tutorial I followed had everything in one file and didn't get into anything about imports. I started splitting everything up arbitrarily and started tossing imports into the files that complained about missing dependencies and ended up getting overwhelmed because nothing worked.

I'm sure if I'd taken the time to try and fix it I eventually could have and at this point i've had more experience with a bunch of different languages, so I'm sure it's not as bad as I remember.

I imagine it's one of those cases where if i were to go back and laugh about how stupid I was, but ya know, those first impressions.


No, your impression was right. Reading this blog post made me realize how little I know about the python import system (and I use python daily), and at the same time how little I want to learn it. It is completely unintuitive and probably one of the worst aspects of otherwise beautiful and useful language. Fortunately, sys.path hack works reliably - one can just add that one line and imports work as expected.


There's only about three things to learn at first.

If you know how a path and relative path work from the shell, there is only one thing: touch __init__.py.

These are on the import doc page and beginners tutorial.

https://docs.python.org/3/tutorial/modules.html#packages


I've been programming in Python professionally for more than five years. I consider myself quite a good programmer. I still don't have good grasp on Python's import system. Does anyone else have similar experience?


When in doubt, put some more dots in front of your import statements. Or remove them? Maybe I need an extra __init__.py somewhere? Oh I'm importing from a subfolder, what do I need to do to get that work again? I can't remember.


This


I'm a python readability approver at Google and I don't understand how the import system works


Interesting... If someone wants to know more about what a readability approver does: https://www.pullrequest.com/blog/google-code-review-readabil...


To be fair, Google's python avoids ~99% of the complexity of Python's import system by making all imports absolute and doing most things through blaze/bazel.


I’ve only been writing Python for about a year, but I’ve found it much harder to grasp how dependency resolution and imports work than other languages I’ve picked up (JVM, Node, Go, C).


Same. Occasionally I will get into some sort of mess, learn how it works under the hood enough to get myself out, and the promptly forget everything.

And I think that's for the best. I'd much rather have a happy path that I stay on than use some sort of dark magic that nobody who comes after me will understand.


I have >10y python exp. Vanilla import system? I grasp it quite well.

- Entrypoints? Getting there.

- Namespace packages? Ehh. Murky

- site-packages/mypackage.pth - I get it but I don't know why sometimes it appears and other times not

- c extensions? Ehhh.

- .so loading? Kinda magic.

- the confluence of editable installs, namespace packages, foo.pth, PYTHONPATH, sys.path, relative imports, entry points, virtualenvs, LD_LIBRARY_PATH, PATH, shell initialization, setup.py vs pyproject.toml? Um yeah that's some heavy wizardry.

Tbf, you don't need the vast majority of that to be effective in python.


The average python programmer does not really need to deal with pythons import system that much (just be aware of how it does its module loading and that you can conditionally do stuff sometimes with __import__ etc). As someone who has messed around with it a lot (dynamically loading/unloading modules, modifying on the fly etc) I would NOT recommend doing that stuff for anything in production.


I've actually found the Python import documentation [0] to be really approachable. I'd give it a look if this article didn't clear things up.

[0]: https://docs.python.org/3/reference/import.html


I sometimes fall into the trap of using pip to install dependencies and then things break after an os update. That is, my python version has changed from 3.8 to 3.9 and my dependencies are sitting in the wrong directory. I never know if I should use pip and requirements.txt or rely on Ubuntu's packaged versions.


I used to have a rule of, never use pip and only use Ubuntu's/Debian's packaged versions. That works pretty well if you're happy with the packaged versions and you don't need unpackaged libraries.

I now have the rule of, only ever use pip inside a venv. If your venv is more than a little bit complex, write a requirements.txt file so you can generate it. So it's something like

    $ cat > requirements.txt << EOF
    tensorflow
    Django==3.2.5
    cryptography
    EOF
    $ echo venv > .gitignore
    $ python3 -m venv venv
    $ venv/bin/pip install -r requirements.txt
    $ venv/bin/python
or, if you prefer,

    $ . venv/bin/activate
    (venv)$ pip install -r requirements.txt
    (venv)$ python
Then when your Python version changes, or you get confused about what's installed, or whatever, you can just blow away the entire venv and recreate it:

    $ rm -r venv
    $ python3 -m venv venv
    $ venv/bin/pip install -r requirements.txt
and you're in a known-good place.

Either of these rules works fine. The thing that works poorly is using pip install outside of a venv (with or without root).


For me the rule is to always use pipenv locally and pip + requirements.txt (generated by pipenv) for production (in docker container usually). No complaints.


Just pip install again under new interpreter. Use distro versions if a script will primarily run there, and ok if an older version.


same! I've always been shielded from it with Django's conventions. (the ecosystem I mainly work in). I used a lot of '.' and '..' imports but I think something changed in python3 that made that strategy a lot less forgiving... now I _really_ should read the entirety of this article!


Always use absolute imports, do not use relative imports. Solves most problems. Also is recommended by pep8.

Skip the relative imports in the article. No need to read it entirely anymore ;-)


I thought not knowing that made me a noob.


I created this fun hack that taught me a LOT about the import system. Basically it allows you to import anything you want, even specify a version, and it will fetch it from PyPI live. Might be interesting to flesh this out in a way that's deployable.

Basically instead of

    import tornado
and hoping and praying that the user has read you README and pip installed the right version of tornado, you can do

    from magicimport import magicimport
    tornado = magicimport("tornado", version = "4.5")
and you have exactly what you need, as long as there is an internet connection.

https://github.com/dheera/magicimport.py


Cool! But why not just throw the code a package with a version dependency on tornado?


Because then you'd still have to install the package.

It's nice to have something that "just works" without having to install it. I like to call it Level 5 autonomous software -- software that can figure out how to run itself with zero complaints.

I actually use this for a lot of personal non-production scripts, I can just clone the script on any system and just run it, it will figure itself out.

Also, packages with version dependencies fuck up other packages with other version dependencies, unless you set up virtualenvs or dockers or condas for them, and those take additional steps.

magicimport.py uses virtualenv behind the scenes, so when it imports tornado 4.5 because some script wants 4.5, it won't mess up the 6.0 that's already on your system. In some sense it just automates the virtualenv and pip install process on-the-fly, to give the effect that the script "just works".


There's a very cool (and succinct) blog post[1] showing how to abuse this in an interesting way where you can put the code of a module into a string and load it that way.

Not something I'd use in production, but it's a very clear way to see how both "finding a module" and "loading a module" works under the covers.

[1] https://cprohm.de/blog/python-packages-in-a-single-file/

Edit: As an aside, I much prefer the way Perl does things in this space. It's much easier to define multiple packages either within a single file or across different files, and much clearer about what's happening underneath.


But despite being terrible the Python import system is remarkably easy to get started in. And generally easy for beginners to work with (put all the code in the same folder or "pip install").

There are some lessons here that other languages would do well to learn. Trouble importing 3rd party libraries must be a kiss of death for beginner engagement.


I disagree. I find it hard to get started in python. There are loads of package managers, so I don't know which one to pick. There are multiple different rules for for how imports work with local files. The standard library is full of of functions that one should not use any more, and you need to know which is which. Defining a main entrypoint consists of checking a magic __NAME__ constant.

You have internalised all these quirks, and know how to work with/around them. Beginners haven't.


Counterpoints:

Start with pip, the official package manager. You don't have to start with virtual envs: I didn't.

I haven't ran into any function which I "should not use".

You don't need an entry point, you can just write code in a file. Besides that, I don't see how much it differs from other languages with implicit entry points, where you need to match a certain name in your function.

Imports: yes, those are annoying and confusing. I still struggle with those.

Aditionally: package managers and virtual envs are a pain in the ass. Every year we're getting a new one which is supposed to solve the problems from the previous one, but doesn't, and the cycle goes on. The language should really solve this at the core instead of requiring community fixing, as it is a core part of any serious development.


Honestly I have had very few issues with poetry, and the ones I've had, it's because I'm trying to use plugins, which are alpha right now.

Dependency resolution just works. Editable installs just work. Building just works.

Before that, I only had problems with virtual envs once, and it was due to bad hygiene with system python libs and deps. Moral of the story: don't. Unless it's necessary to bootstrap virtualenv or compiled libs, don't system install python. Good ol' get-pip.py and virtualenv.


I disagree too... you can't even refer to a file in the same folder without using some magic (adding current folder to the path), which is a huge barrier when starting.


Current folder is in the path by default.


I vividly remember the frustration when trying to make some very simple application imports working. There's a couple of gotchas, the biggest one being that you need to write imports based on where you expect to run your applications from. Perhaps obvious for experienced python devs, very much surprising for newcomers.

And then I was quite shocked by the state of package managers in python. You need to learn pip, venv (with references to "virtualenv"), these are too low level, so you find pipenv, which is unbelievably slow (installing a single dependency can take 10 minutes), so you need to learn to use it with "--skip-lock", but then you lose locking ...

I've never appreciated node's bundled "npm" so much before which mostly "just works".


Poetry is pretty great comparatively, since it handles both dependencies, locking and virtual environments for you, but it has slow resolution just like pip (since the new update), pyenv, pipenv, etc.


No. Python is a language I have to use infrequently, but I give up half of the time using a project found on GitHub because of missing depencies, the need to install a package manager to install another package manager to install dependencies, etc. The other day I spent some times fixing a docker image that was working fine few months ago but which was now failing because some Python package install returned an error.

On the contrary, C projects tends to build with 3 commands and C# (often way bigger) with a single command, and without having to do magic things around "virtual environments".


One of the reasons I gradually fell out of love with Python. To get Python right you need to remember more protocol than Queen Victoria's master of tea. And it is truly protocol in the sense that there is always this arbitrariness hanging around it.


What do you use now?


Python, hehe. But I just don't like it that much anymore.


Hi! I'm the author of this article. Thanks for posting it.

I've been programming in Python for quite a while but didn't really understand how the import system works: what modules and packages are exactly; what relative imports are relative to; what's in sys.path and so on. My goal with this post was to answer all these questions.

I welcome your feedback and questions. Thanks!


It's certainly a lot better than Ruby's require which just executes code and alters global virtual machine state. Not too different from C's #include.

My favorite is Javascript's. Modules are objects containing functions and data, require returns such objects. Simple, elegant. Completely reifies the behind-the-scenes complexity of Python's import, making it easily understandable.


Though that's not how JS's new module system works, which I would say is also elegant in its own ways, but juggling two different systems is less so


I have no idea why they added a module system to the Javascript language itself. The old one was so awesome. Not sure what advantages the new system offers.


Lua does the same thing as JS. It's basically "dofile" with some indirection and extra path resolution logic.


Slightly off-topic:

One of the simplest import systems I've seen were in q/kdb (like with most things in that language, everything is as simple as possible)

Imports work by simply calling `\l my_script.q` which is similar to simply taking the file `my_script.q` and running it line by line (iirc, it does so eagerly, so it reruns the entire file whenever you do `\l my_script.q`, even if the file has been loaded before, which may affect you state. By contrast, Python `import` statements are no-ops if the module has already been imported).

The main disadvantage is that you risk your imported script overwriting your global variables. This is solved by following the strong (unenforced) conversion that scripts should only affect their own namespaces (which works by having the script write declare \d .my_namespace at the top of the script)

I never found this system limiting and always appreciated its simplicity - whenever things go wrong debugging is fairly easy.

What does Python gain by having a more sophisticated import system?


Python avoids having to rerun the import over and over again.

Suppose that you are importing a larger project. Where your one import (say of your standard database interface) pulls in a group of modules (say one for each table in your database), all of which import a couple of base libraries (to make your database objects provide a common interface) that themselves pull in common Python modules (like psychopg2) which themselves do further imports of standard Python modules.

The combinatorial explosion of ways to get to those standard Python modules would make the redundant work of parsing them into a significant time commitment without a caching scheme.

From the point of view of a small script, this kind of design may seem like overkill. But in a large project, it is both reasonable and common.


Yep. Rerunning things is how you end up with C/C++ style headers and all the (performance and otherwise) problems there in.


Python imports aren't necessarily always importing Python source code -- they can be pyc (bytecode) files, or C API extensions, etc.

These are slightly more complicated than "load the script at this path".

There's probably a more detailed answer, in that historically, decisions were made that we're now stuck with. Python packages can and sometimes intentionally have import-time side effects, for example. They must be only run once, without relying on convention, or we break existing code.


Everything is an object, and it “just works” for most purposes. (Not nearly as many as I would like, but still; it has backwards-compatibility to consider, so I'll cut it some slack.)

If you need to start installing your own user packages, you need `pip` and then `venv` and then things get ugly, but for the usual case where the sysadmin deals with all that (or you're on Windows), it works quite well.


Carl Meyer, a Python developer at Instagram, has ~recently discussed open source contributions to Python he has worked on.

He specifically describes “strict modules” and efforts to improve the efficiency of Python imports at 16:30 of this episode of Django Chat:

https://django-chat.simplecast.com/episodes/django-instagram...


My least favorite is that import ordering matters in some situations. Like if I just run "organize imports", all of a sudden dependency cycles pop up and everything is broken. Certainly a sign of things being misimported/misorganized, but stuff happens has systems grow fast. And solving these issues is always incredibly time consuming.

Unless anyone knows of magical tools to help solve import issues?


Yes, it is unfortunate. Loading modules can have side effects as the loaded module is allowed to execute arbitrary code at load time. This is also a source of ordering issues.

Maybe some think this is only a theoretical problem and doesn't happen with "well-written" libraries. Well, here is one example which bit me in the past: https://stackoverflow.com/a/4706614/767442


> Unless anyone knows of magical tools to help solve import issues?

Topological sorting? Always wondered why programming languages can't do what package managers do.


The breakage described can only occur if there are dependency cycles, so topological sorting can't fix it. If there are no cycles, then the order of imports doesn't matter.

edit: Actually I'm not even sure what kind of error we're talking about here. If two modules import each other and they both need access to the other's contents upon initialization, there is no ordering that will work. And if at most one needs access to the other, it will always work, no matter in which order they are imported. So I don't really know what the OP was talking about.


One interesting use case of overwriting `builtins.__import__` I've encountered was the automatic hooking by ClearML [0] (experiment tracking, ...) into all sorts of common libraries like Matplotlib, Tensorflow, Pytorch, and friends.

The implementation is surprisingly straightforward, once you've come to terms with the basic idea, see [1] and the rest of the `clearml.binding` package.

[0]: https://clear.ml [1]: https://github.com/allegroai/clearml/blob/master/clearml/bin...



This is so needed. I feel like a lot of people know enough about python imports, but everyone has pains with it.


I am so glad this post exists


Badly


"badly"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: