Hacker News new | past | comments | ask | show | jobs | submit login

I really hope a solution can be found and python can get true lazy imports (without doing the manual thing). I work in the ML/AI space and manual lazy imports are basically mandatory unless you want several seconds to see usage text from `footool --help`. Why "pay" for executing code you don't use?

Manual lazy import meaning:

    def uses_foo(x):
        import foo
        return foo.bar(x)
Manual lazy import sucks because:

- it's just ugly, I like imports all at the top so I can see all the deps

- bad for static analysis

- performance hit every time the function is called

Eschewing lazy imports has several problems:

- you always pay the execution cost, even if you don't use it

- also bad for static analysis and testing, since you have to eat the import time even if the code block you want to test doesn't execute the expensive path

- sometimes you need lazy import to avoid circular import errors

It's too bad that the main impediment to this is existing code which relies on side effects. Import-time side effects are an absolute pain in the ass. Avoid it at all costs.




> - performance hit every time the function is called

Python modules are always singletons, regardless of where they are imported. "import foo" inside a function will only import the module once (=global effect) but bind the name every time (=local, but cheap, effect).


With one exception -- whatever module is executing is named `__main__`, and if something else imports it by it's usual module name you'll get a duplicate copy.


Example:

  $ cat spam.py
  import sys
  print(f'__name__ = {__name__!r}')
  for (name, mod) in sys.modules.items():
      try:
          if 'spam' not in (mod.__file__ or ''):
              continue
      except AttributeError:
          continue
      print(f'sys.modules[{name!r}] = {mod!r} @ 0x{id(mod):x}')
  import spam
  
  $ python3 -m spam
  __name__ = '__main__'
  sys.modules['__main__'] = <module 'spam' from '/home/jwilk/spam.py'> @ 0xf7d86ed8
  __name__ = 'spam'
  sys.modules['__main__'] = <module 'spam' from '/home/jwilk/spam.py'> @ 0xf7d86ed8
  sys.modules['spam'] = <module 'spam' from '/home/jwilk/spam.py'> @ 0xf7caef78
More generally, Python is happy to import the same file multiple times as long as the module name is different.

For example, if there's eggs/bacon/spam.py and you add both "eggs" and "eggs/bacon" to sys.path, you will have two different modules imported after "import bacon.spam, spam".


But you're still getting the perf hit of calling a function and checking if the module is already loaded


It’s a fairly simple dict lookup. The same lookup would happen when you use something from that module, so it’s fairly insignificant in the grand scheme.

Besides, it’s Python. It’s not going to be super fast anyway. That extra check is never going to show up on a perf trace.


Yeah well there's never been a line of Python that's free


Given that modern ML libs can bring hundreds of megabytes of binaries with them, this is a great point. As for static analysis in Python, I am not sure it can get much worse :) The language is essentially built around the idea of making everything absolutely dynamic


Typing in python is pretty thorough nowadays, you’ve got overloads, literal types, strict null checking, unions

It’s a more advanced type system than Go and Java


Typing is very surface level though and doesn’t negate what the person you’re replying to is saying: Python is über dynamic.

It’s very trivial to have something pass the type hinting checks, but be completely different at time of use. Even if you hold on to a single instance of an object, it’s easy for any thing to monkey patch it.


One approach I've done, to try and reduce the performance hit, is:

    def uses_foo(x):
        global uses_foo
        import foo
        uses_foo = foo.bar
        return foo.bar(x)
While that performance suckage improves, the other suckages get worse.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: