Hacker News new | comments | show | ask | jobs | submit login
Python startup time: milliseconds matter (python.org)
662 points by vanni 7 months ago | hide | past | web | favorite | 378 comments

I've always been disappointed by how large software projects, both FOSS and commercial, lose their "can do" spirit with age. Long-time contributors become very quick with a "no". They dismiss longstanding problems as illegitimate use cases and reject patches with vague and impervious arguments about "maintainability" or "complexity". Maybe in some specific cases these concerns might be justified, but when everything garners this reaction, the overall effect is that progress stalls, crystallized at the moment the last bit of technical boldness flowed away.

You can see this attitude of "no" on this very HN thread. Read the comments! Instead of talking about ways we can make Python startup faster, we're seeing arguments that Python shouldn't be fast, we shouldn't try to make it faster, and that programs (and, by implication, programmers) who want Python startup to be fast are somehow illegitimate. It's a dismal perspective. We should be exercising our creativity as a way to solve problems, not finding creative ways to convince ourselves to accept mediocrity.

This isn't an attitude of "no" - it's an attitude of "yes" to other things. The arguments are that making Python startup fast makes other things worse, and we care about those other things.

Here are some other things we can say "yes" to:

- Rewrite as much of Mercurial in Rust as possible, which will provide performance improvements well beyond what Python can possibly offer. https://www.mercurial-scm.org/wiki/OxidationPlan

- Spend resources on developing PyPy, which (being a JIT) has relatively slow startup but much faster performance in general, for people who want fast performance.

- Write compilers from well-typed Python to native code.

- Keep CPython easy to hack on, so that more people with a "can do" spirit can successfully contribute to CPython instead of it being a mess of special cases in Guido's head.

Will you join me in saying "yes" to these things and not convincing ourselves to accept mediocrity?

I have to note that none of the projects you suggested, all of which are good and useful, will do anything to address cpython startup latency problem under discussion. Why shouldn't cypthon be better?

There's also no reason to believe that startup improvements would make the interpreter incomprehensible; the unstated assumption that improvements in this area must hurt hackability is interesting. IME, optimizations frequently boost both simplicity and performance, usually by unifying disparate code paths and making logic orthogonal.

I think you misunderstood the point. These weren't things that would address the cpython startup problem - these were other priorities that can be worked on, instead of (or in addition to) the latency problems under discussion.

Saying yes to fixing one thing usually means saying no to all the other things you can be doing with your time instead. Unless you're lucky and can "kill 2 birds with 1 stone".

> - Write compilers from well-typed Python to native code.

That is one thing I really want to happen, because I think it begins to open up python to the embedded space. Micropython is nice, but it still needs an interpreter embedded into it.

There have been plenty of attempts to compile Python to faster code (usually by translating to C).

Cython can use type annotations and type inference to unbox numbers for faster numerical code, but uses ordinary Python objects otherwise. http://cython.org

ShedSkin translates a restricted (statically typeable) subset of Python to C++. https://shedskin.github.io

RPython has a multi-stage approach where you can use full dynamic Python for setup, but starting from the specified entry point, it is statically typed. Since it was created for PyPy, it comes with support for writing JIT compilers. https://rpython.readthedocs.io

Pythran translates a subset of Python with additional type annotations to C++. http://pythran.readthedocs.io

In general, the major hurdle for all attempts to compile Python to native code is that Python code is dynamically typed by default.

An hurdle that has been solved for quite some time in Lisp, Scheme, Prolog, Smalltalk, SELF (was the basis of Hotspot), Dylan, Ruby, JavaScript.

All not less dynamic than Python.

What Python lacks is the funding and willingness to actually push one of those implementations to eventually become the new reference implementation.

Nuitka might be interesting for this use case, too. http://nuitka.net/index.html

It will compile python to C++ and then compile that code to machine instructions and calls to the CPython library.

Thanks for that rundown. It's pretty good.

> Why shouldn't cypthon be better?

The point is that "better" is almost never a well-defined direction unless you only consider a single use-case. It's almost always a tradeoff, especially in a widely used project.

A language is a point on a landscape of possible language variants, and "better" is a different direction on that landscape for every user.

> cpython startup latency problem

The problem under discussion is that projects that currently use CPython have slow startup. One potential solution is for those projects not to use CPython. (Certainly it's not the only potential solution, but, a language that tries to be all things to all people isn't going to succeed. Python has so far done an extraordinarily good job of being most things to all people, with "I want native-code performance" being one of the few out-of-scope things.)

>IME, optimizations frequently boost both simplicity and performance, usually by unifying disparate code paths and making logic orthogonal.

I would really like to see some examples where this is the case. Optimizations in my experience have made systems more brittle, less portable and ultimately less maintainable.

Programs a fast when they don't force the computer to do much stuff, so speed overlaps (albeit imperfectly) with simplicity of code. Mostly this explains why programs start fast and then slow down as they cover more use-cases. But some optimisations also amount simplifying an existing system.

For example: as you get to know your use-cases better you might simplify your code to sacrifice unwanted flexibility. Or you might replace a general-purpose data structure with a special purpose one that not just faster, but concretely embodies the semantics your desire.

A case, that is not quite a simplification is removing code re-use. Instead of using function in three different ways, you use three separate optimised functions. Now changes to one use case don't cause bugs in the others. That's the kind of thing that quotemstr meant by "making logic orthognal".

>A case, that is not quite a simplification is removing code re-use. Instead of using function in three different ways, you use three separate optimised functions. Now changes to one use case don't cause bugs in the others. That's the kind of thing that quotemstr meant by "making logic orthognal".

And which is what I mean by making systems more brittle, less portable and less maintainable.

You find a corner case in the original function that's not covered, now instead of fixing it in one place you need to fix it in three places with all the headaches that causes.

So the next maintainer thinks: "Gee I can fix this by bringing all these functions together".

So if they're using an oo language they make an abstract base class from which the behaviour is inherited, or a function factory otherwise.

So now you're back to a slow function with even more overhead, that's even harder to debug.

So the next maintainer comes around and thinks: "Gee I can speed this up if I break out the two functions that are causing 90% of the bottleneck".

Now you have 4 completely independent functions to keep track of.

Repeat ad-nauseum.

I'm so longing for a Python(like) compiler.

MicroPython put together a Python in 250kb. Why the hell can't we make an LLVM frontend for Python that can use type hints for optimization? Sure, you lose some dynamic features as you optimize for speed, but that's the dream. Quickly write a prototype, not caring about types, optimize later with adding types and removing dynamicism.

I'm currently learning Racket and LLVM and I have about 70 more years to live. I'm gonna try make Python fast on slow weekends 'til I die.

Since you're after micro-controllers you might be interested in Nim. At this year's FOSDEM we showed off some micro-controllers running Nim code[1]. The language is definitely less Python-like than Cython, but it might just be similar enough for your use cases (I started using Nim as a Python replacement).

1 - https://twitter.com/nim_lang/status/959736268870639616

For people who don't want to bounce through twitter to get to the home page:


Unless your Python compiler can use cpython modules without a massive performance penalty, it's going to see very limited adoption. The ecosystem matters.

We had a chance with ctypes and now CFFI to move away from the platform calcifying cpython module interface that is overly coupled to the cpython runtime. I am very disappointed in the lack of affordances that cpython gives to alternative pythons to support their work. The stdlib is a crufty mess that is overly coupled to cpython as well. The batteries are corroded and need to be swapped out for a modular pack.

I don't think it would be hard to add type annotations to existing projects, though. Something similar happened in the JS world with the transition of JS projects to TypeScript and it wasn't a big deal, IIRC.

Isn't it what Numba does quite successfully for a subset of Python?

[1]: https://numba.pydata.org/

Not static AOT, not creating tiny binaries that fit on a microcontroller.

There is a lot of stuff out there that goes in this direction. There is nukita (again no small binaries), there is even an abandoned GCC frontend that can compile some minimal examples, but has been abandoned long ago.

Seriously, the time I spent researching this topic - if a proper compiler engineer would spend that on the actual compiler, it'd be done by now.

Seeing as that page states the resulting executables still require numpy, I'd guess it was the static requirement that it misses.

> I'm so longing for a Python(like) compiler.

There is Cython, of course, which is a great python (+) compiler. I assume you mean a just-in-time compiler as opposed to a static compiler.

No. Pypy already does JIT. And Cython has the speed, but not the size.

The target of my phantasy compiler are microcontrollers. That's why the (correct) comment of quotemstr isn't that big of a concern to me.

Just throwing Python code at Cython doesn't really improve performance all that dramatically, because it'll be doing pretty much the same thing as the bytecode interpreter, except that all the saucy special-cases are now unrolled many times across all code.

You might be interested in Matthew Might's course on compilers. In one of his courses he targeted a Python compiler written in his weapon of choice, Racket.

> PyPy, which (being a JIT) has relatively slow startup

    > time pypy -c 'print "Hello World"'
    Hello World
    pypy -c 'print "Hello World"'  0.08s user 0.04s system 96% 
    cpu 0.120 total

    > time luajit -e 'io.write("Hello World!\n")'
    Hello World!
    luajit -e 'io.write("Hello World!\n")'  0.00s user 0.00s 
    system 0% cpu 0.002 total

Sometimes I wonder why we're not all using Lua instead of Python. Lua seems to get a strange amount of hate in some circles, but I've found both Lua and Python to be reasonably pleasant languages to work with.

From my personal account about Lua [1]:

> Three, the language is not a mere combination of syntax and semantics. Any evaluation should also account for user bases and ecosystem, and in my very humble opinion Lua spectacularly fails at both. I'm not going to assume the alternative reality---Lua has a sizable user base and its ecosystem is worse even for that user base.

> [...] The lack of quality library also means that you are even more risky when you are writing a small program (because you have less incentive to write it yourself). I have experienced multiple times that even the existing libraries (including the standard ones) had crucial flaws and no one seems to be bothered to fix that. Also in the embedded setting the use of snippets, rather than proper libraries, are more common as libraries can be harder to integrate, and unfortunately we are left with PHP-esque lua-users.org for Lua...

I think this critique still holds today, and unless miracle happens (like D), I doubt this is fixable.

[1] https://news.ycombinator.com/item?id=13902023

I had a similar feeling a while back, but when I actually picked up Lua for a project I was shocked by how limited the standard library is. Third party libraries aside (which Python clearly has in spades), even just the standard library is pretty sparse. It makes sense, since Lua is at least partly motivated by being lean and embeddable, but it puts Lua in a completely different space for me.

Other folks complain about the lack of libraries. Personally, the two things which turn me off on Lua (luajit) are 1-based arrays and the conflation of hash-tables and array-lists into a single thing.

I don't like how everybody invents their own class system.

> the conflation of hash-tables and array-lists into a single thing.

Precisely, one of the things that turns me off on Python (coming from Lua), is the unnatural proliferation of different container types.

Oddly that's one of the things I like most about python. The containers module has so many very useful things. Plus hash-tables and array-lists shouldn't ever really be the same thing.

To each their own :-)

You might like Tcl - associative arrays (hash tables), arrays (lists), strings, numbers, and code are all the one type (at least conceptually).

no batteries

Is ... that faster than CPython? Wow. Maybe I should symlink /usr/bin/python -> pypy on my laptop....

It's much slower. On my machine, "$PYTHON -c ''" (execute nothing) takes 60ms on Python 2.7, 80ms on 3.6 and 250ms on pypy.

It's not. You can test it on your system. In every case I've seen, Python starts up faster than PyPy. Neither takes a super long time.

PyPy has some great performance characteristics, but startup time isn't one of them.

Startup << warmup.

None of these address, for instance, the issue raised about the firefox build invoking python many times. This seems both an accepted use case of cpython and an area where traditionally cpython has a huge edge on the JVM and PyPy. If scripts are not a priority, what is the expected use case of cpython?

I would like to note, the cpython ties to the PyObject C abi seem to stymie rather than encourage “hacking”. Cpython seems to have traditionally valued stability over all else.... see the issues pypy has had chasing compatibility with c and retaining speed.

So: normally i’m with you and a language should lean into its strengths, but i’ve always listed startup time as a primary strength of python!

In my experience, "script" is usually well-correlated with "a bit of inefficiency is okay." There's a reason that, say, many UNIX commands that could be implemented as scripts (true, false, yes) are actually implemented as binaries. There's a reason that most commercial UNIXes/clones (Solaris, macOS, Ubuntu, RHEL, etc.) switched from a script-based startup mechanism to a C-program-based one.

I certainly write and continue to write Python scripts where even an extra half second won't matter. It's doing some manipulation of data where the cost of what it's doing is dominated by loading the data (e.g., grabbing it from some web service), and even if the script is small and quick, it's not so small and quick that I'll notice 50-100 ms being shaved off of it.

Use cases where CPython continues to make sense to me are non-CGI web applications and things like Ansible, where load time isn't sensitive to milliseconds and runtime performance is pretty good. (Although if you believe the PyPy folks, perhaps everything that's PyPy-compatible should be running on PyPy.)

This hits the nail on the head.

Optimization is very, very rarely completely „free“ - and usually a concious trade of some property for another trait that‘s deemed more important in a specific case.

Simplicity for performance. Code size for compilation speed. Startup time for architectural complexity. UX for security.

For a great product, you need to say „no“ much more often than not. Do one thing and do it well. Be Redis, not JBoss.

I love how this article gets down to the essence of it: https://blog.intercom.com/product-strategy-means-saying-no/

Agree. I recently found that trying to optimise code can make it a lot more complex.

> Rewrite as much of Mercurial in Rust as possible, which will provide performance improvements well beyond what Python can possibly offer. https://www.mercurial-scm.org/wiki/OxidationPlan

I read that article and I'm still wondering: why Rust?

The last three paragraphs of the section "Why use Rust?" should address that - basically, they have experience with solving this problem by writing parts of the code in C, they are not fans of that experience, and Rust is a compelling better C (and there are specific reasons they don't think C++ is compelling).

Are you asking in comparison to some other language? The most obvious other languages in the "compelling better C" niche I think are Ada, D, and Go; Ada and D (I think, I do not know them well) don't have as good of a standard library or outside development community, and Go is less suited to Rust to replacing portions of a process. Go would be a reasonable choice were one writing a VCS from scratch today.

The C++ rationale is bizarre. That a 2008 compiler doesn't provide modern features is unsurprising. They've chosen to use Rust is a strange reaction to this limitation, since it would be just as easy, from a toolchain perspective, to just use a modern C++ compiler.

One big advantage of rust is that (while some people won't like this), rust assumes you won't install it from your package manager, but download a script which installs it in your home directory. This script is quick and easy.

Trying to install a c++ compiler from source (and I've done it several times) is a much less plesent experience, so most people stick with what their package manager provides.

Language features aside, Rust is a lot nicer to work with because it has cargo and using libraries is no longer a pain.

Depends which libraries we are talking about, try to use GUI libraries from Rust.

That isn't fair, all languages besides JS+HTML+CSS have issues with GUI libraries. Either you go Electron/webview or you have to deal with Qt/GTK for cross-platform GUI.

Sure it is fair, Java, C++, C#, VB.NET, Delphi, Objective-C, Swift have quite good GUI libraries available.

And regarding JS+HTML+CSS, they are still on the stone age of RAD tooling.

I think you missed the cross-platform part of my answer.

Some of those languages do have cross-platform GUI offerings.

Even AWT is better than any option currently natively available to Rust.

After all, "using libraries is no longer a pain" is not what I felt when converting an old toy application from Gtkmm to Gtk-rs.

> Go is less suited to Rust to replacing portions of a process

How so? Is it because Go has a GC and Rust doesn't?

That's part of it, but more generally, Rust prioritizes fitting into other programs: it offers direct compatibility with the C ABI for functions and structs (because the C ABI is the effective lingua franca for ~all present-day OSes), it uses regular C-style stacks instead of segmented stacks, threading is implicit, calls between Rust and C in either direction are just regular function calls and involve no trampolines or special handling by the GC/runtime, there is no runtime that requires initialization so you can just call into a random Rust function from a C program without setup, etc.

Go has cgo, and has slowly come to a few of the same decisions (e.g., Go gave up on segmented stacks too), and gccgo exists, so it's certainly possible to use Go for this use case. But it's not as suited as Rust.

There's a nice post about calling Rust from Go which goes to lengths to avoid the cgo overhead, even though Rust can just directly expose C-compatible functions and cgo can call them without any special effort on either side: https://blog.filippo.io/rustgo/

> How so? Is it because Go has a GC and Rust doesn't?

More generally a heavy runtime, which creates issues when you're trying to replace parts of a process which has its own runtime, unless you can make the two cooperate (by actually using the same runtime e.g. jvm or graal or whatever).

Go's FFI is also easy to work with but the source of… other issues which is why the Go community often prefers reimplementing things entirely to using cgo and existing native libraries.

Yes, manually move objects between GCs is tricky.

Because he works for Mozilla.

That is probably the only reason to use rust over go or cpp.

Really? The only reason?

Only one I can see. Want speed and performance and safety? C++. Want easy multithreading? Go. Rust's only claim to fame is that Mozilla is dogfooding it.

> Want speed and performance and safety? C++.

Speed and performance sure, but safety automatically rules out C++.

> Want easy multithreading? Go

Want speed, performance, safety, and easy multithreading? Rust.

C++ still does multi-threading better than rust, just that go does it better. Similarly go does perf better than rust, just that cpp is even better. So yeah, if you want the worst of all worlds coupled with the pains associated with a brand new language (try compiling rust for a armv5 soc) rust all the way!

Can you show a benchmark where Go outperforms Rust?

It's funny when developers themselves think effort is so fungible. Like if you spent 1 hour on A, then you would've also made 1 hour of progress on B, C, or D, and that it would've been worthwhile. To the point of fallacy in your post.

I would think developers have the experience to realize this isn't true but I see it all the time on these forums.

I think I'm making the opposite claim - effort isn't fungible (and availability of effort isn't fungible). You can't necessarily spend 1 hour that would otherwise go into, say, rewriting Mercurial into a compiled language and instead spend it on making CPython faster and get the same results. One of these is more likely to work, and also the two problems are going to attract interest from different people.

And one of the things that affects how productive one hour of work will be - and also whether random volunteers will even show up with one hour of work - is the likelihood of getting a change accepted and shipped to users. This is influenced by both the maintainers' fundamental openness to that sort of change, and any standards (influenced by the maintainers, who are in turn influenced by their users) about how careful a change must be to not make the project worse on other interesting standards of evaluation. It's also influenced by the number of people working on the project (network effects) because a more vibrant project is more likely to review your code promptly, finish a release, and get it into the hands of more users.

So I'm claiming that it's better to spend time on rewriting Mercurial in Rust than to spend time on getting CPython startup faster, because the Mercurial folks are actively interested in such contributions and the CPython folks are actively uninterested, and because there are fewer external constraints in making Mercurial startup faster than in making CPython startup faster. And I'm saying that the more we encourage folks to help with rewriting Mercurial in Rust, the more likely additional folks are to show up and help with the same project, thereby making 1 hour of effort even more productive.

I agree with you. As the limitation of developing resources, say "no" is difficult but important.

I am slightly afraid to ask, but what is a "well typed python"?

Have you seen MyPy, static type annotations / checking for Python? http://mypy-lang.org/

In context, what I'm really getting at "a sufficiently non-dynamic subset of Python that it can be compiled statically, but also a sufficiently large one that real Python programs can have a chance of being in the subset." PyPy has a thing called RPython that fits the former but not really the latter (I don't know of any non-PyPy-related codebases that work in RPython). In general, adding complete type annotations to a codebase is pretty correlated with making it static enough to do meta-level things on like compiling and optimizing it - for instance if you have a variable that changes types as the program runs, at least now you've enumerated its possible types. It's not the only way of doing so, but it seems to work well in practice and there seems to be a correlation between compiled vs. interpreted languages and static vs. dynamic typing.

correct. Every time you say yes, your saying no to something else. Its important to realize what your saying no to, before you say yes.

> This isn't an attitude of "no" - it's an attitude of "yes" to other things.

You are literally bringing an attitude of "no" to the question of whether you are being an attitude of "no" to the discussion....

If those who complain about no-attitudes are insisting that the only acceptable response to anything is "yes", I doubt they'll get far.

FWIW no one who replied to this email thread said something even close to "no". Victor Stinner points out that startup time is something that comes up a lot and mentions some recent work in the area [1].

Python is a big ship, it may not be as nimble as a young FOSS project but it is always improving and investments in things like start up time pays dividends to a large ecosystem.

[1] https://mail.python.org/pipermail/python-dev/2018-May/153300...

I get the impression that backwards-compatibility does weigh pretty heavily on the Python core developers these days. There are so many Python installations out there doing so much that the default answer to a change has to be "no". The fact that macOS and popular Linux distributions ship with copies of Python is great, but once something is effectively a component of operating systems, boldness is not a viable strategy. Arguably, one of the reasons why the transition to Python 3 has been so drawn out is that every time somebody installs macOS or one of many Linux distributions, a new Python 2 system is born. I've seen .NET Core developers explain that having .NET Framework shipped in Windows put them under massive constraints, and this was one of the motivations for a new runtime.

I'm not denying this phenomenon, but part of it is surely that widely used projects get more conservative because any change risks breaking something for someone somewhere. And the maintainers tend to feel a sense of responsibility to help people deal with these breakages.

I'll bring a slightly different perspective, as someone who's been using Python professionally for over a decade: there is no such thing as just saying "yes" or "no". Every "yes" to one group is at least an implicit "no" to some other group, and vice-versa.

The Python 2/3 transition is a great example of this. Python 2 continued an earlier tradition of saying "yes" to almost everything from one particular group of programmers: people working on Unix who wanted a high-level language they could use to write Unix utilities, administrative tools, daemons, etc. In doing that, Python said "no" to people in a lot of other domains.

Python 3 switched to saying "yes" to those other domains much more often. Which came with the inherent cost of saying "no" (or, more often, "not anymore") to the Unix-y crowd Python 2 had catered to. Life got harder for those programmers with Python 3. There's been work since then to mitigate some of the worst of it, but some of the changes that made Python nice to use for other domains are just always going to be messy for people doing the traditional Unix-type stuff.

Personally, I think it was the right choice, and not just because my own problem domain got some big improvements from Python 3. In order to keep growing, and really even to maintain what it already had, Python had to become more than just a language that was good for traditional Unix-y things. Not changing in that respect would have been a guaranteed dead end.

This doesn't mean it has to feel good to be someone from the traditional Unix programming domain who now feels like the language only ever says "no". But it does mean that it's worth having the perspective that this was how a lot of us felt in that golden age when you think Python said "yes" to everything, because really it was Python saying "yes" to you and "no" to me. And it's worth understanding that what feels like "no" doesn't mean the language is against you; it means the language is trying to balance the competing needs of a very large community.

"people working on Unix .... In doing that, Python said "no" to people in a lot of other domains."

Could you elaborate on this?

I thought Python was pretty good about supporting non-Unix OSes from early on. It was originally developed on SGI IRIX and MacOS. From the README for version 0.9:

> There are built-in modules that interface to the operating system and to various window systems: X11, the Mac window system (you need STDWIN for these two), and Silicon Graphics' GL library. It runs on most modern versions of UNIX, on the Mac, and I wouldn't be surprised if it ran on MS-DOS unchanged. I developed it mostly on an SGI IRIS workstation (using IRIX 3.1 and 3.2) and on the Mac, but have tested it also on SunOS (4.1) and BSD 4.3 (tahoe).

though it looks like there wasn't "painless" DOS support until 1994, with the comment "Many portability fixes should make it painless to build Python on several new platforms, e.g. NeXT, SEQUENT, WATCOM, DOS, and Windows."

I also thought that PythonWin had very good Windows support quite early on. The 1.5a3 release notes say:

> - Mark Hammond will release Python 1.5 versions of PythonWin and his other Windows specific code: the win32api extensions, COM/ActiveX support, and the MFC interface.

> - As always, the Macintosh port will be done by Jack Jansen. He will make a separate announcement for the Mac specific source code and the binary distribution(s) when these are ready.

So, take the Python 3 string changes as an example.

Python 2 scripting on Unix was great! Python just adopted the Unix tradition of pretending everything is ASCII up until it isn't, and then breaking horribly. And then the Linux world said "just use UTF-8 everywhere!" and really meant "just keep assuming things are ASCII, or at least one byte per code point, and break horribly when it isn't!"

This was great for people writing command-line scripts and utilities. This was a nightmare for people working in domains like web development.

Python 3 flipped the script: now, the string type is Unicode, and a lot of APIs broke immediately under Python 3 due to the underlying Unix environment being, well, kind of a clusterfuck when it came to locales and character encoding and hidden assumptions about ASCII or one-byte-per-character. Suddenly, all those people who had been using Python 2 -- which mostly worked identically to the way popular Linux distros did -- were using Python 3 and discovering the hell of character encoding that everybody else had been living in, and they complained loudly about it.

But for growing from a Unix-y scripting language into a general-purpose language, this change was absolutely necessary. Programmers should have to think about character encoding at their input/output boundaries, and in a high-level language should not be thinking of text as a sequence of bytes. But this requires some significant changes to how you write things like command-line utilities.

This is an example of a "yes" to one group being a "no" to another group. Or, at least, of it feeling that way.

Also, saying that Python was a great Unix-y language is not equivalent to "Python only ran on Unix and never supported Windows at all", and you know that, so it was kind of dishonest of you to try to start an argument from the assumption that I said the latter when really I said the former. Don't do that again, please.

You wrote: Python was a great Unix-y language is not equivalent to "Python only ran on Unix and never supported Windows at all",

Let me elaborate further. I recall that Mark Hammond at one of the Python conferences around 2000 said that Python was the language with the best support for Windows outside of the languages developed at Redmond. Hammond did much of the heavy work in making that happen.

I didn't mean to sneak in a dishonest argument.

I am under the genuine impression that Python worked well for Windows, with the narrow Unicode build that matched the UCS2 encoding that Windows used, and was comparable to the experience of developing under Python for Unix.

Similarly, I thought the native Mac support under, say, OS 9, was also well supported, and matched the Mac environment.

I'm not saying that there weren't problems, and I agree that that web development is one of the places where those problems came up.

Rather, I'm saying that I think the native Unix, native Windows, and native Mac support were roughly comparable, such that I don't think it's right to say that there was a really strong bias towards Unix.

What I said: Python was a great language for writing traditional Unix-y things like shell scripts, daemons, sysadmin tools, and here's an example of it adopting something that made that much easier.

What you are trying to twist that into saying: Python somehow didn't run on or wasn't used on or was terrible on operating systems not explicitly named "Unix".

I don't see any way to assume good faith on your part given you've repeated that attempt at putting words in my mouth while demonstrating knowledge that indicates you understand perfectly well what it was I really said. I'm going to ignore you now.

What you said was:

> Python 2 continued an earlier tradition of saying "yes" to almost everything from one particular group of programmers: people working on Unix who wanted a high-level language they could use to write Unix utilities, administrative tools, daemons, etc. In doing that, Python said "no" to people in a lot of other domains.

> Python 3 switched to saying "yes" to those other domains much more often.

I would like to know why you singled out Unix when it seems like Python also said "yes" to MS Windows.

Of course Python developers said "no" to other domains. Every language says "no" to some domains. I thought you were trying to make something more meaningful about a specific bias towards Unix.

Eg, as I recall, the Perl implementation was biased towards Unix and was difficult to compile under Windows. The glob syntax, for example, called out to the shell.

Honestly, I was expecting you to point out a difficulty that Python had with non-Unix OSes, specifically with MS Windows, which has since been remedied with Python 3.

I didn't expect this response at all, nor have my attempts to explain myself seemed to have made a difference.

I still don't know why you singled out Unix in your earlier comment. And it seems I will never know.

"Unix-y" is a paradigm or design philosophy, not an operating system. You can write unixy things for any OS. That's what the parent is talking about, not an operating system. https://en.wikipedia.org/wiki/Unix_philosophy

I think what would make things clear for me is if there was an example of how Python did something like a "no" for MS Windows support.

That is, outside of those places where MS Windows might (to the exasperation of Dave Cutler) be considered Unix-y.

>"Python only ran on Unix and never supported Windows at all",

I think the misunderstanding stems from no one having said this :P

One might rephrase "english-unix-ascii" from my other comment to "english-command line tooling-fixed width system encoding".

It was really a problem for web and fullish unicode.

Your wording wouldn't have cause me to raise an eyebrow.

But ubernostrum seemed to be making a stronger statement that Python favored "one particular group of programmers: people working on Unix ... to write Unix utilities, administrative tools, daemons, etc."

While I know that I used Python 2.x with the win32 extensions to write daemons for MS Windows, and to write an ActiveX extension for Excel.

That's why I wanted clarification on the basis for ubernostrum's statement, with pointers to why I thought Python was well-supported on other OSes.

The biggest thing is likely the str/unicode change. In py2, if working on a unix system with only ascii, you never had to think about strings. Suddenly with python3, you had to a little bit.

The gain was that for everyone else (read: web, non-English, anywhere where unicode is common), python became much easier to use. But for those specific english-unix-ascii cases, it was a mild inconvenience.

Edit: as ubernostrum pointed out, more than a mild inconvenience if you were porting code. If writing new code, it was not much worse, but porting was absolutely a pain.

And I thought that if working with Python 2 on a MS Windows system, then you also didn't really have to think about strings. That is, Python's narrow Unicode strings matched the native UCS-2 of Windows.

I did some Python 2 programming under Windows and don't recall string issues; certainly fewer issues than I've had in dealing with Python 3 changes.

I agree that what we have now is an improvement. I just don't see why the old way was really Unix-centric.

> In py2, if working on a unix system with only ascii, you never had to think about strings.

On any system with only ASCII, you never had to think about strings. Unix is an irrelevant word in this sentence.

That's a nice sounding comment, but... could you be a little more specific about what particular "traditional Unix-y things" did Python 3 say "no" to?

...I can't really think of many, if any at all. Sometimes you just say "no" to "inertia".

I think it's because they've seen exactly where saying "yes" leads them and they don't like that place.

They hate fast code?

Perhaps the known opportunities for dramatic perf increases require compatibility breaks and some expressiveness downgrades.

That Python 3 transition was a fun time, yeah?

If python 3 transition did bring dramatic perf increases I guess people would have been way faster to sell that upgrade to their hierarchies. Not that I blame python team for the lack of it. But fictional dev history would have been different

No doubt, huge perf improvements for free would have made the transition more compelling.

However, what I'm suggesting is that large perf wins are not free. Breaking compatibility too much more could have doomed Python 3. And now that the devs know how painful a transition is, they're far less likely to break compatibility again for any reason.

I think part of what explains this attitude in people is "lack of imagination". In the sense that sometimes, especially when an existing project or organization or bureaucracy has become huge and daunting, people cannot imagine excellence anymore, so they believe it to be literally impossible.

To be fair, they are frequently saying no to things other people think they should do (rather than saying no to things like contributions of startup improvements).

We is very abstract term I am sure if you proposed a patch that addressed the issue without adverse side-effects it would get accepted.

I think your comment is well-intentioned (I upvoted) but I respectfully disagree. I think wanting Python to be a bit faster is similar to wanting Haskell to have a little bit of mutability. Engineering with restrictions is a good thing, we can do great systems in Haskell because it's a very neat language even though it lacks mutability. We also can do great systems in Python because it's a very neat language even though it's a bit slow. Sure, you can always optimize Python's performance, that's a legitimate problem and it takes a few engineers to solve it. But it's more interesting to work around Python's slowness by engineering tricks such as better algorithms etc.

That's not a great analogy. Haskell is a neat language in part because it doesn't have mutability. Python is a neat language despite being slow.

I can't imagine anyone would object if Python could magically be 10x faster. I can't say the same thing for the Haskell thing.

My whole point is that 10x thing cannot just magically happen. The reason Python is slow is not incompetent programming or lack of magic. We know why that's happening. Because every variable in Python interpreter is a hash map and pretty much every operation is a hash map lookup. How do you optimize this? The only way is to remove language features like `setattr` and my whole point is that some people use Python because it's flexible enough to do that so they need their `setattr`.

> The reason Python is slow is not incompetent programming or lack of magic. We know why that's happening. Because every variable in Python interpreter is a hash map and pretty much every operation is a hash map lookup.

This statement is easily refuted by PyPy. Here's a simple program which runs 70 times faster in PyPy than Python on my machine - including startup time:


> How do you optimize this?

Semantically, everything in JavaScript and Lua is also hash map lookups. Yet very smart people have made those languages very fast. CPython is not slow for the reasons you stated.

Not all python programs are compatible with PyPy though. Nor do they get much of a performance boost by switching. Pandas and NumPy for example didn't work on PyPy until less than a year ago. And a good chunk of Python codebases are going to use one of those at some point.

> Not all python programs are compatible with PyPy though.

I'm not sure how this is relevant to my reply... I didn't say PyPy was a replacement for Python. I said PyPy, Lua, and JavaScript are existence proofs that dynamic "hash table" languages don't have to be slow. Therefore, CPython must be slow for some other reason.

Pypy can't accelerate the Pandas/Numpy part anyway, the computationally intensive code in Numpy and Pandas is coded in C, C++ or Fortran.

Though there are a lot of if isInstance(Foo, bar) that could be avoided in a statically typed language.

> Sure, you can always optimize Python's performance, that's a legitimate problem and it takes a few engineers to solve it. But it's more interesting to work around Python's slowness by engineering tricks such as better algorithms etc.

Surely you're not implying that improving Python's performance would preclude finding interesting algorithms, nor that this is a suitable rationale for keeping Python slow? Anyway, algos can only get you so far when they're built on slow primitives (all data scattered haphazardly across the heap, every property access is a hash table lookup, every function call is a dozen C function calls, etc).

> I think wanting Python to be a bit faster is similar to wanting Haskell to have a little bit of mutability

I'm sorry but that makes zero sense. Haskell is defined by immutability. People want to use haskell because of that characteristic. I don't want to use python because it is slow.

Sorry I disagree. There is definitely a sense in which Haskell is desirable because it is immutable, e.g. I myself love immutable data structures, it certainly makes it desirable. But my point is that it puts a restriction. Now you cannot implement algorithms that need mutability such as hash maps. It is easy to circumvent such problems but one other way is basically introducing mutability to Haskell which totally doesn't make sense. I think same goes for python, if you want to make it significantly faster then you need to face certain trade-offs: maybe data model should be optimized, or maybe `int` shouldn't be arbitrary precision integers, or maybe there should be primitive types like `int`, `double` as in Java to increase performance. Truth is these are not Pythonic solutions, and just like mutability is not Haskell-esque, optimizing Python sacrifising these trade-offs is not Pythonic.

> one other way is basically introducing mutability to Haskell which totally doesn't make sense

It makes perfect sense, because Haskell is not "an immutable language". It's a language in which some things are immutable, and those that are not are explicitly indicated in the type system.

This is why large companies like Google often reinvents the wheel. Open source gives everyone the right to use, but not the power of control. Sure, you can fork, but then your version will diverge from the official, and the pain of maintaining compatibility may be greater than writing your own from scratch.

It's a byproduct of how many people you have to answer to. I was kind of having discussion with a coworker about an app that had a lot of features that made it seem kind of cluttered but useful. I think small projects can make bolder choices and enable more options because they have a smaller userbase that would be impacted by their changes and they want to be able to reach more people so adding a feature is generally a net benefit. But a larger project cannot risk hurting the large userbase they have already established so they have to be more cautious about the changes that they make.

I've always been disappointed at how quickly people make sweeping generalizations from a single anecdote. (I also think Python can do better here, but the generalization isn't justifiable.)

With major infrastructure like Python there's a tendency to over-emphasise compatibility between releases.

Look at this post in the same list thread: https://mail.python.org/pipermail/python-dev/2018-May/153300...

Python 3.6 is trying an enormous number of potential paths that code for imports might be found at. Why is that fixed in stone? Couldn't Python 3.(n+1) change that, if it's slow and historical, cutting out a bunch of slow system calls?

As someone who makes use of Python to deploy software, it's entirely possible that could cause me a few issues... which I'd fix quite easily. It should be totally reasonable to expect the community using the software to cope with those sorts of changes after a major release; the alternative is ossification.

Django suffered from maintaining too much compatibility, and releasing too slowly, and they fixed it. Three or four years ago everyone was talking about moving away from it; now they release often, deprecate stuff when they need to, and the project is as vibrant as it ever was. Time for cPython to learn the same lesson.

It may also be that they simply don't have an attack on the startup problem.

Competition. Hiphop VM lit a fire under the PHP team.

Everyone is focusing on python, but where is this "can do" spirit from mozilla? Their are languages with better startup times, bash, perl, lua, awk to name a few, and could likely do whatever the python scripts are doing.

> but where is this "can do" spirit from mozilla?

The mail included both

> At some point, we'll likely replace Python code with Rust so the build system is more "pure" and easier to maintain and reason about.


> Since I am disproportionately impacted by this issue, if there's anything I can do to help, let me know.

Python3 has the exact opposite problem: Too many devs willing to say "yes" to features and a small number of devs who try to keep things fast and maintainable.

Remember that Python2 was faster.

That changed with the dict improvements in 3.6.

This is true but python's relative slowness (along with the GIL) is an issue that is regularly blown out of all proportion.

Part of the reason for the language's success is because it made intelligent tradeoffs that often went against the grain of the opinions of the commentariat and focused on its strengths rather than pandering to the kinds of people who write language performance comparison blog posts.

If speed were of primary importance then PyPy would be a lot more popular.

You're conflating two kinds of "performance", startup latency and steady state throughput. We're talking about the former, and you're proposing improvements for the latter. In fact, moving to pypy is exactly what you shouldn't do to improve startup.

It's surprising but frequently true that startup latency has a greater effect on the perception of performance than actual throughput. Nobody likes to type a command and then be kept waiting, even if the started program could in principle demonstrate amazing feats of computation once warmed up.

The GIL is a pretty nasty problem once you try to scale things beyond one core.

Simply try something like unpickling a 10 GB data structure while keeping your GUI in the main thread responsive. You cannot do that because the GIL locks up everything while modifying data structures. Move the data to another process instead of another thread. Great, your GUI is responsive but you can't access the data from the main thread.

You can say that such a humongous data structure is wrong or that a GUI isn't meant to be responsive or programmed in Python or that I'm holding it wrong. Probably right.

I've flailed around with this a few times in the last year or so and have found that posting things up and down a multiprocessing.Pipe is the least painful alternative.

So you're basically building a distributed application just because you can't share memory properly. This can be very efficient if little communication is involved or a total nightmare if you have gigabytes of data where you need lots of random read access to walk the data structures at high speed. If you're not careful you spend most of your time pickling and unpickling the stuff you send over your pipes while requiring duplication of your gigabyte data structures in order to gain at least some parallelism.

I don't see a way around this mess with the current structure of python. You would have to reimplement the data heavy part completely in another language that provides proper threading models.

"You're holding it wrong" is a poor response to a wide audience, like iPhone users. But it's an OK response to a specialist, like someone tackling the task you describe.

I'm a professional Python developer and I run into performance problems a lot. Python makes things really hard for even specialists to "hold right". Contrast that with Go, which (for all the hate it gets) writes very alike well-formed Python in single-threaded applications, and writes how you would like to write Python in parallel applications. And all the while being two orders of magnitude faster. If we don't start taking performance seriously in the Python community, Go (or someone else) will eat our lunch sooner or later.

Go offers faster performance with code that is up to 50% longer - with the commensurate added maintenance burden.

And, go is still very slow compared to C, C++ or Rust.

Since performance is usually a power law distribution (99% of the performance gains are made in 1% of the code), it's frequently more effective - in terms of speed and maintenance burden - to code up hot paths in a language like C, C++ or Rust and keep python.

I accept that Go is more verbose than Python, but your maintainability claim doesn’t match my experience at all. I find that Go is more maintainable for a few reasons: magic is discouraged in Go, everyone writes Go in pretty much the same way and with the same style, Python’s type system is still very immature (no recursive types, doesn’t play nicely with magical libs). Further, in my experience with working with large Python and Go codebases, Python becomes less maintainable very quickly as code size increases, especially in the face of more junior developers. Go seems to be more resistant to these forces, probably because of the rails it imposes. Lastly, any maintainability advantages Python might have had are quickly eaten up by the optimizations, which are necessary in a much greater portion of the code base because naive Python is so much slower than naive Go.

Go is ~100X faster than Python and about half as fast as C/C++/Rust, and I find it to be at least as maintainable as Python for most (but not all!) applications.

As for your power law claim, I agree with the premise but not the conclusion—-“rewrite the hotpath in C!” is not a panacea. This only works when you’re getting enough perf gain out of the C code to justify the marshaling overhead (and of course the new maintenance burden).

I don’t like bashing on Python, but it doesn’t compete well with Go on these grounds. It needs to improve, and we can’t fix it by making dubious claims about Go. We should push to improve things like Pypy and MyPy, as well as other tooling and improvements.

Have you watched David Beazley's talks about using generators to implement coroutines? That might give you a similar pattern to goroutines. If non-blocking IO isn't the challenge, do you make use of the concurrent.futures module?

While I also encounter efficiency issues, most of them are frustrations with the overhead of serialization in some distributed compute framework or the throughput of someone else's REST API. As much as so many people complain about the GIL, it's never been a blocker for me (pun intended). Perhaps it's because my style in Python is heavily influenced by Clojure.

Now that I think about it, Python's string processing is often my bottleneck.

Coroutines aren’t parallelization, so they’re quite a lot worse than goroutines in terms of performance. If you want parallelism in Python, you’re pretty much constrained to clumsy multiprocessing. Besides parallelism, Python makes it difficult to write efficient single threaded code, since all data is sprinkled around the heap, everything is garbage collected, and you can’t do anything without digging into a hashmap and calling a dozen C functions under the hood. And you can’t do much about these things except write in C, and that can even make things slower if you aren’t careful.

Probably the best thing you can do in Python is async io, and even this is clumsier and slower than in Go. :(

I'm getting confused. Are you trying to do parallel compute or parallel networking?

If parallel networking, the benchmarks I've seen set Python asynchronous IO at about the same speed as Golang. The folks at Magicstack reported that Python's bottleneck was parsing HTTP (https://magic.io/blog/uvloop-blazing-fast-python-networking/). Note their uvloop benchmark was about as fast or faster than the equivalent Golang code.

If parallel compute, then multiprocessing is the way to go and Python's futures module ain't clumsy. It's just ``pool.submit(func)`` or ``pool.map(func, sequence)``. If you're asking for parallel compute via multithreading, you're going against the wisdom of shared-nothing architecture. Besides, pretty soon you'll want to go distributed and won't be able to use threads anyway.

In contrast to your experience, I find Python makes it easy to write efficient code. Getting rid of the irrelevant details lets me focus on clear and efficient algorithms. When I need heavy compute, I sprinkle in a little NumPy or Numba. My bottleneck is (de)serialization, but Dask using Apache Arrow should solve that problem.

> I'm getting confused. Are you trying to do parallel compute or parallel networking?

Parallelism conventionally means "parallel computation". For async workloads, you're right--there are third party event loops that approach Go's performance, but that's not the subject of my complaint.

Regarding parallelism, I haven't used Python's futures module specifically, but all multiprocessing solutions are bad for data-heavy workloads simply because the time to marshal the data structure across the process boundary poses a severe penalty. There are many other disadvantages to processes as well--they're far less memory friendly than a goroutine (N Python interpreters running, each with the necessary imports loaded), they require extra support to get logging to work as expected (you have to make sure to pipe stderr and stdout), they're subject to the operating system's scheduler, which may kill them on a whim.

> Besides, pretty soon you'll want to go distributed and won't be able to use threads anyway.

Processes have the same problem in addition to being generally less efficient.

> you're going against the wisdom of shared-nothing architecture

I mean, sort of. If you're doing a parallel computation on a large immutable data structure, you don't lose out on maintainability, but you gain quite a lot of performance (no need to copy/marshal that structure across process boundaries). The loss of maintainability is negligible due to immutability. Besides, there are lots of other good reasons to share things across processes, like connection pools, file handles, and other resources.

Also, it's terribly ironic that you're defending CPython and specifically its GIL on the basis of "shared nothing architecture".

> In contrast to your experience, I find Python makes it easy to write efficient code. Getting rid of the irrelevant details lets me focus on clear and efficient algorithms.

Then you'll love Go--Go has far fewer irrelevant details than Python and Python lacks many _relevant_ details, such as control over memory. Your efficient algorithm in Python will almost certainly be at least two or orders of memory better than the equivalent CPython without compromising much in terms of readability.

> When I need heavy compute, I sprinkle in a little NumPy or Numba.

I haven't used Numba, but I've seen a lot of Python get _slower_ with NumPy and Pandas (and lots of other C extensions, for that matter). You have to know your problem well or you'll end up with code that is less readable and less performant than the original, and even when it works it's still less readable than the naive Go implementation and not significantly more performant.

> My bottleneck is (de)serialization, but Dask using Apache Arrow should solve that problem

They'll help, but the fact that Python needs these projects when other languages have far simpler solutions is an admission of guilt in my view. That said, I'm excited to see what sorts of things these projects enable in the Python community.

> Processes have the same problem in addition to being generally less efficient.

What I meant was that you should consider a multiprocessing approach that shares essentially no data between processes. As you say, the memory copying overhead is highly inefficient. Once you approach a problem like that, you've already implemented an essentially distributed system and the change is trivial.

I've regretted multithreading enough times to convince me it's almost never the right choice. Mostly because I find I've underestimated the project scale and needed to rewrite as distributed. Maybe those new monstrous instances available on EC2 will change my habits. I've never had such flexible access to a 4TB RAM / 128 core machine before.

> the fact that Python needs these projects

Apache Arrow solves problems for many languages. The ACM article that popped up the other day, "C is not low-level" touched on some of the issues.


Python derives a good chunk of its speed (if not all of it) from carefully tuned libraries written in other languages (or even for other architectures in the case of many machine learning packages). As soon as you try to do a lot of heavy processing python even the compiled versions quickly bog down. IMO the best way to use python is to use it to cleverly glue together highly optimized code. That way you spend the minimum amount of effort and you get maximum performance.

Multithreading is the glue I need. How am I supposed to write optimized native module to spawn threads to do computation in numpy and pandas?

Yeah, that was kind of my point :/

I have to say that my first reaction was: "maybe you shouldn't use python for this, then". If you are using a language in a way that it gets worse in subsequent versions, that's a good sign that they're optimizing for something other than what you care about.

The programming language R does not, as I understand it, optimize for speed, because they are optimizing for ease of exploratory data analysis. R is growing quite rapidly. So is python, actually. It doesn't mean that either one is good at everything, and it's probably the case that both are growing because they don't try to be good at everything. A good toolbox is better than a multi-tool.

(I authored the linked post)

While the "maybe you shouldn't use Python" comment could be construed as trolling to some, there is definite truth to your line of reasoning and I agree with comment.

I absolutely love Python as a programming language for the space it is in. But as someone who needs to think long term about maintaining large projects with lifetimes measured in potentially decades, Python has a few key weaknesses that make it really difficult for me to continue justify using it for such projects. Startup time is one. The GIL is the other large one (not being able to achieve linear speedups on CPU-bound code in 2018 with Moore's Law dead is unacceptable). General performance disadvantages can be adequately addressed with PyPy, JITs, Cython, etc. Problems scaling large code bases using a dynamic language can be mitigated with typing and better tools.

Python can be very competitive against typed systems languages. But if it fails to address its shortcomings, I think more and more people will choose Rust, Go, Java, C/C++, etc for large scale, long time horizon projects. This will [further] relegate Python to be viewed as a "toy" language by more serious developers, which is obviously not good for the Python ecosystem. So I think "maybe you shouldn't use Python for this, then" is a very accurate statement/critique.

I would characterize Python's weaknesses differently.

Startup time is a problem for Python. But concurrency is much more complex than you state: threading is not the only or best concurrency model for many applications. And certainly removing the GIL will not just enable Python "to achieve linear speedups on CPU-bound code". Distributed computing is real. One of Python's problems for a long time was not the GIL, it was the sorry state of multi-process concurrency.

The speed issues that JITs solve for other languages may not be solvable in Python due to language design.

I'm totally OK with Python's threading choice of saying only 1 Python thread may execute Python code at any time. This is a totally reasonable choice and avoids a lot of complexity with multithreaded programming. If that's how they want to design the language, fine by me.

But the GIL is more than that: the GIL also spans interpreters (that's why it's called the "global interpreter lock").

It is possible to run multiple Python interpreters in a single process (when using the embedding/C API). However, the GIL must be acquired for each interpreter to run Python code. This means that I can only effectively use a single CPU core from a single process with the GIL held (ignoring C extensions that release the GIL). This effectively forces the concurrency model to be multiple process. That makes IPC (usually serialization/deserialization) the bottleneck for many workloads.

If the GIL didn't exist, it would be possible to run multiple, independent Python interpreters in the same process. Processes would be able to fan out to multiple CPU cores. I imagine some enterprising people would then devise a way to transfer objects between interpreters (probably under very well-defined scenarios). This would allow a Python application to spawn a new Python interpreter from within Python, task it with running some CPU-expensive code, and return a result. This is how Python would likely achieve highly concurrent execution within processes. But the GIL stands in its way.

The GIL is an implementation detail, not poor language design.

It is a tractable amount of work ~40-80 hrs to convert CPython from a sea-of-globals to a context based system where one could then have a distinct Python interpreters in the same address space, as it is now. You get one. Lua got this right from the beginning, Lua state doesn't leak across subsystems. There is zero chance I would do this work and then see of it would stick. I am going to waste 2 weeks of full time work and then have the CPython folks say, yeah, no, because reasons.

Startup time should be fixed, Python does way too much when it boots, using blank files.

    $ time lua t.lua 

    real	0m0.006s
    user	0m0.002s
    sys	        0m0.002s

    $ time python t.py 

    real	0m0.052s
    user	0m0.036s
    sys	        0m0.008s

Lua supports the scenario you describe effortlessly not to mention that it's actually designed for embedding.

Python can't even be re-initialized in the same process without introducing memory leaks and other non-deterministic gotchas! [1]

[1] https://docs.python.org/3.6/capi/init.html#c.Py_FinalizeEx

> The GIL is an implementation detail, not poor language design.

As I understood the GIL simplifies data structures by removing any regard for concurrent access.

If you remove the GIL you must move your synchronization (mutexes) into the data structures and immediately get a big performance penalty.

If you wanted to avoid this overhead you run into swamplands where the programmer must take care of concurrent access patterns and everything. Also many CPython modules would stop working because they assume the GIL.

It can be done but last time I read about the GILectomy there was no clear way forward.

Yeah, I think this kind of issue is why Ruby, which also has a GIL, seems to be heading for a new concurrency and parallelism model that introduces a new level (Guilds) between threads and processes where the big lock would be held, and where Guilds communicate only by sharing read access to immutable data, and transferring ownership or copies of mutable data.

I agree that this is an implementation detail. If they were to simply use the JS model of "every thread gets its own environment and message passing is how you interact", then you could still use threads safely and achieve some pretty impressive performance improvements in some cases.

Knowing literally nothing about Python other than what I read, I'm kind of confused as to how the current implementation came to be, because it is much easier to design an interpreter that uses the JS model than one that uses a shared environment among multiple threads. I created an Object Pascal interpreter, and it has this design: it can spin up an interpreter instance in any thread pretty quickly because it's greenfield all the way with a new stack, new heap, etc.

Python's slowness can help improve performance by teaching you to use techniques that end up being faster no matter the language.

Python is so slow that it forces you to be fast.

Consider data analysis: on modern machines, you're almost always better off with a columnar approach: if you have a struct foo { int a, b, c; }, you want to store int foo_a[], foo_b[], foo_c[], not struct foo data[]. It's better for the cache, better for IO, and better for SIMD.

numpy makes it much easier to use the latter than the former, whereas in C, you might be tempted with the former and not even realize how much performance you were leaving on the table. Likewise for GPU compute offloading, reliance on various tuned libraries for computationally intensive tasks, and the use of structured storage.

Sorry, I didn't mean it to be trolling, I just meant it more or less literally. If Rust (for example) gets used for things like Mercurial and Mozilla, is that bad? I'm not saying Python shouldn't care, if it could improve the startup time without sacrificing other things. But presumably the transition from py2 to py3 was not intending to make things slower, it was intending to solve other problems. There are almost always tradeoffs. Even the mercurial folks quoted in the article said that the things py3 solved were not what they needed. That's a good indicator that Python is not the right language (anymore) for what they're doing.

I am primarily a Python programmer, but if Rust, Go, etc. take over as the language of choice in certain cases, I don't think that's a bad thing. Which doesn't mean one shouldn't write an article to highlight this cost of not having short startup time, just in case this cost wasn't understood by Guido, et al. But my guess (and it's only a guess), is that it was.

> While the "maybe you shouldn't use Python" comment could be construed as trolling to some, there is definite truth to your line of reasoning and I agree with comment.

I wouldn't say I construed it as trolling. More like, "You might be right, but where does that get us?" Not trolling, but also not that constructive, because it's extremely easy to write something like "maybe you shouldn't use Python" but likely hard and time-consuming to make it so.

There are a lot of questions when considering such a move. For example:

- What's the opportunity cost of migrating $lots_of Python to Rust, or some other language?

- Is that really where you can add (or want to add) the most value?

- And what does having to do that do to your roadmap? Maybe it enables it, but surely it's also stealing time from other valuable work you could be doing?

- Longer term, are we sacrificing maintainability for performance? (In your case it sounds like the opposite?)

- How easily can we hire and onboard people using $new_tech? (Again, it sounds like you might reduce complexity.)

Basically I suppose what I'm saying is I find it a little trite when people say, "well, maybe you should do X," without having weighed the costs and benefits of doing so. And in a professional environment, if that's allowed to become a pattern of behaviour, it can contribute to the demotivation of teams. Hence, I found myself a bit irritated by the grandparent post.

Python was always slow to start. Not as slow as the JVM, but maybe around the 300th test case for hg and maybe around the 100th python script invocation in any build system, people should start to wonder about how to get all of that under one Python process.

It's not like Python is so ugly it'd be messy to do. (It was possible with the JVM after all. It even works by simply forking the JVM, with all its GC threads and so on: https://github.com/spray/sbt-revolver )

Make style DAGs are nice, but eventually the sheer number of syscalls for process setup (and module import and dynamic linking) are going to be a waste of time.

If one needs Rust, C/C++ level of performance I doubt there is much Python can do and one can wonder if Python was ever the right tool for such a project.

It’s a great tool for prototyping.

If you expect to need the performance of a statically typed, compiled language I don't see why you'd prototype in a dynamically typed, interpreted language.

That's why build systems still look like black magic infused with even darker sh, and a bit of perl sprinkled all over, presumably because the previous maintainers were all out of goat blood.

Most layers of a large project that need to be designed and figured out care nothing of those concerns.

I feel bad for even thinking it but ... I bet go's startup times are great.

I think your characterization of the GIL is not accurate. Show me ANY real world program that can achieve linear speedups on multicore or multi-processor systems. Humans have not sufficiently mastered multithreading to be able to make such a claim. I am not aware of any "CPU-bound" use cases that would actually use Python like this instead of, say, C or Fortran. And anyway, I submit that it would benefit (both from a design and an execution standpoint) from being multi-process (in other words, using explicitly coded communication).

Regarding the GIL I‘ve always wondered about Jython but never gotten around to trying it. What are the drawbacks of running it on a JVM to get true multithreading? Having to properly sync the threads like in other environments without global locks?

Nothing, it's just not maintained. People realized, that yeah, python is nice, but why spend years reimplementing it on the JVM, when there's Kotlin. (And Java itself is quite a breeze to program in nowadays. And of course Scala, if you dare go beyond the Pythonic simplicity.)

Jython doesn't look completely unmaintained: https://hg.python.org/jython

It's also not completely obsoleted by Kotlin, e.g. for the use case of calling a Python library from Java. However, the Python semantics are not a great fit for the JVM, so you should expect it to be slower than plain CPython: https://pybenchmarks.org/u64q/jython.php

The supposed attitude of the python developers about startup time works against the popular niches Python is supposed to be such a great fit for. Little scripts, glue, short run applications.

That’s a problem if that’s an area python wants to compete in.

I might be biased because I'm from the hordes that are moving from Stata and Matlab to Python (but then there are the hordes attracted to data analysis now), but that was never really Python's strong suit, nor its target market.

I mean, I was always into little scripts, but I used Tcl and then Perl.

Back in the 1990s, Python was promoted as a web programming language. This was back in the days when everyone used CGIs. Python came with an cgi module, while in Perl you had to download cgi-lib.pl. I even helped maintain a Python web application that was all CGI-based.

So I can assure you that at one point Python was trying to be in the "short run applications" space. They may have given up since then, but that's a different issue.

As for me, I do write little scripts in Python. I don't like how most of my run time is spent waiting for Python to get ready.

What I really don't like is using NumPy. I tend to re-implement features I want rather than reach for NumPy because that 0.2s import time irks me so much. And it's because the NumPy developers want people to do "import numpy; numpy.reach.into.a.deep.package", so they import most of its submodules.

They used to also eval() some code at import, causing even more overhead. I don't know if that's gone away.

Ah, the days when I knew quite some people doing Zope consulting.

Apparently it is still around.

Ah, Zope. I remember when the IPC Python conference seemed to double in size (I think it was the DC one). 1/2 the people were seemingly there because of Zope.

Both Tcl and Perl are dead languages walking these days, and it's Python that's displaced them. It absolutely competes in that market.

Markets are a funny thing. Both "dead" languages are thread safe and can easily run separate interpreters per thread.



> Markets are a funny thing. Both "dead" languages are thread safe and can easily run separate interpreters per thread.

Perl threading is officially recommended against IIRC? In either case, "threads" in either of them don't share memory (except explicitly and manually), at which point what you have is multiprocessing by a different name.

In Perl 5 threading things are only "shared" semantically. In reality, there is an extra interpreter running where the "shared" variables live, and any value fetches and stores are handled with the `tie` interface (see http://www.perlmonks.org/?node_id=288022 for more information).

In Perl 6 on the other hand, everything is always shared and you can use high level constructs such as atomic increments, supplies, taps, `react` and `whenever` so you don't have to think about any (dead)locking issues as a developer.

Tcl has an "easy" threading mode where each thread runs its own interpreter, and a "hold my beer" mode where you can spawn threads within a single interpreter.

Tcl also has synthetic channels, so you can stay in easy mode but open a bi-directional read/write channel with the two ends in separate threads, with readable/writable events assigned, so you can do automatic event-driven information sharing between threads.

I don't know any other language that gives you options like that.

I do have to wonder if Tcl would have gained significantly greater mindshare if the syntax had been more Algol-like.

Quite a few sysadmins around here have a different point of view regarding Perl.

Dream on.

The linked post is about Python startup being a problem with thousands of invocations. Is Python startup really a problem for the niches you mention, or is it a problem in some extreme edge cases? I would argue this is the latter and perhaps signals that an architecture change for the build or tests would be best.

I have been using Python for small scripts for 20+ years and haven't had this issue. The JVM on the other hand was historically slow to start.

If you need to run thousands of scripts, do you need to (re-)start Python for each script? IMHO what needs to be done for this problem is not faster startup, but a way to avoid startup by implementing a feature where you can keep a single Python "machine" in memory that can make a "soft reset" to execute a fresh script.

Yep. Even PHP solved this :)

That said PHP startup (parsing, because there's no on disk bytecode cache like .pyc - though there's an in-memory one [OPcache], as somewhat expected from a server thingie) was always quite fast, and it got a bit faster in php7: https://wiki.php.net/rfc/abstract_syntax_tree#impact_on_perf...

Yep. Tried to use a Raspberry Pi as my main system for a while and one of the pain points was slooooow startup of Python. As a Python fan I was embarrassed.

I don't particularly agreed about this being what "Python is supposed to be such a great fit for."

I've been to quite a few PyCons and never heard anyone espousing this view, but I'm open to the possibility that I have missed it. Can you link me to a piece of media that you think persuasively makes the case that this is what Python is supposed to be for?

Python is not optimized for small glue code at all. The fact that it is the sanest language for use in that niche speaks much more about the ecosystem than about Python.

Python seems to be mainly optimized for web servers, scientific computing and machine learning tasks. None of those care about startup time.

Python is really only the target for those because someone lied to all of the systems folk and told them that Ruby was too slow. (The previous wave of infrastructure management tools seemed to all be written in Ruby and nowadays it's Python or Go.) That and python is one of the "official" languages at Google and everyone wants to be Google, right?

Meanwhile, Ruby is making great strides in performance and even has JIT coming in 2.6.

Not sure why you're saying Python was used as an alternative to Ruby when Python is older than Ruby.

Its popularity and its popularity specific to the mentioned use-case is not, for those that know their history.

I think this is why Mercurial is switching (largely) to Rust: https://www.mercurial-scm.org/wiki/OxidationPlan

I totally understand that milliseconds matter in the use case described in the article.

For me, personally, I use python to automate tasks - or to quickly parse through loads and loads of data. To me, startup speed is somewhat irrelevant.

I built a micro-framework that is completely unorthodox in nature, but very effective for what I needed - that being a suite of tools available from an 'internet' server, available to me (and my coworkers) over port 80 or 443.

My internet server, which runs python on the backend (and uses apache to actually serve the GET / POST) literally spits out pages in 0.012 seconds. Some of the 'tools' run processes on the system, reach out to other resources, and spit the results out in under 0.03 seconds (much of that being network / internet RTT). To me, that's good enough - adding 30 or even 300 milliseconds to any of that just wouldn't matter.

I totally get that if Python wants to be a big (read bigger?) player then startup time matters more...but for my personal use cases, I'm not concerned with the current startup time one bit.

As expected, language start up time only matters to some people. Often in my case, Python is used to build command line tools (similar to the case of Mercurial).

In such an event, the start-up time of the program might dominate the total run time of the application. And on my laptop or desktop with a fast SSD with good caching and a reasonably fast CPU... that still ends up being 'okay'.

But once I put that on an ARM chip with a mediocre hard drive - some python scripts spend so long initializing that they are practically unusable. Whereas the comparable Perl/BASH script runs almost instantaneously.

Often to make Python even practically usable for such systems I have to implement my own lazily loaded module system. Having some language which allowed me to say...

    import(eager) some_module
    import(lazy) another_module

Which could trigger the import process only when that module becomes necessary (if ever).

Have you tried moving import statements into the functions where they are invoked? My understanding is this is effectively the same as lazy loading the module[1].

[1] https://stackoverflow.com/questions/3095071/in-python-what-h...

I have, and it actually works well (performance wise). The maintenance burden is a little higher.

A little Python preprocessor that lets you annotate your lazy modules sounds like a fun little toy project, actually. Not something I'd use for real, but it would be fun to build.

3.7 makes it easier to use dynamic imports. https://snarky.ca/lazy-importing-in-python-3-7/

Python is moving to have a lazy loader as part of the standard library. I mean, it's there already, at https://docs.python.org/3/library/importlib.html#importlib.u... , but not clearly easy to use, and with a big warning label against using it.

The issue at https://bugs.python.org/issue32192 says the plan is to start with an easier to use system as a PyPI package.

I've also written a little asynchronous module loading system. Why not load a module we know we're going to need in the background?

I think you're telling us about how you're not affected by a problem that does affect other people. I feel like this doesn't add any substantial, interesting points to this discussion.

I have similar use cases. Startup time starts to matter once you either want to build test cases or put scripts in loops. If I have a script that parses one big data file, and I decide to parse 1000, it's often helpful if I can run that script a thousand times rather than refactor it to handle file lists. Or if you want to optimize some parameter.

> To me, startup speed is somewhat irrelevant.

But isn’t that the author’s point? It doesn’t seem like much time but because you’re paying it so often in so many little places it really does add up.

Sort of related story: we needed a scripting language able to run on an x86 RTOS type of architecture compiled with msvc and looked into CPython because, well, Python is after all quite a nice language. After spending a considerable amount of time to get it compiled (sorry, don't recall all the issues there, but main one was that the source code assumed msvc == windows which I know is true for 99% of cases but didn't expect a huge project like CPython to trip over) it would segfault at startup. During step-by-step debugging it was astonishing how much code got executed before even doing some actual interpreting/REPL. Now I get there might not be a way around some initialization, but still it simply looked too much to me and perhaps not overly clean either. Moreover it included a bunch of registry access (again, because it saw msvc baing used) which the RTOS didn't have in full hence the segfault. Anyway we looked further and thankfully found MicroPython which took less time to port than the time spend to get CPython even compiling. While not a complete Python implementation, it does the job fur us, and it gets away with startup/init code of just something like 100 LOC (including argument parsing etc). Yes I know it's not a fair comparision, but still, the difference is big enough to, at least for me, indicate CPython might just be doing too much at startup and/or possibly spend time on features which aren't used by many users and/or possibly drags along some old cruft. Not sure, just guessing.


Mercurial's startup time is the reason why, for fish, I've implemented code to figure out if something might be a hg repo myself.

Just calling `hg root` takes 200ms with hot cache. The equivalent code in fish-script takes about 3. Which enables us to turn on hg integration in the prompt by default.

The equivalent `git rev-parse` call takes about 8ms.

Wow, that's quite a difference.

But 8ms is still too slow for me. :) I implemented the Git recognition code myself in my own prompt using the minimal amount of FS operations [1], and it renders in 5 ms from start to finish, including a "git:branch-name/47d72fe825" display.

[1] https://github.com/majewsky/gofu/blob/master/pkg/prompt/git....

(I work on Git in my copious free time)

One of the reasons git-rev-parse takes slightly longer than your implementation is that you just unconditionally truncate the SHA-1 to 10 bytes. E.g. run this on linux.git:

    git log --oneline --abbrev=10 --pretty=format:%h |
    grep -E -v '^.{10}$' |
    perl -pe 's/^(.{10}).*/$1/'
You'll get 4 SHA-1s that are ambiguous at 10 characters, this problem will get a lot worse on bigger repositories.

Which is not to say that there isn't a lot of room for improvement. The scope creep of initialization time is one of the things that tends to get worse over time without being noticed, but Git unlike (apparently) Python makes huge use of re-invoking itself as part of its own test suite (tens of thousands of times), so it's naturally kept in check somewhat.

If you have this use-case I'd encourage you to start a thread on the Git mailing list about it.

I put similar code in Emacs's vc-hg to get revision information straight form Mercurial's on-disk data structures instead of firing up an hg subprocess.

You mean actually reading dirstate[0] or just the branch/bookmark files?

We also do the latter, but dirstate format isn't easily readable just with shell builtins (lots of fixed-length fields with NUL-byte padding, also we don't even have a `stat` builtin and the external program isn't a thing on macOS AFAIK), so we still fire up `hg status` for that - but only after we decide that there is a hg repo.


Somewhat tangentially, I noticed that fish performs quit badly in remote-mounted (sshfs) directories that are git repositories. I wonder if it would be possible to detect a remote mounted filesystem and turn off/tone down some of the round-trip heavy operations?

I've gone through your problem myself countless times, and concluded that hitting ctrl+c to interrupt the status line every time it tries to render the current repository state is not very productive.

My git status line uses timelimit (https://devel.ringlet.net/sysutils/timelimit/) to automatically stop if any of the git status parts (dirty/staged/new files) take > 0.1 seconds to finish:


I implemented something similar for Xonsh.

Ironically, xonsh itself suffers from a long startup time due to it's use of python. This is my primary (negative) experience with the issue in the linked article, and the reason why I stopped using xonsh.

This is truly a problem. Even more so if you host your application on a network directory. Loading all the small files takes ages. I really wish there would be a good way to compile the whole application with all the modules into one package once you're ready to release. I really wish the creators of Python would have given such use-cases more consideration.

Edit: I'm aware that there are solutions that put everything a program touches into a kind of executable archive. A single file several hundred Megabytes in size. I've tested it. It doesn't really pre-compile the modules. The startup time was exactly the same.

Nuikta (http://nuitka.net/) already does that and much more:

- it compiles your program and make it stand alone so you can distribute just the exe

- it makes it start faster

- it makes it run faster

- it's fully independant of the system python. Actually your system doesn't even need a python at all

I don't get why it's not used, it's very robust, compatible with 3.6 and on some of my script I get about x4 speed up just on start up alone.

This is different from the package that I've tested (PyInstaller or py2exe).

Is Nuikta compatible with numpy, pickle, etc? I remember that numpy was very problematic with compilers like pypy for a long time.

In my experience it's easier and more reliable than PyInstaller or Py2exe to use, and cross plateform (but no cross compilation). It doesn't pack python files with an executable. It translates the Python code to C then compiles it.

Nuikta supports numpy officially, you can even see it in change logs: http://nuitka.net/posts/nuitka-release-0521.html

I haven't tried pickle.

>This is different from the package that I've tested (PyInstaller or py2exe).

In an ancestor comment you say:

> A single file several hundred Megabytes in size.

Are both points referring to PyInstaller? Asking because I've tried out PyInstaller with small CLI as well as GUI (wxPython) programs, and the resulting EXEs did not reach near that size, IIRC.

It was around 150..300 MB if I remember right. Admittedly, I do import a lot of modules. Program startup for me is in the seconds, not milliseconds. I maybe could cut it down 50% or even 80% but then that would cost a few weeks or a few months. Having a quick and robust solution to be implemented in a single week to cut down startup times by that amount would be highly preferable.

Interesting. I see what you mean. That tip that someone else mentioned, IIRC, in this thread, to import modules inside functions that use them, could cut down startup time, but only if those functions are not called at startup, only later during the program's run. But it would not change the EXE size, I guess.

I wish Python and all other interpreted languages came with a way to build EXEs from the start. It would be great for deployment.

First time I hear about this, and I've looked for alternatives to cxfreeze and its cousins in the past.

Any time I see something like this, I feel like I'm hearing about some homeopathic cancer cure. If Nuitka actually does what it says it does, it's solving a big recurrent problem for the Python community, so why is nobody talking about it?

Having followed Nuitka since it started, I can offer my perspective:

- Before Nuitka, someone already did a "Python->C" compiler along similar lines: translate the Python source code into the C calls that the interpreter will make, eliminating any interpreter overhead and providing C compiler optimization opporunities. That thing sort-of-worked (with v1.5.2 IIRC), but was cumbersome to use and delivered a meager 5% performance improvement for the cases it did support; it was abandoned.

- Nuitka's plan had the same thing as a starting phase; people told the Nuitka guy that he's wasting his time based on prior experience. When he actually delivered a mostly-robust working version (much more usable than the previous attempts ever were), it indeed delivered only a small performance gain compared to CPython.

- As a result, it seemed like the community believed both that the whole thing is futile, and that the developer is fighting windmills.

- a lot of time passes, Nuitka keeps improving with better analysis, translation, compilation, etc - but the community has already cemented its opinion.

- Nuitka remains a useful magic system known to few.

I would say that the early Nuitka versions (and the prior attempt) gave it a SEP field that has never been lifted, and short of e.g. DropBox or Facebook adopting it, nothing will lift it either.

I think Nuitka is an answer looking for a problem. Usually people move away from Python. If they really need more low level performance, or implement whatever system in Python itself, to solve the problem Nuitka might solve for them.

That said, it's a wonderful project, I hope more and more people will find it useful for them.

Quite the opposite, distributing a Python binary is one of the most popular demand in the community, along with better multiple core and a JIT.

It used to be packaging and V2/V3, but those are fading away now with wheels everywhere and 3.4+ being the new love affair. Python has been in the habit of improving every year, steadily for 28 years, solving the problems the community asked every time.

How is that the opposite? :)

It just means that there are a lot of folks who stay on Python, but want better deployment. That's great, but we don't see those who simply move away from Python (and use Electron with a Rust backend for example, or go full web + maybe native Android/iOS apps).

That's the question I'm asking.

Not only it's a beautiful tool, but the author has been quietly and steadily working on it for 8 years. Compatibility is the number one goal.

The guy has a lot of rigor and humility, so maybe communication suffered ?

Let's give it some visibility then. https://news.ycombinator.com/item?id=16980704

hg allows loading modules at runtime. maybe thats a problem.

I doubt it, since you can pass pass manually a list of all modules you want nuikta to embed with --recurse-plugins=MODULE/PACKAGE

But the problem is that you can't know the list you need at embed time for hg, because extensions are arbitrary python files discovered at runtime.

I collect ideas, especially weird and powerful ideas.

I've learned not to try to talk about it because of that question: "If foo is so great, why isn't everybody using it?"

It's one of the single greatest frustrations of my life. I don't know. I've never made any progress on it. The best you can say is, "Well, that seems to be human nature." The world is full of "magic beans" and most people seem interested in banging their heads against the wall.

(Did you know, you can make a 140hp engine that fits in the volume of two stacked pizza boxes and has only one moving part?)

Anyhow, Nuitka is great, it does do all that.

And the creator is a freakin' saint for putting up with the way he's been treated by the Python community, is my opinion.

Also, you should write a blig about those ideas. Or a place where we can share it, but in a non tin foil way / silver bullet way.

Eg: after 7 years of having malaria, one tropical disease doctor explained to me that we have been able to cure malaria for years. Generalist doctors usually don't know it because they don't encounter the disease often enough to keep up to date. It's kind of hard pill to swallow given that i always though you had it for life.

I'd love to, but from experience I can tell you, you just get skeptics and crackpots and scammers and suckers crawling out of the woodwork and gumming up the scientific/inventive process. That, combined with the apathy of the general public, means that it's just hard to get a real conversation going in a "non tin foil way / silver bullet way".

It also means that a lot of great ideas go nowhere, or take 50-100 years to get adopted. Your experience with malaria is an example. My condolences btw, that sounds terrible.

In any event, there's Rex Research: http://rexresearch.com/1index.htm (IGNORE HOW IT LOOKS!!!) This fellow has been collecting inventions and other weird stuff for decades, since before the internet. He used to run little ads in the back of Popular Science and others like that. Yes, the site looks a little... creative, and much of the stuff he lists is just crackpottery, but the stuff that isn't is mind-blowing.

Just one example, one of my favorite devices: the Hilsch-Ranque Vortex Tube. It's a "Maxwell's Demon" (although it does not violate Thermodynamics. Of course.)

> The vortex tube, also known as the Ranque-Hilsch vortex tube, is a mechanical device that separates a compressed gas into hot and cold streams. It has no moving parts.


You can actually buy these to go on the end of a compressor and provide "spot cold" for cooling off whatever. I emailed a company that sells them once to ask what would happen if you set it up in a feedback loop so that the cold output was chilling the input line, but they were uninterested.

Oh hey, time marches on and it now has a wikipedia page! https://en.wikipedia.org/wiki/Vortex_tube

Anyhow, like I said, I'd like to blog about this stuff but most of the people who would be into it would either haters or credulous fools. Speaking of which, youtube has lots of videos of people talking about and sometimes even demonstrating things. But again, you have to wade through all kinds of bullshit and scammers and hoaxers and skeptics and credulous fools to actually find the handful of people who take this stuff seriously but can maintain a proper detached scientific attitude to actually investigate it. YMMV

Saint is the word given he has be working really seriously at it for 8 years, alone, never complaining, and giving away everything without any recognition.


It's a function of how many people really need that solution. It's high enough that it exists. It's low enough that it's not a thing you do by default or even talk about much. But if you need it and it's compatible with your code - it's there.

Actually any sysadmin or scripter wannabe could benefit from nuikta. Twitter even invented pex because of that. Nuikta took some time to become 100% compatible with the latest versions of cPython, but now it's the case so let's all enjoy it.

I like nuitka and what it's doing, but it's just not that comfortable in every single situation. There's value in seeing the source and being able to modify it in place without a compile&redeploy steps. There's value in dtrace support which as far as I can tell nuitka doesn't have. And other small things like existing profilers which work with cpython stack traces specifically.

So if there is a reason to use it - great. But that's not every situation.

Can't you use CPython for development and Nuitka for deployment?

Does nuitka build a static executable or do you still need to supply shared libraries with the executable?

It builds a static executable if you pass it the --standalone option.

No, it doesn't.

  $ echo 'print "Hello world"' > hello.py
  $ nuitka --standalone hello.py 
  $ ldd hello.dist/hello.exe
  	linux-gate.so.1 (0xf7f94000)
  	libdl.so.2 => /lib/i386-linux-gnu/libdl.so.2 (0xf7895000)
  	libpython2.7.so.1.0 => /home/jwilk/hello.dist/libpython2.7.so.1.0 (0xf7508000)
  	libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf732f000)
  	/lib/ld-linux.so.2 (0xf7f96000)
  	libz.so.1 => /home/jwilk/hello.dist/libz.so.1 (0xf7310000)
  	libpthread.so.0 => /lib/i386-linux-gnu/libpthread.so.0 (0xf72f1000)
  	libutil.so.1 => /home/jwilk/hello.dist/libutil.so.1 (0xf72ed000)
  	libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xf71eb000)

Ok, it buids a static executable against the python runtime and extensions, including the c ones. It doesn't against libc, lubutil, etc. But aren't those almost always installed ?

If it's a real problem on your machines, it's PR time !

Nuitka has not been able to compile any Python code I've written myself. It's not used because it's incredibly limited.

When was the last time you've tried it?

Is it a perl2exe descendant, packing the interpreter into an executable wrapper?

No it compiles python to C, then compile the C.

I know this isn’t everyone’s favorite but Cython has a way to convert your python code Into an executable with Python embedded and I bekievr it also Packs your imports

Cython is a complicated beast but I feel like it just needs a more friendly wrapper for this to be more widespread.



Why Cython isn’t in the stdlib (I think it could easily replace ctypes) is beyond me sometimes

I worked on one Python application that had a startup time problem because it was on a network filesystem with slow metadata/stat times. It took several seconds to start Python.

We were able to solve most of the problem by zipping up the Python standard library and the our application.

That is, if you look at sys.path you'll see something like:

  >>> sys.path
  ['', '/usr/local/lib/python36.zip', '/usr/local/lib/python3.6', ...]
If you zip up the python3.6 directory into python36.zip then it will use that zip file as the source of the standard library, and use the zip directory structure instead of a bunch of stat calls to find the data.

This should also include getting access to the pre-compiled byte code.

You can also have Python byte-compile all of the .py files in a directory as part of your build/zip process.

  python -m compileall --help

Don't forget

  find . -type f -name "*.py" -delete
right afterwords.

Also note calls to imp.load_source need to change to imp.load_compiled, and any .py files references directly in code need to be changed to .pyc (this is with 2.7, not sure about 3.x)

Thanks, I will try that!


I think design choices made in Python simply don't allow for comprehensive ahead of time compilation. For what it's worth, they have recently landed snapshots in Dart that do what you want:


It's what Flutter uses on iOS since you can't run JITed code; AOT compile it and load it as just another shared library.

also it’s not static linked, so you need to make sure all of the shared libraries exist on the host, requiring to install a whole bunch of trash.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact