What an amazing read. Now I know why my pip installs are failing in 3.12 but we ...

cdavid · on Nov 8, 2023

Yes, that's a fundamental reason python packaging is a mess. Python success is largely due to the availability of key mixed language packages. No other mainstream language package manager has to deal with this.

For example, cargo for rust, which is great, can assume to package mostly rust-only code. And while it is compiled, the language "owns" the compiler, which means building from sources as distribution strategy works. I don't know how/if cargo can deal with e.g. fortran out of the box, but I doubt cargo on windows would work well if top cargo packages required fortran code.

The single biggest improvement for python ecosystem was the standardisation of a binary package format, wheel. It is only then that the whole scientific python ecosystem started to thrive on windows. But binary compatibility is a huge PITA, especially across languages and CPUs.

winstonewert · on Nov 9, 2023

Many rust crates actually do package code written in other languages. There are plenty of useful C/C++/Fortran libraries that nobody has rewritten into Rust but for which wrappers have been created that call into C. It works in Rust because the build.rs lets libraries do whatever they want during the build process, including invoking compilers for other languages.

Various factors still make the Rust and Python story different (Python uses more mixed-language packages, the Rust demographic is more technically advanced, etc). But a big one is that in Rust, the FFI is defined in Rust. Recompiling just the Rust code gives you an updated FFI compatible with your version of Rust. In Python, the FFI is typically defined in C, so recompiling the python won't get you a compatible FFI. If Python did all FFI through something like ctypes, it would be much more smooth.

glandium · on Nov 9, 2023

> It works in Rust because the build.rs lets libraries do whatever they want during the build process, including invoking compilers for other languages.

It's still its own mess. You'll find plenty of people having problems with openssl, for instance.

oefrha · on Nov 9, 2023

I have the pleasure of maintaining a reasonably popular -sys crate, and getting it working on Windows is an absolute nightmare, especially considering the competing toolchains & ABIs (msvc and gnu) and millions of ways the C dependency can be installed (it’s a big one, and users may choose a specific version, so we don’t bundle it), with no pkg-config telling you where to look.

No idea how build.rs being able to run any Rust code is an advantage, it’s not like setup.py can’t run any Python code and shell out. In fact, bespoke code capable of spitting out inscrutable errors during `setup.py install` is the old way that newer tooling tries to avoid. Rust evangelism is puzzling.

oblio · on Nov 9, 2023

Wait a second, I need to understand this better.

If you cargo build, can that run a dependencies' build including trying to compile C and stuff?

oefrha · on Nov 9, 2023

Yes, see https://doc.rust-lang.org/cargo/reference/build-scripts.html.

bogeholm · on Nov 9, 2023

I believe build.rs can do pretty much anything: https://doc.rust-lang.org/cargo/reference/build-scripts.html

oblio · on Nov 10, 2023

That's both scary and sad :-(

winstonewert · on Nov 11, 2023

Why? If you are using a crate, its code will be running in your application. Its not really any more of a concern if it can run code while building.

oblio · on Nov 13, 2023

Yeah, but with binary packages you can add another lay of defense in depth, signed packages, signature checking, etc. It's not just about the original authors themselves, it can also be about attacks on the public repositories, for example.

winstonewert · on Nov 9, 2023

When I referred to build.rs - I merely meant that the build script made it possible to build code written in other languages - not that it solved all the problems. It very much doesn't solve all the problems involved.

wongarsu · on Nov 9, 2023

Though you get partially saved by the "FFI is defined in Rust" part, with many of the hard to compile crates offering optional prebuilt binaries for the part that's not rust.

cesarb · on Nov 9, 2023

> It works in Rust because the build.rs lets libraries do whatever they want during the build process, including invoking compilers for other languages.

And also there's a helper library (the "gcc" crate) which does all the work of figuring out how to call the C or C++ compiler for the target platform, so that build.rs can be very small for simple cases. You don't have to do all the work yourself.

LoganDark · on Nov 9, 2023

Don't you mean the `cc` crate? `gcc` is its terribly-outdated predecessor.

cesarb · on Nov 9, 2023

It's the same crate, it just changed its name at some point in its history. I still refer to it through its former name by force of habit.

LoganDark · on Nov 9, 2023

I know it's the same crate, but they changed its name. `gcc` is the predecessor of `cc` in terms of name. If you depend on `gcc` instead of `cc`, you won't have any of the new improvements.

mistrial9 · on Nov 10, 2023

gcc is not terribly outdated overall, this is FUD. specific information, please

LoganDark · on Nov 10, 2023

The `gcc` crate was last published to in 2018. Use `cc`.

https://docs.rs/crate/gcc/latest/builds

mattip · on Nov 9, 2023

That sounds like cffi https://cffi.readthedocs.io/en/latest/

the__alchemist · on Nov 9, 2023

As you might expect, compiling rust crates that use C libraries can lead to the inscrutable-block-of-text linker errors we know and love. I have been having a rough time with CUDA, CMSIS-DSP, Tensorflow, and OpenCV during the past few weeks. One of them requires LLVM=v15 to be installed; another requires v16+. A diff one requires an old version of rustc to be installed. On a diff one, when I posted on Github, the maintainers assured me the crate is, in fact, fine, and my system is misconfigured; phew!

oefrha · on Nov 9, 2023

Lol, as a maintainer of a reasonably popular and complex -sys crate, our crate is “fine” on Windows, at least in theory, and I’ve heard of successes using it. However, I can’t even port my own app depending on said -sys crate to Windows; there’s always a wall of linker errors. If you report a Windows problem to me, I won’t tell you it’s fine, I just throw my hands in the air.

cshokie · on Nov 9, 2023

Out of curiosity is it a “Windows is objectively difficult” problem, or a “Windows is not Linux and I know Linux best” problem?

I’ve only begun using it so my expertise is limited, but I think vcpkg aims to help with some of these difficulties by shipping code as source and then running make on dependencies so they are guaranteed ABI compatible because the same compiler builds everything.

oefrha · on Nov 10, 2023

That I don’t know Windows well is certainly a factor, but I think it’s at most 40% of the problem. Several notable issues, not all:

- Competing msvc and gnu toolchains and ABIs, with native and Windows-first dependencies working better or exclusively with msvc, and *ix-first dependencies working better or exclusively with gnu, is a uniquely Windows situation. Which is which for a given build is also not clearly labeled most of the time. (You might mention glibc vs musl, but there’s basically nothing uniquely musl, and when you’re compiling for musl you can almost always get/compile everything for musl from the ground up.)

- Confusing coexistence of x86 and x64 is another thing largely unique to Windows. (amd64 and arm64 are much more clearly separated in Apple land.)

- Package management is a complete mess. Choco, scoop, win-get, nuget, vcpkg, ad hoc msi, ad hoc exe installer, ad hoc zip, etc. etc. There’s no pkg-config telling you where to look and which compiler flags to use. If you want to pick up a user-installed dep, you special case everything (e.g. looking in vcpkg path) and/or ask user to supply the search path(s) in env var(s).

Anyway, shit mostly(tm) just work(tm) on *ix if you follow the happy path. There’s no happy path on Windows more often than not.

shoo · on Nov 9, 2023

> The single biggest improvement for python ecosystem was the standardisation of a binary package format, wheel.

I agree. Some people love to complain about python packaging. But from one perspective, it's arguably been a solved problem since wheels were introduced 10 years ago. The introduction of wheels was a massive step forward. Only depend on wheel archives, don't depend on packages that need to be built from source, and drag in all manner of exciting compile-time dependencies.

If there's a package you want to depend on for your target platform, and the maintainers don't produce a prebuilt wheel archive for your platform -- well, set up a build server and build some wheels yourself, and host them somewhere, or pick a different platform.

sertbdfgbnfgsd · on Nov 9, 2023

> Yes, that's a fundamental reason python packaging is a mess. Python success is largely due to the availability of key mixed language packages. No other mainstream language package manager has to deal with this.

Admittedly I'm not a python expert, but julia handles this just fine? It doesn't seem like it's a difficulty inherent to "mixed language packages". Somehow it appears to me that python's approach is just bad somehow.

rightbyte · on Nov 9, 2023

Then again jl seems to take 10s to start doing something non-trivial each time while modules are being compiled.

sertbdfgbnfgsd · on Nov 9, 2023

THEN AGAIN, If you want to restart repeatedly, for whatever reason (??), maybe you should just compile the modules once...? See e.g. [1]

TLDR: There's a package called PrecompileTools.jl [2] which allows you to list: I want to compile and cache native code for the following methods on the following types. This isn't complicated stuff.

[1] https://julialang.org/blog/2023/04/julia-1.9-highlights/#cac...

[2] https://julialang.github.io/PrecompileTools.jl/stable/

markkitti · on Nov 9, 2023

Julia 1.9's native precompilation is definitely helpful in that regard but loading those native shared libraries (.so files on Linux) into Julia does take some to verify.

If the main objective is to reduce time to load and time to first task then PackageCompiler.jl [3] is still the ultimate way to do so.

Because Julia is a dynamic language, there are some complicated compilation issues such as invalidation and recompilation that arise. Adding new methods or packages may result in already compiled code no longer statically dispatching correctly requiring invalidation and recompilation of that code.

It slightly more complicated than what you stated. It's "I want to compile and cache native code for the following methods on the following types in this particular environment". PackageCompiler.jl can wrap the entire environment into a single native library, the system image.

[3] https://github.com/JuliaLang/PackageCompiler.jl

jchw · on Nov 9, 2023

I don't think this is really true. NodeJS has quite a similar problem.

NewJazz · on Nov 9, 2023

Doesn't R handle fortran and C++ packages?

azalemeth · on Nov 9, 2023

It does -- and to a much lesser extent shims exist for other languages as well. R also has imperfect packaging but I think it's handled well albeit with a level of complexity that also goes up a lot unexpectedly at times. For a truly great experience, call python packages within R in a custom conda environment in order to get data out of pandas in a particularly unholy way...

disgruntledphd2 · on Nov 9, 2023

R has always shipped binary packages on Windows and Mac to avoid lots of the pain we see in Python.

Also, all packages must build with the latest version of R, or they are removed from CRAN. This makes the dep problems a lot less severe than we see with Python.

smabie · on Nov 9, 2023

Nothing to do with Python? These FFI bindings exist because Python is slow as dirt.

zzbn00 · on Nov 9, 2023

This is one of the reasons why Python got so popular.

It is too slow to reimplement big pieces of software in it, so people just used bindings to existing code. And productivity rocketed!

baq · on Nov 9, 2023

These FFI bindings exist because Python was designed as a glue language for FFI bindings.

zo1 · on Nov 9, 2023

And the bindings to python exist because those languages are a pain in the ass to work with.

I don't want to work with Fortran, C++, Cobol, etc. And I sure as hell don't want to figure out how to integrate such wildly different languages into my existing and modern ecosystem.

oblio · on Nov 9, 2023

You're probably replying to the original comment from the wrong angle.

Ecosystems like Java, .NET, Golang, Rust, etc do away with this entire problem by virtue of... not calling into C 99.99% of the time, because they're <<fast enough>>.

baq · on Nov 9, 2023

There's no right or wrong angle here. There's the useful or not useful angle.

Python was designed to call into C. It was always the solution to make Python fast: write the really slow parts in C and it might just turn out that will make the whole thing fast enough. Again: this is by design.

The languages and VMs you list were designed to be fast enough without calling into C. If you need that, great, use them.

People saying 'Python is slow' miss the point. It was never meant to be fast, it was always meant to be fast enough without qualifiers like 'no C'. If it isn't fast enough or otherwise not useful, don't use it, you've got plenty of alternatives.

mike_hearn · on Nov 10, 2023

I don't think Python was designed to call into C, is there some document from the early days claiming that was a major design goal?

1. The only way to integrate with C or other langs in early Pythons was to write interpreter extensions against the internal API. The cffi module seems to have appeared as late as 2012.

2. The Python interpreter API is not an excellent way to extend it, being as it is just whatever happens to be the internals of CPython specifically. There's now an HPython project that is trying to define a JNI equivalent for Python i.e. something vendor neutral, binary compatible and so on.

A language designed to call into C would have had a much easier to use FFI from day one.

BlueTemplar · on Nov 12, 2023

My understanding is that, however, calls into Fortran happen because you want some subroutine to be «as fast as possible».

placebo · on Nov 9, 2023

> It’s a miracle it works at all.

I agree. In fact with what seems to be an exponential growth in complexity of software ecosystems, what's keeping it all from eventually getting to a "tower of Babel" catastrophe? Of course, this does not only apply to software, but it is a good example.

globalnode · on Nov 9, 2023

yes it was very eye opening. i often see people comparing their favourite package manager to pythons and coming to the conclusion that python is terrible, but its not! one thing i dont quite understand, is why dont the python people just use a c/c++ math library instead of fortran?

patrick451 · on Nov 9, 2023

There often simply doesn't exist an equivalent library written in c/c++ (or any other language for that matter). The example I'm familiar with is SLICOT (Subroutine Library in Systems and Control Theory) [1], exposed in python through Slycot [2]. It as routines for pole placement, riccati solvers, various factorizations, and MIMO zero computations and a ton of other stuff. As far as I have been able to find, no c/c++/other-language comes close to supplying either the breadth or depth of this library. Further, many of the SLICOT subroutines were written by the original inventors of their respective algorithms, which I view as a big bonus.

[1] http://slicot.org/ [2] https://github.com/python-control/Slycot

radarsat1 · on Nov 9, 2023

A lot of these kinds of routines could be translated to other languages but aren't because they are complex and often unmaintained and no one is really around that understands them well enough to port them to C or Rust or whatever.

There is also the issue that often they were published before adding a LICENSE file was a thing. I've found myself in the position before of having to email professors in some random university to ask them if I can get permission to redistribute such a routine while packaging a library that depended on it. In one case I asked them if it would be possible to update their code with a license (which was just a zip file on netlib) and the answer was, "no, but you have my email". So I found myself having to write something like "distributed with permission from the author, private correspondence, pinky swear" in my copyright file. some of this code is so old the authors aren't around and it would get "lost" in terms of being able to get permissions to use it, I mean it's a potential crisis to be honest, if people really cared to check. (Until the copyrights expire I guess, which is what, 70 years after the author's death or some such?)

Anyway, I wonder if a potential solution is to autotranslate some of these libraries using LLMs? Maybe AI will save us in the end. Of course you can't trust LLMs so you still need to understand the code well enough to verify any such translation.

BlueTemplar · on Nov 12, 2023

> (Until the copyrights expire I guess, which is what, 70 years after the author's death or some such?)

«To promote the Progress of Science and useful Arts», my ass. Why do we keep tolerating this Disney-caused bullshit ?

Though I guess that we didn't, and this is what caused the Free Software movement, so it all works out in the end ?

avidphantasm · on Nov 9, 2023

Right, lots of legacy code, plus lack of pointer aliasing in Fortran opens up more opportunities for optimization (or so I have read; this might have changed).

rightbyte · on Nov 9, 2023

'restrict' has been an common extension in C since long ago and is now a proper keyword.

I guess it is a matter of taste.

baq · on Nov 9, 2023

also a matter of 'I'm not going to rewrite lapack in C because a platform which was never the intended target doesn't have a free compiler'.

dzdt · on Nov 9, 2023

Because some of the best math libraries are written in Fortran. Seriously lots of heavy duty scientific code was written in Fortran in the 1970s and is still underpinning applications today. In many cases there is no equivalent alternative.

ecshafer · on Nov 9, 2023

Imo i would much rather write a math library in fortran than c or cpp. Fortran is quite a joy for doing numerical work. It sucks for most other things though. Really the only thing nowadays is that youll probably be using gpus so that makes cpp better for cuda integration.

BlueTemplar · on Nov 12, 2023

Why would C++ be better than FORTRAN for GPUs ?

keithalewis · on Nov 9, 2023

Indeed. The real problem is python seems to attract people with no training in software development. It is a mess on top of a mess.

tomlockwood · on Nov 9, 2023

That's also a feature - by design it has to be friendly to new users, and not an arcane art only accessible to the Chosen One, as much as those Chosen Ones would like to be the only programmers.

dylan604 · on Nov 9, 2023

I thought that was what people said about php

josefx · on Nov 9, 2023

A lot of the mockery PHP got was not because it attracted amateur developers, it was because the language itself was amateurishly implemented and because of the resulting mess when that leaked into how it behaved. Things like function names in the standard library optimized for a strlen based hash, a hand rolled parser that made it impossible to even guess in which contexts what features would work, proactive conversion of strings into numbers "0hello" == "0world", ... . There where entire communities dedicated not to mock the people working with PHP but the language itself.

oblio · on Nov 9, 2023

And Basic, Visual Basic, HTML, Javascript, the list is probably endless :-)

IshKebab · on Nov 8, 2023

It's also to do with Python itself.

wenc · on Nov 8, 2023

Kwpolska · on Nov 8, 2023

The Python packaging world is full of barely compatible tools and no clear vision. Even if you're consuming packages, or packaging pure Python code, it's often an incomprehensible mess.

delfinom · on Nov 9, 2023

Well, part of it is really Python's age and legacy. We are talking about going back to 1996. So much of python's development was ducktape through history in response to the changing world and whims of the contributors.

I'm not saying it's an excuse but it's just how it got to where it was. Newer languages have alot of lessons learnt to build upon to be decent from day 0.

ecshafer · on Nov 9, 2023

Java and ruby are similar ages as python and dependencies are much better stories there.

fbdab103 · on Nov 9, 2023

Ruby never had nearly the FFI/other language problem as Python so could almost entirely focus on Ruby-code delivery.

cesarb · on Nov 9, 2023

The same for Java, since its ecosystem has an allergy to calling native (non-JVM) code, to the point of rewriting perfectly good libraries in Java. When they do call native code, it's often in horrible ways (like copying a native library from within the JAR file to a temporary directory, and loading it from there, the JAR file coming with pre-compiled native libraries for all possible operating system and architecture combinations). So the Java package managers mostly focus on Java (and other JVM languages) code building and delivery.

pjmlp · on Nov 9, 2023

Maven and Gradle support building C and C++ libraries just fine.

IshKebab · on Nov 9, 2023

While this may be true, they've also had literal decades to improve the situation and have barely got anywhere. In some ways they've gone backwards!

xapata · on Nov 8, 2023

I haven't found it so. I've stuck with pip and adopted venv when it showed up, and haven't needed anything else. I use Docker for "pinned" builds.

eviks · on Nov 9, 2023

venv and docker are exactly the indicators of how Python is bad

globular-toast · on Nov 9, 2023

Are screws bad because you need a screwdriver?

IshKebab · on Nov 9, 2023

That's a completely nonsensical analogy. Maybe you missed his point, but well designed programming language infrastructure does not need Docker or venv to work. The fact that you have to resort to the massive hack of Docker shows how bad the situation is.

I do not have to use Docker or a venv for my Rust, Go or Deno builds.

xapata · on Nov 12, 2023

I'm not a Rustacian, but so far I much prefer Python's packaging to Go's. Italicize all you like. I don't find the emphasis convincing.

When I receive a Python program that I'd like to modify, I can. When I receive a Go program that I'd like to modify, I must beg for the source code.

Do you "vendor" your database into your Go program? If not, you likely still need Docker, or something like it, for your program to work.

globular-toast · on Nov 10, 2023

No, but you have to use Rust, Go or Deno.

IshKebab · on Nov 11, 2023

I get to use Rust, Go or Deno (for my own projects). I am forced to use Python for work unfortunately.

It's not a terrible language for sure. It's just the packaging systems and tooling around it that are face-palmingly awful.

globular-toast · on Nov 11, 2023

The thing is I've learnt Python packaging one time basically (obviously there's always more to learn about anything) and that's enabled me to write a ton more software than I would have written otherwise. For example, just a little plotting utility for me to visualise my accounting data. It's on GitHub, but nobody uses it but me. Could you imagine me writing this in anything but Python?

If your projects really do benefit from other languages, maybe you're doing network or systems applications that need to be fast, then you might have easier packaging but those languages require easy more work due to fewer libraries or just being harder (Rust).

As usual with these things it's six of one and half a dozen of the other :)

eviks · on Nov 9, 2023

You're mistaking hammer for a screwdriver

oconnor663 · on Nov 9, 2023

This pair of articles is excellent, both for good practical advice about installing Python packages, and for its general attitude about how to teach difficult things to large groups of people:

https://www.bitecode.dev/p/relieving-your-python-packaging-p...

https://www.bitecode.dev/p/why-not-tell-people-to-simply-use

Every single decision point or edge case represents permanent failure for hudreds of people and intense frustration for thousands. Of course, none of this is really to do with Python the language. It's more about the wide userbase, large set of packages and use cases, and overlapping generations of legacy tools. But most of it isn't C/C++/Fortran's fault either.

IshKebab · on Nov 8, 2023

I'm assuming you're not a Python user otherwise you'd already know the many answers!

This link might give you a taste:

https://packaging.python.org/en/latest/key_projects/

selcuka · on Nov 8, 2023

I am a Python user, but never heard of most of the tools in that list. This is probably because everyone and their cousin attempts to write yet another package manager for Python.

The built-in tools venv, pip, (together with requirements.txt and constraints.txt) meet 99% of real life dependency management needs.

starlevel003 · on Nov 8, 2023

The proliferation of requirements.txt is one of the key reasons why Python packaging sucks so much.

kelipso · on Nov 8, 2023

Right what we need is a requirements.yaml (better yet, create an entirely new markup language for this particular project) and another new package manager for it. One day (one day!) I will start a project without python. One can hope.

starlevel003 · on Nov 9, 2023

> Right what we need is a requirements.yaml

It already exists, it's called pyproject.toml. It already existed for years in the form of setup.py. Requirements.txt means that projects can't be automatically installed which contributes massively to the difficulty of getting packages to work.

mkesper · on Nov 9, 2023

Pyproject.toml is a right step afaict but man is it complicated: "Please note that some of these configurations are deprecated, obsolete or at least discouraged, but they are made available to ensure portability." Core vs setuptools-specific etc. See https://setuptools.pypa.io/en/latest/userguide/pyproject_con... and https://packaging.python.org/en/latest/specifications/declar...

gv83 · on Nov 9, 2023

Also web frameworks! Many web frameworks.