I think it's the wrong abstraction. It encourages ridiculous amounts of bloat and duplication, of bandwidth, file systems, virtual environments, and docker images. It's really not actually that portable.
It encourages blindly installing unauditable giant binary blobs which run with all kinds of leaky abstractions. But boy howdy are they convenient.
I think "Allow external hosting" is the best case here. The first point of user expectation on PEP 470 is:
> Easily allow external hosting to "just work" when appropriately configured at the system, user or virtual environment level.
That's already out of the window once GPGPU is involved. The ecosystem is much better than it used to be, but there is still no plug-and-play GPGPU on linux without hiccups.
Let PyPI be the namespace, registry and lookup, and let users pass an --allow-external again, or (less optimal imho) do something like Conda channels. Improve the user experience for integrating with external hosts.
Pip is unlike NPM, Cargo, go get, etc because Python is much more intimate with linked libraries. It's unlike Brew because it's way more cross-platform. It's much closer to apt-get in this regard, and its challenges are different. Apt-get has sources lists, and I think that makes some sense for Python as well.
People use this the tell Spack where the system cuda libs or system MPI are.
Wow, gen Z lingo has come into HN faster than I expected. Can someone translate this to traditional English? The definition I find for 'yeet' ("an exclamation of excitement") isn't particularly enlightening...
Update: I see, thanks all!
My definition of "Yeet" is to fling/throw/hurl something, typically with great vigor. Urban Dictionary defines it as "To discard an item at a high velocity." Derives from the Latin Iacio, From Proto-Italic jakjō, from Proto-Indo-European (H)yeh₁- ("to throw, let go"). JK I just made that up, but it's actually plausible.
Never heard the "exclamation of excitement" definition.
"The first empty soda can was yote many moons ago"
According to Wiktionary:
1. simple past tense and past participle of yeet" 
Worth noting there is a fair amount of discussion on whether the past tense is "yeeted" or "yote", or even "yaught": https://en.wiktionary.org/wiki/Talk:yeet
See also: https://linguistics.stackexchange.com/questions/28300/what-i...
(I do think it would be pretty funny if the past tense ended up as "yote", though.)
I think you're way overestimating the people who come up with new words :P
There's no expertise for Bazel in-house; when we run the build, it seems to fail all its cache hits and then spend 12-13h in total compiling, much of which appears to be recompiling a specific version of LLVM.
Every dependency is either vendored or pinned, including some critical things that have no ABI guarantees like Eigen, which is literally pinned to some random commit, so that causes chaos when other binaries try to link up with the underlying Tensorflow shared objects:
And when you go down a layer into CUDA, there are even more support matrices listing exact known sets of versions of things that work together:
Anyway, I'm mostly just venting here. But the whole thing is an absurd nightmare. I have no idea how a normal distro would even begin to approach the task of unvendoring this stuff and shipping a set of reasonable packages for it all.
The Tensorflow maintainers themselves even kind of admit the futility of it all when they propose that the easiest thing to do is just install your app on top of their pre-cooked Tensorflow GPU docker container:
(The irony of paying Google to compile their software is not lost on me.)
starts sweating from GDAL PTSD
I hadn't heard about dh-virtualenv. That looks super convenient, I'll definitely give that a shot.
Honestly I've been kinda leaning into Conda-in-Docker images (so you can isolate while you isolate, dawg), which actually isn't so bad with a custom SHELL directive so you can RUN in a conda env. So many of my customers use it and provide algos developed in conda that it's often just easier to grab their environment.yml and stick it all in a container for deployment.
The pre-cooked TF image are a godsend but also the instant you need two different versions of TF in the same pipeline, you're hosed. I've been playing with Argo Workflows to get around this.
ML software packaging is the worst.
Or if you just need to integrate with anything else whose deployment model is "lol here's a container."
Spack can build with two levels of parallelism (within a build and among builds) on a single node or multiple nodes with a shared filesystem:
You can also generate GitLab pipelines of your build, which will break the jobs out as well:
(and I still don't get the advantages of an advanced build system like bazel, where I first have to build the right version of the build system (including all patches which might crop up with new libcs in vendored dependencies), and then it takes ages to compile (compared to any other similar software...) and I still have to fix the libc bugs...)
It is extremely powerful but also extremely tedious, like programming in MPI.
I just want a turnkey way to add multiple hosts/resolvers and use content-addressing. Like bittorrent but with passlists. Is that a thing?
Like, IPFS, but I want specific use-case domains. Like I'd be more than willing to stand up some sort of pip proxy for my company's domain but I'd only want it hosting packages used internally.
You could also have an IPFS gateway with just normal HTTP(S) access control techniques and a normal pip, deferring to the swarm for packages that you haven't pinned. It's unhealthy for the swarm to stop your node from sharing it's data with other nodes, as that'd loose the Torrent swarm effects that unload the initial uploader (PyPI).
In the past I've run this in a docker container, setup some pip environment variables and its off to the races. It transparently caches stuff from PyPi and keeps things local.
Requiring IPFS for accessing packages/wheels exceeding a threshold from PyPI seems reasonable.
Just have PyPI offer collaborative clusters for a few common situations, and maybe work with the IPFS devs to get collaborative clusters to support only pinning a subset by path (like, only pinning the packages you depend on for local CI).
Edit: Not sure about the downvotes. Joking tone aside, that actually seems like an ideal application for it.
When reading the first part of your post, I read it as implying that the person you replied to is trying to push a secret agendo or something with IPFS. That's on me of course, and you seem to have no such intentions, but that's how I read it. Also, sarcasm can be hard to understand for some people, and even harder in text form.
I think including your "Joking tone aside, that actually seems like an ideal application for it." in your initial comment would have helped. It indicates that what was before was in part a joke, and inform people of your real position.
This use case is something I believe they could charge for if they need to cover infra costs. Same as Docker hub started doing - if someone fails to cache properly in their CI and wants to redownload things from the internet, they should pay for that.
"[Discussions on Python.org] [Packaging] Draft PEP: PyPI cost solutions: CI, mirrors, containers, and caching to scale"
> Continuous Integration automated build and testing services
can help reduce the costs of hosting PyPI by running local mirrors
and advising clients in regards to how to efficiently re-build
software hundreds or thousands of times a month without re-downloading everything from PyPI every time.
> Request from and advisory for CI Services and CI Implementors:
> Dear CI Service,
> - Please consider running local package mirrors and enabling use of local package mirrors by default for clients’ CI builds.
> - Please advise clients regarding more efficient containerized
software build and test strategies.
> Running local package mirrors will save PyPI (the Python Package Index, a service maintained by PyPA, a group within the non-profit Python
Software Foundation) generously donated resources. (At present (March 2020), PyPI costs ~ $800,000 USD a month to operate; even with generously donated resources).
Looks like the current figure is significantly higher than $800K/mo for science.
How to persist ~/.cache/pip between builds with e.g. Docker in order to minimize unnecessary GPU package re-downloads:
Looks like it might be buildkit-specific?
"build time only -v option"
"Build images with BuildKit" https://docs.docker.com/develop/develop-images/build_enhance...
Then again, it might not be - the amount of CI setups out there that download the entire universe every 5 minutes..
The issue we now have is that scientific users almost always turn to Conda. Scientific authors also have increasingly turned to conda, because it solves a lot of distribution problems for them - it’s a lot easier to say to people ‘conda install -c conda-forge XYZ’ than it is to say ‘install the CUDA toolkit >v10.1 and then download dependency X and then install it from PyPi.’ Look for e.g. at Software Carpentry which aims to teach Python to researchers, or any undergrad course in Python outside of perhaps Comp Sci, and these days they almost always tell people to use Conda. In addition to that, it’s so much easier from a build perspective to add packages to Conda Forge than it is to build the various types of wheels that you need to do for PyPi. The manylinux instructions aren’t even particularly up to date.
To me, a lot of this stems back from Guido basically saying to the scientific community ‘it sounds like your problems are different to that of a lot of the Python ecosystem’ but without seeing that the whole compiled dependency thing affects everyone at some point. I’m not sure what the fix is - do we want people to be able to use PyPi to bundle compiled external libraries? It sounds like you’d just end up replicating Conda at the end of the day.
For me personally, this fragmentation causes a lot of issues because I help users install packages into our HPC cluster. I need to be able to install things from source and PyPi usually offers that, but increasingly, users bring packages with a chain of dependencies that have no source PyPi package and a long list of conda dependencies. I don’t blame package authors at all - they are doing what is easiest for them and most of their users - but I do think it needs a lot of thought and work by PyPa.
/Users/dalke/local/bin/python3.9 -m venv ~/venvs/py39-2021-4
pip install numpy scipy matplotlib pandas
tar xf Release_2021_03_1.tar.gz
cmake .. -DRDK_INSTALL_INTREE=OFF \
alias deactivate 'test $?_OLD_VIRTUAL_PATH != 0 && setenv PATH "$_OLD_VIRTUAL_PATH" && unset _OLD_VIRTUAL_PATH && setenv DYLD_LIBRARY_PATH "$_OLD_DYLD_LIBRARY_PATH" && unset _OLD_DYLD_LIBRARY_PATH; rehash; test $?_OLD_VIRTUAL_PROMPT != 0 && set prompt="$_OLD_VIRTUAL_PROMPT" && unset _OLD_VIRTUAL_PROMPT; unsetenv VIRTUAL_ENV; test "\!:*" != "nondestructive" && unalias deactivate'
if (`printenv DYLD_LIBRARY_PATH` == '') then
setenv DYLD_LIBRARY_PATH "$VIRTUAL_ENV/lib"
setenv DYLD_LIBRARY_PATH "$VIRTUAL_ENV/lib:$DYLD_LIBRARY_PATH"
Oh, but am I done? No! See the "/Users/dalke/local/lib/" in the CMAKE_INSTALL_RPATH?
That's because I installed my own Python, and Boost:
curl -O 'https://www.python.org/ftp/python/3.9.1/Python-3.9.1.tar.xz'
tar xf Python-3.9.1.tar.xz
./configure --prefix ~/local --with-openssl=/usr/local/opt/openssl
tar xf boost_1_72_0.tar.gz
./bootstrap.sh --prefix=/Users/dalke/local --with-python=/Users/dalke/local/bin/python3.9 --with-toolset=clang --with-python-version=3.9
./b2 -j 4
Oh, and I have multiple Open Babel installs from scratch, for the same reasons
Pity me. :)
GitHub Issue: https://github.com/rdkit/rdkit/issues/1812#issuecomment-8088...
Like, what's inherent about GPU functionality that makes them large compared to other things? (Usual python packages are in the 100 kilobytes size range, so no idea how you'd ever go up to several gigabytes.)
On top of that, people want to install their package with PyPi for e.g. and go - but need the entire CUDA runtime libraries - which include cuFFT, cuBLAS, cuDNN, etc etc. So if every package includes 7 or 8 arch specific versions of the compiled code and the full CUDA RT distribution, it’s easy to see how sizes get big.
The interesting thing is that exactly the same problem occurs on CPUs anyway. We have various optimisation levels in compilers for each family of processor, but aside from MKL I don’t know of any packages that are wheels that try to provide an optimised version of the code across CPU generations. This means anyone who wants good performance should always try to build from source rather than install the wheel.
Also NVIDIA could consider breaking up their packages into smaller pieces. But then again, they're still doing better than Intel, which ships 15GB+ images for their OneAPI
Another example of an ecosystem maintaining mappings out to system packages is ROS and rosdep:
Now it's interesting because ROS is primarily concerned with supplying a sane build-from-source story, so much of what's in the rosdep "database" is the xxxx-dev packages, but in the case of wheels, it would be more about binary dependencies, and those are auto-discoverable with ldd, shlibdeps, and the like. In Debian (and I assume other distros), the binary so packages are literally the library soname + abi versions, so if you have ldd output, you have the list of exactly what to install.
Maybe one interesting "social" piece to including this kind of functionality would be what the behaviour of pip would become. Like, would it a) go back to needing to be invoked as root and call through to the system package manager, b) emit the command needed for the user to install the required system packages, or c) download the system packages itself, extract the libraries, and insert them into its non-root-owned virtualenv or whatever other environment?
To provide a better experience for both package authors and users, as well as reducing the maintenance burden, the community has developed and migrated to a unified system called BinaryBuilder (https://binarybuilder.org) over the past 2-3 years. BinaryBuilder allows targeting all supported platforms with a single build script and also "audits" build products for common compatibility and linkage snafus (similar to some of the conda-build tooling and auditwheel). One downside of "cross-compilation everywhere" is that it's not always well-supported by upstream libraries, so the initial effort to build a library may be higher if patches need to be developed, but that effort is arguably higher-leverage than a per-distro packaging approach (eg https://twitter.com/Blosc2/status/1395425736597585920).
All that to say: "make binaries a distro packager's problem" sounds like a simplifying step, but there are some big caveats. It has been tried before, both in other languages and in Python: the fact that conda and manylinux don't use system packages was not borne out of inexperience. One additional issue is that distro packages are only available with sysadmin consent in shared unix-like environments, which can be very limiting for end-users.
I totally believe their CDN partner values their donated services at $1.5M/mo, but curious what that looks like in per-GB pricing. :)
It's not a bad way to do it actually.
That makes no sense to me.
Do anyone know why they are so big? What in the package takes so much space? Do they embed all GPU drivers? Or full model assets?
If you only want to support one GPU model/generation, one CPU arch, and one OS it isn't hard but god forbid you want anything more than that you end up with LxMxN variants and a bunch of compatibility/variant selector code.
In GitLab we have a mirror functionality for docker container images, I look forward to us extending it to other package types as well. This should reduce load in the central registry, speed up the download, and improve the overview of what packages are in use at the organization.
As another reply mentioned, the main things PyPI hosts are compiled extension libs. Sometimes these result in giant 300MB binary files, but more often what happens is you have many versions of a ~20MB file due to the combinations of OS, Python version, and API version.
But yes, this is it. If you have a compiled library based on CUDA that results in a pretty large binary, and you need one for each version of the CUDA API * Python version * OS, that adds up quickly.
> The issue with CUDA 11 in particular is not just that CUDA 11 is huge, but that anyone who wants to depend on it needs to bundle it in, because there is no standalone cuda or cuda11 package. It’s also highly unlikely that there will be one in the near to medium future because (leaving aside practicalities like ABI issues and wheel tags), there’s a social/ownership issue. The only entities that would be in a position to package CUDA as a separate package are NIVIDIA itself, or the PyPA/PSF. Both are quite unlikely to want to do this. For a Debian, Homebrew or conda-forge CUDA package it’s clear who would own this - there’s a team and a governance model for each. For PyPI it’s much less clear. And it took conda-forge 2 years to get permission from NVIDIA to redistribute CUDA, so it’s not like some individual can just step in and get this done.