Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Requirements feels like a dirty hack but it does work fine. It has ==, ~=, and >= for version numbers, as well as allowing you to flag dependencies for different target os, etc. And then you can add setup.py if you need custom steps. But yes, it feels dirty to maintain requirements.txt, requirements-dev.txt, etc.

Poetry is the most common solution that I've seen in the wild. You spec everything using pyproject.toml and then "poetry install" and it will manage a venv of its own. But you still need to tell people to "pip install poetry" as step 0 which is annoying.

If you don't care about deploying python files, and rather just the final product, I'd recommend either nuitka or pyinstaller. These are for bundling your project into an executable without a python runtime needed (--onefile type of options for single file output). Neither supports cross compilation though.




What flow do you use with requirements.txt that gives you reproducible builds across a team and environments? Using ==, ~=, and >= will not give you reproducible builds.


We do it like this:

- configure the project with pyproject.toml,

- use pip-compile (from the pip-tools package) to create a lockfile,

- commit the lockfile into git,

- whenever we want to update the dependencies, do it through pip-compile again (if you give it an existing lockfile as output, it will keep what's in there and change only what's required).

Since all our requirements are cross-platform and on PyPI, we can install the same env everywhere.


Hash-pinning with requirements.txt will get you the closest to this, but it's not possible in the general case to have a cross-environment reproducible build with Python. The closest you can hope for is a build that reproduces in the same environment.

This problem is shared by the majority of language packaging ecosystems; the only one I'm aware of that might avoid it is Java.


Rust and Go both have proper lock files... Both of which are good enough to satisfy Nix's requirements for reproducability such that they re-use the lock file hashes. This "it's hard, no one does it right" feels like a cope.


We get (very) close to cross-environment reproducible builds for Python with https://github.com/pantsbuild/pex (via Pants). For instance, we build Linux x86-64 artifacts that run on AWS Lambda, and can build them natively on ARM macOS. (As pointed out elsewhere, wheels are an important part of this.)

This is not raw requirements.txt, but isn’t too far off: Pants/PEX can consume one to produce a hash-pinned lock file.


How do you get a reproducible build in python for the same os/arch? As in, what concrete steps do you take?

This is very easy in nearly every other language that is popular. No one ever answers this clearly in threads like this short of saying “use poetry” which makes my point. I’ve asked many times.


I explicitly said that you can't. Python's packaging ecosystem wasn't designed with reproducibility in mind, and has never claimed to prioritize reproducibility. The best you can do is get close, and hash-pinning gets you pretty close.

I'm not aware of any other major language or language packaging ecosystem that makes reproducibility straightforward. Certainly not Ruby or NPM, and not even brand new ones like Rust's Crates. Java appears to be the closest[1], but is operating with significant advantages (distributing reproducible bytecode to all users, minimizing system dependencies, etc.).

Edit: In addition to hash-pinning, you can instruct `pip` to only install built distributions, i.e. wheels. If you do both hash-pinning and built distributions only, your package installation step _should_ reproduce exactly on machines of the same OS, architecture, and Python version. But again, this is guaranteed nowhere.

[1]: https://reproducible-builds.org/docs/jvm/


You are splitting pedantic hairs in order to avoid talking about the obvious. Python's dependency management is much worse than ruby's and nearly every other popular language.

  In ruby you add dependencies to a Gemfile then ...
  $ bundle install
  $ git add Gemfile Gemfile.lock
and other members of your team can have the same build as you.

requirements.txt doesn't solve this basic need.

Edit: formatting.


I think you're confusing lockfiles with reproducibility. Lockfiles are good, but they don't guarantee reproducibility: a locked (or pinned, hashed, etc.) dependency might always be the exact same source artifact, but it can install in different ways (e.g. due to local toolchain differences, different versions of dependencies, sensitivity to timestamps, sensitivity to user-controlled environment variables, etc.).

Reproducibility is a much harder problem than dependency locking, and (again) I'm not aware of any language level packaging ecosystem that really supports it out of the box.

Python doesn't have reproducible builds, but it does have lockfiles (via hashed and pinned requirements). They're not particularly good (for all the reasons mentioned upthread), but they do indeed exist. If you use them as I've said, then your environment will be approximately as repeatable as with any other language packaging ecosystem (and arguably more so in some cases, since wheel installs are reproducible where gem installs aren't).


Binary wheels are just archives, I don't see differences in the way they install between different systems.

Source wheels that contain C/C++ code are so annoying to install on Windows that we don't use them. But most packages provide binary wheels anyway.


> Binary wheels are just archives, I don't see differences in the way they install between different systems.

The subtlety here is in which binary wheel is selected: a particular (host, arch, libc) tuple may cause `pip` to select a more specific wheel for the same version of the package, or even an entirely different wheel. This makes wheels themselves reproducible between systems, but it also means that which wheel isn't guaranteed.


What is the most common python flow for dependency locking? Is it in any of the PEPs?


That would be PEP 508.

508 notably excludes hash-pinning, which is a significant limitation -- hash-pinning is defined by how `pip` implements it.


Thanks, is there any kind of tutorial on how you should use this in a python project?

Edit: I think this is one of the biggest problems someone coming to python has. Python advocates say some version of, "you can roughly do that" but there isn't a clear explanation of how to do it.

Edit 2: I see that the official docs have a Pipenv flow outlined. Is Pipenv the way people do this in python these days?


> Thanks, is there any kind of tutorial on how you should use this in a python project?

The PyPUG docs contain examples of using all of the PEPs mentioned above. TL;DR: all you need for 99% of use cases is a pyproject.toml.


Those docs say to use Pipenv, or am I looking at the wrong docs? Really not sure why python people can’t articulate a clear flow to follow. It’s all riddles.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: