Hacker News new | comments | show | ask | jobs | submit login
Packaging Python software with pbr (danjou.info)
53 points by pmoriarty 12 days ago | hide | past | web | 37 comments | favorite

I have never understood why people feel that pbr is better than just regular setup.py. All it seems to do is move the exact same metata into setup.cfg instead. Maybe it's a bit easier syntactically but to pull in a whole package just for that seems a bit much. Especially since you don't find yourself editing setup.py that often. You still need to learn about all the fields, what they do and what you should set them to.

I'd much more recommend reading through https://python-packaging.readthedocs.io/en/latest/index.html to actually understand how packaging works.

That said... packaging stuff in Python could really do with being a lot simpler. Pbr just doesn't seem to make it simpler, it just moves the problem to a different file.

Prefacing with "all it seems to do" makes you sound like you haven't actually used PBR, which could also explain why you don't understand why people prefer PBR to setup.py.

If you think packaging is a matter of configuration, then PBR makes a lot of sense. It gives you a config file that you simply fill out so you don't have to worry yourself about the code.

I've used both PBR and setup.py quite a bit and I personally prefer PBR since there's less things I need to debug when things go wrong.

I have, in 2 different projects and rejected PRs to introduce it in another three. Because "all it seems to do" is move the problem to a different location.

You fail to give any examples as to why it would be an improvement too, aside from a very vague "less things I need to debug when things go wrong". Which I'd argue against since now I also need to potentially worry about things going wrong in PBR, which is one more thing I need to debug. This seems to be the general theme with pbr, there's no clear reasons for why it's better and when asked about it you get these kind of hand-wavy answers.

If you actually have some concrete examples as to what it improves and why I'm all ears.

That's the problem, there's more things you need to debug, since PBR does not get rid of any of the problems that you encounter with setuptools or pip, but adds its own.

Trying to paper over the (real) problems with the Python packaging toolchain by generating setup.py from a config file strikes me as a vanity project.

At first glance, pbr appears to solve the problem of having to duplicate my dependencies in both setup.py and requirements.txt.

I suppose it does doe that, so does this though:

    with codecs.open('requirements.txt') as f:
        requirements = f.read().splitlines()
I don't think it's worth the dependency. But perhaps for others it is.

That works for simple use cases but it is not robust.

In case anyone sees this, at a minimum you need to use

  from pip.req import parse_requirements
to be safe here.

What if the user doesn't have pip installed? This is before we've installed the requirements, so we shouldn't be assuming anything outside the stdlib.

if I'm not mistaken, requirements are parsed out to be installed before installing the desired package, so that should be fine

Isn't that paradoxical? That code is for parsing requirements and those requirements can't be parsed until initial pre-requirements (pip, in this case) are installed.


from my understanding, when you build the package, your computer extracts the relevant required packages, and sends up that data to pypi, which then in turn sends that data back to users when they install

At least thats how I think it works

I think you may be misunderstanding the problem being discussed. The code that uses pip will crash if pip is not already installed. There is no way to parse and satisfy the dependency on pip, because the author chose to import pip as a library.

pip being a dependency is defined at upload time, it is parsed by the creator of the package, not the consumer, afaik

if you used any other package manager, it would need to resolve dependencies from the additional info in the package index, resulting in pip being downloaded and installed before the desired package is installed

This is actually less robust, since instead of relying on the setuptools API, you now also rely on the pip API.

If you control requirements.txt, there is nothing "not robust" about parsing it in setup.py.

You should read this: https://caremad.io/posts/2013/07/setup-vs-requirement/, it specifically references this "feature" of pbr.

Why are you using a requirements.txt if you have a setup.py that lists them? `pip install -e` or `setup.py install` should just work in that case.

You don't always want to install the actual package necessarily (e.g. dev) - using requirements separately lets you install the deps but not the package...

What if you have your own package index? You can use `--index-url` in requirements.txt to specify that.

You can use `--index-url` when calling pip: https://pip.pypa.io/en/stable/reference/pip_wheel/#index-url

You can also use `dependency_links` in your setup.py to specify this, which allows deps on github etc.

The approach I've been using

Define my dependencies

- setup.py

- requirements_dev.in

run piptools pip-compile to generate a locked set of dependencies

- requirements.txt

- requirements_dev.txt

It gives me some of the benefit of Rust's Cargo.toml / Cargo.lock in the python world (and actually respects all package's dependency version declarations unlike other tools like pyup).

I'd suggest pipenv for this kind of thing. It will generate a pipfile.lock, which is the (still in development, but pretty stable) official way to create a locked set of deps.

Please don't do this. It's just yet another tool which makes the same thing as setup.py yet another way with no real benefit and it's not "standard". Everyone who did some Python packaging already know at least the basics of setup.py, now they all should learn yet another file type syntax for the same problem.

Newer versions of setuptools have builtin support for more fields in setup.cfg: http://setuptools.readthedocs.io/en/latest/setuptools.html#c...

That along with setuptools-scm and pip-tools have for me solved most of the issues that pbr addresses.

> now they all should learn yet another file type syntax for the same problem

Are you really implying that learning INI would be a burden?

Yes. Obviously it doesn't seem like so at the first glance, but when you have a product like ours, where there are hundreds of small things like this, it can really add up. I believe that everyone should use and improve the same, standard tools so we can have few, but high quality things. When you come up with something new which is just slightly different from the existing one (see Python3 first versions...) I think you are doing the wrong thing.

I'm not a fan of this.

First of all, requirements.txt is for development requirements. Runtime ones belong in setup.py.

Also, the "extras" feature is already in setup.py via extras_require.

I see no need to use this nonstandard tool when the standard tooling works.

I do this:


Then in setup.py just read these requirements.txt. I just find this easier to manage all the dependencies from a single point. During development and testing code, you'd assume base.txt is what is going to production. It takes some care to commit this file, nonetheless.

I could keep a freeze version if I really want to have a full view (for debugging purpose).

Not really. A package can be installed in both base and used in test, and vice versa, so you have no real confidence removing the package doesn't break anything. The best you get is explicit duplication.

Perhaps I didn't explain well. test.txt would only consist of packages such as "pytest". The name means a specific function. If you have a platform package that provides APIs for test classes, and APIs for other things like wrapper around AWS apis, then my recommendation is make them distinct package, even if they reside in the same repository. Just have a seperate setup.py. A package in simple terms is just a folder, module is a single file.

> automatic generation of AUTHORS and ChangeLog files based on git history

I used to think this was a good idea. Then I found a huge loophole in it: if I copy-paste code from another FLOSS project, then that code's author should still be listed, but won't have any commits.

Also, peer-programming.

> version management based on git tags

setuptools-scm already does this for us.

Finally, a big upside of setup.py, is that I can programatically generate information, whereas pbr's cfg file doesn't seem to allow that.

flavors are listed as a big deal, but much like the article says, setuptools already has this, nothing new here.

>> automatic generation of AUTHORS and ChangeLog files based on git history

>I used to think this was a good idea. Then I found a huge loophole in it: if I copy-paste code from another FLOSS project, then that code's author should still be listed, but won't have any commits.

If you think that the other person deserves all the credit for the change, set the author with:

  git commit --author="Guido van Rossum <guido@python.org>"
If you deserve the responsibility/credit/blame for the commit, give credit to the other person in such a way pbr will pick it up, by adding to your commit message:

  Co-authored-by: Guido van Rossum <guido@python.org>

> Finally, a big upside of setup.py, is that I can programatically generate information, whereas pbr's cfg file doesn't seem to allow that.

The problem with package manifests that are arbitrary scripts (Python, Ruby and probably others) is that you lose even basic introspectability into rudimentary parameters like name, version, dependencies unless your threat model allows for executing random code that people put into the setup.py. This can be mitigated by using complex static analysis carefully crafted for the specific task, but this isn't easy to implement and there is a non-insignificant amount of cases where the data cannot be legitimately decided. I'm just wondering how this is conceptually different from Pipfiles:

https://github.com/pypa/pipfile (Previous discussion https://news.ycombinator.com/item?id=13011932)

Also, executable package scripts with arbitrary code are the reason why 'pip install rickroll' does exactly what it sounds like it'll do.

Pipfile isn't a replacement for setup.py, it's meant to replace requirements.txt / how you specify the dependencies of your project. As you've noted, setup.py can do a lot more than that.

To me this doesn't really solve the most painful part of packaging and deploying Python: portability.

You still need to install system files for those c line. You still can't have multiple packages requiring different versions of libraries. What a mess.

For the first part, OpenStack fairly widely uses bindep (pbr is a pretty OpenStack ecosystem thing too). It's not really perfect, but it is a fairly standard way for a CI environment deploying from git to ensure the binary dependencies. If a project doesn't have a bindep.txt file, in CI it falls back to [2] for example.

I guess your second point means something like symbol versioning for Python libraries, which I'm not aware of any solution to, apart from just running things in a virtualenv.

[1] https://docs.openstack.org/infra/bindep/readme.html [2] https://github.com/openstack-infra/project-config/blob/maste...

PBR? People still drink that stuff?

Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact