Hacker News new | past | comments | ask | show | jobs | submit login
Current State of Python Packaging (stefanoborini.com)
72 points by robertlagrant on June 7, 2019 | hide | past | favorite | 76 comments



I'm not sure what Poetry tries this to simplify on the development side... using venv and pip/requirements.txt is simple enough for me. And you don't even have to do it by hand, Pycharm will do it for you and activate the right venv in the shell tab. Distribution is something else, but I wouldn't expect users to type commands in a shell anyway, so there is only a need for one-click installers and/or self contained executables, where poetry does nothing it seems (Nuitka/PyInstaller and other solutions will help here).


In deployment you'll still need to use the command line. Poetry is IDE and OS agnostic.

It manages also the lock file, and resolve deps better than pip.

At last, it lets you put all meta data in one file, including dev deps, then build a wheel.

You can do that with regular setuptool and setup.cfg and it's compatible with pip though.


I don't have a problem with the command line, I just find the argument of "tool X and tool Y are boring to use so you'd better tool Z" (as found on the linked article) a bit weak. I guess that's one of these tools that solves problems I don't have (like the need for a lock file... just specify the version in requirements.txt), so unless it becomes the new standard, I'll stay away from it. But I'm sure it's gonna be helpful for many.


The lock file doesn't just specify the version of the packages, but of all dependancies as well, in a nested manner. This let poetry detects incompatibilites you can't with a flat requirement file.

Poetry also updates the files properly at each install, so you don't have to do it. And it can't install something outside of a venv by mistake.

But yes, a requirement file is ok in many cases. I use them myself often.


> but of all dependancies as well, in a nested manner.

thanks for the clarification !


I added a detailed comment on packaging here: https://news.ycombinator.com/item?id=20132303


using venv and pip/requirements.txt is simple enough for me

Sure - it depends on what you're doing. Requirements.txt won't work if you're writing a library; you'll need to move (or duplicate) your dependencies into setup.py.


Yes, it's still very valid. It's simple. It's baked in.

However, the benefit of moving to setup.cfg (not setup.py anymore, which should be almost empty) is more than making a lib. It makes deployment easier, and dev as well. Hell, you can even pip install from git repo with it.

I made a summary of what to use for what, when and how in this comment: https://news.ycombinator.com/item?id=20132303


Indeed.. I stopped writing libraries when I switched to Python, because for all my needs there are better ones out there than those I could write. Only write apps now... :)


I get mad at the Haskell tools I use every day, and then I remember that nobody else is really in better shape.


Rust's Cargo is incredibly well thought-out and will make you despise everything else.


From the little development I have done in Rust, I have to agree.

On the other hand, I’ve been working with react-Native and typescript recently, and it’s a constant mess. Every invocation of yarn leaves at least a couple of warning messages about things that are out of my control. Installing packages is slow and at least every few days I have to wipe out node_modules and nuke any iOS build folders in order to get it working again. That said, RN is promising technology and hopefully my frustrations are due to more inexperience than the tools or language and ecosystem. I just wish all of the JavaScript stuff were faster. I’m on a gigabit connection and fast hardware but still it seems sometimes yarn has to download or unpack 6500 files or something


NPM claims to have taken the lead from Yarn in terms of install speed in the latest versions. Still, I prefer Yarn because NPM has such a history of suckage.

Meanwhile, the former CTO of NPM released a new (alpha) package management system last week at JSConfEU (the conference where NodeJS was first announced); see https://github.com/entropic-dev/entropic/blob/master/docs/RE...


I saw those claims in the past few days and have tried both, for me yarn still seems faster although I didn't actually time them.


We are making a Cargo-like build toolchain for C/C++, if anyone is interested: https://build2.org


Nope. Absolutely not. Cargo makes the simple things easy but is less thought out than Rust itself.


Web browsers have the best software distribution story. You type a natural language query about what you want, click and it installs and runs the latest version of everything you need, usually in less than a second.


From the click user point of view, yeah.

From the dev user or packager point of view, no.

First, there is no namespace in JS. Secondly, the import instruction standard is not implemented everywhere, so we need a bundler. First, the network makes everything harder. And finally, JS has virtually no stdlib.

Add to that that we we are stuck in npm land, with acute lefpadite, packages breaking the public API every sunday, and a mad hatter packager like webpack as our lord and savior, and you have a recipe for terrible times.


Python packaging is not hard really. It's just that there is way to much of obsolete or incomplete information on the web.

Here is what you need to know: https://news.ycombinator.com/item?id=20132303

Honestly, it's not much. It's enough to live a happy life as a Python dev.


This. I still use pip, venv and setup tools and I've packaged everything from third party dlls given by a supplier to my flask applications with html and everything.

I keep seeing pipenv and poetry and wonder how all these came about? Is it because people come from other languages and want to do this the way they did it in those?


May I know if you package 3rd party dlls in your flask app, does it run on *nix platform?


Racket has by far the best I have ever used.

Here is everything you need to know about it in a 7 minute read.

https://docs.racket-lang.org/pkg/getting-started.html


Yes I'm familiar with Racket.


It good?


Not good, is gooder!


Racket is respectable but I prefer types all the way down.


probably I know I'm the outsider here, but Maven can do everything any other tool can, but then also a ton of stuff more that none other can, Gradle et al. included: Namely, enabling you to do proper dependency & version management with parent projects (POMs). Massively useful if you have gigantic projects with hundreds+ of dependencies.


It's impressive how so many people struggle with this :)


I really hope poetry will win. It's just so good, easy to use and just works.


I like poetry but it comes with 2 problems:

- it uses only pyproject.toml. Despite the current com on this format, it's not stable, it's an incomplete standard and it's not well supported by the ecosystem. Setup.cfg is a much better alternative in the mean time. In fact, just the auto include features make it better.

- there is no nice way to install it. Pip install poetry is the usual one, and it got many gotchas for beginers, which are the ones that would benefit poetry the most.


Agree; there are multiple issues in Poetry's Github associated with installing inside Docker. If that's fixed, I'd be much happier with it.


Also with travis. And requires some manual fiddling with tox. Not to mention IDE support.


The point about the beginners is solid, but consider the use case for Poetry: you need it if you want to package and publish a library, which is not an area I would associate with beginners.


To be less blunt, I made a better comment on how to package things in the modern python world:

https://news.ycombinator.com/item?id=20132303

No poetry necessary, but I point to when I use it anyway.


No you don't. It's actually very to package and publish a library with setup.cfg.


Thought I would mention pip-tools as an option to manage requirements.txt.


I have a PHP Yii2 web project basically using composer (something like pip for Python) that the old project maintainer chose not to use a specific version for dependencies.

I am stunned when I do composer update, because it basically updates every dependencies.

When I asked one of the seniors, he said yea, I have to check every single part of the website and make sure it doesn't break.

I don't know anymore.


> When I asked one of the seniors, he said yea, I have to check every single part of the website and make sure it doesn't break.

That's what unit/integration/e2e tests are for.


composer.lock and composer install are very fast and very correct, so I'm suspecting that it's time you start using them or stop calling the others "seniors".


no mention of python venv, it's there since python v3.3


This is what I usually use since it's baked in.


I actually had a typo and wrote virtualenv instead of venv. But I meant venv (which is standard in py3)


The top question Python learners ask me is:

How do I package my Python program for small scale distribution to non-technical people?

The end target is usually Windows so I suggest PyInstaller but it doesn't feel exactly elegant although it is better than py2exe.

Maybe there is some alternative on the horizon?


If they can get away without packaging the interpreter, there is always an executable zip. There's a pretty handy tool for creating them:

https://github.com/linkedin/shiv


PyInstaller is quite adequate and works well. The only problem is that all it gives you is an exe. If you want something a bit more complex it's not really much more than that.


Yet another state of python packaging that:

- skip the explanation on the py command on windows, the version suffixes on unix, -m and why you need to install pip on linux but not on the other OSes.

- doen't address the various sys path issues of pip and poetry. Because at some point you need to install peotry.

- ignore the existence of the excellent and simple setup.cfg.

- ignore the consequences of using poetry on IDE setup, tox or CI.

Python packaging is not hard anymore. But the information you get out there is incomplete and assume some kind of basic sysadmin experience.


> Python packaging is not hard anymore

Would you tell us which is "the right way to do it" nowadays? Possibly, in a maintainable, kind-of-officially supported way that doesn't change or disappear in a few months?

Please note: I use Python professionally since 2005, I've been involved a lot in Python packaging for production apps (including giving some talks on the bad state of Python packaging at Europython around 2010) and I had followed closely the then-failed distutils2 effort. And I still don't know what's the "right and easy way to do it".


Not on my phone from the subway, but maybe when I'm back at home I'll take the time.


Here we go :)

Packaging, the easy way

Because I'm not on a blog, I can't go too much into details, and I'm sorry about that. It would be better to take more time on each point, but use them as starting point. I'll assume you know what virtualenv and pip. If you don't, check a tutorial on them first, it's important.

But I'm going to go beyond packaging, because it will make your life much easier. If you want to skip context, just go to the short setup.cfg section.

1 - Calling Python

Lots of tutorials tell you to use the "python" command. But in reality, often several versions of Python are installed, or worst, the "python" command is not available.

WINDOWS:

If the python command is not available, uninstall Python, and install it back again (using the official installer), but this time making sure that the "Add Python to PATH" box is ticked. Or add the directory containing "python.exe", and its sibling "Scripts" directory to the OS system PATH manually (check a tutorial on that). Restart the console.

Also, unrelated, but use a better console. cmd.exe sucks. cmder (https://cmder.net/) is a nice alternative.

Then, don't use the Python command on Windows. Use the "py -x.y" command. It will let you choose which version of Python you call. So "py -2.7" calls python 2.7 (if installed) and "py -3.6" calls Python 3.6. Every time you see a tutorial on Python telling you to do "python this", replace it mentally with "py -x.y".

UNIXES (mac, linux, etc):

Python is suffixed. Don't just call "python". Call pythonX.Y. E.G: python2.7 to run python 2.7 and python3.6 to run Python 3.6. Every time you see a tutorial on Python tell you to do "python this", replace it mentally with "pythonX.Y". Not PythonX. Not "python2" or "python3". Insist on being precise: python2.7 or python3.5.

LINUX:

pip and virtualenv are often NOT installed with Python, because of packaging policies. Install it with your package manager for each version of Python. E.G: "yum install python3.6-pip" or "apt install python3.6-venv".

FINALLY, FOR ANY OS:

Use "-m". Don't call "pip", but "python -m pip". Don't call "venv", but "python -m venv". Don't call poetry but "python -m poetry." Which, if you follow the previous advices, will lead to things like "python3.6 -m pip" or "py -3.6 -m pip". Replace it mentally in tutorials, including this one.

This will solve all PATH problems (no .bashrc or windows PATH fiddling :)) and will force you to tell which python version you use it with. It's a good thing.

In any case, __use a virtualenv as soon as you can__. Use virtualenv for everything. One per project. One for testing. One for fun. They are cheap. Abuse them.

In the virtualenv you can discard all the above advices: you can call "python" without any "py -xy" or suffixes, and you can call "pip" or "poetry" without "-m". Because the PATH is set correctly, and the default version of Python is the one you want.

But there are some tools you will first install outside of venv, such as pew, poetry, etc. For those, use "-m" AND "--user". E.G:

    "python -m pip install poetry --user"
    "python -m poetry init"
This solves PATH problems, python version problems, doesn't require admin rights and avoid messing with system packages. Do NOT use "sudo pip" or "sudo easy_install".

2 - Using requirements.txt

You know the "pip install stuff", "pip freeze > requirements.txt", "pip install -r requirements.txt" ?

It's fine. Don't be ashamed of it. It works, it's easy.

I still use it when I want to make a quick prototype, or just a script.

As a bonus, you can bundle a script and all it's dependencies with a tool named "pex" (https://github.com/pantsbuild/pex):

    pex . -r requirements.txt -o resulting_bundle.pex --python pythonX.Y -c your_script.py -f dist --disable-cache
 
It's awesome, and allows you to use as many 3rd party dependencies as you want in quick script. Pex it, send it, "python resulting_bundle.pex" and it runs :)

3 - Using Setup.cfg

At some point you may want to package your script, and distribute it to the world. Or maybe just make it pip installable from your git repo.

Let's say you have this layout for your project:

    root_project_dir/
    ├── your_package
    ├── README.md
Turn it into:

    root_project_dir/
    ├── your_package
    ├── README.md
    ├── setup.cfg
    ├── setup.py

And you are done. Setup.py needs only one line, it's basically just a way to call setuptools to do the job (it replaces the poetry or pipenv command in a way):

    from setuptools import setup; setup()
Setup.cfg will contain the metadata of your package (like a package.json or a pyproject.toml file):

    [metadata]
    name = your_package
    version = attr: your_package.__version__
    description = What does it do ?
    long_description = file: README.md
    long_description_content_type = text/md
    author = You
    author_email = foo@bar.com
    url = https://stuff.com
    classifiers = # not mandatory but the full list is here: https://pypi.org/pypi?%3Aaction=list_classifiers
        Intended Audience :: Developers
        License :: OSI Approved :: MIT License
        Programming Language :: Python :: 3.5
        Programming Language :: Python :: 3.6
        Programming Language :: Python :: 3.7
        Topic :: Software Development :: Libraries

    [options]
    packages = your_package
    install_requires =
        requests>=0.13 # or whatever

    [options.package_data]
    * = *.txt, *.rst
    hello = *.msg

    [options.extras_require]
    dev = pytest; jupyter # stuff you use for dev

    [options.package_data] # non python file you want to include
    * = *.jpg
You can find all the fields available in the setup.cfg here: https://setuptools.readthedocs.io/en/latest/setuptools.html#...

Setup.cfg has been supported for 2 years now. It's supported by pip, tox, all the legacy infrastructures.

Now, during dev you can do: "pip install -e root_project_dir". This will install your package, but the "-e" option will make it work in "dev mode", which allow you to import it, and see modifications you did to the code without reinstalling it every time. "setup.py develop" works too.

If you publish it on github, you can now install your package doing:

    pip install git+https://github.com/path/to/git/repo.git
You can also create a wheel out of it doing:

    python setup.py bdist_wheel
The wheel will be in the "dist" dir.

Anybody can then "pip install your_package.whl" to install it. Mail it, upload it on an ftp, slack it...

If you want to upload it on pypi, create an account on the site, then "pip install twine" so you can do:

    twine upload dist/*
Read the twine doc though, it's worth it: https://pypi.org/project/twine/

You could use "python setup.py bdist_wheel upload" instead of twine. It will work, but it's deprecated.

4 - Using pew

Pew (https://github.com/berdario/pew#usage) is an alternative to venv, poetry, virtualenvwrapper and pipenv.

It does very little.

    "pew new env_name --python python3.X"
Creates the virtualenv.

    "pew workon env_name"
Activates it. And optionally moves you to a directory of your choice.

That's all. It's just a way to make managing virtualenv easier. Use pip as usual.

You can know where your virtualenv has been created by looking up the $VIRTUAL_ENV var.

This is especially useful for configuring your IDE, although I tend to just type "which python" on unix, and "where python" on Windows.

5 - Using poetry

Now, if you need more reliability, poetry enters the game. Poetry will manage the virtualenv for you, will install packages in it automatically, will check all dependencies in a fast and reliable way (better than pip), creates a lock file AND update your package metadata file.

I'm not going to enter into details on how this work, it's a great tool, with a good doc: https://github.com/sdispater/poetry

You don't need to start with poetry. You can always migrate to it later. Most of my projects do the "requirements.txt" => "setup.cfg" migration at some point. Some of them move to poetry if I need the extra professionalism it provides.

The problem with poetry is that it's only compatible with poetry. It uses the pyproject.toml format, which is supposedly standard now, but is unfinished. Because of this: any tool using it, including poetry, actually stores most data in custom proprietary fields in the file :( Also, it's not compatible with setuptools, which many infrastructures and tutorials assume. So you'll have to adapt to it.

That being said, it's a serious and robust tool.


Oh, almost forget.

6 - If you want to compile Python to an exe, use nuitka (http://nuitka.net/), not p2exe, cx_freeze and co.


P.S. no src/ folder? I feel like that's quite a nice idea, although I come from a Java background, so perhaps it just feels familiar.


I've use the src folder a lot in the past, but realized I only had one dir in it in all my projects. Since it's confusing to beginers, I removed it and never missed it.

Nevertheless, if you want to have one, just add this to setup.cfg

     [options] 
      package_dir=src              
      packages=find:


You need to put this all somewhere better than a HN comment. I will then immediately share it on HN (-:


I have only a french blog, and those informations are already on it.

I will have to open an english on I guess.


Hi, I am the author of the post. Thank you for the ideas. The post is meant to be "dynamic" and I will integrate your ideas in it. I just woke up so in an hour or so I will update it.


Kudos


I’ve been coding in Python for about 8 years now and this is the first time I’ve even heard of Poetry.


For 7 of those years Poetry didn't exist. For the 8th year however Poetry and Pipenv have been pretty big news in the Python packaging world.


Is pipenv not recommended any longer ?


The author wrote something about it being the official python package manager but that was just his own marketing.


It was NEVER recommended by any Python organization or official person, it was just Kenneth's usual bullshit marketing.


In the link below it says, “Pipenv is recommended for collaborative projects”

https://packaging.python.org/tutorials/managing-dependencies...


Yes, they toned it down, but still to a wording that suggests a formal recommendation. If you find the discussion, it was not meant to be an exclusive recommendation. Bit of a mistake, in my opinion.


So my question is: who is in charge at PyPA, and how much Kenneth involved in it?


I woudln't trust PyPA, Kenneth or not. They have been taking very dubious is decisions in the past just for the sake of it.

The way forced a half baked pyproject.toml format to be a standard and refused to hear any complaint, all that while there have been a working open format existing, and working for 2 years was not very professional.


My own anecdotal experience was that I couldn't get poetry to install on a fresh instance of Ubuntu. There were some open issues for it on the Poetry issue tracker but the developer was implying it was an Ubuntu issue rather than a problem with Poetry.


I never heard of poetry before and normally use pipenv.

The author states that poetry is more popular than pipenv, is that true?

I did a quick comparison:

https://github.com/sdispater/poetry watchers: 77 stars: 4,690 forks: 318

https://github.com/pypa/pipenv watchers: 348 stars: 17,207 forks: 1,268

I don't know if the authors statement 'Most people seem to prefer Poetry.' is true.


I moved over to Poetry recently. Pipenv traded on Reitz's cache as the Reqeusts author and got a lot of adopters, perhaps a little before the software was ready for the primetime. I only tangentially keep my finger on the pulse of the Python community but it certainly appears that they're souring on him [0], [1].

I wouldn't say more popular, but it's certainly gaining traction in a relatively short time and I prefer it. I haven't used it in anger but it certainly seems faster that Pipenv.

[0] - https://vorpus.org/blog/why-im-not-collaborating-with-kennet...

[1] - https://www.reddit.com/r/Python/comments/8kjv8x/a_letter_to_...


I never met a single developer who loves pipenv, but everyone had a fatal problem with it at some point.


Anecdote, but everyone I personally know in the Python-sphere hates pipenv (and the creator) for various reasons. On another forums community I’m a member of the recommended tool for python packaging is poetry.


You are right about the hates for pipenv, but I don't know that the creator hates it too.

edit: sorry I misunderstood, the creator of pipenv doesn't hate pipenv


He meant that the creator is hated, not that creator hates the program.


I'm pretty sure pulsetile meant that people in the Python-sphere hate the creator of pipenv, not that the creator of pipenv hates pipenv.


Never head of poetry before. Looks very nice. Thanks for sharing.


Me either. Of course, now the last four Python packaging systems are obsolete. ("What about eggs? easy install? .egg-info directories? Forget them. They are legacy.")

The consistent problem with all these systems is that when (not if) an install fails, figuring out what to do is very difficult.


pip, virtualenv, setup.py using setuptools, and twine if you need to publish to PyPI, are all pretty old and well-established, and are all still a fine way to do python development.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: