Hacker News new | past | comments | ask | show | jobs | submit login

The solution to this is Poetry. https://poetry.eustace.io

It's good. Projects should use it.

> curl ... | python

Ah goddamnit.

868 lines, including os.rmtree calls and stuff.

Also installable via pip, but... "not recommended", and:

    Poetry was not installed with the recommended installer. 
    Cannot update automatically.

Wait, are you seriously complaining about executing code you downloaded from the internet, that installs a package manager - i.e. a piece of software that downloads executable code from the internet?!

Not the OP but I am very concerned with telling people to pipe anything from curl straight to your shell.

I think what the comment you are replying to are getting at is the fact that installing pip packages from the Internet and importing them in your python app is not that different from piping code from the Internet into your python executable. In both cases python code from the Internet will be executed with your user privileges from within Python. Unless you audit every python package you consume, you might as well accept a curl https://example.com | python installer too.

It is not that long ago that PyPI hosted malicious (typo-squatting) packages: https://news.ycombinator.com/item?id=15256121

Yeah, I hate this trend. Unfortunately, you can't pip install poetry because it needs to manage packages, so I guess a different way was necessary. Still, OS-specific packages would be nice, I guess they just need volunteers.

Even pip is pip-installable. What makes poetry any different?

It’s running over HTTPS from an auditable source. Is that _really_ so much worse than a pip install, and can you explain in detail why you believe that to be true?

I teach my kids to use the right tool for the job, because using the wrong tool for the job can lead to injuries. But I violate this all the time, myself. It's just a good habit to get into.

"curl | bash" is a bad habit to get into. It works under certain circumstances, like making sure it's an SSL connection from a source you trust. But it's just a bad habit for the average person to get into.

Somewhere, I can hear John Siracusa saying, 'curl piped into a shell? No thanks.'

Yes, funny, but seriously, where's the threat model where you've analyzed the risks of installing code from GitHub over HTTPS and found it to be less secure?

To be clear, either of these methods can have problems, it's not unique to curl and your shell of choice. Some of the better open source projects will say up front that if you are concerned about this kind of thing, feel free to read the installer script and decide for yourself if everything's kosher.

Yes, my point was that if you're worried about running someone else's code the answer is to audit that code rather than the transport layer. There are valid concerns with HTTP or in scenarios where something could be targeted to a single user, but neither of those are relevant to 99% of the time people raise this complaint.

There's always the risk that the script will fail to completely download and leave your system in a broken state. This can be mitigated against by the script authors by wrapping everything in a function which is called on the last line, but how do you know they've done that without downloading the script and checking first?

(Poetry have done this, for what it's worth)

it's pretty easy to detect only when you are being piped and then only include malicious code then

Do you believe GitHub has that infrastructure deployed? If not, this is a blind alley to worry about. If so, what other precautions have you taken to avoid compromised tarballs, unauthorized pushes to repos with auto-deployment pipelines, etc.?

The point is that in reality you’re orders of magnitude more likely to be compromised by ads in your browser, an undetected flaw in legitimate code, or a compromised maintainer than GitHub having deployed custom infrastructure to target you. If you’re being target by a government, why would they do this instead of using the same TLS exploit to serve you a dodgy Chrome or OS update which is harder to detect and will work against 100% of targets?

So because ads can compromise us we should ignore the security of package managers?

How about this for a reason, where are the checksums when I’m curling and piping? How do I validate in an automated fashion the validity of this file I’m piping into an interpreter? When installing a package it’s quite easy to have redundant copies of an index with checksums pointing to a repository hosting the actual code. The attack surface is much smaller vs a curl | python

This is bad practice, stop promoting it or downplaying it’s security issues.

Edit: smaller instead of larger

HTTPS has checksums, and note that we’re specifically talking about installing from Github, where every change is tracked.

> This is bad practice, stop promoting it or downplaying it’s security issues.

I’m trying to get you to do some security analysis focused on threats which are possible in this model but not the real alternatives (download and install, install from a registry like PyPI or NPM, etc.). So far we have “GitHub could choose to destroy their business”, which seems like an acceptable risk and about the same as “NPM could destroy their business”.

HTTPS doesn’t know if the file changed on the server so that doesn’t count here.

I am doing security analysis. If this file changes and I’m using it in built server images then I have no way of automatically validating the changes are good without doing the checksumming myself and managing this data. What we have is a server that can be hacked and the files are unable to be verified by checksum

Fair point - I didn't actually click so I didn't see that it was a Github link, I was just reading the comments.

> Also installable via pip, but... "not recommended", and:

If you install it via pip you need to update it via pip, the alternative would be insane. And the reason it's not recommended is that it doesn't let you use multiple Python versions, but if you're only using one version then installing by pip works fine.

They could just as easily add the same code to setup.py, and then pip would run it as soon as you run pip install. There's generally no security difference between curl | python and pip install.

I agree. Most of the issues the parent mentions have been solved with poetry and pipenv.

And if you need "to create a redistributable executable with all your dependencies". You can either use pyinstaller [0] or nuitka [1] both of which are very actively maintained/developed and continually improving.

[0]: https://github.com/pyinstaller/pyinstaller [1]: https://github.com/Nuitka/Nuitka

Pipenv is plagued with problems and issues. It takes half an hour to install dependencies to our project. The —keep-outdated flag doesn’t (didn’t?) work, so I don’t know if my pipfile is being modified because the constraints require changing versions or because the package manager is errantly updating versions to latest. There are mixed messages about the kind of quality the project aims for. I would not recommend.

Frankly I’ve been burned enough that I won’t use any new packaging technology for Python because everyone thinks they’ve solved it, but once you’re invested you run into issues.

Poetry is definitely an improvement.

Anyone considering it for production usage should note that package installs in the current versions are much slower than pip or Pipenv. This might affect your CI/CD.

Could you give some details as to why it's better than other more commonly used tools (pip, venv, ...)?

Looking at the home page it's not immediately obvious to me. For example, the lock file it creates seems to be the equivalent of writing `pip freeze` to the requirements file. I see a quick mention of isolation at the end, it seems to use virtual environments, does it make it more seamless? What's the advantage over using virtualenv for example?

I'm not an expert on the internals, but virtualenv interactions feel more seamless. When you run poetry, it activates the virtualenv before it runs whatever you wanted.

So `poetry add` (it's version of pip install) doesn't require you to have the virtualenv active. It will activate it, run the install, and update your dependency specifications in pyproject.toml. You can also do `poetry run` and it will activate the virtualenv before it runs whatever shell command comes after. Or you can do `poetry shell` to run a shell inside the virtualenv.

I like the seamless integration, personally.

Sounds about the same as pipenv's functionality.

Pipenv still needs setup.py and MANIFEST.in. Poetry replaces both.

Thanks! Didn't realize this existed.


What's wrong with it? I've not used it before but it does look like a good idea.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact