Hacker News new | past | comments | ask | show | jobs | submit login

I think the biggest problem is going from zero third-party dependencies to one and more. Adding that very first one is a huge pain since there are many ways of doing it with many different trade offs. It is also time consuming and tedious. The various tools like you mention are best at adding even more dependencies, but are hurdles for the very first one.



Not true. The more external dependencies you add, the more likely it is that one of them will break. I try to have as few external dependencies as possible, and to pick dependencies that are robust and reliably maintained. There is so much Python code on GitHub that is just broken out of the box. When people try your software and it fails to install because your nth dependency is broken or won't build on their system, you're lucky if they open an issue. Most potential users will just end up looking for an alternative and not even report the problem.


To add one more third party dependency when you already have some is as simple as adding one more to whatever solution you are already using (eg another line in requirements or running a command).

When you have no third-party dependencies, then adding the first one requires picking amongst trade offs and lots of work. A subset of choices include using virtualenv, using pip, using higher layer tools, copying the code to the project, using a Python distribution that includes them, writing code to avoid needing the first dependency ...

* You have to document to humans and to the computer which of the approaches is being used

* Compiled extensions are a pain

* You have to consider multiple platforms and operating systems

* You have to consider Python version compatibility (eg third party could support fewer Python versions than the current code base)

* And the version compatibility of the tools used to reference the dependency

* And a way of checking license compatibility

* The dependency may use different test, doc, type checking etc tools so they may have to be added to the project workflow too

* Its makes it harder for collaborators since there is more complexity than "install Python and you are done"

I stand by my claim that the first paragraph (adding another dependency) is way less work, than the rest which is adding the very first one.


They're typically broken out of the box because they don't pin their dependencies. pip-tools[1] or pipenv[2], and tox[3] if it's a lib, should be considered bare minimum necessities - if a project isn't using them, consider abandoning it ASAP, since apparently they don't know what they're doing and haven't paid attention to the ecosystem for years.

[1] https://github.com/jazzband/pip-tools [2] https://docs.pipenv.org/en/latest/ [3] https://tox.readthedocs.io/en/latest/


It's trickier than just pinning dependencies because some libraries also need to build C code, etc. Once you bring in external build tools, you have that many more potential points of failure. It's great. Also, what happens if your dependencies don't pin their dependencies? Possibly, uploading a package to pipy should require freezing dependencies or do it automatically.


Modern python tooling like pipenv pins the dependencies of your dependencies as well. This is no longer an issue


I used a requirements.in file to list out all the top-level direct dependencies & then used pip-compile from piptools to convert that into a frozen list of versioned dependencies. pip-compile is also nice because it doesn't upgrade unless explicitly asked to which makes collaboration really nice. I then used the requirements.txt & various supporting tooling to auto-create & keep updated a virtualenv (so that my peers didn't need to care about python details & just running the tool was reliable on any machine). It was super nice but there's no existing tooling out there to do anything like that & it took about a year or two to get the tooling into a nice place. It's surprisingly hard to create Python scripts that work reliably out-of-the-box on everyone's environments without the user having to do something (which always means in my experience that something doesn't work right). C modules were more problematic (needing Xcode installation on OSX, potentially precompiled external libraries not available via pip but also not installed by default), but I created additional scripts to help bring a new developer's machine to a "good state" to take manual config out of the equation. That works in a managed environment where "clean" machines all share the same known starting state + configs - I don't know how you'd tackle this problem in the wild.

I do think there's a lot of low-hanging fruit where Python could bake something in to auto-setup a virtualenv for a script entrypoint & have the developer just list the top-level dependencies & have the frozen dependency list also version controlled (+ if the virtualenv & frozen version-controlled dependency list disgaree rebuild virtualenv).


I don't know if it'd work the same way, but I've had a lot of success with Twitter's Pex files. They package an entire Python project into an archive with autorun functionality. You distribute a Pex file and users run it just like a Python file and it'll build/install dependencies, etc. before running the main script in the package.

I used it to distribute dependencies to Yarn workers for PySpark applications and it worked flawlessly, even with crazy dependencies like tensorflow. I'm a really big fan of the project, it's well done.

https://github.com/pantsbuild/pex


Unless your dependency is a C-header file updated by your distro as part of a new version.


Requiring people to "pay attention...for years" is not the way to build long-term robust software.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: