The number of extra tools used in this article boggles my mind.
Are you writing a simple library? Create a setup.py. Copy paste an existing setup.py and modify it to suit your purposes. Now you have a working, pip installable python package.
Want to publish to PyPI? Use twine. It's standard and it's simple.
You don't need complicated tooling for simple projects.
Comments on this reply seem to have forgotten two key things about python.
1) It’s batteries included. The standard library is a core feature, and it is extensive in order to reduce the number 3rd party library dependencies.
2) “There should be one— and preferably only one —obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.” The Pythonic way is to use the standard library, setup.py, pypi, and so on. Python is not Rust, or Node, or Clojure, or anything else. Do it Pythonicly and good things will happen.
In the history of Python, it’s only because of the Zen’s guiding principles and the “come join us” attitude of the community that Python accidentally became “enterprise grade”. It succeeded in spite of the lack of Cargo or NPM, or (shudder) Mavin[, or CI/CD, or 3D game engines, or easy binary package productization, or everything else Python is obviously bad at]; almost precisely because it’s not those things. Python encourages me to get my task done, and all it asks is that I leave the flame wars at the door and embrace it for what it is.
It saddens me to see so many people jump down zomglings throat for making the daring suggestion that Pythonistas use the language as designed. Because, he is absolutely right about this: “Complex is better than complicated.”
Unpopular opinion: Python is bait and switch. Naive, aspiring programmers are told to learn Python because it’s the perfect beginner language, only to quickly hit a brick wall of complication they are unable to foresee (because they’re completely new to all this) and ill-equipped navigate.
Like it or not Python is now “enterprise grade”. The sooner it either grows up and figures out packages, ci/cd, binaries etc, or it looses its reputation at a great beginner and general programming language to something like Julia or Crystal, the better.
The thing is that using setuptools is neither standard nor Pythonic. It isn't part of the standard library. It's a way of doing things that is broken, and has been specifically called out by the Python developer community as something that people should stop doing.
I actually think one should do the exact opposite to what you're suggesting. I've experienced modern package management through Cargo and anything below that level now seems like returning to stone age.
In the python ecosystem, poorly defined packages is a wide problem. You never know what you get in the next version upgrade.
So my suggestion: burn all the "simple" python packaging tools in a big fire and move eveything to Poetry today. It will be painful at start but will increase the quality of the ecosystem by a huge margin if just 50% of the projects do this.
Bit of a side note but remember cargo learned a lot from generations of package managers, across languages. It essentially represents some of the best that package managers have to offer. You only get that kind of result by starting from scratch every few years (as in, rust started from scratch when the Node/NPM ecosystem had been around for years to steal ideas from, Haskell had been around for years to steal language design from, etc.).
Rust is incredibly lucky to have been created when it was because it benefits immensely from these things (I think it's the best new-to-middle-aged production-usable systems language out there today).
I agree with the idea but languages like Python and their ecosystems are really hard to move (remember python 2->3? is that even over?) -- it's a herculean and often impossible task.
Cargo is not hugely different from Maven which has been working fine for over a decade. Yes, it takes some polished ideas from other systems, but Python has had more packaging tools come and go than several other ecosystems put together.
+1 from Maven. And Maven was launched in 2004, 17 (!) years ago. It's almost old enough to vote.
A lot of ecosystems put on some solid horse blinders in order to avoid Java at all costs (Javascript/Node/NPM being another example). They've avoided good ideas from Java for more than a decade.
Using setup.py does not mean "not using extra tools". It depends on setuptools, which is an "extra tools" just like flit (used in the article) or any other tool. In fact, with using only setuptools one will need a whole additional set of tools to manage things like:
* virtual environments (granted, venv is now part of the stdlib, but it's still an "extra tool")
* publishing to PyPI or another index (twine)
* dependency management (both for development and actual dependencies)
Plus the tools that are needed anyway, to manage common development actions like:
The article is correct in using pyproject.toml, which has become the standard way to specify build mechanism for your package [0]. Even setuptools supports it in the latest versions [1], meaning that setup.py is becoming obsolete, or at least unnecessary.
Finally, tools like Poetry [2] offer a whole set of functionalities in one place (dependency management, virtual environments, publishing), which means that they need fewer "extra tools" than just setuptools.
You asked for packaging, and that is pretty much it. Of course, setting up a project and its dependencies take a bit more work; the basic intro for that is here: https://python-poetry.org/docs/basic-usage/
How do you handle version pinning? hash checking? CI? testing on multiple platforms? multiple python versions? deployment? credential management? package data? version bumps?
Sure, experts know how to do all these things because they spent many days learning them, but I'd rather outsource to a tool.
Iteratively. You don't need to solve all those problems at once.
Version pinning can be done in setup.py using the same syntax you would see in a requirements.txt file. You should be very conservative when pinning versions in a library, though.
You can lean on your ci tool (eg. Github actions) to handle testing, hash checking, credential management, etc. But I recommend all of this start as a bunch of locally runnable scripts.
I typically bump version directly in a version file and move on with my life.
This stuff usually builds up iteratively and at least for me has never been the starting point. Starting point should be a library worth sharing. It is not the end of the world of you release the first few versions manually.
TBH as someone trying to use Python professionally it is extremely frustrating that basic things with regards to package management are something you have to iterate towards, as opposed to just being obvious and default.
One thing that has become clear to me, from playing around a bit with go, rust, and nim, is that it is astonishingly better when the language has exactly one set of tools, that everyone uses.
Even if that set of tools is kinda crappy (Glances at go), it's just so nice to not have all the bikeshedding and just get on with it.
I’m not familiar enough with Go but at first glance the go mod stuff seemed pretty decent.
E.g. more flexible than Cargo in that you could have a large codebase with two different versions of a dependency isolated to their compile units allowing you to gradually adopt a breaking change dependency in a large code base. I was kinda bowled over with that feature (the isolation part is key).
For python I’m finding poetry much more ergonomic than pipenv, it’s not just the speed difference it’s the convenience of one tool which aligns with what you’re saying although the existence of poetry doesn’t delete historical aberrations in Python’s long history.
I sympathize. It is unfortunate that the python community never settled around a tool like leiningen for clojure or cargo for rust or npm for node.
What we saw with npm was the entire community iterating towards a feature set and everyone reaping the benefits automatically with npm updates. package-lock.json is a good example of this.
Worth noting is that cargo and npm weren't "settled around"; they were developed and presented, from the beginning, alongside the relevant compiler and runtime. There was never a question; the batteries were included.
Leiningen is the weird one where people did actually settle fairly well around an unofficial solution in the absence of an official one. I think the norm with languages that forego official tooling is closer to what we've seen in Python.
The Python community has considered an "official" packaging tool in the past, but in those conversations found that the community had too many preferences to find a good compromise. That's the trouble with having a highly diverse set of uses and integrations, and lots of legacy.
If you're curious, the email threads about Conda and defining wheels are interesting.
Maybe it could still happen? It seems like a super high value challenge that the BDFL could take on: build out the official set of tools (setup.py, twine, virtualenv, pip) to support features that make people seek out alternatives (pyproject.toml, poetry, flit, conda, pyenv, pipenv).
I realize this is controversial but from reading the docs I really thought Pipenv was the official solution. Took me a while to realize this wasn't the case.
I went through the same progression, thinking pipenv was the official solution before deciding it wasn’t. Then, just now, I realized that pipenv [1] is currently owned by the Python Packaging Authority (PyPA) who also owns pip [2] and virtualenv [3]. I don’t know the right answer but this illustrates the confusion of not coalescing around an official solution.
What happened was that Kenneth Reitz socially-engineered his way into the PyPA to get his tool blessed. The community lashed out (since the tool had obvious shortcomings and a somewhat dubious development process) and recommendations were softened. Eventually the PyPA had to take over pipenv when Reitz had other issues, and they are now forever burdened with what is a bit of a dud.
There are quite a lot of people working on better packaging (look at the Python discourse forum, for instance), but this is not really a topic that Guido has got involved in, at least in the time I've been paying attention.
But with or without a BDFL, one packaging tool to rule them all is a pretty tall order. The needs of a package like scipy, which incorporates C & Fortran code, are pretty different from something like requests. And different communities using Python have their own entrenched tools and techniques. It takes more than someone saying "Foo is officially blessed" to shift that, even if everyone respects the person saying that.
On the whole, Guido's career as a BDFL was astoundingly effective. Maybe he made the right call. It'd have been a terrible idea to alienate the science community just when data science was taking off as a field.
I'm not disagreeing, but I am curious if you have anything specific you would point to with regards to Guido's role being successful. I'm just ignorant really, it's not intended to be a leading question at all.
Too many things to answer here. The easiest is to point to the popularity of the language. He's made decisions I disagree with, but I can't argue with success.
I don't know how much the original author had to do with Python's success in the last 20 years. The success of Python in data science is because of NumPy/Scipy/Pandas. Things like packaging that needed leadership never got any.
The leadership was that the science community would benefit from a science-specific tool, like Conda, and that making one ring to rule them all would be too difficult.
Also, yes, Guido and many other early contributers have been been active for 30ish years.
Rubygems, and then Bundler, followed the same pattern. Neither was batteries-included, both were unofficial community efforts. Bundler directly influenced cargo.
Yarn / npm are still in competition today i think?
The js tooling is particularly immature, e.g. fake packages along with typo squatting is rife. Compare with boring old maven where that’s not been an issue in >10 years at this point.
The issue is that every time the community settles on a tool a new tool is made to fix the issues with the old tool rather than just refactoring the old tool.
I know Python much better than I do rust or node, but I think the Python design decision was decent here: you can use different tools, but all of them should put the configuration in pyproject.toml. That file has fields which are universal and others which can depend on the exact tooling used. So build tools, repos and so on can get the info they need and at least potentially, do the right thing for code packages with different tools.
yea, it is. but it's a sane thing to do. recommending poetry for a beginner is a bad idea. (nothing against that package).
python is a mature, old, software system. three times older than go or rust. way older than zig. these modern languages have learned from the field as a whole and implemented tools that people are taking for granted nowadays.
as with any mature software system, people & companies have established their preferred ways of doing things. it's going to be hard to have the language council dictate ways of doing things.
i would recommend going simple and using the standard set of modules until poetry or whatever gets into it.
Recommending the standard set of modules is the opposite of "going simple." Poetry removes a lot of the complexity and user-unfriendliness inherent in the previous set of standard modules. For any Python beginner coming from another popular language Poetry is likely going to be very similar to the dependency and package management in the language they're coming from. Python's standard set of modules sticks out like a sore thumb when compared to the tools in other popular languages.
It's also not ready yet, missing critical features like editable installs. Right now, you still need a shim setup.py. Until pyproject.toml can actually replace setupy.py, I see little incentive to start using it: It's just one more file to add. The one exception is if the package actually has build requirements, e.g. for Cython modules.
The only reason we use pyproject.toml is because of the stupid black formatter that refuses to support setup.cfg, which every other python tool under the sun supports.
Yeah, it's annoying. But Black has so little configuration that I'm ok with hardcoding command-line parameters in my Makefile/tox.ini/precommit hooks (`--skip-string-normalization --line-length 79`)
I don't recognize this. I use poetry all the time, without setup.py and the local installs are editable. I have published half a dozen packages which don't have a setup.py and they all work fine.
You can't install a poetry package editably into another environment. It's really a missing feature in the pyproject spec. There's an open issue about it under one of the pypa repos. Someone just needs to do the work of implementing it in pip/setuptools.
> recommending poetry for a beginner is a bad idea.
Strongly disagree. There are so many footguns with low-level tools like pip that I can't recommend it to anybody but an expert (but an expert doesn't need my recommendation anyway).
To be fair, while this is a single article, if you only look at step 1, 2 and 3, you get a fully published package with only one tool used (flit) and not much extra.
It's the succeeding sections (A, B, C, D, E) that get more advanced, but they're all optional. You should definitely do A, but the rest I'd say it's a lot more opinionated and definitely not needed.
Fair point. I wouldn't recommend this article to someone just starting out with python, though. It's often good to understand generation n-1's way of doing things but not too be married to it.
> Version pinning can be done in setup.py using the same syntax you would see in a requirements.txt file
The problem with this approach is that it doesn't handle transitive dependencies well. Say you depend on version 1.4.6 of a particular library. And then that library depends on version >= 2 of some other library. When you install your package, you know that you'll get version 1.4.6 of the first library but have no idea what version you'll get of the second library. You can of course pin all the transitive dependencies - except that clutters up the setup.py and is a massive pain to keep up to date as you bump individual dependency versions.
Seems like a solid argument for a switch to use go's minimal version selection
the version selected during a build is the one with the most minimal version that satisfies all other constraints. this means if you have libA that needs dep>=1.1 and libB that needs dep>=1.3, you get dep=1.3 even if dep1.9 is out. your build never changes because of a new version release, as long as they release with proper semantic versioning. if you later include libC that needs dep>=1.8, you'll get that version. but because you changed your immediate dependencies, not due to a surprise down the dependency line.
Its at least consistent - which, IMO, is better than just getting a random version. I do, however, think it's a bit unfortunate that it prevents picking up bug / security fixes in transitive dependencies.
Imagine that you depend on library A of a particular version which itself depends on library B. With minimal version selection, as long you don't bump you dependency on library A (or some other library that depends on library B), you'll continue to get that same version of library B. But, then library B releases a critical security fix. With minimal version selection, there isn't a great way to pick up that fix. You can _hope_ that library A releases a new version that requires the fix - but that may or may not happen and could take a while. Or, you could add an explicit dependency on the new version of library B - which is unfortunate, since, your main package doesn't depend on library B directly.
Lock files solve this problem. You can depend on whatever version of library A that you need and lock the transitive dependencies. And once library B releases its fix, you can update your lock file without having up bump the version of library A.
Tools like poetry providing the features to automate this workflow.
I used poetry for multiple projects already and I quite like it because it makes version management really straightforward and you can just go like
poetry new foo
And it will setup all the basic stuff for you right away. Having all the project dependecies and Metadata in pyproject.toml makes sense and reduces cognitive overload. Having poetry manage your venvs autmagically is a good extra.
There is still room for improvement, e.g. I had some trouble with their install script at times (python vs python3)
I usually don't pin, I'd rather deal with upstream BC breaks as they are published instead of accumulating tech debt. I call this "continuously integrating upstream code", because Continuous Integration is a practice, not a tool.
You don't pin anything for a package. I'm not aware of any "standard" CI that a package tool could set up. I guess you mean testing on multiple versions, in which case tox will help. Deployment for a package is handled by twine. Package data? What about it? Version bumps should always be manual but I recommend setuptools-scm.
You seem to be confusing packages with "apps". It's very important to understand the clear distinction between these.
I generally reach for pip-tools, when I need to pin versions in a requirements for a deployable app, like an api.
It's by far the simplest option I've found.
If your project is a library, just use setup.py and express your deps abstractly (min, max or version range). Don't pin to specific versions at all, if you can help it.
This is bad advice. Do not create a setup.py for a new package. (Keeping setup.py for an old package can be okay.)
The author is correct that you want a tool such as flit or poetry which will work with pyproject.toml. Setting up a basic package will be no harder than using setuptools, and it is much more future-proof. You won't have to copy-paste some other crufty setup config either.
It is fair that you don't need all the external tools in this tutorial. In particular, using make is very silly since you can configure different linting and testing workflows directly in pyproject.toml, rather than pull in a whole other system which only works decently on *nix. Poetry also removes the need for tox.
Future-proof... Poetry 1.1 broke compatibility with 1.0. 1.1 lockfiles would crash Poetry 1.0, and 1.0 lockfiles would be thrown away by Poetry 1.1.
It does not correctly verify hashes (if at all) [1]. You can't add packages without updating all your dependencies. Monorepos are not supported. PEP-508-compliant Git dependencies cause the tool to crash with TypeError [2].
I think Poetry is the right direction, I use it for everything, but it's not the silver bullet you're painting it to be (yet). It's definitely not on par with Cargo, or maybe even npm.
For all the hate nodejs gets, it solved the software packaging problem. That is the sole reason npm ecosystem is so big.
I have to say that it probably isn't a fare comparison because python is much older than nodejs. Python package management might very well have been state of the art in 1995.
My impression was that npm does little more than fetch code from the Web and stick it in a 'node_modules' directory. I've even seen npm used for things that aren't even JS, just bunch of files.
This approach ends up with multiple, potentially-incompatible versions of the same package in a project. True that's less of a problem in JS, since it's interpreted (deferring imports to runtime) and un(i)typed (no need to check if interfaces match up). Yet even that has lead to replacements/complements like yarn.
> My impression was that npm does little more than fetch code from the Web and stick it in a 'node_modules' directory.
Yes. There's hardly even a standard directory structure, let alone a standard way to convert source code to published code. Every slightly non-trivial repo basically has an ad hoc build system of its own. Ever tried to fix a bug in a package, and realized that using git://github.com/user/repo#branch doesn't work, because npm only downloads the source code, which bears no resemblance to the built products? I fixed two bugs in two third party packages within the past week, had to deal with this twice. Ran into the Node 12+ and Gulp 3.x incompatibility issue twice in the past month (with totally modern, actively developed packages), too.
npm has more sophisticated dependency resolution and locking than pip, sure. Python packaging is more consistent in basically every other regard.
Maybe the reason that the library itself is simple in the article is that that’s just an example and the author wants to show how to do it properly, end to end.
You seem to have missed the point. The project in the article was simple because that is best for exposition. The article would hardly have been more helpful with the code for a realistic python package dumped into it, would it?
Huge thumbs up to Poetry. It's drastically simplified package management for me and replaced Pipenv (which I simply dreaded working with due to performance and DX issues).
I no longer start Python projects without Poetry. It's really good.
EDIT: Also, it's being integrated to most PaaS as well. I deploy to Render.com with Poetry now.
That's a 72MB download, and yet another way to fragment the ecosystem. Not something I'd get just to make a package when a default Python setup has recommended tools and everything I need to make and/or install packages.
> a default Python setup has recommended tools and everything I need to make and/or install packages
No it doesn't. Neither setuptools nor pip are part of the standard library. Yes, they are installed by default in many cases, but they are still "extra tools".
72MB? On a development machine? Are you on dialup?
Poetry doesn't fragment the ecosystem. Unlike setuptools it uses pyproject.toml, which can be read by other tools, and is the correct way of storing package configuration.
A package built using Poetry is installable without Poetry in the exact same way as one built using setuptools.
I think it's the right direction, and I look forward to Poetry maturing. But right now it has a lot of gotchas, and I would only recommend it for people who are serious about dependency management/compliance/reproducibility.
That's if you want to pin transitive dependencies, which is the de facto standard in JS world but not always true in Python world, depending on your context.
Which, like any jazzband project, you must not use for a commercial project, as stated by their CoC, the Contributor Covenant with a vague & bizarre modification about ethical use at the end. You are simply not allowed to contribute in any way on any kind of paid time, nor are you allowed to pay someone to contribute. This is of course not advertised for when they tell you to give them your OSS projects, for which you have already chosen a license and maybe even a CoC for. For this reason, despite the many non profit projects I maintain, I stay away from jazzband.
`poetry` and `pipenv` are both so much slower than `pip-compile` for me (there are many open issues for both complaining about lock speed), and manage to update locked dependencies half the time despite me asking them not to with e.g. `poetry lock --no-update`.
Pipenv only targets applications; Poetry targets both applications and libraries. Pipenv has quite some drama behind it that I do not want to get into; in contrast, Poetry's development has been quite professional. Pipenv enjoys better tool support, e.g. it is recognized and supported by VS Code; but Poetry does not have the same level of support.
The drama was surrounding false advertising so to speak. Pipenv promised a lot but did not quite deliver, much like the earlier days of MongoDB. But more importantly, it pretended or at least heavily implied it was an official PSF-affiliated project, when it was not. How that claim was substantiated was also subject to drama.
It also had no releases for over a year, even though the master branch was getting frequent updates and the performance of the last release was atrocious (and I'm not sure if it has improved much).
From memory, there was a whole thing where pipenv, created by Kenneth Reitz of requests fame, was misaccurately portrayed as the official successor to pip when that wasn't true
I just moved a project from pipenv to poetry at work. My biggest issue with pipenv is that you can't selectively upgrade dependencies. Trying to `pipenv update xyz` basically blows away your lockfile and updates everything. There's a command line flag to be more selective but it doesn't work. I found an open GitHub issue about it that's years old.
Poetry by contrast works pretty much like any modern dependency system you'd be familiar with from another language like cargo, npm, or hex.
For poetry, it makes sense to use `poetry self update --preview` — I often come across weird bugs that take too long to get fixed in the current release.
I hadn't heard of flit, it does seem like it's not brand new on the scene, however it is primarily a single author, so expect a tool which is opinionated and for which the opinions may not necessarily reflect a broad consensus:
Thomas is well know as one of the maintainer of IPython and Jupyter, and developed flit while working on the pep for pyproject.toml and the pip backend allowing things like python -m build.
Though `python -m build` only works _if_ you use something like flit or setup.py in the backend to do build the package and hence why you can set flit as a build-backend.
So yes, flit is one of the latest tool, and yes it is one of the things that push for the ability to use pyproject.toml+python3 -m build, you just seem to miss some subtleties of the toolchain.
Poetry does much more than Flit, like resolving dependencies, creating a lock file, and managing an environment where you can run your code. In particular, Poetry is meant to support application development (where you want to have a fixed version of your dependencies) as well as library development.
Flit is more aimed at being the simplest possible thing to put a package on PyPI, if that's all you want to do. It expects you to list dependencies in pyproject.toml manually.
I feel like this is a good place to mention Pip-tools [0] which can generate a lockfile of sorts from a standard requirements.txt (or a setup.py). Specifically, it resolves all dependencies (including hashes), and writes them to a new "requirements" file that you can read with the usual `pip install -r`.
The nice part about Pip-tools versus Flit or Poetry or Pipenv is that Pip-tools lets you keep using Setuptools if you want to, or if you're unable to switch to one of the others for some reason (and valid reasons do exist).
Flit is definitely opinionated, and not suitable for every use case. As Carreau hinted, I think its bigger impact will be from the specifications, especially PEP 517, which it helped to prompt, rather than people using Flit directly. The specifications mean it's practical to make new tools which interoperate nicely, without having to either wrap setuptools or carefully imitate its behaviour.
The author mixes different things like linting and testing into the packaging process, which (IMHO) are not really part of making a package. The process is really much easier than this article makes it seem:
- Write a simple setup.py file.
- Generate a source or binary release by e.g. running "python setup.py sdist"
- You're done!
Adding a setup.py file is already enough to make your library pip-installable, so you could argue that this is a package already. The files generated by "setup.py" can also be pip-imported, so they are also packages. Now you might want to upload your package to a repository, for which there are different tools available. The simplest one being twine. Again, you just install it and run "twine upload -r dist/*" and your packages get uploaded to PyPi (it will ask for a username and password). So why complicate things?
I don't see how your version is easier than the sequence of commands in the first few steps of the article, which is basically `pip install flit; flit init; flit publish`. Flit is just as easy to install as twine, but you save yourself the hassle of having to write a setup.py.
Maybe I'm too old-fashioned then. But I like that you don't have any dependencies when using distutils/setuputils with a `setup.py` file, so if you don't distribute your code you're already done. I'm also not a fan of tools that are just wrappers around other tools.
Flit isn't (mostly) a wrapper around other tools - it has its own code to create and upload packages. This was one of the motivating cases for the PEPs (517, 518) defining a standard interface for build tools, so it's practical to make tools like this without wrapping setuptools.
Even as a long time Python user, the packaging ecosystem feels fragmented and error-prone at best. Honestly, it sours the experience of writing Python code knowing you might eventually need to make it work on another computer.
Even more fragmentation? As a dev I'm not going to make packages for anything other than 'pip install' and maybe Ubuntu if you're lucky. I would also heavily discourage distros from shipping ancient buggy versions of the package, which is all distros are good for these days.
I think for the vast majority of at least pure-Python projects you could just use poetry and upload your packages to PyPI or a private index. You can go from an empty directory to publishing a package within minutes with poetry (although, of course, you probably shouldn't).
There would be no room for error if we just put the libraries in with the project as files instead of adding all these extra steps. Nobody seems to like this simple, bulletproof method anymore for some reason though.
A package manager is a whole separate program with config files that adds an extra build step and works like a black box. I mean just having the libraries entirely present in the repository. If an update is needed then someone pastes it in and commits it. This also lets you organize how you want.
It's great for beginners and experts. As a long time python veteran it completely changed how I work with python for the better. It is lengthy with lots of optional steps. Just skip the ones you don't find relevant.
The recommendation to set up a Makefile on top of tox is a bit odd to be honest. Tox basically "just works", and you can do things like pass stuff to `pytest` by setting up `{posargs}` in the tox config (see [0])
I do feel like tox gets a bad rap despite having a complete feature set. I think a part of it is that the documentation is complete but not organized in the "Tox user"'s perspective, so for someone who shows up on a project using it, it's hard to figure out the quickstart( though the "general tips and tricks" page gets somewhere [1])
Anyways yeah, would not recommend Make over just leaning into tox more here.
EDIT: also, this article reminded me of how much I really dislike Github Action's configuration syntax. Just balls of mud on top of the Docker "ball of mud" strategy. I will re-iterate my belief that a CI system where the configuration system isn't declarative but just like..... procedural Lua will be a billion dollar business. CI is about running commands one after another! If you want declarative DAGs use Bazel
I’d argue the counterpoint actually: Writing Makefile targets for common commands significantly improves usability and ergonomics, especially when they follow common idioms (make test, make build, make install, ...).
The recipes for each target describe not only how a project intends to run each tool, but which tools it intends to run. Instead of having to know that this project runs tests under tox, while that one runs only under pytest, we can run ‘make test’ in each, and count on the recipe doing the Right Thing.
That consistency across projects makes it much easier for someone to get started on a new project (or to remember how the pieces fit together on your own project from a few months ago)
For me it’s important for a testing command to be able to receive parameters at runtime (for example tox test —- —pdb) , is it possible to do that with make in general? I never knew how.
I generally agree with your sentiment, though. I’m usually limiting myself to Python stuff so don’t have much exposure to make, it’s always felt like a less powerful task runner than other stuff
I prefer to put `.PHONY` before every phony target, not gather them all in one place.
When phony targets are all written out in the beginning of the Makefile, it's easy to forget to alter this list when a phony target is added or removed later.
This rarely causes an error, but still I've seen many Makefiles with outdated .PHONY lists.
I’m always glad to see Make being used. It’s such a powerful and simple tool that usually does the job just as well as more “bespoke” CLI’s for various frameworks and languages
I rather just write a bash script. It's the lowest common denominator. Make may not be installed by default in many places and it has some weird syntax quirks that make it annoying to use IMO.
I like this style as well, we keep all our scripts in a ./bin directory, e.g. ./bin/lint.sh, ./bin/test.sh, etc. just as discoverable as make commands (run ls ./bin) and much easier to maintain.
If you really want make, you can also just call out to your bash scripts from make:
I would strongly discourage using make on new projects, make syntax is full of footguns and quirks (not being able to pass multiple args to subcommands is an easy example).
Bash, Python, or even Typescript are much easier, safer, and more widely standardized environments to maintain and grow your scripts once you get past a few lines.
It's programming language that distinguishes between spaces and tabs in a way that changes behavior. It's also the only PL that I know of that's outright incompatible with expand-all-tabs-to-spaces editing policy, which is what the vast majority of coders use in practice.
I like the idea of Make, but it's far too hacky. Even using it for a static blog site (turning foo.md -> foo.html, which is pretty close to the usual foo.c -> foo.o examples) ended up with recursive invocations, rule-creation macros, double-escaped sigils, eval, etc.
There are a bunch of lightweight alternatives to Make out there (I hear Ninja is pretty good). My personal preference is Nix these days (although that's quite heavyweight).
The biggest problem with make appears to be that people refuse to spend a little time learning how it works, and instead charge off to reimplement it, poorly, instead.
It sets up GitHub actions for publishing the package to PyPI when you create a release on GitHub - which I find way to be a really productive way of working.
cookiecutter is great. I recently moved from click to typer, which I so far really have enjoyed. I probably should make a cc template for that one day...
When you are doing machine learning, conda is widely used. Why? Because you can install non-python things like cudatoolkit or ffmpeg (you can even install python, so you are sure that everybody are using the same version of python)
Python is fantastic at gluing specialized tools/libraries, but a lot of these require non-python dependencies (most are written in more performant languages). IMO, this is a big differences when comparing with Cargo for Rust because most of the dependencies in Rust are written in Rust.
The state of packaging in Python is kinda meh, the official documentation here [0] suggests to create 3 additional files in order to create a package:
- pyproject.toml
- setup.cfg
- setup.py # optional, needed to make editable pip installs work
if you add conda, you may need 2 additional files:
- meta.yaml # to build you conda package
- environment.yml
With this much boilerplate, I understand why people are creating tools like flit.
I still don't understand why in 2021 `pip`, which is the standard package and dependencies manager in Python, cannot
1) build a package from some spec
2) generate the scaffolding needed to build such package
`gem` from Ruby does that (well, it doesn't 2 AFAIK but at least it does 1)
Slightly off topic but does any recommend any great guides for building browser javascript / node.js packages i.e listing recommended linters, profilers, testing strategy, documentation template, or a project structure?
Somewhat related: I have an example repository which I've been using to keep track of the tools I use, aimed at people in a research lab who are relatively new to Python. I made it because the existing example/template repositories I found don't gel nicely with the way I like to set up and think about things. Here it is -- hope you find it useful:
It's more the latter, if you read the rest of the thread you'll get a feel for the issue - there's no broad consensus on what Python should do for package management.
Flit is not an unreasonable pick, but it's not a silver bullet.
I would say it is quite up to date. First, it is using pyproject.toml which is now the standard way to define build requirements for Python packages. Second, its collection of additional (linting etc) tools is pretty solid; there are potential alternatives in few cases (e.g. I would personally use Poetry rather than flit, and wouldn't use make for development scripts) but that is pretty much it.
I've never even heard of pyproject.toml and I open github projects behind packages to vet them (to a minimal degree, but still, making sure it's not a typosquat and still maintained and often also reading a bit of source code) on at least a weekly basis. It's not always python of course, but often enough. Maybe it's just some freak coincidence that I somehow never saw it or just don't recall while really it's mostly everywhere, but this broad statement for something I've never even heard of makes me think of JavaScript and the 'standard framework' that changes every six months.
With PEP-518 being 5 years old it's still relatively new, and most tools have implemented the support for it relatively recently. The key introduction article was written exactly one year ago: https://snarky.ca/what-the-heck-is-pyproject-toml/
I work in a very small company where we are building the plane as we learn to fly, and I'm the only person there with any programming experience. I've been working on trying to improve my hobbyist-level (at best) knowledge of the Python ecosystem over the last year or so.
I've gotten to the point where a number of smaller tools I've put together can now be used in larger projects. I learned the hard way that just copying files around makes it hard to know which version is in that project, and upgrading, especially once there is more than one file, becomes a lot tougher. I learned the harder way that trying to link the same file into multiple projects is a great way to really screw things up.
About 4 or 5 months ago I made a real effort to try to learn how to use virtual environments (pipenv) and packaging to make it so that if I update one of the smaller tools, I don't clobber all the downstream projects that rely on it. I wanted to make it so that when I update something to add features or change things, I can go back and fix older projects it's used in at my leisure. I haven't even begun to touch on unit testing, and I have no clue what linting is. Things are kind of working so far, but it feels very hacky and fragile, and I know it can (and SHOULD) be better.
All of this stuff around packaging and being able to install those packages very daunting, and trying to stumble on the right tutorials is extremely frustrating. The vast majority of them assume I want to share my stuff with the world on PyPI, or that I have servers available to me to create private PyPI indexes, but I don't. Yet I still want my packages to "resolve their own dependencies" when I install or upgrade them.
And when it comes to learning things like testing, the few tutorials I've looked at either use different tools to do it, or their examples are so oversimplified that when I look at my own code, I don't know where to begin.
I say all of this because looking at this tutorial, it's more of the same. I want to make my code better. I want to make it easier to use those smaller projects in larger projects. But then it says things like "Every solid open-source project runs cloud tests after each commit, so we will too," but it doesn't do anything to explain what that is or why it should be done, besides "everyone does it, so you should, too."
I think what makes it even harder is that when something like this gets shared, there are so many conflicting opinions. Some people say to just use setuptools, other say that setuptools is on its way out and to use pyproject.toml (or some other tool) instead. It's all just so... hard!
I'm sorry. This is coming off a bit ranty, and that's not what I intended. I'm just feeling frustrated and I'm not sure of a better way to express that I need help with finding help. There are even a lot of things that I'm sure I need help with, but I just don't know what they are. It makes it really hard to verbalize what I need to another person, let alone to get the right words into a search engine to take me there.
Dynamic version numbers based on git tags are absolutely necessary to me, to ease my continuous integration practice, that's also what openstack/pbr does amongst probably other things
Are you writing a simple library? Create a setup.py. Copy paste an existing setup.py and modify it to suit your purposes. Now you have a working, pip installable python package.
Want to publish to PyPI? Use twine. It's standard and it's simple.
You don't need complicated tooling for simple projects.