Hacker News new | past | comments | ask | show | jobs | submit login

> I recommend using two tools together: Black and isort.

Black formats things differently depending on the version. So a project with 2 developers, one running arch and one running ubuntu, will get formatted back and forth.

isort's completely random… For example the latest version I tried decided to alphabetically sort all the imports, regardless if they are part of standard library or 3rd party. This is a big change of behaviour from what it was doing before.

All those big changes introduce commits that make git bisect generally slower. Which might be awful if you also have some C code to recompile at every step of bisecting.




> Black formats things differently depending on the version.

Then add black as part of your environment with an specific version...


Or wait until a more sensible formatting tool comes along.

Reformatting the whole code every version isn't so good. It's also very slow.


Install pre-commit: https://pre-commit.com/

Set black up in the pre-commit with a specific version. When you make a commit it will black the files being committed using the specific version of black. As it's a subset, it's fast. As it's a specific version, it's not going back and forth.

I hope this solves your issues.


It doesn't… people use a million different distributions. Forcing everyone to use a single version of black means that people will just not bother with your project.

The authors of black just don't understand that it'd be ok to introduce new rules to format new syntax, but it isn't ok to just change how previous things work.


This is mostly nonsense and FUD. We have virtualenvironments, requirements files, setup.py with extra_requires that can all be used to manage versions without relying on the particular packages installed on an OS. Most people contributing to open source would be familiar with at least some of these methods and if they are not it’s a good opportunity for learning.

And if they are not, then maintainers can pull, run black over the diff, and commit.

CI prevents poorly formatted code from entering main.

The actual changes between black versions of late have been minor at best. You’re making a mountain out of a molehill.

Having a tool that dictates formatting is a lot less oppressive to new developers than 100 comments nitpicking style choices.


> Having a tool that dictates formatting is a lot less oppressive to new developers than 100 comments nitpicking style choices.

Yes, it would work very well if said tool didn't change its mind every 6 months, generating huge commits at every bump

> Most people contributing to open source would be familiar with at least some of these methods and if they are not it’s a good opportunity for learning.

You seem unfamiliar with the fact that other people aren't necessarily clones of yourself and might not behave like you.

> CI prevents poorly formatted code from entering main.

If you run black on CI… which of course I don't since every time they make a new version I'd have the CI start failing.

And no pinning is not a "solution"… it's at best a workaround for badly written software.

> The actual changes between black versions of late have been minor at best. You’re making a mountain out of a molehill.

If you have 10 lines of code, I guess your diff can't be more than 10 lines. If you have more than 10 lines…


I’m working in a Python code base with multiple millions of files and not for the first time. It’s not the problem you make it out to be. The changes between black versions have been almost unnoticeable for years.


since developing in python should be done in a virtual env to start with, I fail to see how this will be any problem. The pre-commit documented version of black will be installed in the venv of the project, problem solved.


I think you haven't understood what I've told you. Please look into pre-commit and using it.


I understood but I disagree.


That would only make it more likely that two developers would be using two different versions of Black.

The further you get away from the project folder the more likely each developer is to have a different environment.


Just put a versioned black into pre-commit yaml and put that in your source and forget about it


So now we have one more (useless) build requirement for developers?


pre-commit is very useful, in my opinion. When organising code from a lot of Python developers at least, getting the boring stuff like formatting, import ordering, linting, mypy etc. sorted is a time saver.


Do you know how slow all of that is? Do you want to run all of that per every commit? The result would be people making a monocommit rather than incremental commits that is easy to review one by one.


I do like pre-commit for enforcing linted and typed code but the speed (or lack thereof) does hurt.


Yes, so fast that it makes no difference at all to me. How slow do you think black is?


I'm sure you will get many contributions to your project if you refuse people with the wrong distribution from contributing.


I think by now it's a reasonable requirement for contributors to use a virtualenv when working on a project


Two developers on the same python project should also use the same version... with poetry it is straightforward to keep track of dev dependencies. Reorder python imports is an alternative for isort: https://github.com/asottile/reorder_python_imports


> Two developers on the same python project should also use the same version

Why? It is expected for the thing to run on different python versions and different setups… what's the point of forcing developers to a uniformity that will not exist?

It's actually better to NOT have this uniformity, so issues can get fixed before the end users complain about them.


Tooling matters, pretending that it doesn't isn't really going to help you. But you do you...


> So a project with 2 developers, one running arch and one running ubuntu, will get formatted back and forth.

Any team of developers who aren't using the exact same environment are going to run into conflicts.

At the very least, there must be a CI job that runs quality gates in a single environment in a PR and refuses to merge until the code is correct. The simplest way is to just fail the build if the job results in modified code, which leaves it to the dev to "get things right". Or you could have the job do the rewriting for simplicity. Just assuming the devs did things the right way before shipping their code is literally problems waiting to happen.

To avoid CI being a bottleneck, the devs should be developing using the same environment as the CI qualify gates (or just running them locally before pushing) with the same environment. The two simple ways to do this are a Docker image or a VM. People who hate that ("kids today and their Docker! get off my lawn!!") could theoretically use pyenv or poetry to install exact versions of all the Python stuff, but different system deps would still lead to problems.


> Any team of developers who aren't using the exact same environment are going to run into conflicts.

You've never done any open source development I guess?

Do you think all the kernel developers run the same distribution, the same IDE, the same compiler version? LOL.

Same applies for most open source projects.


Would you please stop breaking the site guidelines? You've been doing it repeatedly, unfortunately. We want thoughtful, curious conversation here—not flamebait, unsubstantive comments, and swipes.

If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.


A lot of modern open source projects include a lock file or some other mechanism that ensures that all contributors use the same versions of certain key tools. Obviously there are still going to be some differences in the environment, but for things like formatting, linting, etc, it's generally fairly easy to lock down a specific version.

In Python, the easiest way to achieve this is using Poetry, which creates a lock file so that all developers are using a consistent set of versions. In other languages, this is generally the default configuration of the standard package manager.


Using lock files is a good way to make sure your software never ends up in a distribution and in the hands of users.


The popular Rust tool "ripgrep" uses a lock file for development (you can see it in the GitHub repo), and yet is in the official repositories for homebrew, various Windows package managers, Arch, Gentoo, Fedora, some versions of openSUSE, Guix, recent versions of Debian (and therefore Ubuntu), FreeBSD, OpenBSD, NetBSD, and Haiku.

With all due respect, I don't think you're correct.


And how much rust software is packaged in distributions? Almost none. They haven't figured out the procedures, because distributions really really don't want pinned stuff around.

Homebrew, windows, arch all have very very relaxed processes to enter. There is no QA, you can just do whatever you want. I mean more like Fedora and Debian.


Bottom line is that the lock file in ripgrep's repo hasn't prevented it from being packaged. And I haven't heard of any distro maintainer complain about any lock file in any Rust program ever. So you're just plain empirically wrong about lock files preventing Rust programs from being packaged.

You've now moved on to talking about something else, which is "how much Rust software is packaged." Well, apparently enough that Debian has Rust packaging policy[1]. I'll give you one guess at what isn't mentioned in that policy. Give up? Lock files!

[1]: https://wiki.debian.org/Teams/RustPackaging/Policy


> So you're just plain empirically wrong about lock files preventing Rust programs from being packaged.

My mistake, seems rust packagers gave up on decent packaging. It isn't so for the python policy, I can assure you :)


I haven't heard anyone complain about how Rust programs are packaged. Take your passive aggressive bullshit somewhere else.


How Linux package managers handle these newer languages with their own package managers (including rust) is an ongoing pain point. Here’s an article from 2017 about it, and I don’t know if things have improved:

https://lwn.net/Articles/712318/


I didn't say it wasn't a pain point or that there weren't challenges. I said I don't hear about people complain about how Rust programs are packaged. Not that the packaging of Rust programs (among others) itself doesn't present interesting challenges depending on the policies of a particular distro. Archlinux, for example, has far fewer problems than Debian because of differences in policy.

The poster I was responding to was literally posting false information. I'm correcting it. This doesn't need to turn into a huge long sprawling discussion about packaging Rust programs. The main point that I was making is that lock files do not prevent Rust programs from being packaged. bombolo then went off on their own little tangents spouting nonsense without bothering to acknowledge their mistake.


I contribute to packaging. But thanks for teaching me about something I know already.

Now try to get something using an obsolete version of some python module into Fedora or Debian and let me know how it goes… It would not be accepted as it is. It'd be patched to work with a current one or just rejected.


I never said a single word about Python. Whether you contribute to packaging or not has nothing to do with whether you're posting false information. If anything, it makes what you've said worse. You should know better.

Just stop spreading misinformation. And the courteous thing to do is to acknowledge an error when it's pointed out instead of doubling down and redirecting as if no error was made.


The article is about python, the thread is about pinning dependencies in python.


> Using lock files is a good way to make sure your software never ends up in a distribution and in the hands of users.

> And how much rust software is packaged in distributions? Almost none.

> They haven't figured out the procedures

You're clearly talking about Rust in the second two comments. Your original comment was just a general pronouncement about lock files. You could perhaps be given the benefit of the doubt that you were only thinking about Python, but someone else interpreted your comment broadly to apply to any language with lock files. If you really only meant it to be specific Python, one would reasonably expect you to say, "Oh, sorry, I was only talking about Python. Not Rust. Their situation might be different."

But no. You doubled down and started spouting nonsense. And you continue to do so!

> Where dependency pinning is the norm, there is a culture of breaking API compatibility.

Rust does not have this problem. It is the norm to use lock files for Rust programs, but there is no culture of "breaking API compatibility" without appropriate signaling via semver.

This entire exchange is a classic example of Brandolini's law: https://en.wikipedia.org/wiki/Brandolini%27s_law

It's easy for you to go off and spout bullshit. You've even been corrected by someone else who maintains distro packages. But it's a lot harder to correct it. You wriggle and squirm and rationalize and deflect.


The distros will eventually stop this dangerous practice of mixing and matching versions for all dependencies. It can only work for a small set of system components, which is what every other OS does.


It's more dangerous to let people pin dependencies and have vulnerable libraries in use forever.


Who says the distros are using the lock file? AFAIK, Debian doesn't use ripgrep's lock file, for example. They don't have to, because of semver.


What's the point of the lockfile then?


For people that want to build with the exact set of dependency versions tested by upstream. Just because some distros don't use them doesn't mean there isn't any point.


Distros can keep their own lock file that is based on their own release branch's versions. If it doesn't build, the pkg maintainer will either file a bug report or make a patch, or neither.

Source: I maintain distro packages.


But something building doesn't mean that it will work.

There can be changes that are different than function signature changes.

Where dependency pinning is the norm, there is a culture of breaking API compatibility. And you might not have a compiler error to inform you that the API has changed. Sometimes all you have is a commit message.


> All those big changes introduce commits that make git bisect generally slower.

Bisection search is log2(n) so doubling the number of commits should only add one more bisection step, yes?

> Which might be awful if you also have some C code to recompile at every step of bisecting.

That reminds me, I've got to try out ccache (https://ccache.dev/ ) for my project. My full compile is one minute, but the three files that take longest to compile rarely change.


Perhaps the poster meant that the contents of the commits themselves make bisection slower? By touching a lot of files unnecessarily, incremental build systems have to do more work than otherwise.


Could be? I've not heard of that problem, but I don't know everything.

I'm not used to Python code (which is all that black touches) as being notably slow to build, nor am I used to incremental build systems for Python byte compilation.

And I expect in a project with 2 developers which is big enough for things to be slow, then most of the files will be unchanged semantically speaking, only swapping back and forth between two syntactically re-blackened representations, so wouldn't an caching build system be able to cache both forms?

(NB: I said "caching build system" because an incremental build system which expects time linear order, wouldn't be that helpful in bisection, which jumps back-and-forth through the commits.)


Both… and version jumps in formatting tool basically will touch every single file.


Every file that changed because rules changed, which shouldn't be frequent, I don't remember black changed radically since its creation, do you have an example of some widespread syntax change ?


Just take a codebase and run black from different ubuntu releases.

The funny thing is that if you run the versions backwards you will NOT obtain identical files with what you started with.


> Bisection search is log2(n) so doubling the number of commits should only add one more bisection step, yes?

And testing 1 extra step could only add a 1 hour build more, yes?


It could, certainly. But

1) you don't have one black commit for every non-black commit, do you? Because the general best practice is to do like kuu suggested and have a specific black version as part of the development environment, with a pre-commit hook to ensure no random formatting gets introduced.

2) assuming 500 commits in your bisection, that's, what, about 9 compilations you'll need to do, so it will take you 9 hours to run. So even with a black commit after every human commit, that yes, 1 hour more, but it's also only 11% longer.

Even with only 10 non-black commits and 10 black commits, your average bisect time will only increase from 3.6 hours to 4.6 hours, or 30% longer.

I'm curious to know what project you are on with 1 hour build times and the regular need for hours-long bisection search, but where there isn't a common Python dev environment with a specific black version. Are you using ccache and/or distributed builds? If not, why isn't there a clear economic justification for improving your build environment? I mean, I assume developers need to build and test before commit, which means each commit is already taking an hour. Isn't that a waste of expensive developer time?

And, I assume it's not the black formatting changes which result in hour-long builds. If they do, could you explain how?


As I said in other comments, if you try to force contributors to reproduce exactly your local setup, you will be left with no contributors. Which is why you set up a CI to run the tests… because people will most likely not.

As for build times, it was an extreme example. But even an extra step taking 5 extra minutes is very annoying to me…


> if you try to force contributors to reproduce exactly your local setup, you will be left with no contributors. ...

That's not been my experience. To the contrary, having a requirements.txt means your contributors are more likely to have a working environment, as when your package depends on package X but the contributor has a 5-year-old buggy version of X, doesn't realize it, and it causes your program to do the wrong thing.

In any case, your argument only makes sense if no one on the project uses black or other code formatter. Even if you alone use it, odds are good that most of your collaborator's commits will need to be reformatted.

> .. an extra step taking 5 extra minutes ...

How do black reformatting changes cause an extra 5 minutes? What Python code base with only a couple of contributors and no need for a requirements.txt takes 5+ minutes to byte-compile and package the Python code, and why?

Adding 5 minutes to you build means your bisections are taking at least an hour, so it seems like focusing on black changes is the wrong place to look.


> How do black reformatting changes cause an extra 5 minutes?

Did you even read my comments?

Black reformatting causes more steps in bisecting. It's quite easy that a test suite takes 5+ minutes.


None of your comments mention running the full test suite, only build.

When I've used bisection, I've always had a targeted test that I was trying to fail, not the entire test suite. This is because the test suite at the time of that commit wasn't good enough to detect this failure. Otherwise it would have failed with that commit.

Instead, a new failure mode is detected, an automated test developed, and that used to probe the history to identify the commit.

Why are your bisections doing the full suite?

> Black reformatting causes more steps in bisecting

Yes, of course it does. But it's log2(n).

The worst-case analysis I did assumed there was a black commit after every human commit. This is a bad practice. You should be using black as a pre-commit hook, in which case only your new collaborator's commits will cause re-formats. And once they are onboard, you can explain how to use requirements.txt in a virtualenv.

If only 10% of the commits are black reformattings, which is still high!, then a bisection of 100 human commits (plus 10 black commits) goes from about 6.64 tests to 6.78 tests, which with a 5 minute test suite takes an additional 42 seconds.

If it's 3% then your bisection time goes up by 13 seconds.

If you are so worried about 13 seconds per bisection then how much time have you spent reducing the 5 minute test suite time? I presume you run your test suite before every commit, yes? Because if not, and you're letting CI run the test suite, then you're likely introducing more commits to fix breaking tests than you would have added via black, and taking the mental task switch hit of losing then regaining focus.


> This is a bad practice. You should be using black as a pre-commit hook

I would reject such commits in review.

A human might add one or two items to a list and black might decide it's now too long, and make 1 line into 10 lines.

Now I have to manually compare the list item by item to figure out what has changed.

So I normally require formatting to be done in a separate commit, because I don't want to review the larger than necessary diffs that come out doing it within the same commit.


> A human might add one or two items to a list and black might decide it's now too long, and make 1 line into 10 lines.

A human might add one or two items to a list, decide it's now too long, and make 1 line into 10 lines.

Including the same hypothetical first contributor you mentioned earlier, who you think will find using requirements.txt as being too big a barrier to entry.

Onboarding occurs either way.

I get that you don't like using black - and that's fine! I don't use black on my project either.

But it seems like you're trying to find some other reason to reject black, and constructing hypotheticals that don't make any sense.

Just say you don't like black's choices, and leave it at that.


> A human might add one or two items to a list, decide it's now too long, and make 1 line into 10 lines.

At which point I tell him to split formatting and actual changes into different commits (see https://mtlynch.io/code-review-love/).

> I get that you don't like using black - and that's fine! I don't use black on my project either.

Well according to this comment, it's because we are noobs: "the people that disagree just haven't figured out that they're wrong yet"

> But it seems like you're trying to find some other reason to reject black, and constructing hypotheticals that don't make any sense.

After the n-th time I have to re-setup the entire virtual env on a different branch just to re-run black and isort to backport a fix to a release branch… it does get tiring.

I presume most people here just do websites and don't really have a product that is released to customers who pay to support old versions for years, so black changing syntax is a 1 time event rather than a continuous source of daily pain.

But it seems the commentators here don't have the experience to know there might be a use case they didn't think of.


> and don't really have a product that is released to customers who pay to support old versions for years

My main product is 12 years old, with paying support customers, and with bugfix branches for older releases.

> just to re-run black and isort to backport a fix to a release branch

Great! That's an excellent reason. But it has nothing to with bisection.


> if you try to force contributors to reproduce exactly your local setup

  python -m venv venv
  pip install -r requirements.txt
Do you consider that imposing? I assumed that was standard. Don't basically all Python projects in existence use something like it?


> isort's completely random… For example the latest version I tried decided to alphabetically sort all the imports, regardless if they are part of standard library or 3rd party. This is a big change of behaviour from what it was doing before.

This is not isort! isort has never done that. And it has a formatting guarantee across the major versions that it actively tests against projects online that use it on every single commit to the repository: https://pycqa.github.io/isort/docs/major_releases/release_po...


It did this to me today…


Are you using any custom settings?


No. Seems they changed the default ordering


Hi! I said this with more certainty than I should have. Software can always have bugs! For reference, I wrote isort, and my response came from the perspective that I have certainly worked very hard to ensure it doesn't have any behavior that is random or non-deterministic. From your description, it sounds like someone may have turned on force-alphabetical-sort (if this is in a single project). See: https://pycqa.github.io/isort/docs/configuration/options.htm.... You can do `isort . --show-config `, to introspect the config options isort finds and where it finds them from within a project directory. The other thing I could think of, is coming from isort 4 -> 5, I wouldn't think it would fully ignore import groupings, but maybe it doesn't find something it used to find automagically from the environment for determining a first_party import. If that's the case this guide may be helpful: https://pycqa.github.io/isort/docs/upgrade_guides/5.0.0.html. If none of this helps, I'd be happy to help you diagnose what your seeing.


I upgraded the underlying docker image… so python version and all dependencies got bumped. I did not change any configuration or script.

I now use version 5.6.4, from 4.3.4. In the end we passed a flag to keep the old behaviour, but in my mind behaviours shouldn't just change.


> So a project with 2 developers, one running arch and one running ubuntu, will get formatted back and forth.

You should never develop using the system Python interpreter. I recommend pyenv [0] to manage the installed interpreters, with a virtual environment for the actual dependencies.

[0] https://github.com/pyenv/pyenv


> You should never develop using the system Python interpreter.

Yes yes… never ever make the software run in a realistic scenario! You might end up finding some bugs and that would be bad! (I'm being sarcastic)


> Black formats things differently depending on the version. So a project with 2 developers, one running arch and one running ubuntu, will get formatted back and forth.

use pre-commit https://pre-commit.com/ so that everyone is on the same version for commits.


What's the alternative? YAPF is even worse - it will flip flop between styles even on the same version! Its output is much less attractive, and there are even some files we had to whitelist because it never finishes formatting them (Black worked fine on the same files).

Not using a formatter at all is clearly worse than either option.


> Not using a formatter at all is clearly worse than either option.

why?

Do you hate terse diffs in git?


Because some people are really bad at formatting code manually and constantly nitpicking them about it is both tedious and antagonistic. Its much better for a faceless tool to just remove formatting from the equation entirely.

I think the sane part of the software engineering world has realised that auto-formatting is just the right way to do it, and the people that disagree just haven't figured out that they're wrong yet.

Maybe you meant "why is Black specifically better than no autoformatting, given that it isn't perfectly stable across versions?" in which case the answer is:

a) In practice it is very stable. Minor changes are easily worth the benefits.

b) They have a stability guarantee of one calendar year which seems reasonable: https://black.readthedocs.io/en/stable/the_black_code_style/...

c) You can pin the version!!


> the people that disagree just haven't figured out that they're wrong yet.

This is unnecessarily confrontational. Please read my other comments where I consider the extra effort that automatic formatting causes for code reviews.

> In practice it is very stable.

It has never happened to me to upgrade black and have it not change opinion about black formatted code.

> Minor changes are easily worth the benefits.

It doesn't matter how minor they are. A 1 bit difference is still going to fail my CI.

> You can pin the version!!

I usually do, but working with old releases that must be maintained, mean that I can't cherry pick bug fixes from one branch to the other, because black fails my CI.


> I consider the extra effort that automatic formatting causes for code reviews.

Why would it cause extra effort? Not having automatic formatting causes extra effort because you have to tell people to fix their formatting!

> It has never happened to me to upgrade black and have it not change opinion about black formatted code.

I'm sure small things change but large differences? No way. Even the differences between YAPF and Black aren't that big in most cases.

> It doesn't matter how minor they are. A 1 bit difference is still going to fail my CI.

Right but you have a pre-push hook to format the code using the same version of Black as is used in CI. Then CI won't ever fail.

> I can't cherry pick bug fixes from one branch to the other, because black fails my CI.

Cherry pick, then run Black. Sounds like you have a very awkward workflow to be honest.


> Why would it cause extra effort?

As I said… because a 1 word change easily ends up changing multiple lines and from the diff it's not clear it's just a 1 word change. So… extra effort.

> Right but you have a pre-push hook to format the code using the same version of Black as is used in CI. Then CI won't ever fail.

No I don't. My release branch and old-release branch use different versions. Such a thing would need to understand which base branch I'm working on, or recreate the venv continuously.

> Cherry pick, then run Black. Sounds like you have a very awkward workflow to be honest.

Seems to me you don't support older versions, in which case everything is easy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: