Unfortunately ruff is very inconsistent and has lots of differences from the flake8 plugins it tries to emulate. Lots of rules are confused by irrelevant context so that it can miss lots of things it should find when the equivalent flake8 plugin still find them. It's automatic fixing of issues will happily introduce other issues that it doesn't find until the next run. I've tried pretty hard to use it and gave up, it's just not remotely ready.
If you notice differences vs. the originating plugins, would definitely appreciate it if you could file an issue! We tend to be very responsive especially when it comes to matters of correctness.
Candidly, for Flake8 plugins, my experience is that Ruff tends to be more consistent, more robust in its inference and, at this point, more extensively tested than the original implementations -- both via our own testing and via the significant number of projects and companies that now use Ruff in production. (As compared to Pylint, though, we catch fewer issues, since Pylint does some type inference across files, which Ruff doesn't support yet.)
Ruff is also designed such that we will iteratively lint-and-fix until there are no more fixable issues, so if you've seen the linter introduce _new_ fixable diagnostics, that would be a bug too.
This isn’t really accurate in my experience. I’ve been using it on several projects large and small for some time now and can only recall one time autofix introduced an issue of any sort and that was many versions ago. It will also handle cascading fixes just fine.
> It's automatic fixing of issues will happily introduce other issues that it doesn't find until the next run.
I think a few of the "bad" auto fixes have been recently disabled by default, they now separate fixes into "safe" and "unsafe" (for example: it used to always change `x == True` to `x is True`, which broken common numpy selection patterns)
If you have a smaller codebase which has had consistent good practices you won't find many things anyway. I have a legacy codebase where ruff finds over 25,000 issues. Many times when I fix all of the issues for a given code (say UP031 just to pick an example) I can then run flake8 and find more valid examples of that same issue. On my desktop running flake8 takes about 7s, so there's literally no reason for me to bother with ruff if it is going to be so much less reliable, even if it is 1000x faster.
Ruff is ok I guess if you barely need it, but I have a medium sized (about 400kloc), quite old, and extremely poorly managed codebase that I support which powers a profitable business. Ruff just can't cope with it, at least as of a few months ago.
Kinda funny, but my experience was quite the opposite there. When migrating our codebase from flake8 to ruff, I actually found a number of bugs in flake8 and flake 8 bugbear that were resolved by a more correct implementation in ruff