Show HN: Patchwork – Open-source framework to automate development gruntwork

meiraleal · 2024-07-27T03:05:40.000000Z

PR reviews are the one thing you sure don't want a LLM doing.

Carrok · 2024-07-27T03:50:42.000000Z

Please elaborate.

While obviously a LLM might miss functional problems, it feels extremely well suited for catching “stupid mistakes”.

I don’t think anyone is advocating for LLMs merging and approving PRs on their own, they can certainly provide value to the human reviewer.

cuu508 · 2024-07-27T06:47:09.000000Z

They can lull the human reviewer into a false sense of security.

"Computer already looked at it so I only need to glance at it"

throwthrowuknow · 2024-07-27T09:55:21.000000Z

I don’t know what your process is but if someone else has reviewed a PR before I take my turn I don’t ignore the code they’ve looked at. In fact I take the time to review both the original code as well as their comments or suggestions. That’s the point of review after all, to verify the thinking behind the code as well as the code itself and that applies equally to thoughts or code added by a reviewer.

spartanatreyu · 2024-07-29T01:08:41.000000Z

> LLM [...] feels extremely well suited for catching “stupid mistakes”.

No.

Linters are extremely well suited for catching stupid mistakes.

LLMs are extremely well suited for the appearance of catching stupid mistakes.

Linters will catch things like this because they can go through checking and evaluating things logically:

> if (

> isValid(primaryValue, "strict") || isValid(secondaryValue, "strict") ||

> isValid(primaryValue, "loose" || isValid(secondaryValue, "loose"))

> //...............................^^^^ Did we forget a closing ')'?

> ) {

> ...

> }

LLMs will only highlight exact problems they've seen before, miss other problems that linters would immediately find, and hallucinate new problems altogether.

luckilydiscrete · 2024-07-29T08:02:49.000000Z

While true in a subset of problems, linters will also miss stupid mistakes because not everything is syntactical.

AI for example can catch the fact that `phone.match(/\d{10}/)` might break because of spaces, while a linter has no concept of a correct "regex" as long as it matches the regex syntax.

I don't think anyone is arguing that replacing linters with AI is the answer, instead a combination of both is useful.

rohansood15 · 2024-07-29T04:35:28.000000Z

Linters are great at finding syntactical errors like the case you mentioned. But LLMs do a better job at finding logical flaws or enforcing things like non-syntactic naming conventions. The idea is not to replace linters, but to complement them. In fact, one of the flows we're building next is fixing linting issues that linters struggle to fix automatically.

rohansood15 · 2024-07-27T03:36:27.000000Z

I agree and disagree. You definitely need someone competent to take a look before merging in code, but you can do a first pass with an LLM to provide immediate feedback on any obvious issues as defined in your internal engineering standards.

Especially helpful if you're a team with where there's a wide variance in competency/experience levels.

aaomidi · 2024-07-27T04:11:50.000000Z

Until that immediate feedback is outright wrong feedback and now you’ve sent them down a goose chase.

rohansood15 · 2024-07-27T04:45:30.000000Z

This is where prompting and context is key - you need to keep the scope of the review limited and well-defined. And ideally, you want to validate the review with another LLM before passing it to the dev.

Still won't be perfect, but you'll definitely get to a point where it's a net positive overall - especially with frontier models.

throwthrowuknow · 2024-07-27T10:00:38.000000Z

That happens with human review too and often serves as an opportunity to clarify your reasoning to both the reviewer and yourself. If the code is easily misunderstood then you should take a second look at it and do something to make it easier to understand. Sometimes that process even turns up a problem that isn’t a bug now but could become one later when the code is modified by someone in the future.

meiraleal · 2024-07-27T12:03:17.000000Z

I stand corrected: LLMs are great to block PRs by raising issues. A lack of issues should not be taken as a good PR tho.

datashaman · 2024-07-27T04:47:22.000000Z

We're trialing ellipsis.dev for exactly this, and it's pretty good most of the time.

SOLAR_FIELDS · 2024-07-26T21:03:59.000000Z

A feature comparison to https://github.com/paul-gauthier/aider would be great.

Is this just a non interactive version of this kind of agent?

rohansood15 · 2024-07-26T21:24:54.000000Z

Aider is great, but the use case is different:

1. You use Aider to complete a novel task you're actively working on. Patchwork completes repetitive tasks passively without bothering you. For e.g. updating a function v/s fixing linting errors.

2. Aider is agentic, so it figures out how to do a task itself. This trades accuracy in favor of flexibility. With patchwork, you control exactly how the task is done by defining a patchflow. This limits the set of tasks to those that you have pre-defined but gives much higher accuracy for those tasks.

While the demo shows CLI use, the ideal use case patchwork is as part of your CI or even a serverless deployment triggered via event webhooks. Hope this helps? :)

lifeisstillgood · 2024-07-27T09:12:14.000000Z

Ok the video explains this way better - and it looks awesome.

Do you accept PRs yourself :-)

rohansood15 · 2024-07-27T17:07:23.000000Z

We do. We haven't done a very good job of listing good first issues, but please feel free to create and contribute.

danielhanchen · 2024-07-26T23:56:50.000000Z

Oh this is really cool and great name! Will definitely try this out!

bsima · 2024-07-27T02:06:03.000000Z

Yall know there’s a popular oss project called patchwork right https://patchwork.readthedocs.io/en/latest/

rohansood15 · 2024-07-27T02:23:46.000000Z

There are a few open source projects by the name and we were aware of https://github.com/ssbc/patchwork which is archived. Didn't know of this though.

It's a common noun which works really well for patch-based offerings I guess, and we chose it because we built a 'framework to patch code'. But we'll think more about this - thanks for bringing it up.

ylyn · 2024-07-27T06:53:16.000000Z

Patchwork is used by the Linux kernel: https://patchwork.kernel.org/

When I saw your submission title I thought it was that Patchwork.

cuu508 · 2024-07-27T06:44:35.000000Z

There's also https://github.com/fabric/patchwork

dash2 · 2024-07-27T09:06:39.000000Z

And an R library: https://cran.r-project.org/web/packages/patchwork/index.html

lordofgibbons · 2024-07-27T08:43:42.000000Z

The opensource ecosystem is large enough now compared to previous decades that name collisions are very likely to happen given that they're always named in English.