Hi HN, hunterbrooks and nbrad here from Ellipsis (
https://www.ellipsis.dev). Ellipsis automatically reviews your PRs when opened and on each new commit. If you tag @ellipsis-dev in a comment, it can make changes to the PR (via direct commit or side PR) and answer questions, just like a human.
Demo video: https://www.youtube.com/watch?v=X61NGZpaNQA
So far, we have dozens of open source projects and companies using Ellipsis. We seem to have landed in a kind of sweet spot where there’s a good match between the current capabilities of AI tools and the actual needs of software engineers - this doesn’t replace human review, but it saves you time by catching/fixing lots of small silly stuff.
Here’s an example in the wild: https://github.com/relari-ai/continuous-eval/pull/38. Ellipsis (1) adds a PR summary; (2) finds a bug and adds a review comment; (3) after a (human) user comments, generates a side PR with the fix; and (4) after a (human) user merges the side PR and adds another commit, re-reviews the PR and approves it
Here’s another example: https://github.com/SciPhi-AI/R2R/pull/350#pullrequestreview-..., where Ellipsis adds several comments with inline suggestions that were directly merged by the developer.
You can configure Ellipsis in natural language to enforce custom rules, style guides, or conventions. For example, here’s how the `jxnl/instructor` repo uses natural language rules to make sure that docs are kept in sync: https://github.com/jxnl/instructor/blob/main/ellipsis.yaml#L..., and here’s an example PR that Ellipsis came up with based on those rules: https://github.com/jxnl/instructor/pull/346.
Installing into your repo takes 2 clicks at https://www.ellipsis.dev. You do have to sign up to try it out because we need you to authorize our GitHub app to read your code. Don’t worry, your code is never stored or used to train models (https://docs.ellipsis.dev/security).
We’d really appreciate your feedback, thoughts, and ideas!
First, I'm pretty unimpressed with the PR descriptions. I'd be frustrated if my company adopted this and I started seeing PRs like the one from continuous-eval that you linked to first. It's classic LLM output: lots of words, not much actual substance. "This PR updates simple.py", "updates certain values". It's the kind of information that can be gleaned in the first 5 seconds of glancing through a PR, and if that creates the illusion that no more description is needed then we'll have lost something.
Second, in the same PR: when writing a collection to a jsonl file, I would expect an empty collection to give me an empty file, not no file. Further, I haven't looked at the rest of the context, but it seems extremely unlikely to me that dataset_generator.generate would somehow produce non-serializable objects, and a human would easily see that. These two suggestions feel at best like a waste of time and at worst wrong, and it's concerning to me that the habits this tool encourages led to the suggestions being uncritically adopted and incorporated and that this seemed to you to be a good example of the tool in use.
The second PR you linked to is, I think, a better example, but I'm still not sold on it. Similar to the PR descriptions, I'm concerned that a tool like this would create an illusion of security against the "simple" problems (leaving reviewers to focus on the high level), whereas I'd hope that the human reviewer would still read every line carefully. And if they're reading every line carefully, then have we really saved them that much time by paying for an LLM reviewer to look over it first? Maybe the time it takes to type out a note about a misnamed environment variable.