*Do not try and bend the spoon, that's impossible. Instead, only try to realize ...

bognition · 2024-02-21T11:56:16 1708516576

What alternative are you suggesting?

exDM69 · 2024-02-21T12:18:31 1708517911

Pijul is a version control system that is patch based (like Darcs) instead of snapshot based (like Git and Mercurial).

Together with some mathematical models on the properties of patches it can deal with some merge situations that need to be resolved manually in Git.

Pijul's website (and Darcs' before that) has some theory that may be helpful if you are interested in the details.

The distinction between patch based and snapshot based is mostly invisible to the end user, except when it isn't.

gugagore · 2024-02-21T12:51:02 1708519862

Couldn't git just recreate patches from snapshot? I mean, this is what happens when you show a commit or a diff.

This is probably what you mean by how it's invisible to the user. When does it actually make a difference?

exDM69 · 2024-02-21T12:55:31 1708520131

Yes, git "recreates" the patches when you view a diff.

But git can't decide that these two patches are independent of each other and thus their order does not matter so it'll give you a merge conflict where pijul doesn't.

You can losslessly convert between the two models, but you can't efficiently apply patch theory (commutation etc) to a bunch of snapshots.

gugagore · 2024-02-21T13:07:36 1708520856

This doesn't seem to be true. I can use git rebase to change the order of commits, and I don't get conflicts if the corresponding patches are independent.

If you can losslessly convert between the two models, then it seems like you could make git's conflict resolution smarter without changing the underlying data model.

exDM69 · 2024-02-21T13:15:04 1708521304

Yes, of course you can manually do that.

> If you can losslessly convert between the two models, then it seems like you could make git's conflict resolution smarter without changing the underlying data model.

No, this will turn into a performance nightmare.

The keyword is efficiently. A patch based data model is a requirement to efficiently do the kind of deduction on patch commutativity and other properties that Pijul does. Converting back and forth on the fly using snapshots (which are a list of filenames and blob hashes) would not work outside of toy examples.

And this is what eventually "killed" Darcs (an earlier patch based system), its data model had some exponential corner cases that could not be resolved.

gugagore · 2024-02-21T14:16:50 1708525010

I can understand in principle but not actually. Converting between snapshot and patch representations is not an exponential effort operation. You could convert a git history to pijul (yes, expensive, but not exponential), do the conflict resolution in pijul land, and then handwaved convert back to snapshots.

pmeunier · 2024-02-21T22:25:30 1708554330

The key insight of Pijul is to be the smallest generalisation of a file that is a CRDT with insertions and deletions of bytes as its two operations, where "smallest" and "file" are meant in a specific sense.

The main thing that makes it all work is the extreme performance of its storage backend, which allows to manipulate a graph datastructure directly on disk, and avoid as many IO operations as possible while doing that. This works well, as all operations in Pijul (with some caveats) work in a time logarithmic in the size of history. And yet, it is slower than Git for some operations.

Therefore, what you're suggesting is linear in the size of history (importing), i.e. exponentially slower than Pijul, for every single merge!

alchemist1e9 · 2024-02-21T12:56:03 1708520163

I was such a heavy Darcs user once, before git existed. Then for a while I was using darcs to git and back. Sounds like I should take a serious look at Pijul.

crotchfire · 2024-02-21T20:19:50 1708546790

It's getting pretty good.

The problem is the primary author demanding CLAs for changes to the "plumbing" part of pijul (but not the "porcelain"). There are a handful of outside contributors who agreed to it for small drive-by fixes, but a lot of others (like myself) who saw the CLA and said "yeah no thanks these always end with somebody getting screwed".

It's a real tradgedy. Hopefully the pijul plumbing is close enough to "perfect and finished forever" (like TeX, djbdns, etc) that it can survive for the rest of its life on one serious contributor and low-effort drive-by bugfixes. Otherwise the only plausible outcomes are a no-CLA fork (highly unlikely since the main author wouldn't participate) or the project dying.

I do hope the primary pijul author will rethink their "no limits" CLA. History has shown that these always end with contributors getting screwed, either by the primary author, or some other entity to whom the primary author sold the rights.

pmeunier · 2024-02-21T22:18:27 1708553907

This problem was never reported a single time on our Zulip or by email, but I'm glad you asked, because this choice wasn't obvious:

- The main reason the CLA is there is that earlier versions of Libpijul saw lots of online arguments about its very reasonable GPL2 license, which made me doubt of the choice of GPL2. These arguments were often followed by "@me was there"-style contributions to Libpijul, like applying linter fixes without understanding any of the code. I got scared of having to ask all past contributors for permission to release it under (say) a BSD license (which I have even recently discussed with the FreeBSD maintainers during FOSDEM 2024).

- Also, the fact that the Pijul binary is GPL2 (without a CLA) and dependent on Libpijul forces me, in the case of an evolution of the license, to choose something compatible with GPL2. Deal breaker, really?

- Another goal of the CLA was to experiment with something cool and meta, at the interactions between version control and licensing: I wanted to test the idea that the CLA was a single patch (or a sequence of patches, strictly ordered by dependency), and contributors would be required to add a dependency on the CLA patch. That would make them state in their patches which version of the CLA the agreed with when they recorded. And as Pijul does a lot of dogfooding…

- Other than that, the Pijul plumbing is indeed mostly meant to become just a static algorithm after a while, not many features are missing. The goal of Pijul is to have essentially three core functions, create a patch, apply a patch, unapply a patch.

crotchfire · 2024-02-22T09:43:05 1708594985

Thank you for replying!

> This problem was never reported a single time on our Zulip or by email

This shouldn't be surprising; the CLA requirement drives away contributors before their first contribution. Like me. I saw no point in a person who hasn't made any contributions petitioning you for a policy change. This phenomenon is one of the insidious effects of policies which ward off contributors before they make their first contribution.

A lot of us fell for CLAs when they were new and shiny. By now we've all seen way too many examples of people getting screwed by them. Fool me once, fool me twice... https://drewdevault.com/2023/07/04/Dont-sign-a-CLA-2.html

Using artificial pijul dependency arcs as CLA acknowledgements is certainly cute. But frankly, for legal matters, something that can't be misinterpreted like a `Developer-Certificate-of-Origin:` header is probably a better idea. By the way, have you noticed that `pijul git` import loses some of the git headers, like `Committer` and `CommitterDate`? Also, importing a large repo has exposed what appears to be some significantly superlogarithmic-time behaviors in `pijul record`.

> in the case of an evolution of the license

I think you should try to let go of the idea of "evolving the license" of an open source project that invites contributions.

If you want to go the sqlite route that is absolutely fine; honestly pijul is one of the few pieces of software that is (after accounting for younger age) on the same level of sophistication as sqlite. Just explain that your project, like sqlite, is open source but not soliciting contributions.

I think that with the CLA requirement you have effectively made this choice -- but in the way which maximizes the number of people annoyed: both the people who get triggered by any mention of the GNU project as well as the people who support its goals! This is quite a neat trick to pull off, although I fear it was unintentional.

> like applying linter fixes without understanding any of the code

Yeah I saw that PR, which was obnoxious for other reasons as well. Bravo to you and felix91gr for being so polite in your responses.

pmeunier · 2024-02-22T12:03:48 1708603428

First, thanks for the patient and kind advice. This is so rare I didn't even know you were allowed to talk like that online.

> I think you should try to let go of the idea of "evolving the license" of an open source project that invites contributions.

Yes, this is becoming clearer now, but wasn't for a long time, since choosing any license on a complicated tool like this one, which few people fully understand, inevitably sparks discussions on the license, because if you got to say something, it is easier to discuss than commutativity, pushouts (hard-core theory), Sanakirja, endianness (hard-core practice).

> This is quite a neat trick to pull off, although I fear it was unintentional.

;-)

> Yeah I saw that PR

There were many: a single one wouldn't have changed my mind. And many more on what projects/topics/software I should be interested in.