Hacker News new | past | comments | ask | show | jobs | submit login

This looks like a fine project for its purpose, but I think git is already open-source and p2p. You don't need sh<(curl) a bunch of binaries, instead simply connect to another git server, use git commadns to directly pull or merge code.

What's missing in git is code issues, wikis, discussions, github pages and most importantly, a developer profile network.

We need a way to embed project metadata into .git itself, so source code commits don't mess up with wikis and issues. Perhaps some independent refs like git notes?

https://git-scm.com/docs/git-notes




While Git is designed in some way for peer-to-peer interactions, there is no deployment of it that works that way. All deployments use the client-server model because Git lacks functionality to be deployed as-is in a peer-to-peer network.

For one, it has no way of verifying that the repository you downloaded after a `git clone` is the one you asked for, which means you need to clone from a trusted source (ie. a known server). This isn't compatible with p2p in any useful way.

Radicle solves this by assigning stable identities[0] to repositories that can be verified locally, allowing repositories to be served by untrusted parties.

[0]: https://docs.radicle.xyz/guides/protocol#trust-through-self-...


> it has no way of verifying that the repository you downloaded after a `git clone` is the one you asked for

Respectfully disagree here. A repository is a(or multiple) chain(s) of commits, if each commit is signed, you know exactly that the clone you got is the one you asked for. You're right that nobody exposes a UI around this feature, but the capability is there if anyone would have any workflows that require to pull from random repositories instead of well established/known ones.


Here's the problem: how do you know that the commit signers are the current maintainers of the repo?


That problem is social you can never be sure of that even with hardware signing of commits. No tech can ever solve that. Just get "pull requests" from contributors you know and pull from maintainers you trust. Is the social model.


That's not quite right, we solved this in Radicle. Each change in ownership (adding/removing maintainers) is signed by the previous set of owners. You can therefore trace the changes in ownership starting from the original set, which is bound to the Repository ID.


Sure, but again, you've added convenience - or what you feel like it's convenience - for something that probably can be achieved right now with open source tools. A "CONTRIBUTORS" file with sign-offs by maintainers is an example of a solution for the same thing.

I don't deny that your improvements can benefit certain teams/developers but I feel like there are very few people that would actually care about them and they're not making use of alternatives.


A CONTRIBUTORS file is easy to change by anyone hosting the repository - it's useless for the purpose of verification, unless you have a toolchain to verify each change to said file. "Sign-offs by maintainers" it not useful either unless you already know who the maintainers are, and you are kept up to date (by a trusted source) when the maintainers change. This is what Radicle does, for free, when you clone a repo.


All good points, but now you moved the trust requirement from me having to trust the people working on the code, to me having to trust the tool that hosts the code. I'm not convinced your model is better. :P


Over time, I’d expect trusting the tool to be more and more trustworthy as more and more eyeballs can review the tool.

Whereas having to trust people, especially as people cycle in and out over time can be inherently stochastic.


I don't know, for me when I get involved with a project, I'm more likely to be aware of the people involved with it than the place where they host it.

I understand that the disruption Radicle wants to bring is to divorce projects from their developers, but that sounds so foreign to me, that I can't wrap my head around it. I can see its use in some cases: abandoned projects, unethical behaviour from maintainers, but not to the extent where a new platform is required.

Maybe that's why I'm being such a Negative Nancy. I hope u/cloudhead didn't consider my replies too aggressive. :)


For me, one of the benefits of FOSS is that I don't have to trust the people. I can look at the code and decide for myself.

Not looking to convince you of that or anything though... :)


Can’t debate that :)


How do I verify the “original set”, or the Repository ID, if not out-of-band communication (like a project’s official website)? And then what advantage does this have over the project maintainer signing commits with their SSH key and publishing the public key out-of-band?

I think there’s room for improvements in distributed or self-hosted git, but I think they exist more in the realm of usability than any technological limitations with the protocol. Most people don’t sign git commits because they don’t know it’s possible—not because it’s insecure.


The repository id can be derived via a hash function from the initial set of maintainers, so all you need to know is that you have the correct repository id.

The advantage of this is that (a) it verifies that the code is properly signed by the maintainer keys, and (b) it allows for the maintainer key(s) to evolve. Otherwise you’d have to constantly check the official website for the current key set (which has its own risks as well)


I'm not sure how any of this solves the problem.

If I am on the internet there is no key or keys that I could definitively say came from the _real_ maintainers. I need to trust some source or sources for that.

In your model, committing to the repo requires a private key. This key claims ownership of the repo. If that key is lost or stolen I have lost ownership of that repo. With no out of band method to recover it.

If that key is unknowing stolen, ownership is switched to a new key, this is a pretty bad scenario.

Basically, I still always need to go to some other out of band source to verify that bad things have not happened.


Radicle developer here :) And yes you're completely right.

The current state of key management has A LOT left to be desired, because `did:key` has no rotation and so if you lose your key then it's game over. We decided to go with something simple first to allow us to develop the collaboration experience as much as possible -- we're a small team so it's hard to tackle all of the large problems all at once, while also getting an experience that's polished :D

Key management and a general "profile" is high on our priority list after we have properly launched. A few of us think DIDs (https://www.w3.org/TR/did-core/) are a good way forward. In particular, `did:keri` seems very interesting because its method involves a merkle-chain log, which can be easily encoded in Git. It includes key pre-rotation -- meaning there's a key that's available to help recover if something goes wrong. It can also delegate to other people, so you can allow the safety of your identity and key be improved by third-parties.

That said maybe there are other DID methods or other methods in general that might better suit. Or maybe we're able to build something that can be more general, and just needs to essentially resolve to a public/private key pair and we don't care after that.

Would definitely be interested in the communities thoughts here :) Or if someone who's got expertise in the area wants to chip in, hit us up ;)


It's seems to me that for security reasons it might be a good idea to support separate signing keys for normal commits and commits that change the ownership set. This would allow you to keep the ownership change keys offline under the assumption they are rarely used. This is something PoS cryptocurrencies tend to do by having a separate withdrawal key for accessing stake to the signing key used for block proposals, attestations etc.


Interesting idea, thanks!


How do you know the repository id is the correct one?

You have just changed the requirement from knowing the maintainers public key, to knowing a different public key. Sounds pretty much the same problem to me.


The difference is that the repository id is stable, while the maintainer keys can change.


Except repository ids change when the repo is forked.


Yes, but the maintainers can be changed while also keeping the identifier stable.

Updates to the delegate set (read: maintainers) can be made, let's say adding a new delegate. This change is signed by a quorum of the current set of maintainers. This change is kept track of in the Git commit history, so these chain of actions can be walked from the root and verified at each point.

Similarly, a delegate can be removed, the project name changed, etc.

Forking is only necessary if there is a disagreement between maintainers that cannot be resolved so one of them goes off to create a new identifier to differentiate between the two. At this point, it's up to you to decide who you trust more to do a better job :)


How do you fork an abandoned repo?


When you fork an abandoned repo, you are essentially giving it a new repository identity, which is a new root of trust, with a new maintainer set. You'll then have to communicate the new repository identifier and explain that this is a fork of the old repo.


By the same way I know how the commit signers are who they say they are in "regular" usage of GPG: I have verified the key belongs to them, or their keys are signed by people I trust to have verified, etc, etc. Like a sibling said, the problem is social rather than technical.


By joining the web of trust. Meeting people, verifying each other's identities and getting keys signed.

Debian seems to be quite good at this.

https://wiki.debian.org/Keysigning

https://wiki.debian.org/Keysigning/Coordination

https://wiki.debian.org/Keysigning/Offers


Does that matter if the signatures are valid?


Yeah, because for eg. I can publish the given repository from my server with an additional signed commit (signed by me) on top of the original history, and that commit could include a backdoor. You have no way of knowing whether this additional commit is "authorized" by the project leads/owners or not.


That is in fact the point, it's decentralized by nature. The entire idea behind git's decentralization is that your version with an additional backdoor is no lesser of a version than any other. You handle that at the pointer or address level i.e. deciding to trust your server.


Perhaps, but none of that commit history is related to the invocation to git clone. To acquire and verify you need both a url and a hash for each branch head you want to verify


The problem I'd like to see solved is source of truth. It'd be nice if there were a way to sign a repo with an ENS or domain withiut knowing the hash.

Another thing is knowing if the commit history has been tampered with without knowing the hash.

The reason for needing to not know the hash is for cases like tornado cash. The site and repo was taken down. There's a bunch of people sharing a codebase with differing hashes, you have no idea which is real or altered.

This is also important for cases where the domain is hacked.


> The reason for needing to not know the hash is for cases like tornado cash. The site and repo was taken down. There's a bunch of people sharing a codebase with differing hashes, you have no idea which is real or altered.

> This is also important for cases where the domain is hacked.

I think at some point you need to know some sort of root-of-trust to kick off the trusting process. I believe in this case, you would trust a certain DID or set of DIDs (i.e. a Tornado Cash developer's public key). You can clone their version of the project and the history of the project MUST be signed by their private key for it to be legitimate.

To clarify, in Radicle, a peer's set of references are always signed by their key and this data is advertised so that you can always verify, using their public key, that this data is indeed what this peer has/had in their Git history. If this ever diverges then any fetching from that peer is rejected.


Right now the devs are in jail so you wouldn't be able to seed from them (computers off) so you'd need to trust another reference


This is just another form of the cryptographic key distribution problem. Doesn't matter where the git repository comes from, you can be sure it hasn't been tampered with if the signatures are valid.

Domains with DNSSEC are an interesting solution. PGP public keys are distributable via DNS records.

https://www.pgp.guide/pgp_dns/

https://weberblog.net/pgp-key-distribution-via-dnssec-openpg...


How do you handle the SHA1 breaks in an untrusted p2p setting?


If you mean collision attacks, this shouldn't be a problem with Git, since it uses Hardened SHA-1. Eventually, when Git fully migrates to SHA-2, we will offer that option as well.

> Is Hardened SHA-1 vulnerable?

> No, SHA-1 hardened with counter-cryptanalysis (see ‘how do I detect the attack’) will detect cryptanalytic collision attacks. In that case it adjusts the SHA-1 computation to result in a safe hash. This means that it will compute the regular SHA-1 hash for files without a collision attack, but produce a special hash for files with a collision attack, where both files will have a different unpredictable hash.

From https://shattered.io/


So you use hardened sha1 in radicle? It would be great to see this in the docs.


Everything that is replicated on the network is stored as a Git object, using the libgit2[0] library. This library uses hardened SHA-1 internally, which is called sha1dc (for "detect collision"). Will add to the docs, good idea!

[0]: https://github.com/libgit2/libgit2/blob/ac0f2245510f6c75db1b...


The entire Linux kernel development team wouldn’t to differ…


> What's missing in git is code issues, wikis, discussions, github pages and most importantly, a developer profile network.

Radicle adds issue tracking and pull requests. Probably some of those other features as well.

On mobile there are buttons on the bottom of the screen in the op link, click those and you get to the issue tracking tab and the pull request tabs etc


But that’s not what parent meant. Those things should be embedded in the git repository itself, in some kind of structure below the .git/ directory. That would indeed make the entire OSS ecosystem more resilient. We don’t need a myriad of incompatible git web GUIs, but a standard way of storing project management metadata alongside version control data. GitHub, Gitea, Gitlab, and this project could all store their data in there instead of proprietary databases, making it easy to migrate projects.


Yes, this is how radicle stores this data. ; )

https://app.radicle.xyz/nodes/seed.radicle.xyz/rad:z3trNYnLW...


https://docs.radicle.xyz/guides/protocol is probably a better resource (but this guide is still Work In Progress)


> Radicle’s predefined COB types are stored under the refs/cobs hierarchy. These are associated with unique namespaces, such as xyz.radicle.issue and xyz.radicle.patch, to prevent naming collisions.

This looks like an interesting approach. I have question, to avoid copy a large .git project, we have partial cloning and cloning depth. If `cobs` grows too large, how can we partially clone it? Like select issues by time range?


The COB types are located in the Stored Copy, you would still be able to partial clone the working copy repo without the issues and patches, with your current git commands. There is a better explainer here: https://docs.radicle.xyz/guides/protocol#storage-layout


> a standard way of storing project management metadata alongside version control data

Emphasis mine. Doesn't seem to be it seening as this is yet another home grown issue storage.


Yeah, exactly. Radicle doing it this way, Fossil another - see here why that is a problem: https://xkcd.com/927/


And Fossil is an entirely different VCS.

What’s the alternative? That at least N projects cooperate and agree on a common design before they do the implementation? (Then maybe someone can complain about design-by-committee.)


I use Artemis, which was originally written for Mercurial but also supports Git. It stores issues inside a repo, so it doesn't care about where it's hosted and works completely offline without needing a Web browser. Issues are stored in Maildir format, which is widely supported standard that can be processed using off-the-shelf tools. For example, I write and comment on Artemis issues using Emacs message-mode (which was designed for Email), and I render issues to HTML using MHonArc (which was designed for mailing list archives).

I'm not claiming these tools are the best, or anything. Just that existing standards can work very well, and are a good foundation to build up UIs or whatever; rather than attempting to design something perfect from scratch.

My fork of Artemis, patched for Python3: http://www.chriswarbo.net/git/artemis

Emacs config to trigger message-mode when emacsclient is invoked as EDITOR on an Artemis issue: http://www.chriswarbo.net/git/warbo-emacs-d/git/branches/mas...

An example of HTML rendered from Artemis issues: http://www.chriswarbo.net/git/nix-config/issues/threads.html


> What’s the alternative? That at least N projects cooperate and agree on a common design before they do the implementation?

That would be ideal, yes. You should solicit comments from the greater community before setting the format in stone. But the very minimum would be to build on existing attempts at issues-in-git like [0] instead of reinventing the wheel unless you have a very very very good reason.

[0] https://github.com/MichaelMure/git-bug


Yes! That's exactly what I would like to see - come together as a working group, create a PR on git itself, and implement standard support for issues, PRs, discussions, projects, votings, project websites, what-have-you. The community will take it from there.

The alternative to that would be the git project itself coming up with an implementation. They have reasonable experience working with the Kernel, and the creation of git itself seems to have worked reasonably well -- although I'm not sure I would want to use something Linus considers ergonomic :)


Ok. That could work if you found a group of people who are interested in such an all-in-one package. Gitlab is apparently working on a sort of cross-forge protocol (probably motivated by connecting Gitlab instances) and it seems that some Gitlab employees are working full time on the Git project. So those are candidates. You probably need a group which both have recognition within the project and are active enough to drive such a project forward without it fizzling out.

Without such a group you might just send a few emails to mailing list, get shrugs about how that plan “looks good” with little real interest, and then have to discuss this big plan in every future patch series that incrementally builds such a thing into Git itself. Mostly why such code should be incorporated and how it will pay off when it’s all ready.

The Git tool itself—and the project by extension—is per now very unopinioated about whole-project solutions like this. The workflow that they themselves use is very loosely coupled and pretty much needs bespoke scripting by individual contributors, which I guess is telling in itself.


I would have like to have seen what kind of issue resolution labels old Linus would have come up with. Resolved: YOU GIT has a nice ring to it.


I'd like to note here that Radicle has defined our own version of issue and patch management, but they're not necessarily required to be used as part of the protocol. The protocol only defines that any and all COBs will be stored and replicated under the `refs/cobs` hierarchy.

If someone wanted to come along and define a way to embed Fossil wikis/issues as a COB then they could be replicated on the Radicle network and it's then up to application developers to load and interpret that data.

I think this is cool because it essentially allows developers to extend the Radicle ecosystem easily and define new workflows! However, that does not avoid our XKCD problem stated above ;P But hey, sometimes that's the beauty of these things -- we're given the power to define our own workflows and not locked into something everyone complains about coughGitHub branches PR flowcough


Nothing should be under .git/ except things owned by git (or at least allowed for in an explicit way like the hook scripts).

You would have to either add the features to git itself, or at least add to git the knowledge of and allowance for extra features like that.

But not just toss non-git data in .git/ simply because the dir happens to exist.


What is "non-git data" though?

Git is just a mechanism for storing plain-text data as a series of commits. The underlying data are just blobs of bytes. So all data is Git data and Radicle takes full advantage of that.

The "special" data in Git would be `refs/heads`, `refs/remotes`, `refs/tags`, and the lesser known `refs/notes`. Radicle doesn't touch those directly, we still allow the use of Git tooling for working with that data.

It then extends on top of these by use `refs/rad` and `refs/cobs` for storing Radicle associated data, using all of Git's mechanisms that it provides to do so.


> You would have to either add the features to git itself,

That is exactly what I'm suggesting.


I don't think I would agree. Well it depends exactly what you imagine.

There are probably an infinite and always growing and changing number of repo-adjacent things similar to the handful of added features that github currently tacks on to git.

I think it doesn't make sense for git, already complicated enough, to try to do all of that other related stuff, however related.

Maybe in fact git is already doing too much meaningful metadata work directly itself, and should instead try to switch it's current metadata into some kind of generic interface that other software could hook in to better?

So rather than git managing, say, issues, git just manages data. Not merely a db, it would facilitate associating high level feature data like issues and ci and conversations with low level commit data. An issue tracker would just be one of many clients using the interface.

git itself should probably not know very much about the data other than how it is associated to some commit. IE, the actual meaning of the data is either only in the consumer, or perhaps is some agreed standard that multiple consumers adhere to and understand, or perhaps comes with it's own schema definition like dtd or protobuf. Or all of the above, a paid product could have data only it understands, and other software could use standardized data, all stored in git at the same time. Multiple different consumers could use the same generic interface to their own data.

Because these add-on related functions are probably not universal or done being invented. Tomorrow there will be 5 new things you want to attach to a repo besides what github provides today, and so without even knowing what they are, that's how I know it's wrong for git to provide the features itself.

But managing the data without caring what the data is, that could totally be a new built in git function, where you give git itself one new function and interface, and an open-ended number of new feature-providing consumers all just use the same underlying git feature in whatever way they want.

Other than that the only other new thing I think that might be right to add to git itself is a proper way to manage large binary assets.


Radicle does store such data in git - issues, patches (PRs) etc. Also, the entire project (protocol, cli, web ui etc) is fully open source.


> We need a way to embed project metadata into .git itself, so source code commits don't mess up with wikis and issues.

Fossil (https://fossil-scm.org) embeds issues, wiki etc. into project repository.


Also Radicle, evidently


> and most importantly, a developer profile network

What has the world come to where that is the most important part?

--

I think gerrit used to store code reviews in git.


Repositories and code-sharing are inherently about trust. Even if you personally audit every line of code, you still need to trust that the owner isn't trying to slip one past you. Identity is a key component of trust.


What you say makes sense. But that trust needs to extend to the hosting platform itself, because the platform can manipulate all non-signed data. I don't see how a GitHub profile by itself is trustworthy. You need some additional, external and independent verification that that GitHub profile is really authentic and doesn't contain compromised code.

There is nothing stopping me from creating the accounts IggleSniggle or Iggle5n1ggle on github.


I mean... yeah, you obviously have to trust someone to vouch for the authenticity of an identity. In the case of Github, that's the platform owner. In the case of a digital signature, that's the root certificate authority.

With that being said, your example feels pretty far off the mark. You might be able to phish using a similar looking identity, but that's completely unrelated to the trustworthiness of the platform. It's not as though you'll manage to somehow phish Github into showing someone else's trustworthy work history on a spoofed identity.


> It's not as though you'll manage to somehow phish Github into showing someone else's trustworthy work history on a spoofed identity.

You don't need to trick github, that's just how it works by design. Anyone can upload any repo to github. There is nobody checking the repo isn't stolen or fake. Github does not claim to be vouching for anyone. At most they will delete malware and obvious scams if it happens to comes to their attention.


Welcome to the era of self-promoters and narcissists.


Classic git does not evade censorship, such as the extremely recent news concerning Nintendo. An idea like this has been rolling around in my head, and I'm overjoyed that someone has done the hard work.


Git evades censorship just fine, since it is properly decentralized and doesn't care about where you got the repository from. Plain HTTP transport however does not and most Git repositories are referred to by HTTP URL.

If you simply host Git on IPFS you have it properly decentralized without the limits of HTTP. IPNS (DNS of IPFS), which you need to point people to the latest version of your repository, however wasn't working all that reliably last time I tried.


But with Git you still need to locate an up-to-date source for the repo. If the author is signing commits or you know a desired commit ID then you can verify once you have found a source, but finding the source is the hard part.

IIUC with Radicle you can just request the repository by signature and get the latest released version from the network without needing to track down a source yourself. A trusted publisher (probably the original author/maintainer) can continue to publish commits without a centralization point that can be shut down (like the recent Yuzu case).


You're basically describing a name service: a way to associate a stable name (like "Yuzu" or "PGP:ABCDEFG..." or whatever) with a changing, mutable identitfier (like a Git commit ID, an immutable IPFS URL, a Bittorrent magnet link, etc.).

The most obvious example is DNS, which is technically capable of this, but is mostly not set up in the ways we'd want. It's pretty centralised: so whilst anyone can host their own DNS servers, serving any data they like, it's unlikely the "main" DNS network will connect to it; and will consider you malicious if your data disagrees with the centralised records. Things like DNSLink can be useful for associating as DNS name with names in other systems, but that's still vulnerable to hijacking/poisoning without an out-of-band way to verify it.

The GNU Name System seems better suited, since it can use public keys for stable names, which therefore can't be hijacked/poisoned. Associating these to pet names is also less centralised, feeling more like /etc/hosts than "the" DNS; with recursive resolving, so e.g. I can look up "Yuzu" based on what you think it is, etc. That seems to provide a nice balance between decentralised control, versus relying on side-channels to find public keys. I'm currently experimenting with this.

There's also IPNS, but in my experience its lookups are incredibly slow and unreliable. It could get better, and I know various studies are being performed and experiments with different approaches and parameters are being tried, so I'm hopeful something will come of it.

As far as I remember, the main reason Radicle diverged from these approaches (and IPFS/IPNS in particular) was to allow peers to negotiate delta transfers, like (non-dumb) Git HTTP servers do. That's more efficient than using something like IPFS/IPNS, where the root address keeps changing, and we need to fetch a load of the Merkle tree to find out which blocks we already have.


IPFS does not intentionally replicate (it merely caches). Bringing down the authoritative server can result in lost data. Anonymity is also out of scope for the project.


Yeah that’s been my experience with IPFS. Very cool idea, practically doesn’t work very well. Haven’t tried recently though, maybe it’s improved.


You're missing the discovery part. You want to get the repository X from user Y cloned - how do you find it? Especially if you don't know Y and their computer is off?

Also radicle does want to tackle the issues / prs and other elements you mentioned as well.


How do you find a website ?

And presumably the person hosting it will make sure that the computer hosting it is often on, for instance ISP routers and TV boxes are a good way to popularize it, since they often come with NAS capabilities :

https://en.wikipedia.org/wiki/Freebox

(Notably, it also supports torrents and creating timed links to share files via FTP.)


Depends on what you mean by finding :

- finding what the domain name is ? - resolving the DNS to an IP address ?

Radicle solves both problems in theory, but more the latter than the former right now:

- there is some basic functionality to search for projects hosted on Radicle, to find the right repo id (I expect this area will see a lot more activity and improvements in the near future), - given a repo id, actually getting the code onto your laptop. This is where the p2p network comes in, so that the person hosting it doesn't always need to keep their computer/router/tv box on, etc.


I think this already exists for issues. git-bug [1] uses git internal files to store the issues. It is distributed and it even comes with a web ui in addition to the usual cli.

[1]: https://github.com/MichaelMure/git-bug


A friend of mine wrote a similar tool. https://github.com/nolash/piknik.


do you know of any projects using [anything like] git-bug?

i know i've encountered something like this once in a notable repo. thought it was graphics related, like mesa or something, but looks like they're using GitLab.


Most CI runners use git notes which is similar to what git-bug uses iirc


Fossil has a few of these.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: