Hacker News new | past | comments | ask | show | jobs | submit login
Peer-to-Peer Databases for the Decentralized Web (orbitdb.org)
261 points by fossislife on March 1, 2021 | hide | past | favorite | 79 comments



I like this, but I feel its important to point out that this is effectively unusable by the very people who would benefit from it most: non-web developers.

If this were a drop-in C/C++ library, I'd use it immediately in my native apps and adopt it as a widespread platform technology. I want to be able to do this, and if it were truly a working technology, I'd put it in every single app I use.

But as a Node/JS library, it is off limits.

Please, decentralization guys, consider native languages first, and toy languages next. These technologies are never going to be embraced unless they treat native platforms as first-class systems for their manifestation.


Amen to this! I too got really excited until I saw it was JavaScript. For me to use it I'd have to link against some sort of JavaScript interpreter (ew) and probably incur performance penalties shoveling the data too and fro.

Can't we just have nice, normal libraries like we used to? Tools that require me to install a huge new build system, package management (cough golang cough) just to run quickly lose their appeal.


The answer is the same as it always was: you'll have "normal" libraries when somebody writes them (it could even be you). Most new application development relies on web servers, where JavaScript is very relevant and normal.


There is a Go implementation available https://github.com/berty/go-orbit-db


What is the definition of a toy language? That’s a new term for me.

I (thankfully) haven’t written C++ in over 20 years, but if I had to drop back down to a low level language, it’d be C or a compile-to-C language. C is the language these things should target, in my opinion.


Systems developer here. I made an account to answer this, because the OP is 100% right about this.

A toy language is one which you do not allow to escape the play-room/sandbox, you play with it (because hey, you can, so why not..), but you would not actually use it.

The reason this is important for decentralization efforts, is that the OS should be doing this.

Sure, paint the pretty UI, but expose the entire thing to common OS-level abstractions, please, and then host the scripts, kthxbai ..

(A non-toy language is one you are very comfortable shipping. Javascript is that for a lot of people. Not for systems programmers = it is an application execution environment, not a place for things which must operate, as part of the system, without much user hassle ..)


> What is the definition of a toy language?

A language that doesn't require as much hair-pulling to learn, so you can't feel as smug for knowing it.


So... not Javascript then?


> What is the definition of a toy language? That’s a new term for me.

I would imagine most non-compiled, scripting languages. They are ok when building the business logic of an application on top of faster building blocks but not ok for application building blocks such as a database.


Something like https://dqlite.io/ would fit your needs? It's distributed thanks to Raft with all consequences. It's not viable to call it P2P as it lacks node discovery et cetera. Yet, it's distributed.


I would play with some Python.


Me too! But the thing is, a vanilla C library can be used by anything. It's the lowest common denominator. I can easily call into a C library from Python, Perl,... MATLAB... and I don't know JavaScript but I bet it's not hard from there either. That can't be said about any other language.

Performance is oft cited as why systems software is written in C (or C++ with C bindings), but I think a much stronger reason is the universality. C runs on everything from big iron to the smallest microcontroller.


orbitdb runs on browsers, via WebRTC. It's much easier to embrace something that runs on a web page than something that you have to download and run.


I don't see how people consider this and IPFS decentralized. You still need peers to subscribe and pin, and services like Pinata aren't cheap and still centralized. Is there an example of websites that would continue to run indefinitely if there wasn't a centralization of services with IPFS? I just don't think it is likely unless you start distributing your app that bundles IPFS, so that your users end up hosting it for you. However, if someone new tries it out and only one of your users has your app open you better hope it is being cached by Cloudflare.


I'm looking at integrating IPFS with sandstorm (sandstorm.io). The sandstorm (and other homelab/homeserver) community have an interest in self-hosting. IPFS allows people that are interested in seeing something stored to pin it themselves. But yes: bundling IPFS with the app is pretty much what would need to happen.

As they are now, neither the IPFS ecosystem or sandstorm are really decentralized -- or maybe I should be more precise: not decentralized in a resilient way. Yet, I think both are great technologies that enable some interesting things.

I'm not as interested in the blockchain notion of decentralized though. I'm looking for resiliency to severe netsplits. As a thought experiment, if Earth crosses a coronal mass ejection event and our power and communication grids go down, would we be able stitch things back together again when our grid fractures?


I've been doing some experimentation along the same lines ... I've got a small proof-of-concept working where I can sync Earthstar databases between multiple Sandstorm instances - each "grain" acts as a pub, and you can use "webkeys" as pubs. I'm using powerbox-http-proxy to request capabilities for connections, so it works even with fully sandboxed client-side networking (ALLOW_LEGACY_RELAXED_CSP=false). It's maddeningly complex to do simple networking, but it's amazing to have the security layer there. I'll put together a blog post / video soonish.

https://github.com/jimpick/sandstorm-earthstar-foyer

Sandstorm is very interesting for privacy-focused peer-to-peer because it has the tools in place to sandbox networking. The networking security model is designed to prevent things from talking to each other without user approvals - so it's extra tricky to do peer-to-peer things with it.

I'm excited to try to stick some of my old IPFS/IPLD/libp2p/Dat/CRDT things into Sandstorm - there's a tonne of interesting things that could be done with it.


I was wondering about that. I suppose, things that the user wants to share publicly can be done so (via powerbox), but otherwise, everything else would not be.

Are you using the grain model for each individual files to be shared?

(I looked up your email and I would like to continue chatting about this).


Everything is in a grain, but I think the Earthstar applications could be decomposed to just handle a single document at a time.

I made a quick demo recording and updated the README with a link.

Video is here: https://bafybeih6kkv4rgmfada4kfc5thmtzheiurq7ck5uz4rchqkrxw7...


I was also looking at sandstorm. I even managed to install it on my homeserver. But it is sadly not really up to date. As a replacement i found homelabOS which is a bit more complicated (I think).

And I was also thinking about integrating ipfs into my homeserver stack. But instead of plain ipfs I was thinking about sia coin. They offer cloud storage. So I was thinking about to rent 4tb of space. And whatever money I make with it I use it to buy backup from sia.

So in a sense I provide backups to others and in return I get free backup.

Sadly work is being to intensive on last month so I haven't found much inspiration in completing stuff above.


Curious about the homelab/homeserver community. Are there any sites with further information?


If you reddit, r/selfhosted and r/homelab are a great place to start.

I really like the sandstorm.io platform, though it is not currently supported on an rpi. It has the potential as a platform that “grandma” can use.

There is a project called HomelabOS. It reminds me a bit of Homebrew.


Cheers. I remember hearing SandStorm (the company) went under a couple years ago, and was curious but hadn't explored it in depth. I've been paying attention to CollapseOS as well, though it's a little more apocalyptic in nature.

The eventual aim is to be able to provision a set of servers from a thumb drive without Internet (if that's even possible), but I haven't had much time to get far.


I'll check out CollapseOS!

The sandstorm community is still going. The app developers are interested, but the platform itself has not received much love. For my own goals, (1) getting ARM support (which includes customizing seccomp for the ARM; patching the sandstorm app package format; and getting a good developer story), and (2) enabling IPFS.

A kind of practical user-facing goal for integrating IPFS would be to decentralize the sandstorm app store.

What I am finding is that the design goals for sandstorm looks like it was originally intended to be deployed in the cloud, whereas I'd like to see individuals and communities empowered to run sandstorm for their local community.


Honestly, the biggest pain point around adding ARM support is that it'll be a pain to create ARM-compatible packages for all the apps. I definitely think the largest unanswered question in ARM support for Sandstorm.io is "how do we make it convenient for app developers to release packages for multiple architectures?"

That being said, I'd be super excited at the possible expansion of the community ARM support would bring.

By the way, regarding seccomp and the platform receiving love, there's actually a new experimental seccomp filter which operates as a whitelist instead of a blacklist: https://github.com/sandstorm-io/sandstorm/pull/3502


I'm not a security engineer, so the seccomp stuff isn't something I know a lot about. Given that limited experience. whitelist sounds more sensible than blacklist, and I presume that it would be more portable on different architectures.

I have a lot more confidence when it comes to tooling, packaging, testing, developer environments, CI/CD systems etc. since that is the kind of thing I do at work.

Within that old Github thread:

- A suggestion that there is a metadata field that can be purposed for architecture

- There was a suggestion for using QEMU with the vagrant tool

I'm a bit weirded out by a packaging tool that tries to automatically look for all the files related to it -- though I think there might be a good reason for it. You do lose the ability to have reproducible builds and any kind of continuous integration or delivery. I kinda wonder if just switching to something like Nix would work better.

Reproducible builds would make it easier for the app ecosystem to be rebuilt on a different arch. ARM and RPI is great, but I expect RISC-V to be gaining traction over the next couple years. As far as my personal goal of helping develop resilient infrastructure, being able to quickly adopt new architectures would help a lot.

I have not floated this in the sandstorm-dev mailing list yet, but I'll get some conversation started over there.


Yeah, my understanding is when ARM support is added, we'll add a field to the pkgdef file that you can specify architecture. Any apps missing that field can be assumed to be x64.

spk and vagrant-spk definitely take the "grab what the app needs" approach, but Ian is in agreement with you in wanting something more reproducible and well-defined. (vagrant-spk is mostly reproducible because VMs made with it should be roughly the same, spk is not remotely reproducible).

docker-spk is a tool he wrote that creates Sandstorm app packages from specially-crafted Docker images (you still have to be writing a Sandstorm package here, it won't convert a traditional Dockerfile to work on Sandstorm, for instance). He has also repeatedly expressed interest in building a nix-spk tool that uses Nix packages. ;)


There are multiple HN stories on CollapseOS; it was worth reading for the comments.


Very cool idea!


What do you think decentralized means? It means that hosting it isn't the responsibility of a single party.

If I put something online, and people think it's interesting and pin it. Even if I go down, the pinned content remains up.

Decentralized doesn't magically solve hosting availability. It just removes responsibility from a single party.


I would say there are very few IPFS sites where hosting doesn't still end up being the primary responsibility of a single party. That was the point I was trying to make. If you have to pay someone else to pin it and keep giving them funds then it is still your responsibility. If you have a website that you want to scale and load fast for all your users on IPFS then it will still end up being your responsibility. There may be a few exceptions.


You're speaking from actual experience, aren't you? I tried out IPFS and came to the same conclusion myself. Content has to be kept alive or it disappears forever, and keeping it alive requires some kind of investment.

I like IPFS, I want it to succeed, but HN has these romantic ideas about IPFS that don't quite match reality, and you get downvoted for being honest sometimes. Not acknowledging the limits of IPFS are part of what holds it back...


Yes. I really enjoy IPFS and fully intend to use it in my next project, but there is still quite a bit to be desired. I think the IPFS cluster project may end up being what I'm looking for, but it currently has a few issues that need to be worked out:

https://github.com/ipfs/ipfs-cluster/issues/1226

Ideally, I could provide instructions for app users to follow my cluster, set the amount of content they want to share, track the size of content they've pinned and for how long, and provide rewards/incentives through the app for doing so.. similar to how private trackers work.


Of course. Just using IPFS doesn't make other people interested in your content. But if you have interesting content it does mean that people will be likely to save it.

You can imagine that if YouTube hosted its videos on IPFS then popular channels would be widely pinned so that even if YouTube shut down and the channel author left most of the videos would still be accessible. And much unlike current HTTP mirrors you would be able to find out who is pinning it without having to check multiple archive sites. (Of course you need to know the hash, but remembering hashes is much lest costly than mirroring whole sites yourself so I think we would see lots of these)

I think that in most cases it is simplest to think of IPFS as torrents. I think the biggest improvement that IPFS made was a single namespace for content instead of each torrent basically being its own swarm. This means that you can find content no matter how it is packaged and you can share blocks between "torrents".


Would you call torrents decentralised?


To an extent, although many times you still rely on a centralized tracker and dedicated seeds. Unless you belong to a private tracker where users are forced to maintain certain ratios to continue using a service, then in most cases a torrent will end up like an IPFS site without any pins.

If IPFS provided the ability to track pins from users of a site like private trackers, and provide rewards or some incentive then I think IPFS would gain much wider adoption.


Most torrent clients also "pin" content a user downloads by default, with configuration options to stop seeding once certain conditions or upload ratios are met.

Does IPFS support something similar? e.g. pin this content until I have a 5:1 upload to download ratio

---

I 100% agree on the ability to track pins. I want to see organizations be able to easily peer/pin content from other orgs they endorse; and for end users to be able to follow and help peer for organizations they support.

I want something like Youtube where I choose who appears as in the related videos on my channel and can be a peer to hosting them; and where my users can help to peer my videos (and I can help peer others).

As an end user, my client (or local server/cache) would prefetch the content I follow both to be a peer and to have it ready for me to access.

This would also help with locality of data; if 4 people in my household watch the same video or channel they would be able to get the content locally. If I'm an organization with a physical location, users connected to my network can fetch the content on a local link.


you're describing an architecture similar to a project I'm working on using orbitdb/ipfslog - it's not quite ready for sharing/using but there's more information here: https://www.reddit.com/r/musichoarder/comments/lrqx7m/record...


Thanks. Napster 3.0 here we come ;)


Some torrents. There are many torrents basically held up by a single peer and those are not in practice decentralized. They could become so if more peers start seeding the torrent, but just because the architecture provides for decentralisation does not make all torrents decentralized.


There is no technology possible that removes said responsibility. This is how peer to peer systems work. Your criticism is something that no peer to peer system claims to solve.


Actually, I recommend you check out Freenet [0], a project that's 20 years old. In fact, the Wikipedia page explicitly states:

> Information flow in Freenet is different from networks like eMule or BitTorrent; in Freenet:

> 1. A user wishing to share a file or update a freesite "inserts" the file "to the network" > 2. After "insertion" is finished, the publishing node is free to shut down, because the file is stored in the network. It will remain available for other users whether or not the original publishing node is online. No single node is responsible for the content; instead, it is replicated to many different nodes.

[0] https://en.wikipedia.org/wiki/Freenet


Aren’t collaborative systems like that susceptible to abuse? E.g., an attacker could upload 1TB of noise and then shut down, reducing the network size by much more than 1TB.


Of course, in Freenet, if a file is not requested often it will quickly fall off the network due to user churn.


He said that IPFS hosting is the "primary responsibility of a single party". The whole idea behind peer to peer is to avoid precisely that.


Not really, if you want something to be perpetually available in any p2p network you need to ensure that you're still hosting it. That goes for torrents, ed2k, freenet, and ipfs too.


This isn't true of most blockchain p2p networks, eg, Bitcoin. Anything uploaded to these lasts as long as the entire network does.

The reason this works is it costs some amount to initially print the data to the network, limiting that kind of abuse.


And because there is money involved in making it last as long as the entire network.

Once the monetary reasons drop then so does the data.


It's possible for sure, but with Bitcoin and the overall crypto space steadily growing in value into the trillions over the course of over a decade, it seems unlikely.


I'm not claiming the reality of P2P always matches the ideas behind it.

And surely you're not saying centralized hosting being a strict requirement was what people had in mind for P2P?


I'm saying perpetual availability wasn't what what people had in mind for P2P.


Except for nearly all blockchain-based p2p networks.

Eg: https://cryptograffiti.info


I share your viewpoint and frustration with "decentralized" apps. But as you said it is indeed possible by bundling your app with IPFS and using something like orbitdb/ipfslog for structuring data. I linked this in my other comment, but take a look at this proof of concept: https://www.reddit.com/r/musichoarder/comments/lrqx7m/record...

At this point the major challenges are with UX, especially for users who are not accustomed to self-hosting. I think we have the technologies and ideas to overcome the rest.


I think that's the big problem with the "decentralized" WWW movement. There are about a billion conflicting definitions of decentralized providing very different properties, and it's never clear which definition people mean.


It is decentralized in that anyone can step up and host the content on IPFS and it is considered equivalent and secure. With a normal website, if the canonical host goes down, someone else can host it, but it isn't the same website.


IPFS, torrents and many old-school p2p-networks are decentralized to an extent, but they're also non-sybil-resistant and voluntarist, which makes them unsuitable for many use cases. They're not really a part of what is meant by decentralized web.

You should take a look at SkyDB to get an idea what's possible with true decentralization: https://blog.sia.tech/skydb-a-mutable-database-for-the-decen...


I feel like you don't understand the definition of "peer to peer."

The fact that you require nodes to ultimate do the hosting of data does not make IPFS centralized.


Swarm persistence is hard, if not close to impossible. Better that multiple parties pin the data than just one though. It's a step in the right direction.


Agreed. I started looking into setting up cluster followers, but currently you have to have enough space to pin everything and can't have it only do a percentage.


This project may be dead

There has been no activity in over a year,and the sponsoring company has no presence in several years.


It's not dead, just in a bit of a lull due to funding, which is hopefully coming soon. We have a very active Gitter and OpenCollective.

Source: I'm one of the core maintainers


Is there any interest in supporting other languages? Like how libp2p is being gradually ported to Rust, Go, Java. I feel like OrbitDB could become the new data model of the web if it went multi language. Or maybe was compiled to WebAssembly and embeddable using Wasmer.


There's definitely interest. Berty has a Go implementation and the idea of a Rust implementation gets tossed around every season.

Can Webassembly read IndexedDB yet?


It seems so: https://github.com/nwestfall/BlazorDB

Although this might be using JavaScript interop instead of using WebAssembly directly.

Edit: check this out: https://www.ditto.live/

It’s a multi language CRDT implementation that seems to use WebAssembly to be available in multiple languages.


It's def not ded, it's been like this for as long as I remember but it just keeps slowly chugging along. I continually find myself coming back to it, especially as IPFS matures more.


Gun DB is a good replacement.

https://gun.eco/


I wonder if it would be possible to build a decentralized search engine with a decentralized crawl on top of it.


I've been wondering the same thing too! Maybe if there is a way to share indexes, or some way to put out a call for searches through the DHT, and then the requestor aggregates the results.

A couple examples:

Web crawler -- seeded through a curated or personal bookmark list.

Knowledge base -- the permaculture community have amassed a knowledge base on growing plants: USDA hardiness zones, heat zones, humidity, soil requirements, companion planting relationships, etc. That's something I can see people adding on to, particularly with non-standardized homegrown cultivars and how well they do in their local area.


Limewire had an interesting protocol for exactly this back in the day, over its Gnutella network. I don't know if it scales to search engine sizes (most things don't, especially distributed) but there's at least some prior production grade work in this space.

Edit: having just double checked it looks like Gnutella predates limewire; that's just the avenue through which I came across it.


>> Maybe if there is a way to share indexes...

What does a SE index look like? Any designs in the open?


There is one: https://yacy.net/ However it's slow and the last release is from 2016.


An interesting project is https://github.com/dappkit/aviondb which is mongo-db like database on top of orbitdb


I wish there was something like this but with Webtorrent - I find torrents easier to grok for whatever reason.


I hope i can share something here more on this line pretty soon. Its not just databases and blobs over torrent but a whole web based application platform based on chrome for native applications. The SDK will start with Swift but the C interface can be used by other languages.


I think https://gun.eco/docs/ does so much cooler job.


Why do i always feel things like this and ipfs cone back to blockchain? I think the blockchain is a very interesting tool in a limited capacity. Orbitdb and ipfs seem to capitalize on their relationship to it.


OrbitDB doesn't rely on a blockchain at all


IPFS + (Arweave or Filecoin) behind it seems like the best decentralized web stack I've seen yet. I'm not sure how Orbit holds up.


Arweave has production ready releases ?


Yes. Check it out.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: