That's kind of stupid-ambitious for 2019 when another 2019 goal is "a production-ready implementation" and IPFS has been around for 3 years already.
This isn't a roadmap, it's a wishlist. And I'm someone who wants to see IPFS succeed.
1. Define a P2P ready data model and protocol (which they have, Merkle forest and everything)
2. Run a single server/cluster, make a very lightweight client library for that
3. Expand the server to a makeshift CDN (start simple, e.g. rsync like mirrors)
4. Federate the CDN (still in a hierarchical fashion, so you always know whom you are talking to). Also, look for peers on the local subnet (broadcast is simple).
5. Expand that to a friend-of-a-friend network, use PEX like peer finding (PEX is shockingly simple https://en.m.wikipedia.org/wiki/Peer_exchange)
6. Go full P2P, talk to strangers on the Net, use DHT or anything.
The only trick is to start with a data model that can go all the way to (6). So backward compatibility/ installed base issues do not stop you at earlier stages. Think globally, act locally.
The dangerous thing is
deploying an untested complex codebase to serve the most hardcore use case for live customers on Day#1. That may not work. Because even all these shockingly simple steps will turn quite long and tedious in practice. There will be "issues". Advancing one step a year is quite impressive if done under the full load.
May not be ultralight (running a DHT node isn't, but it does remove a lot of crust).
I have seen IPNS taking between 5-10 minutes to solve a single address and go-ipfs with a few pinned files taking more that 5GB of memory after running for a day.
Yeah, I don't understand the hype around IPFS when it performs so badly. After reporting the issue and having a conversation with the devs I had an impression that they are just dilettantes.
> Early on in the language history we used IPFS to distribute the Dhall Prelude, but due to reliability issues we’ve switched to using GitHub for hosting Dhall code.
I wish IPFS the best, because at least in theory this seems like a perfect use case
it seems like people working on package management/distribution related things LOVE the idea of IPFS
Nix + IPFS has been tried before, but what was missing is the Nix-side hierarchical content addressing of large data (not just plans). With the "intensional store" proposal, this should finally happen. Please push it along.
Data shouldn't be hashed like Nix's NARs, or IPFS's UnixFS for maximum interopt. Instead please go with git's model for all it's problems.
Thanks, hope someone can take me up on this because I'm up to my neck with other open source stuff already.
However I would love to see a reference implementation that works at minimum and not just drains out your computer up to latest resource it may have. If we're so near the "production-ready" status of the reference implementations then I think that goal will never be achieved.
How does an IPFS powered website do dynamic content? User sessions? Is all the client's session data encoded in the IPFS address itself?
Even if there's no user sessions, but the page content updates, how do you continuously point clients to fetch the right updated page (e.g. how would you implement a Hacker News style aggregator that updates every minute)?
IPFS does static content just fine - CAS-es are wonderful for that - but websites are much more than static content.
I would love to see all these problems solved and IPFS working very well in the next few years, but I'm afraid the IPFS people are very good in making press releases and presentations, but not in delivering really good software as they say they do.
Anyway, they have no obligation to deliver anything to anyone -- except maybe the people who entered the Filecoin ICO.
Now, obviously, they're under no obligation to deliver anything. But I'm trying to understand what you mean when you say:
The vision of a IPFS-powered web working is beautiful
Actually, I think the fact that IPFS developers are trying to replace HTTP one of the reasons they fail so awfully in producing a good IPFS for static content. They try to integrate much more stuff than actually needed in the protocol.
AIUI, that's the problem that IPNS is designed to solve (https://github.com/ipfs/specs/blob/master/naming/README.md#i...). HN controls its private key, enabling it to be the only one who can update the record at the signature of its public key, and those IPNS records have nanosecond precision expiry timestamps and TTLs, meaning they can update at the frequency of their choice
I agree with the sibling comments that (at least as the IPFS is currently specified) having a user session would be problematic. It's theoretically possible that _your_ HN front page would have an IPNS record of IPFS://mdaniel.news.ycombinator.com and then we're back to the aforementioned expiry semantics. Upvotes would have to travel out of the IPFS network, but in some sense, I think that's expected since one wouldn't want an upvote to be archived, but rather the resulting content to be
There are a ton of weird perspective changes when thinking about the content addressable web, but it might not be the 2050-esque far away that it seems
For instance, see P2P Reddit ( http://notabug.io ) which:
- Running in production with GUN.
- Handling about 42,000 monthly visitors. ( https://www.similarweb.com/website/notabug.io )
- Has done ~1TB in 1 day of decentralized traffic.
You can then configure GUN to save to IPFS as the blob/storage engine (or we have filesystem, localStorage, IndexedDB, S3, etc. as options).
It is basically like having a P2P Firebase :) with IPFS plugin.
I popped on notabug.io just now, and the chat is full of swastikas, racial slurs, and the most up-voted posts are primarily anti-gay misinformation.
I see you wrote GUN, which I'm sure was no small feat, and it looks like an impressive piece of technology. How do you feel about your tech being primarily used in this way?
The guy actually built a P2P moderation tool (anyone has their own "glasses" that filters based on your policy).
notabug.io is suppose to be running a curated homepage (nab.cx not curated) but I think he said it broke a week ago when he was modularizing his code.
NAB usually isn't this bad, lot of the people on it are anti-altright, but altrighters certainly can drown them out.
But it does look like NAB has become more toxic over time. :(
Previously, I was pretty neutral "do what you want with it".
Now, I shifted to be more opinionated about what I want to see built on top.
Primarily, apps that spread Open Source economics through art (music, etc.), to draw a crowd of lovers/creators not haters/destroyers.
But it's hardly surprising:
"The trouble with fighting for human freedom is that one spends most of one's time defending scoundrels. For it is against scoundrels that oppressive laws are first aimed, and oppression must be stopped at the beginning if it is to be stopped at all."
In this case, it means that you'll see people who are predominantly censored from other places already use this tech to avoid further censorship. In US right now, at least, that tends to be alt-right, white nationalists etc. For a more detailed take:
Warning: image contains racial slurs and swastikas: https://i.imgur.com/0kdxzVR.png
A little later in the chat, the developer chimes in and talks about some stuff. Then some people angrily accost them for censoring things. If the Nazis and the racists aren't censored, I don't want to think about the odious content that is.
Edit: Here's a thread from... yesterday, where a bunch of users are mad about the "censorship" of the developer hiding a swastika post from the front-page (not even deleting it or removing it from whatever their equivalent of a sub-reddit is): https://notabug.io/t/whatever/comments/509b9189ece85515671d3...
You can call this "decentralized reddit" a bad place, as it really is, but you can't say it's because it's "full of right-wing people". These adolescents are not "right-wing people".
At some point... is there a difference? If you find yourself in a group "ironically" screaming you all support X for long enough, soon you'll find that some of you actually support X. And that you enabled those people.
I wonder what's changed for early adopters of new tech compared with the original internet in the early 90's? Early adoption was still demographically skewed towards certain groups, but i don't recall this brand of right-wing thought being so prominent.
Somehow I forgot about it.
As you say:
> I think all censorship should be deplored. My position is that bits are not a bug.
> — Aaron Swartz (1986 - 2013)
Edit: But I like https://nab.cx/ better. Or at least, as an alternative.
SSB = Social-like P2P data.
GUN = Firebase-like P2P data.
SEA = End-to-end encryption. ( https://gun.eco/docs/SEA )
DAT = GIT-like P2P data.
IPFS = Images/assets.
WebTorrent = Video-like P2P data.
Only saw your comment now, will reply on Twitter too - long time no see since #hashtheplanet !
Notabug basically does everything Reddit does + more. Would that qualify as a larger app?
What in particular, what do you think would be difficult?
Perhaps a worldwide swarm of drones creating a mesh network.
Satellites are too big of a target, and not very transparent; e.g. we wouldn't know if someone went up there and installed some snooping hardware. The same can be said about drones, but with proper swarms the chances of you connected to a compromised drone would be less.
eg see https://github.com/ipfs/notes/issues/269
So you can't simply interop between the two without some sort of lookup to convert hash functions. It is hugely frustrating, as basically different content based distribution mechanisms cant work together.
In theory docker image registries support pluggable hash functions, although it is not clear to me that the ipfs function is even very well defined outside its own code. We could start to add a second hash calculation to every registry operation, but it would be a performance hit which some users would not like.
(tree based content hashes that allow parallelisation are nice, but the ipfs one is very ipfs specific and more complex than it needs to be I think).
Would someone mind enlightening me regarding what sets IPFS apart from torrents?
For example, unlike torrents, you can seed a collection like "My Web Show (All Seasons)" and add new files as new episodes become available. With torrents, you have to repackage them as new torrent files. IPFS also then encourages file canonicalization instead of everyone seeding their own copy of a file.
I think what makes IPFS interesting is that all files are like torrents and all folders all like torrents of torrents.
And since each torrent is a hash if the file underneath it, if 100 people individually add files or folders that contain identical chunks, then without explicitly doing anything the are so helping each other share those files.
Update: There is apparently "bittorrent V2" protocol , which allows file sharing. It is still not implemented in major clients, like libtorrent
The data was immutable, so we didn't have that use-case. The tracker software we were using (one of the often used open source C++ ones) seemed to handle a couple hundred torrents just fine, but couldn't handle tens of thousands. Even if only a few were active. I'm not sure if it was excessive RAM or high CPU, but they built a wrapper tool to expire and re-add torrents as needed. I think technically it was limiting the number of seeds (from the central server) for different torrents.
There was also a lot of time/overhead in initiating a new download. This was exacerbated by the kludge mentioned above. Client would add the torrent, you would trigger a re-seed, then the client would wait awhile before checking again and finding the seed. Often this dance took much longer than the download itself.
(Add a file to mydir/)
ipfs add -r mydir/
That's it, two different hashes, different contents, but intelligently deduplicated so you only need to download the diff if you already have the files in the former.
What is the benefit here?
Say you distribute "Julie's Webcast Complete Series" and somebody else distributes "Julie's Webcast - Episode 3, with Russian subtitles," peers and seeders from both distributions can share data for the shared content. Similarly, updating a dataset only requires downloading the new data.
This is done automatically, both per-file hashing and (optionally, not sure the current state) of in-file block hashing.
> peers and seeders from both distributions can share data for the shared content
So does IPFS have "plugins" for different archive/container formats so it can "see" that the underlying video/audio streams are identical between "Julie's Webcast - Episode 3.mp4" and "Julie's Webcast - Episode 3, with Russian subtitles.mkv"?
Otherwise container stream interleaving will play holy hell with any sort of "dumb" block hashing :(
> go-ipfs-chunker provides the Splitter interface. IPFS splitters read data from a reader an create "chunks". These chunks are used to build the ipfs DAGs (Merkle Tree) and are the base unit to obtain the sums that ipfs uses to address content.
> The package provides a SizeSplitter which creates chunks of equal size and it is used by default in most cases, and a rabin fingerprint chunker. This chunker will attempt to split data in a way that the resulting blocks are the same when the data has repetitive patterns, thus optimizing the resulting DAGs.
I think they should use the rolling hash based chunking by default
which begs the question, why fork the DHT in the first place? there are BEP drafts that cover all of the features that IPFS (and DAT for that matter) bring to the table.
my guess: there isn't a lot of money in making yet another bit torrent client.
There may be a variety of reasons to delete things,
- Old packages that you simply don't want to version (think npm or pip)
- Content that is pirated or proprietary or offensive that needs to be removed from the system
But in its current avatar, there isn't an easy way for you to delete data from other people's IPFS hosts in case they choose to host your data. You can delete it from your own. There are solutions proposed with IPNS and pinning etc - but they don't really seem feasible to me last I looked around.
This list as @fwip said is great as a wishlist - but I would love to see them address some of the things needed in making this a much more usable system as well in this roadmap.
If you put it on IPFS, it's not "your data" any longer. If that doesn't work for you, then don't use IFPS.
Edit: I do get why people are concerned about persistence of bad stuff. But it's not at all unique to IPFS. And even IPFS forgets stuff that nobody is serving. I mean, try to find these files that I uploaded a few years ago: https://ipfs.io/ipfs/QmUDV2KHrAgs84oUc7z9zQmZ3whx1NB6YDPv8ZR... and https://ipfs.io/ipfs/QmSp8p6d3Gxxq1mCVG85jFHMax8pSBzdAyBL2jZ.... As far as I can tell, they're just gone.
It’s important to think of IPFS as a way to share using content hashes - essentially file fingerprints - as URLs. Every bit of information added is inherently and permanently versioned.
This is a tremendous asset in many ways, for example de-duplication is free. But once a file has been added and copied to another host, any person with the fingerprint can find it again.
While IPFS systematically exacerbates the meaningful problems around deletion that you describe, they are not unique. Once information is put out in the world, it’s hard to hide it.
That's not at all unique to IPFS though - in fact, this is what the ni:// (Named Information) schema is supposed to be used for https://tools.ietf.org/html/rfc6920
(Depending on whether the hashes being used are properly filed with the NI IANA Registry, some IPFS paths might already be interconvertible with proper ni:// format, though with some caveats. sha256 hashes are definitely supported in both, though ni:// does not use the custom BASE58BTC encoding found in ipfs paths. Moreover, ni:// does not standardize support for file-level paths as found in ipfs, but does support Content-Type, which ipfs seems to leave unspecified. Files larger than 256k in IPFS are a whole other can of worms however, as you apparently lose the ability to lookup by sha256 hash of the whole content, and thus to properly interoperate with other mechanisms.)
Also, nitpicking but a content hash defines a URI not merely a URL, since its use is not restricted to looking up resources over a network.
A way to mark something as deleted, in a form of another ipfc object, could serve as soft delete, and also a permission to actually delete hunks of the object marked as deleted.
This, of course, is not secure deletion, and should not be.
As far as I'm aware, any node is free to delete its own data. In fact, isn't only storing data at the user's explicit request in the first place? It just can't do anything about what other nodes choose to keep or delete. If you're referring to a particular node wanting to delete do a best-effort deletion of data on other nodes, it's not clear to me why node A cares about reducing storage on nodes B, C and D (if you own those nodes, delete it yourself; if you don't.. then I don't know what you're up to)
On the other hand, if you can decide which information stays, you essentially own the system.
So, given the project's mission, I guess it is a requirement that information stays online as long as someone somewhere is willing to keep it.
I definitely agree from a technical level, you can't ever guarantee the deletion of files on somebody else's machine.
So the question is, how do we build systems that enable users to protect their communities, without them becoming yet another tool for abuse?
If I put goatse in this comment, others can see it. Then mods will remove it.
If it were on IPFS, the same would happen except the old version may still be accessible if people are still distributing it.
You'd only come across it if you were deliberately looking at an older version.
If the ipfs-based application doesn't have some means (by a group of administrators, or by community consensus) to orphan (de-link) user-contributed content that's abusive, then that app needs to be improved. It doesn't seem like an IPFS-protocol-layer problem.
You could add another layer on top of IPFS, something like IPFS-O (ownership) or IPFS-C (censorship), which used either a separate DHT or a centralized service to allow people to register never-before-seen block hashes by generating a keypair and uploading a signature of the block hash. If the content later needed to be removed, the signature could include contact information, and the signer could be appealed to (or sent a court order) to sign a removal message. It wouldn't really work, though. Nobody would run ipfs nodes paying attention to such a meta-service. And if the service were centralized, there would be grave concern over censorship by whatever entity ran it. And people could abuse the service by registering data blocks that haven't shown up in IPFS yet, claiming ownership when they don't really own it. You'd have to go to the courts to resolve that, the courts could issue an order to a centralized operator, if it's centralized, but again, nobody would run IPFS nodes respecting such a centralized censorship-enabling service, so the whole thing would be futile. And if the censorship service were decentralized, it could be sabotaged by enough libertarian nodes refusing to store or pass along removal messages.
So maybe don't build a Hacker News clone on top of IPFS? It's not meant to be to be a solution to every problem.
This page would be an IPNS address under the admin's control which would point to some IPFS hash representing the current goatse-containing state of the page. The admin would then create a new page, which would get a new IPFS hash (since it's new content) and point the IPNS address to it.
As it turns out, you cannot ever really delete information, you can simply change where your "well known" pointers point to.
Or you could just not look at it.
"One of the biggest challenges with [HTTP|ZFS|TLS|USB|ATX|VHS|USPS] in my mind is the lack of a story around how to delete content."
If you delete your copy of some data, someone else may still have theirs, but then it's them who controls whether to delete it. It's not a challenge for Serial ATA that it doesn't have a function to delete certain data from every hard drive in the world at the same time. Most systems don't work that way, not least because it's inherently dangerous.
Do you want yet another way of serving content that is subject to censorship and ridiculous content takedown policies?
Start by accepting that, and everything starts to make sense.
But the problem is that if you can delete that stuff, someone else can delete stuff that you don't want to be deleted.
The OP was suggesting a way to delete things from other peoples machines.
Once you accept that, you can focus on reducing the output of compromising information, rather than trying to erase it after the fact. Prevention over cure. This will inevitably lead to a society where people do more of what society wants, and less of what society doesn't. This is a good thing.
Also, I feel like we don't collect as much information as we should. Analytics is a lot less comprehensive than it should. Other than CCTVs, real-world analytics is basically non-existent. This greatly inhibits progress in AI/ML.
However, whenever this topic comes up here at HN, we get a bunch of people who say they tried to use it but it was basically unworkable, like too much RAM usage and various sorts of failures. And rarely does anyone respond by saying that it is working just fine for them.
So my question to the IPFS people is, when is it going to get really usable? I am asking for something reasonably specific, like 2 or 3 years, or what? And I am supposing that would mean a different promise/prediction for each main different use case. So how about some answers, not just "We are aware of those problems and are working on them"
It seems to do the same thing and works already but hardly gets any press. IPFS and the Protocol Lab's Filecoin sale seemed to generate a lot of marketing despite it becoming clearer later that Filecoin is for an unrelated incentivized network.
It is hard understand the pros and cons of choosing to use IPFS over Swarm, or where they are in comparative development cycle.
I know many decentralized applications that opt for IPFS for their storage component, and know of the libraries to help different software stacks with that. But I can't tell if it is right for me, versus the state of Swarm.
Swarm is not at all "working already" - the incentivisation layer for nodes to store data for other users is not implemented and currently mostly theoretical and work-in-progress.
IPFS is more mature in comparison to Swarm, but the underlying architecture is rather different.
I see things being stored on Swarm without incentives, like plain text
I believe the chapters about PSS, Swarm Feeds, ENS, Architecture, among others, are mostly up-to-date.
You can read about the incentivisation layer at https://swarm-gateways.net/bzz:/theswarm.eth/ethersphere/ora...
Currently incentivisation is not integrated or implemented in Swarm, so a user has no guarantees about what happens with their uploaded content. If the node hosting it disconnects from the network, it will be gone. The plans to address this are through the sw^3 protocols suite and/or erasure coding.
Regarding plain text - it doesn't really matter what bytes you store in Swarm - encryption is implemented and you can store non-encrypted or encrypted bytes, this has nothing to do with incentives for persistent storage.
We try to do outreach and answer community questions when possible, but the team is not big and this is currently done on a best-effort base, we could definitely improve on that front, I agree.
For example, if in the future debian is moved to IPFS, then many organizations are likely run local IPFS servers with Debian repos pinned. But if debian is moved to Swarm, I do not think that many organizations will be incentivized - the money are insignificant in the total spending, while engineering effort and organizational overhead (finance) is likely to be very big.
"If Debian is moved to Swarm, then many organizations are likely to run local Swarm servers with Debian repos pinned"
I’m a giant advocate for decentralized architectures but so far I’ve never found a use for it that doesn’t rely on a centralized way to find out about new data
> Inter-Planetary Name System (IPNS) is a system for creating and updating mutable links to IPFS content. Since objects in IPFS are content-addressed, their address changes every time their content does. That’s useful for a variety of things, but it makes it hard to get the latest version of something.
If you know the hash of the next version and put that in the file, then that hash will affect the hash/address of the file itself.
But the hash of the next version depends on the hash of the _next_ next version, and so on out to infinity... and you almost certainly don't know all those hashes, so you can't compute the hash of the next version, so you can't compute the hash of the current version, if it must contain the hash of the next version.
Also, if the IPFS's idea of working as local server is sound, BitTorrent DNA(browser plugin, steaming video over BitTorrent) should had been worked.
It seems to me, they suffered NIH syndrome. They tried to reinvent the wheel. The P2P file transfer protocol over IP has already been covered by BitTorrent. What we need is a nice front end which use BitTorrent protocol as back end and offer a illusion of Web site.
This is not a criticism. That's describing a feature. Yes, they do have that. You could implement your own name resolution in a different way if you need that.
It's amazing how many things that are popular now are simply re-discovery of Freenet features.
- but this is decentralized
- it is not a bug|lack of implementation but a feature