Hacker News new | past | comments | ask | show | jobs | submit login

One of the biggest challenges with IPFS in my mind is the lack of a story around how to delete content.

There may be a variety of reasons to delete things,

- Old packages that you simply don't want to version (think npm or pip)

- Content that is pirated or proprietary or offensive that needs to be removed from the system

But in its current avatar, there isn't an easy way for you to delete data from other people's IPFS hosts in case they choose to host your data. You can delete it from your own. There are solutions proposed with IPNS and pinning etc - but they don't really seem feasible to me last I looked around.

This list as @fwip said is great as a wishlist - but I would love to see them address some of the things needed in making this a much more usable system as well in this roadmap.




> But in its current avatar, there isn't an easy way for you to delete data from other people's IPFS hosts in case they choose to host your data.

If you put it on IPFS, it's not "your data" any longer. If that doesn't work for you, then don't use IFPS.

Edit: I do get why people are concerned about persistence of bad stuff. But it's not at all unique to IPFS. And even IPFS forgets stuff that nobody is serving. I mean, try to find these files that I uploaded a few years ago: https://ipfs.io/ipfs/QmUDV2KHrAgs84oUc7z9zQmZ3whx1NB6YDPv8ZR... and https://ipfs.io/ipfs/QmSp8p6d3Gxxq1mCVG85jFHMax8pSBzdAyBL2jZ.... As far as I can tell, they're just gone.


If IPFS becomes famous to the point governments have to look at it, and that it allows to bypass laws, they will try to forbid by law people to run nodes, same way with Tor in some countries.


Try and succeed are different things. The larger the use case set, the tighter the integration with everything else, the stronger the reliance on it, the harder it will be to outlaw it. Often the laws drift ever-so-slightly to accommodate the new reality.


> Old packages that you simply don’t want to version

It’s important to think of IPFS as a way to share using content hashes - essentially file fingerprints - as URLs. Every bit of information added is inherently and permanently versioned.

This is a tremendous asset in many ways, for example de-duplication is free. But once a file has been added and copied to another host, any person with the fingerprint can find it again.

While IPFS systematically exacerbates the meaningful problems around deletion that you describe, they are not unique. Once information is put out in the world, it’s hard to hide it.


> It’s important to think of IPFS as a way to share using content hashes - essentially file fingerprints - as URLs.

That's not at all unique to IPFS though - in fact, this is what the ni:// (Named Information) schema is supposed to be used for https://tools.ietf.org/html/rfc6920

(Depending on whether the hashes being used are properly filed with the NI IANA Registry, some IPFS paths might already be interconvertible with proper ni:// format, though with some caveats. sha256 hashes are definitely supported in both, though ni:// does not use the custom BASE58BTC encoding found in ipfs paths. Moreover, ni:// does not standardize support for file-level paths as found in ipfs, but does support Content-Type, which ipfs seems to leave unspecified. Files larger than 256k in IPFS are a whole other can of worms however, as you apparently lose the ability to lookup by sha256 hash of the whole content, and thus to properly interoperate with other mechanisms.)

Also, nitpicking but a content hash defines a URI not merely a URL, since its use is not restricted to looking up resources over a network.


Managing universal data removal is not universally solved (or even wanted) on internet scale. So it sounds weird to demand it from technology which is trying to solve completely different problem.


A best-effort deletion could be beneficial for any node. It reduces storage requirements a bit.

A way to mark something as deleted, in a form of another ipfc object, could serve as soft delete, and also a permission to actually delete hunks of the object marked as deleted.

This, of course, is not secure deletion, and should not be.


>A best-effort deletion could be beneficial for any node. It reduces storage requirements a bit.

As far as I'm aware, any node is free to delete its own data. In fact, isn't only storing data at the user's explicit request in the first place? It just can't do anything about what other nodes choose to keep or delete. If you're referring to a particular node wanting to delete do a best-effort deletion of data on other nodes, it's not clear to me why node A cares about reducing storage on nodes B, C and D (if you own those nodes, delete it yourself; if you don't.. then I don't know what you're up to)


The same argument applies to self-driving cars, crashing into objects on the road is not a problem because they are trying to solve an entirely different problem.


What you actually want is to break the universe. It is physically impossible to revoke information unless it happens by a strange coincidence. 24x7, you emit information that races away with the speed of light. You can't chase it down. Physically.

On the other hand, if you can decide which information stays, you essentially own the system.

So, given the project's mission, I guess it is a requirement that information stays online as long as someone somewhere is willing to keep it.


I doubt there will every be a way to delete content as every legitimate method of deleting will be commandeered for censorship. Even if they did add something you can never really know the other nodes actually deleted it.


Sometimes, censorship is good. For a silly example, if somebody somehow filled this comment section with images of goatse, it would be nice if we could take that down.

I definitely agree from a technical level, you can't ever guarantee the deletion of files on somebody else's machine.

So the question is, how do we build systems that enable users to protect their communities, without them becoming yet another tool for abuse?


I don't see how that would be a much different problem to now.

If I put goatse in this comment, others can see it. Then mods will remove it.

If it were on IPFS, the same would happen except the old version may still be accessible if people are still distributing it.

You'd only come across it if you were deliberately looking at an older version.


Why shouldn't that be handled at the application level? Just like with git, if there's content you don't want anymore, you orphan the block hashes in whatever structure the application uses to store and display content; IPFS nodes could still store the abusive comments or media, but nobody would find it unless they were looking for it or randomly fetching blocks.

If the ipfs-based application doesn't have some means (by a group of administrators, or by community consensus) to orphan (de-link) user-contributed content that's abusive, then that app needs to be improved. It doesn't seem like an IPFS-protocol-layer problem.

You could add another layer on top of IPFS, something like IPFS-O (ownership) or IPFS-C (censorship), which used either a separate DHT or a centralized service to allow people to register never-before-seen block hashes by generating a keypair and uploading a signature of the block hash. If the content later needed to be removed, the signature could include contact information, and the signer could be appealed to (or sent a court order) to sign a removal message. It wouldn't really work, though. Nobody would run ipfs nodes paying attention to such a meta-service. And if the service were centralized, there would be grave concern over censorship by whatever entity ran it. And people could abuse the service by registering data blocks that haven't shown up in IPFS yet, claiming ownership when they don't really own it. You'd have to go to the courts to resolve that, the courts could issue an order to a centralized operator, if it's centralized, but again, nobody would run IPFS nodes respecting such a centralized censorship-enabling service, so the whole thing would be futile. And if the censorship service were decentralized, it could be sabotaged by enough libertarian nodes refusing to store or pass along removal messages.


> For a silly example, if somebody somehow filled this comment section with images of goatse, it would be nice if we could take that down.

So maybe don't build a Hacker News clone on top of IPFS? It's not meant to be to be a solution to every problem.


If soft delete is ok there is no problem. But should it be part of protocol or application layer is another question.


Removing comments from a comment page is not deleting content, it's changing it.

This page would be an IPNS address under the admin's control which would point to some IPFS hash representing the current goatse-containing state of the page. The admin would then create a new page, which would get a new IPFS hash (since it's new content) and point the IPNS address to it.

As it turns out, you cannot ever really delete information, you can simply change where your "well known" pointers point to.


> For a silly example, if somebody somehow filled this comment section with images of goatse, it would be nice if we could take that down.

Or you could just not look at it.


> One of the biggest challenges with IPFS in my mind is the lack of a story around how to delete content.

"One of the biggest challenges with [HTTP|ZFS|TLS|USB|ATX|VHS|USPS] in my mind is the lack of a story around how to delete content."

If you delete your copy of some data, someone else may still have theirs, but then it's them who controls whether to delete it. It's not a challenge for Serial ATA that it doesn't have a function to delete certain data from every hard drive in the world at the same time. Most systems don't work that way, not least because it's inherently dangerous.


The inability to delete things is a feature of IPFS.

Do you want yet another way of serving content that is subject to censorship and ridiculous content takedown policies?


That's not an issue, that's a feature!


Just watch how fast warez people will start to use IPFS to host all seasons and all episodes of Friends.


why not encrypt it? that way it's junk data to anyone that can't decrypt it


What happens when that encryption method becomes obsolete and broken? Now this private encrypted data is accessible all over the world.


Data should never be deleted. Ever.

Start by accepting that, and everything starts to make sense.


Child porn? Video of you in the bathroom that someone took from a hidden camera? Your stolen financial information?


Sure, of course.

But the problem is that if you can delete that stuff, someone else can delete stuff that you don't want to be deleted.


If you don't want something deleted, you can always keep a copy of it yourself.


Exactly. Which means you can't delete your bathroom video from any networked system. See the Streisand effect.


That is _literaly_ what we already have. If you delete (unpin) content, others are free to keep it (pinned).

The OP was suggesting a way to delete things from other peoples machines.


Yes, but only if you manage that before it's gone.


No. It's unreasonable to expect they could be deleted.


That’s just a dogmatic statement, you need to motivate it.


It's unreasonable to expect to be able to delete data that's been released publicly. Any attempt to delete arbitrary data will either fail or involve extreme authoritarian measures.

Once you accept that, you can focus on reducing the output of compromising information, rather than trying to erase it after the fact. Prevention over cure. This will inevitably lead to a society where people do more of what society wants, and less of what society doesn't. This is a good thing.

Also, I feel like we don't collect as much information as we should. Analytics is a lot less comprehensive than it should. Other than CCTVs, real-world analytics is basically non-existent. This greatly inhibits progress in AI/ML.


How does “should never be” follow from “difficult in practice”?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: