Hacker News new | past | comments | ask | show | jobs | submit login
Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web (micahlerner.com)
24 points by ingve on Nov 1, 2022 | hide | past | favorite | 5 comments



  > Content addressing: IPFS uniquely identifies immutable content stored
  > on the network. This approach simplifies storage and reference to the
  > underlying data, as a node can use a unique key to unambiguously fetch
  > content from its peers.
This is unfortunately not true -- IPFS uses content addressing for metadata, but file content isn't content-addressed. The result is a network similar to BitTorrent's DHT, where if you know the hash of the torrent file you can download the content, but having a hash of the content doesn't get you anything.

An example I posted previously (https://news.ycombinator.com/item?id=32426051): debian-10.7.0-amd64-netinst.iso has SHA256 checksum b317d87b0a3d5b568f48a92dcabfc4bc51fe58d9f67ca13b013f1b8329d1306d, but a very large number of CIDs could be used to identify that file. The following two CIDs are both valid identifiers for it:

https://cid.ipfs.tech/#bafybeihjy54iyvheotna2aeqmzhqnro6yot4...

https://cid.ipfs.tech/#bafybeihfqpypuhmtyzazrj3g4b4f4nqk2ziy...

You'll only be able to download that file from IPFS if you know which CID to use, and if there are peers hosting it with the same CID.


Author here - thank you for pointing that out! The context in the comment you linked to is also helpful.

In your opinion, would it be correct to remove "uniquely" from the first sentence? Maybe that would help clarify the issue you pointed out. As far as I understand, the CIDs are still unique when referring to _metadata_, and can be used to "unambiguously fetch content from its peers".


IMO the closest equivalent to an IPFS CID is a BitTorrent magnet URL. They both can be used to obtain an intermediate metadata value, which can in turn be used to fetch file content.

  > CIDs are still unique when referring to _metadata_, and can be used to
  > "unambiguously fetch content from its peers".
A CID, mathematically, cannot uniquely identify a single file. Consider the CID bafybcfbaqydfnrof5stdvkzhwx6kglb7p6owxja:

  [desktop]$ ipfs get bafybcfbaqydfnrof5stdvkzhwx6kglb7p6owxja -o temp.pdf
  [desktop]$ shasum -a 256 temp.pdf
  2bb787a73e37352f92383abe7e2902936d1059ad9f1ba6daaa9c1e58ee6970d0  temp.pdf

  [laptop]$ ipfs get bafybcfbaqydfnrof5stdvkzhwx6kglb7p6owxja -o temp.pdf
  [laptop]$ shasum -a 256 temp.pdf
  d4488775d29bdef7993367d541064dbdda50d383f89f0aa13a6ff2e0894ba5ff  temp.pdf
If you were to try fetching that CID from the peer network, which file you'd get would depend on which peer you're fetching it from.


why is the sha256sum different?





Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: