This patch makes curl utilize a gateway for ipfs:// addresses. It prefers a local one but can also use public gateways. It makes no effort to verify that the final product's cryptographic hash corresponds to the one in the address.
This lack of verification is expected with HTTP, but not with IPFS. curl should verify that the resultant output conforms with the IPFS address or else just have users input the gateway/ipfs HTTP address as you always could.
curl can operate in a pipe mode and that adds additional complexity in respect to verification.
This would in theory allow curl to block the pipe until it is able to confirm that a piece that arrived is verified or abort a tampered-with file early. This would take quite a bit of work to implement however - since it seems like there is no maintained IPFS implementation in C: <https://docs.ipfs.tech/concepts/ipfs-implementations>
Question: is the in-URL hash some form of Merkle tree root hash? Or is another method used to avoid having to download all the data before the hash can be verified?
I continue to regret that the window for the major browsers to incubate and support a content-addressable URL scheme based on then-current distributed hash table algorithms closed ~20 years ago and has shown no signs of being reopened.
Because 99% of the time people don't want immutable addresses. They want something that points to the latest and most up to date version. And when you really do want to freeze something in time you just use one of the page archive services.
P2P just flat out doesn't work for mobile devices too which are pretty much the entire internet userbase now.
Had content based urls been a thing 20 years ago, presumably there would have evolved technology for mobile ISPs to host border caches that end users could trust 100% without worry of nefarious manipulation.
Is content here mean, assets and/or web pages ?
How would it work for a webpage that loads a bunch of Javascript libraries from some CDN ? The page can have significant changes over time as the JavaScript libraries change
Content addressable means the URI contains some information uniquely describing the content (typically a cryptographic hash). So if you load https://cdn/js/e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b9... you’ll always know it will contain exactly the same contents, regardless of where it comes from (and that can be verified by the client as well).
You can derive an encryption key from both the plaintext content and a shared secret, and store content-addressed encrypted blobs on the open internet, so that only the parties that know the shared secret and a hash of the plaintext can decrypt the blobs.
Of course, shared secrets scale poorly beyond individual users and groups of friends. But there's fundamental limitations there, you need to do access control by some secret knowledge.
Tangential, but does anyone know why the discovery seems to barely work recently in ipfs? A few years ago I could start two nodes in random places and copy files between them with just a little delay. These days, I'm rarely and to start the transfer, even though both sides are connected to a few peers. Has something changed? Is it due to more/fewer people in the network?
How does that work? I would expect more nodes = more capacity for answering lookups / spreading the DHT. Why is the result the opposite? If I start a full node it's definitely not overwhelmed with traffic.
Nodes end up in unconnected islands of content knowledge. Whatever node you pick to query has a low probability of knowing about some content posted on some other random node. The workaround is to force the p2p topology to make a reachable subset of nodes that you use for both posting and querying, but that's not too decentralized because someone has to coordinate the topology override scheme.
Do you have any sources where I can read up on this? I've also heard of this complaint and I'm interested in diving into it for my thesis. Do you know what algorithms are suitable to make IPFS scalable?
Bitcoin was co-opted by a censorious minority - Theymos and Blockstream - who have maintained an artificially low block size, ostensibly in order to keep a high number of non-mining but full-history nodes.
Bitcoin’s competitors such as Ethereum and Monero did not allow a software or discussion forum monoculture and instead have scaling block sizes, and still hard fork to add new scaling features (https://ethresear.ch).
Block size puts a limit on the rate of transactions, but increasing the block size wouldn't help the time it takes to spin up new nodes. In fact, increasing the block size would make spinning up new nodes significantly less feasible, since the chain would grow at a faster rate.
Having a large number of independent non-mining full-history nodes is good, but it’s ultimately far less important than the distribution of mining (voting) nodes, and it’s one of many factors that must be considered while working to increase throughput, which is an absolute imperative.
Under 10 tx/sec is glacial, it makes Bitcoin useless for the people with the least access to financial services, and it seems starkly in opposition to Satoshi’s written comments about scaling when adding the 1MB block size.
I do think Bitcoin should be considered a failure if it becomes purely an exercise in hoarding.
I will say!I’ve been too happy with Ethereum and Monero to test Lightning, but I should. I had misgivings with its interface and availability requirements while reading of it in the past years.
This does not appear to be IPFS, it's just some crude method of rewriting ipfs:// URLs into HTTP requests to gateways that then do the actual IPFS. Which much like bitcoin, nobody seems to want to run locally despite that mostly defeating the purpose. On the bright side none of this ended up polluting libcurl.
I’d love to see IPFS native protocol support in curl, to be honest.
After all, curl already supports dozens of other obscure protocols: DICT, FILE, FTP, FTPS, GOPHER, GOPHERS, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET, TFTP, WS and WSS…
One challenge of doing this correctly is that curl is intended to be a “one-off” call that downloads and exits without participating in the swarm the way a good BitTorrent client should. Granted, presumably this IPFS gateway solution also has this problem.
People use ipfs to download hosted (by someone) things, so it is a fit. Have a look at https://winworldpc.com
Also, the Internet Archive uses web seeded torrents to accomplish the problem of insufficient seeds rather than throwing them into the ether (non-crypto variant) and hoping for mirroring by wishful thinking.
Conclusion: Centralized, unicast-like support for p2p is possible and useful for download-oriented clients.
IPFS Desktop in particular: One shitty thing they did was auto re-opt-in to telemetry without asking the user. With that kind of sneaky bullshit, I refuse to run it. IPFS lost trust and is unable to communicate how their "solution" is useful or make it usable by humans who aren't IPFS Desktop developers.
P2P protocols like IPFS are built with the expectation that clients are persistent. They take a nontrivial amount of time to start up (e.g. to discover peers), and require continuous maintenance from the application to keep running (e.g. to chat with those peers and keep them happy).
These characteristics are incompatible with clients like curl. They're written with the expectation that connection setup is cheap and that connections don't require maintenance while they aren't being used. These expectations are all true of existing protocols like HTTP(S) or FTP; they fail with IPFS. And there isn't really any way to solve that without introducing another process to act as an intermediary -- which is exactly what an IPFS gateway does.
Not really, a transient ipfs node works just fine.
There are stable nodes to bootstrap a new node into the network. More initialization that if you were using http, sure, but it isn’t as against the grain as you suggest.
Are there actual perverse incentives that come with "it being married to crypto" — keep in mind that there are demonstrable positive incentives — or is it just aesthetic distaste?
There is a demonstrable disincentive in that approximately everything related to crypto is a scam, so by being married to crypto they demonstrate at least a lack of good judgement.
Honestly, I agree. This is not IPFS support, this is “we ship with a default URL rewriting rule and then make an HTTP call”
It’s a first step, though
At least the rule is invoked safely. Curl doesn’t do localhost pings by default; instead it checks for the use of an IPFS_GATEWAY environment variable or a ~/.ipfs/gateway file, and will fail with instructions, if neither are present
I'm not that familiar with IPFS I must admit (though it looks great conceptually), but if using the curl CLI tool, why would the operator not just curl the public gateway address?
I'm confused on why such a shallow abstraction was put into something present on every device, and why this seems to be such a big deal to the decentralized community.
Even with it's automatic gateway detection it's purely used to rewrite the URL, which seems like something the operator could easily do themselves.
I wasn't able to find a equivalent in curl with a very cursory search, but wget has `--page-requisites`, which fetches (nominally) every file needed to display a HTML document. If curl does have something analogous, this change would allow html of the form:
<img src="ipfs://WHATEVER"/>
to be handled transparently (even when it occurs in a page that is not itself on IPFS). Ideally similar support would be added for "magnet:?..." and "[...].onion/..." URLs, for the same reason.
I didn't know what IPFS was, clearly I'm living under a rock.
From the above link;
>>The InterPlanetary File System (IPFS) is according to the Wikipedia description: “a protocol, hypermedia and file sharing peer-to-peer network for storing and sharing data in a distributed file system.”. It works a little like bittorrent and you typically access content on it using a very long hash in an ipfs:// URL. Like this:
My understanding is that the curl position was around making sure UX (defaults) are safe and don't tie the user to any third-party gateway.
Default behavior in the merged curl PR got adjusted and now the only gateway that is used implicitly is the localhost one. Using an external, potentially untrusted public gateway requires explicit opt-in from the user via IPFS_GATEWAY env variable.
FWIW, in recent years IPFS ecosystem made content-addressing over HTTP viable and useful. Standards and specifications got created. Verifiable responses have standardized content types registered at IANA.
So, that's not really IPFS support in cURL. It is just a support for IPFS' urls as it actually consists in rewriting them to use a HTTP gateway, which will actually do all the IPFS work.
I understand that implementing the IPFS protocol in a tool such as cURL does not make sense. But I don't really see the point of a fake support like this.
> I have also learned that some of the IPFS gateways even do regular HTTP 30x redirects to bounce over the client to another gateway.
> Meaning: not only do you use and rely a total rando’s gateway on the Internet for your traffic. That gateway might even, on its own discretion, redirect you over to another host. Possibly run somewhere else, monitored by a separate team.
> I have insisted, in the PR for ipfs support to curl, that the IPFS URL handling code should not automatically follow such gateway redirects
Which link do you pass to someone when sharing that? It's going to be one of those, if any goed offline your image will be perceived as offline even though it's there on other sites.
In a decentralized world with IPFS we can do the same thing through gateways. The gateway itself is a centralized endpoint through which you access content on the IPFS network. Here however the content of "cat.jpg" is hashed to a CID (Content ID). Let's say the content of "cat.jpg" has CID QmbjWyHqUyjxfq7KSiSLBi59CdpqwaAxXfwfby9TpuuigE
On a first glance there isn't much of a benefit.
You need a gateway url (a central endpoint) and you need to know the filename (QmbjWyHqUyjxfq7KSiSLBi59CdpqwaAxXfwfby9TpuuigE).
Now for the curl approach and why this is awesome.
Your image (QmbjWyHqUyjxfq7KSiSLBi59CdpqwaAxXfwfby9TpuuigE) is accessible through any gateway. But when you pass this information to someone else, you don't want to tell them to use one specific gateway. You only want to tell them the content hash (QmbjWyHqUyjxfq7KSiSLBi59CdpqwaAxXfwfby9TpuuigE) and let them use their own gateway. You want that because it promotes the data to be propagated over the IPFS network and be better accessible.
The curl patch _prefers_ that you use a own local IPFS gateway. So when you receive a link like "ipfs://QmbjWyHqUyjxfq7KSiSLBi59CdpqwaAxXfwfby9TpuuigE" it would use the gateway it has locally. And yes, this is just url syntactic sugar as it's all eventually been rewritten to a full http-url. If you have a local node with this patch it will probably will rewrite it to "curl http://localhost:8080/ipfs/QmbjWyHqUyjxfq7KSiSLBi59CdpqwaAxX..." But you could also be using an IPFS node in your local network but not on your local machine, meaning it could also rewrite it to:
"curl http://10.0.3.3:8080/ipfs/QmbjWyHqUyjxfq7KSiSLBi59CdpqwaAxXf..." (or whatever your network is)
But to you, as the user of curl, that all doesn't matter. You just do:
"curl ipfs://QmbjWyHqUyjxfq7KSiSLBi59CdpqwaAxXfwfby9TpuuigE"
Granted, if you use a node that isn't a local node the curl won't find it and you'll have to manually specify it in your local "~/.ipfs/gateway" file or in the "IPFS_GATEWAY" environment variable. But these are one-time configuration steps.
Another thing i want to highlight is malicious gateways and verifiable data. This patch really should be considered as a low-level syntactic sugar to access IPFS, not as full IPFS implementation. One can build on top of this and do data verification. This is, at least that is my opinion now, not a place for inside of curl itself. (assumption here) You can do a streaming approach and verify it block by block with another application, which should be a valid usecase. Also it's very much in the philosophy of building 1 tool for 1 job. Curl is to get data from network resources, something else is to verify that that.
Moral of the story is that passing and IPFS url like ipfs://QmbjWyHqUyjxfq7KSiSLBi59CdpqwaAxXfwfby9TpuuigE becomes completely agnostic from central points. It becomes a local user setting to handle these urls. With IPFS still being young and still very much a niche technology, this is still somewhat of a hurdle to pass. But as IPFS gains more adoption like this, the hurdles become more and more streamlined to solve. At some point in the future people know that to use ipfs they install something that implements the ipfs protocol to use such links. Just like people know to use a webbrowser on http/https links, magnet links to be opened in a torrent client and mail links to be opened in a mail client (to name a few examples).
Yeah, verification is needed. A malicious gateway can be passed in via env for example.
There should be another optional argument that also verifies the contents vs the hash.
This lack of verification is expected with HTTP, but not with IPFS. curl should verify that the resultant output conforms with the IPFS address or else just have users input the gateway/ipfs HTTP address as you always could.
curl can operate in a pipe mode and that adds additional complexity in respect to verification.
IPFS gateways can serve you in a manner that allows continuous(?) verification: <https://docs.ipfs.tech/reference/http/gateway/#trusted-vs-tr...>
This would in theory allow curl to block the pipe until it is able to confirm that a piece that arrived is verified or abort a tampered-with file early. This would take quite a bit of work to implement however - since it seems like there is no maintained IPFS implementation in C: <https://docs.ipfs.tech/concepts/ipfs-implementations>