BitTorrent vs. HTTP

sametmax · on Jan 24, 2017

I wish browsers and OS would implement torrent by default. Not only would it make the sharing file experience (downoading, sending, etc) much better, but it would open the door to many cool P2P stuff on the client side.

It's a standard, we know it works and how.

zython · on Jan 24, 2017

Yeah, thats cool but I dont want to pay to be a part of Microsoft's/Apple's/ you-name-it's insfrastructure.

I dont want to be (potentiall forced to) distribute software.

With recent developments in net neutrality and data plans, p2p could drastically impact my data plan and cost me money.

Sure I can see why you would think the potential is, but for me p2p data distribution is out of the question, at least for now.

derefr · on Jan 24, 2017

The proper solution for data plans is not to avoid features that are harmful in their presence entirely, but rather for the OS to be aware of when it's connected via a metered/capped connection, and behave differently. Just like how Android devices can detect when they're connected to "open" wi-fi hotspots and auto-wrap their connection in a Google VPN.

Just because you have a logical reason to not want to serve updates, should not prevent IT admins from getting a good default experience when they have 1000 machines on a LAN that all want to update.

dragonwriter · on Jan 24, 2017

> but rather for the OS to be aware of when it's connected via a metered/capped connection, and behave differently.

Pretty much all internet connections are metered/capped, though prior to recent FCC transparency rules the caps on residential fixed broadband were often undisclosed or affirmatively misrepresented under false "unlimited" labels.

Making the OS appropriately sensitive to the costs incurred by each network-involved transaction relating to updates and the system owner's preferences regarding balancing those costs against the value they provide in update experience is abstractly ideal but decidedly difficult.

derefr · on Jan 24, 2017

As i was saying above, the majority use-case of update sharing is sharing updates over a LAN to other computers in the same office.

Note that, even in that case, the connection to the outside world might still be metered/capped—but the connection to LAN peers obviously isn't. That means that "does this traffic cost anything" is an evaluation the OS would have to make per socket (or requested socket from a higher-level library, like Windows' BITS), rather than per interface.

dragonwriter · on Jan 24, 2017

The point you are making about per-socket evaluation here is what I was trying to get at with the "network-involved transaction" reference in GP.

johnnyhead · on Jan 24, 2017

Man, windows 10 already does that. In the advanced update options there is a distributing method menu and it is pretty clear they are using our machines to distribute updates to others.

Disclaimer: Haven't checked in a while and can not do it now, but it was there until not long ago.

ChrisClark · on Jan 24, 2017

Yes it's still there. I just turned it off last week. But still let it share the updates with the other computers in my household LAN.

atmosx · on Jan 24, 2017

Is this on by default?

smhenderson · on Jan 24, 2017

IIRC yes, unless your internet connection is marked as Metered, in which case the default is off.

What I can't remember and am unable to check at the moment is whether Metered is turned on by default or not.

But regardless of defaults, if you mark a connection as Metered than the background update distribution is turned off.

tracker1 · on Jan 24, 2017

Metered is part of your connection settings, iirc, like when you establish a connection to a new wifi network. Other applications may be effected by metered networks, and iirc updates don't run at all when on a metered network.

zython · on Jan 24, 2017

Makes me even happier to use Linux almost exclusively now.

sametmax · on Jan 24, 2017

If you use steam or battle-net, you are already doing it.

Done respectfully, it can be a great way to make everybody's file arrive faster while balancing the traffic better.

wscott · on Jan 25, 2017

Regretfully steam does _not_ use p2p downloads. People running lan parties would be so much happier if steam would just do a lan broadcast to look for local copies of a game before downloading. (a very limited form of p2p)

They already have local streaming for games so they have the infrastructure for this.

tracker1 · on Jan 24, 2017

Agreed, even if peers seed 1:1, then they are only sending out as much as they received and the servers can reach twice as many people.

sedachv · on Jan 24, 2017

> With recent developments in net neutrality and data plans, p2p could drastically impact my data plan and cost me money.

The solution is to pressure your government to make net neutrality a law and vote with your dollars and use ISPs and network providers that are not intent on breaking the Internet. All Internet nodes need to have the same access to the network. "I am too lazy to do anything about it and too cheap to pay for data" is not a valid argument on a discussion forum called Hacker News.

deckar01 · on Jan 25, 2017

There is no reason you should be required to distribute any content to participate. Leeching content from your peers is still a beneficial behavior from the perspective of the content producer due to the reduced cost of distribution.

Seeding a website would be like paying a membership fee with surplus currency that you have been throwing away monthly.

With the proper incentives and controls I suspect there is content you would be compelled to seed.

sametmax · on Jan 25, 2017

It beneficial for you as well. You benefit from faster updates. Just like torrenting film is beneficial to you because you benefit from fast movie download.

kome · on Jan 24, 2017

Opera 12 had torrent...

It's incredible how Opera was innovative at the time. Nowadays it just another useless chrome.

etatoby · on Jan 24, 2017

Opera's layout engine and UI were incredibly fast, responsive, and lightweight, remaining the fastest for decades. I have no doubt it was the product of a very small team of brilliant engineers. Its userscript support was also top-notch, allowing things that are tricky or impossible even today in a browser extension.

Too bad it couldn't keep up with the whole Web 2.0 sh*tstorm.

vbezhenar · on Jan 24, 2017

What's point in bundling unrelated features? I used Opera in the past, but I never used nor IRC, nor mail (their mail was very cumbersome for me, I used The Bat instead), so it was wasted effort for me. Now if Opera would provide JS API for torrents or mail to enable further integration between those technologies, that would be interesting and innovative.

takeda · on Jan 24, 2017

I also didn't like its mail client, but IRC was useful multiple times. Especially when browsing some kind of open source project and wanted to get an answer, I just clicked on a link on their page and was able to join their channel.

The great thing was that Opera 12 even with IRC, mail, torrent and other goodies (such as HTTP server) was still lighter on resources than Firefox and Chrome.

I really do miss the old Opera, and even the Vivaldi is not really good replacement since it still uses Blink engine.

I switched to Firefox after, I like its extensions but it's not perfect.

angry-hacker · on Jan 25, 2017

Somehow they still managed to keep the installer size small, ux reasonable and product fast. If you can do this, I don't care how many useless features you add. Nowadays it's opposite. I didn't use those features either but they never bothered me.

ComodoHacker · on Jan 24, 2017

It also had Unite!

akjainaj · on Jan 24, 2017

Yes, and it also had an IRC client. Having lots of features unrelated to web browsing did not make it better.

manojlds · on Jan 24, 2017

But it was better. The UX was way too good, even if things like standards implementation weren't great.

digi_owl · on Jan 24, 2017

I seem to recall that back then Opera stood shoulder to shoulder with Mozilla about standard correctness.

gsnedders · on Jan 24, 2017

Until ~2010ish, Opera was per-se pretty much competitive (after that, the other vendors were implementing new specs quicker), but there were some notable differences, especially around places where specs allowed multiple behaviours and every vendor except for Opera had converged on one behaviour a decade earlier (rounding of CSS values comes to mind here, but was far from the only case), and that caused more and more problems and web-developer pain as the web became increasingly complex.

Macha · on Jan 24, 2017

Iirc, Opera was actually quite ahead of Firefox in passing acid3, though can't remember if they were first or if WebKit was

xorcist · on Jan 24, 2017

It probably helped that Håkon Lie, who proposed/invented CSS, worked at Opera.

digi_owl · on Jan 24, 2017

I seem to recall that the Mozilla suite had a irc client as well.

smhenderson · on Jan 24, 2017

They did and still do. The SeaMonkey[0] product from Mozilla comes with all the components that Netscape^ shipped with. A browser, an email client, an IRC client and an WYSIWIG editor for HTML called Composer.

[0] http://www.seamonkey-project.org/start/

^ version 4 I think but maybe earlier versions too

u801e · on Jan 25, 2017

Thunderbird can also connect to IRC (as well as XMPP). It also offers choices of Google Talk, Twitter, and Yahoo.

_pctq · on Jan 24, 2017

It also sounds like a cool caching technique : since people already have the resource cached on their local system, why not allow them to distribute it?

HTTP headers can allow to know when to flush cache (just like it's currently done) and provide last known md5/sha1/whatever digest to make sure page is not tempered with (let's say it's checked when the download is complete, and retry a download if the signature does not match: it should not happen often anyway). It obviously won't work for pages which distribute auth related content, but it would be great for assets.

I guess a problem could be that page load will be slower (depends on the ability to parallelize and to contact geographically close peers, I suppose), but it would mean way less heavy load on servers.

rakoo · on Jan 24, 2017

It'd be a huge downgrade in privacy. As much as we don't like it, when I'm watching cat videos only Google knows; with bittorrent, everybody knows what you're watching. As much as I love bittorrent (I really do), this is an aspect for which I don't see easy solutions.

_pctq · on Jan 24, 2017

Indeed. Maybe it should include Tor like features as well :)

_pctq · on Jan 24, 2017

I realize that this would not work as well : tor is routing traffic through exit nodes, so it would basically cancel the advantages of p2p distribution.

The privacy concern makes me realize something else : there's no incentive for users here. They lose privacy, what do they win?

It does not prevent totally the idea of using p2p to distribute content as a low level implementation, but since we (the ones who manage servers) are the only ones to benefit from it, we must first find a way for it to have no impact on users.

The biggest privacy problem is because of how p2p works currently : we have a list of IPs associated to a resource. How can we obfuscate this without going through proxies?

sametmax · on Jan 24, 2017

Not "tor". "tor-like". Onion routing. Something that would make sure only your closest neighbor knows what you request, and won't know if it's for you or your closest neighbor.

rakoo · on Jan 24, 2017

I don't know. It's already hard enough to incentivize people to share content they already have (in current bittorrent), I believe it would be even more difficult to incetivize them to download content they're not primarily interested in just for a neighbor.

It works if we don't do naive P2P but rather Friend-2-Friend; this is what retroshare does, your downloads can go through your friends so that only they know what you download. But that requires a lot more steps than traditional bittorrent, so I'm not sure it could work in general.

derefr · on Jan 24, 2017

Such networks where you "download a bunch of stuff you're not interested in for the sake of" the network already exist.

Perfect Dark (a Japanese P2P system) is a direct implementation of that concept, where you automatically "maintain ratio" as with a private torrent tracker by your client just grabbing a bunch of (opaque, encrypted) stuff from the network and then serving it.

A more friendly example, I think—and probably closer to what the parent poster is picturing—is Freenet, which is literally an onion-routed P2P caching CDN/DHT. Peers select you as someone to query for the existence of a chunk; and rather than just referring them to where you think it is (as in Kademlia DHTs), you go get it yourself (by re-querying the DHT in Kademlia's "one step closer" fashion), cache it, and serve it. So a query for a chunk results in each onion-routing step—log2(network size) people—caching the chunk, and then taking responsibility for that chunk themselves, as if they had it all along.

romaniv · on Jan 24, 2017

For traffic that consists of many small, relatively rare files (which is most HTTP traffic) you would have to do some proactive cahing anyway. I want JQuery 1.2.3. I ask your computer. Your computer doesn't have it, either because it's a rarely used version, or because you cleared your cache. Instead of returning some error code, your computer asks for it from another node, then caches the file, then returns it to my computer. This kind of stuff will be necessary to ensure high availability, and incidentally it would make it hard to see who is the original requestor. It should actually improve privacy. Also, this scheme could be used to easily detect cheating nodes that don't want to return any content. (They usually wouldn't have an excuse of not having the content you requested, so if a node consistently refuses to share, it can be eventually blacklisted.)

rakoo · on Jan 24, 2017

> I want JQuery 1.2.3. I ask your computer [...] returns it to my computer

Good ! Now I know you have JQuery 1.2.3, which is vulnerable to exploit XYZ, which I can now use to target you. This is one reason why apt-p2p and things like that can't be deployed in large; it's way too easy to know what version of what packages are installed on your machine.

_pctq · on Jan 24, 2017

Indeed, local disclosure is a cool idea to mitigate history leak.

This makes me think that an other feature could be to not disclose all peers available, but randomly select some. This would force someone wanting to look someone up to download a possible big amount of times the same list to check for an ip, instead of just pulling it once per ressource to know as a fact.

Both ideas are not exactly privacy shields, but steps to mitigate the problem.

NTripleOne · on Jan 24, 2017

>It also sounds like a cool caching technique : since people already have the resource cached on their local system, why not allow them to distribute it?

Sounds like a cool way to deter people from using it if you ask me.

puddintane · on Jan 24, 2017

Especially with Comcast now introducing a 1 tb cap to its non-business users.

_pctq · on Jan 24, 2017

I'm not from USA, am I correct to think Comcast is a mobile ISP? If so, this is a non issue : mobile apps can already decide not to do big update when not on wifi, browsers can do the same.

theandrewbailey · on Jan 24, 2017

That is incorrect. Comcast is the (second?[0]) largest cable ISP in the USA. About 10 years ago, they were calling people saying that they were using too much data, but that was always nebulous and varied greatly. Then they defined a 250 GB cap in 2008, and later, 300 GB. They only recently expanded it to 1TB. Go over it for three months, and Comcast will get very angry.

I'm not sure if the cap is in effect for places with fiber internet (like Verizon FIOS). It wouldn't surprise me, since they've admitted caps are unnecessary.[1]

[0] Charter + Time Warner Cable (after merge) is larger, I think.

[1] https://consumerist.com/2015/11/06/leaked-comcast-doc-admits...

_pctq · on Jan 24, 2017

I see, thanks. This sounds quite terrible actually : if internet has any historical impact, the last thing you want is for people to think about "not using too much data". I hope this won't last.

theandrewbailey · on Jan 24, 2017

Many big American ISPs are too greedy. Even worse is that most people have no clue how much a gigabyte of data is.

About the time Chairman Wheeler was aiming for net neutrality, he was also looking at data caps and if they were bad. Within a month, Comcast increased theirs to 1 TB. Coincidence?

There's going to be a lot about how 'competitive' ISPs are in the next few years in America now that everything's Republican. But most people have two options: expensive cable ISP that's fast when it wants to be, and slow DSL that's also expensive and not getting better.

bespoke_engnr · on Jan 24, 2017

Unfortunately Comcast is one of the two companies (along with Verizon) that have monopolized the ISP market in the US. Together they own most of the market, and have agreements to stay out of each others' territories. This affects something like 40% of home Internet customers in the US :(.

sametmax · on Jan 24, 2017

Even when I download non stop torrent movies I don't reach this in one month. I doubt it would be a problem, especially since the load would be balanced across all users.

lacksconfidence · on Jan 24, 2017

Try a different quality? Best quality on amazon video is 7GB per hour, works out to about 150 hours a month. Sure this is a lot of video, but spread across a household with say 4 people with their own screens and viewing habits is under 1.5 hours per day per person

puddintane · on Jan 24, 2017

Streaming Netflix alone can cap your bandwidth quickly given the statistics from the Netflix page [1]. I have recently experienced this actually (I do not run torrents from home and during the month of December I did not transfer anything from the remote box). Month of December I used 1022 GB out of 1024 GB over 900 GB was from NetFlix.

Using: 1 terabyte = 8,388,608 megabits

Ultra HD: 25 mbps ~= 93.2 hours HD: 5 mbps ~= 19.5 days SD: Would be good but hey I bought an HD TV for a reason so that would not be fair to skimp out on quality just because they don't want to introduce cheaper bandwidth caps.

side note It is $10 per 50 GB once you go over the cap with Comcast which is pretty huge for overhead if a user wasn't even aware of the cap. Thankfully they cap it at $200 over your bill.

side note2 this number will only increase - so really that 1 TB cap already needs to see a lift to 2 TB just to start to fulfill the requirements of the newer age web.

[1] https://help.netflix.com/en/node/306

NTripleOne · on Jan 25, 2017

> $10 per 50 GB

Wow, that's disgusting.

Back when I was on cheap and cheerful, capped internet - if you went over the (paltry 100GB, although this was the best part of a decade ago now...) cap, I'd pay an extra £6 on my bill and that was it, regardless of how much I went over by.

Ironically, the price to remove the data cap as part of your package was more expensive than the over-cap charge.

xj9 · on Jan 24, 2017

i nearly reached my cap in five days copying data from my NAS to a cloud server last week. i only have ~100GB left for the rest of the month.

stordoff · on Jan 24, 2017

If you download games, you can cut into a 1TB cap pretty quickly. Last I checked, Halo 5 on Xbox One was >90GB.

ReverseCold · on Jan 24, 2017

ATT doesn't have a cap for gigabit customers. Then again I also have a choice of one other ISP, so I will just switch again if they introduce a cap.

tracker1 · on Jan 24, 2017

For me the choice is cable or dsl... and dsl is pretty painfully slow, and the carrier is much less responsive to issues. Fortunately my cable provider doesn't charge much more for a business connection ($140 vs $90, compared to comcast charging 3x as much). Cox tends to respond to technical issues within hours (onsite) and not days, and I don't have any cap issues. Anything gray (tv torrents) I now use a seedbox or vpn for.

takeda · on Jan 25, 2017

I like https://named-data.net/ approach better, although that requires infrastructure.

In short their approach is that instead of connecting point to point and using addresses of hosts, what if we could address based on the data we want.

This suddenly makes routers aware of what data they are forwarding, that knowledge allows them to start caching and reuse the same data packets when multiple people request the same data.

Things like multicasting or multi path forwarding are simpler. Interestingly, the more people are viewing the same video for example the better, essentially CDN is no longer needed.

Most of use cases how we currently use the Internet, are actually easier to do this way.

Things that are harder though (but not impossible to do) is point to point communication, such as SSH.

_pctq · on Jan 25, 2017

This sounds super cool, thanks!

(note to others: the description of project can be found here https://named-data.net/project/ )

Actually, what could be done without "rebooting" the whole internet is to use caching relays, which would operate just like DNS servers do, but caching content instead of IPs (maybe it's what they're suggesting, it was unclear from general description).

EDIT : but I wonder if I'm really ready to deal, as a webdev, with content propagation based on TTL, like we do with dns zones :D

EDIT 2 : actually, this is a non issue. TTL is there because when caching something as short as a IP string, you don't want to have to issue a request to parent node. But when caching assets, it won't be that a problem to issue a HEAD http request to asset owner to ask if the cached content should be invalidated.

digi_owl · on Jan 24, 2017

The likes of IPFS and Freenet work on this principle.

kybernetikos · on Jan 24, 2017

You can implement something very like torrent using web tech.

https://webtorrent.io/

cesarb · on Jan 24, 2017

An interesting tidbit: the original bittorrent implementation mimicked the download dialog for a browser of that era (back then, each download opened a new dialog box, with its own progress bar). That is, the user experience for a normal download (click on a .mov) and for a bittorrent download (click on a .torrent which downloads a .mov) was almost identical.

some-guy · on Jan 24, 2017

I would agree with you 100%, but first you'll have to convince Comcast to lift my 1TB data cap. :)

bnjms · on Jan 24, 2017

You have that exactly backwards. First we have to make it an issue that will place market pressure on the provider to raise data caps. Until their customers begin to consider leaving or even relocating to other solutions the ISPs will not look for a solution that allows greater data caps. (Also 1TB sounds reasonable to me. As long as none of the data is throttled.)

tracker1 · on Jan 24, 2017

When 4K takes off, 1TB is no longer reasonable. Especially when the ISPs are now double-dipping to get money from Netflix and the like.

bnjms · on Jan 25, 2017

Exactly. When 4K takes off that will do more than anything to remedy the situation.

scarface74 · on Jan 24, 2017

That and the abysmal upload speeds of cable internet providers. Comcast's 1Gbps option has 35mbps upload speeds and they charge more than both Google and AT&T. I pay $70 a month for 1Gbps symmetrical upload and download from AT&T - flat rate no additional fees or taxes. On my best computer, I can usually get 950Mbps+ up and down wired. My house is wired for gig-e throughout.

patmcguire · on Jan 24, 2017

That's a cartoonishly large data cap. How are you getting that?

craftkiller · on Jan 25, 2017

Comcast, in late 2016, rolled out a nation-wide data cap at 1TB/month without any change in billing, but offered our unlimited data back to us for another $50/month.

jagermo · on Jan 24, 2017

Oh god, I hope not. Not because the technology is not awesome, but because of laywers. Especially in Germany they target torrent users and try to sue them like crazy.

If you download, for example, a Windows.iso from technet and start sharing it via the integrated p2p system, I can see a shitstorm of lawsuits raging through the web.

thirdsun · on Jan 24, 2017

I doubt that very much. Wouldn't a wide spread use of torrents help to legitimize the technology?

sametmax · on Jan 24, 2017

My thoughts precisely.

fuzzy2 · on Jan 24, 2017

That’s not a good example. You simply don’t have the rights to redistribute a Windows ISO, even if you need a license to use it. So a C&D is justifiable. Microsoft usually doesn’t care though.

You’ll be perfectly fine sharing legal stuff like Linux ISOs or mods for your favorite game using BitTorrent 24/7.

buildbuildbuild · on Jan 24, 2017

Discriminating against users based on the protocol they use rather than by the content they distribute, be it by rate limiting or by angry legal letters, holds back innovation in my opinion.

Users should not fear a content-agnostic protocol.

Mass adoption of bittorrent would render this type of discrimination much more difficult.

adrianN · on Jan 24, 2017

Even if you're not pirating stuff, it seems very difficult to automatically seed things you download. Even if you have the legitimate right to download something that doesn't necessarily mean that you also have the right to distribute it. Everything you download would have to have an appropriate licence.

WorldMaker · on Jan 24, 2017

«open the door to many cool P2P stuff on the client side»

I think this is a part of the exciting proposition intended by WebTorrent (webtorrent.io) in having an entirely JS/client-side-capable BitTorrent stack that can run directly inside a browser (or in Node or Electron or Cordova apps) with potentially minimal setup.

aphextron · on Jan 24, 2017

The problem with WebTorrent right now is interop with regular bittorrent clients. As it is, a desktop client cannot seed to a WebTorrent client. greatly reducing the usefulness of it. Hopefully this can be solved.

WorldMaker · on Jan 24, 2017

There are several desktop clients that support the "bridge" between WebTorrent and BitTorrent. This includes WebTorrent Desktop [1] prominently linked on the WebTorrent site, but also some "traditional" clients like Vuze have started supporting that, too.

[1] https://webtorrent.io/desktop/

loeg · on Jan 24, 2017

It's not nearly as well documented as HTTP RFCs, for example. Is it even a standard? I know there are some documents published by the author and others, but I was not aware any standards body had ratified it.

discreditable · on Jan 24, 2017

BitTorrent can cheat a little with web seeds. It treats an HTTP host as a "seed" in the swarm. https://en.wikipedia.org/wiki/BitTorrent#Web_seeding

derefr · on Jan 24, 2017

Speaking of web seeds, the magnet: URI scheme—with an embedded web seed—has everything it needs to be an all-in-one subresource integrity + CDN-neutral subresource caching mechanism. If browsers could resolve magnet: links natively, you could just link all your page's resources as magnet: resources with the copies on your own servers acting as the web seeds.

kiliankoe · on Jan 24, 2017

> While clients (probably) could ask for chunks in a chronological order and then get the data in a sequential manner [...]

Why probably? This is exactly how the immensely popular Popcorn Time worked.

fnord123 · on Jan 24, 2017

It's literally in the same sentence. In the part that you elided:

"it will not at all distribute the load as evenly among the peers as the "random access" method does."

If people only stay to download the file and then disconnect then there will be a higher concentration of parts from the beginning of the file than to the end. This reduces the resiliency of the swarm.

kiliankoe · on Jan 24, 2017

I only wanted to bring up that the beginning of the sentence made it sound like a theoretical concept that might work, but no one has ever tried or seen in practice. The concerns are definitely valid.

Retric · on Jan 24, 2017

Depends on the algorithm but you need not go 100% one way or the other. The basic rule can be 70% download next part and 30% download rare bits. The swarm is slightly less resilient, but seeders are still generally uploading the rare bits not the start of the file. This also tends to make a faster swarm as new downloads have something to trade.

WhiteOwlLion · on Jan 24, 2017

You could have an algorithm change how pieces are requested based on availability (seed to peer ratio). If a torrent has high availability, then there is little harm in retrieving pieces sequentially.

wjh_ · on Jan 25, 2017

While I admit that this is less than ideal, it's not quite as bad as it may seem.

For instance, if it's a movie being streamed, the majority of connections will finish downloading the movie long before the user finishes watching, making a pretty decent seed time. Those with a slower connection wouldn't have made a particularly great seed anyway.

dest · on Jan 24, 2017

Side note: AFAIK Popcorn time generally uses very crowded swarms (e.g. YIFY torrents) so chronological downloads do not kill the swarms.

shawabawa3 · on Jan 24, 2017

chronological downloads never really kill swarms. If everyone is downloading chronologically, then the earlier pieces are more in demand, but they also have more supply

The only thing that kills swarms is average seed ratio < 1

numerlo · on Jan 24, 2017

They do as the supply of the final pieces is low and if they go offline the torrent could die. Happens a lot in less popular torrents.

geoah · on Jan 24, 2017

The non-random fetching is also how super-seeding works [1].

https://en.wikipedia.org/wiki/Super-seeding

anonymfus · on Jan 24, 2017

Most clients support this in some way. In qBittorrent it's a context menu option, for example.

masklinn · on Jan 24, 2017

Likewise Tixati, on a per-file basis you can ask for a sequential download (prioritise downloading the file in sequential order) and "head and tail" (further prioritise the first and last 500k, as some file formats require a header and a footer to start using).

motoboi · on Jan 24, 2017

He forgot to mention a very important aspect of bittorrent: the uTP (micro-TP) protocol.

uTP is a very nice bittorrent over UDP protocol implementing LEDBAT congestion control algorithm [1].

1 - https://en.m.wikipedia.org/wiki/LEDBAT

JepZ · on Jan 24, 2017

Everything there looks very pro HTTP like... But hey, curl doesn't support bittorrent afaik. So why should haxx.se care ;-)

ryancox · on Jan 24, 2017

Interesting link in that Wikipedia article [1] to Apple's LEDBAT implementation used for downloading updates.

[1] https://opensource.apple.com/source/xnu/xnu-1699.32.7/bsd/ne...

_ofdw · on Jan 24, 2017

One thing I always fail to see in these types of comparison articles is that BitTorrent will happily destroy any ability for those sharing your connection to use it if you let it.

Probably due to the sorry state of affairs that is the consumer router space, running BitTorrent renders even web browsing painfully slow and sometimes completely nonfunctional.

Why BitTorrent does this while normal http transfers do not is not clear to me. Perhaps due to the huge number of connections made.

Either way, when given a choice I'll always take a direct HTTP transfer over a torrent, for no other reason than the fact that I'd like to be able to watch cat videos while the download completes.

pktgen · on Jan 24, 2017

This is caused by bufferbloat, which you can read more about at https://www.bufferbloat.net/projects/bloat/wiki/Introduction.... The good thing is it's easy to fix.

On your upstream side (i.e. data you send out to the Internet), you can control this by using a router that supports an AQM like fq_codel or cake. Rate limit your WAN interface outbound to whatever upstream speed your ISP provides. This will move the bottleneck buffer to your router (rather than your DSL modem, cable modem, etc., where it's usually uncontrolled and excessive) and the AQM will control it.

Controlling bufferbloat on the downstream side (i.e. data you receive from the Internet) is more difficult, but still possible. You can't directly control this buffering because it occurs on your ISP's DSLAM, CMTS, etc., but you can indirectly control it by rate limiting your WAN inbound to slightly below your downstream rate and using an AQM. This will cause some inbound packets to be dropped, which will then cause the sender (if TCP or another protocol with congestion control) to back off. The result will be a minor throughput decrease but a latency improvement, since the ISP-side buffer no longer gets saturated.

the8472 · on Jan 24, 2017

While bufferbloat is an issue it's not the only one. Poorly designed NAT implementations easily suffer from connection tracking table saturation due to the many socket endpoint pairs they have to track when you're using bittorrent. Doubly so when using bitrorent with DHT.

pktgen · on Jan 24, 2017

That's a good point.

The best thing to do is avoid home networking equipment entirely. The Ubiquiti EdgeRouter products are cheap and good, as is building your own router and sticking Linux/*BSD/derivative distros on it.

richbhanover · on Jan 25, 2017

@pktgen Thanks for the good note. You've nailed the science behind this and the proper fix. In your note below, you also note that Ubiquiti router firmware has fq_codel/cake.

I'd like to mention that both LEDE (www.lede-project.org) and OpenWrt (www.openwrt.org) were the platforms used for developing and testing fq_codel/cake. That means that people may be able upgrade their existing router to eliminate bufferbloat.

My advice: if you're seeing bufferbloat (and a great test is at www.dslreports.com/speedtest) then configuring fq_codel or cake in your router is the first step for all lag/latency problems.

xenophonf · on Jan 24, 2017

Thanks for posting this reply. Implementing a traffic shaping policy similar to the above fixed a problem for me today.

etatoby · on Jan 24, 2017

If you can put a Linux router between your LAN and the uplink, you can use the Wonder Shaper.

http://lartc.org/wondershaper/

I've been using it for 15 years and it's still working great. Even with multiple P2P clients, stuff like HTTP, SSH, and gaming keep a low latency. Also you learn a lot about networks just by configuring it :-)

dspillett · on Jan 24, 2017

> Why BitTorrent does this while normal http transfers do not is not clear to me.

Two key reasons, usually, both related to congestion control (or practical lack thereof).

> Perhaps due to the huge number of connections made.

This is one of those reasons: unless the other end of your incoming connection is prioritising interactive traffic somehow packets for each stream will get through at more or less the same rate once the connection is saturated. So if you have a SSH link and are requesting a http(s) stream (for a web page or that cat video) while a torrent process has 98 connections getting data, for every 100 packets down the link only two are for your interactive process. On fast enough link this isn't an issue, but "fast enough" needs to be "very fast" in such circumstances as it is relative to the combined speed of all the hosts sending data. You can mitigate this by telling the torrent client to use minimal incoming connections (limiting incoming bandwidth can have some effect but is generally ineffective as bandwidth limits like that need to be applied on the other side of the link).

The other problem is due to control packets such as those for connection handshakes and so forth fighting for space on the same link as those carrying data. As soon as the connection is saturated in either direction so that there are packets queued for more than an instant, latency in both directions takes a massive hit. This is particularly noticeable on asymmetric links such as many residential arrangements. You can mitigate this by throttling the outgoing traffic either within the torrent client or at other parts of the network (assuming the traffic isn't hidden in a VPN link that means you can't reliably distinguish it from other encrypted packets) and reserving some bandwidth for giving priority to interactive traffic and protocol level control packets but you have very little control (usually practically none) over traffic coming the other way as you the measures have to be taken before the packets hit the choke point and you don't control those hosts your ISP does (they will implement some generic QoS filtering/shaping but more than that requires traffic inspection which we don't want them to do, and they don't want responsibility either legally or in terms of providing/managing relevant computing capacity).

(the above is a significant simplification - network congestion is one of those real world things that quickly gets very complicated/messy!)

anc84 · on Jan 24, 2017

Simply limit your client's allowed number of connections and limit its allowed bandwidth.

_ofdw · on Jan 24, 2017

Of course that works but why isn't the protocol robust enough to do this automatically?

j3097736 · on Jan 24, 2017

Because bittorrent is designed to be easy to implement to favour adoption, they could have created a more performant DHT and added something like eMule's automatic upload speed sense but that would have been more complex.

anc84 · on Jan 25, 2017

Why should it? I like that the protocol itself favours performance. Traffic shaping/QoS should be done by the maintainers of the pipes (ie routers or OS).

madgar · on Jan 24, 2017

The most common problem is with asymmetric network connections with limited upload bandwidth. If you don't limit your upload rate, BitTorrent will consume your upload bandwidth so thoroughly that even TCP ACKs for other applications aren't sent in a timely fashion.

shawabawa3 · on Jan 24, 2017

The reason is that most ISPs offer much higher download speeds than upload speeds.

Bittorrent very quickly uses 100% of your upload speed, which effectively breaks the internet. The solution is to limit upload speed in your client

offa · on Jan 24, 2017

How much is it usually advisable to limit the upload speed, in terms of a percentage of the total?

shawabawa3 · on Jan 24, 2017

It depends.

My upload speed is awful (~30KB/s), so I limit it to 5KB/s

For higher speeds you can probably get away with limiting it to 50-90%

theonemind · on Jan 24, 2017

As mentioned by someone else, yeah, this is almost certainly the TCP ack thing. If you throttle back the upload about 10KB/s under your max upload speed, it won't choke your download ability.

rakoo · on Jan 24, 2017

For a long time the answer was to throttle downloads as others have said; however today the solution is to use uTP (https://en.wikipedia.org/wiki/Micro_Transport_Protocol), which is specifically designed to yield to other applications that want better interactivity. Pretty much any recent bittorrent software implements it.

stavros · on Jan 24, 2017

Really? I'll take Bittorrent any time, because, no matter how large the transfer, I can just leave my client to download even after I close my browser. Since I have a home NAS with Deluge installed, I can even turn my PC off and the torrent will keep going (and will send a notification to my phone when done), etc.

masklinn · on Jan 24, 2017

You seem to have missed GP's point entirely.

sand500 · on Jan 24, 2017

>Some network operators do funny things, so a client can actually get higher transfer performance by connecting multiple times and transferring parts of the data in several simultaneous connections.

Actually there is a FF extension call downthemall that multithreads HTTP downloads which will infact max out your inbound speed just like torrents do. As far as router are concerned if you set a particular device with a higher QOS then the one that is torrenting you should not have the problem.

niftich · on Jan 24, 2017

Bittorrent has innate support for content addressing, while HTTP is location-addressed: there is simply no way to decouple a content's location from its identity with HTTP. You can try to fake it, of course, with more indirection, but each hop requires communications with a specific location reachable with HTTP.

In fortunate circumstances, Bittorrent, it can run fully P2P without needing 'servers' -- depending on whether you have to holepunch a NAT, whether you're okay with peer discovery taking longer in the DHT without a tracker, and whether you're okay with no fallback webseed as a seed-of-last-resort. Magnet links, a distinct concept not only applicable to Bittorrent, are the cherry on top, which enable even the ".torrent" file (or rather, the equivalent information) to be found without having to possess the .torrent file itself, from just a hash.

BT's architecture has a nice effect that popular content consumes fewer of the original host's bandwidth than it would with HTTP, but this works best if peer interest roughly coincides in time. This is why it's a good fit for distributing patches [1][2], or newly released episodic content, and a fairly poor fit for anything else.

The Bittorrent extensions (BEPs) vary widly in quality, clarity, verbosity, and rigor, and aren't always up to the level of HTTP extensions acknowledged by the IETF. Most BEPs were implemented in only one product and then submitted as a BEP retroactively, while throughout its lifetime most of HTTP's enhancements were discussed and developed in a semi-open, but publicly viewable process in the IETF with more emphasis on consensus and early prototyping, rather than final implementation and retroactive standardization. These days this process has partially been subsumed by prominent vendors implementing a behavior, running it for an extended amount of time as a proprietary enhancement, then submitting a slightly altered version to the IETF for discussion, so admittedly the two enhancement processes are a lot more similar now than they've been in the past.

[1] http://arstechnica.com/business/2012/04/exclusive-a-behind-t... [2] http://wow.gamepedia.com/Blizzard_Downloader

Forbo · on Jan 24, 2017

I really hope that things like ZeroNet start to get more attention/integration into browsers. I see amazing potential for the democratization of the web using technologies like this.

https://zeronet.io/

d33 · on Jan 24, 2017

> HTTP is an established and formalized protocol standard by the IETF.

> Bittorrent is a protocol designed by the company named Bittorrent and there is a large amount of variations and extensions in implementations and clients.

Extensions are in many cases standarized. There's a lot of variations in HTTP as well.

One important thing that's missed is that a magnet link also contains a hash of the torrent file, which points to a specific file. Over HTTP, everyone across the network connection can change the contents of the file and there's no automated way of detecting that.

robryk · on Jan 24, 2017

HTTP is standardized, but lots of stuff doesn't follow the standard in the areas that didn't use to matter. Then we get problems like this:

https://bugs.chromium.org/p/chromium/issues/detail?id=616212

that have to be worked around by assuming that other devices might not adhere to the specification.

RedHatTurtle · on Jan 25, 2017

Isnt IPFS the best adaptation of "bittorrent ideas" to the http general use case?

On an unrelated issue, it baffles me how people with uncapped internet refuse to help serve content to other people, like enabling the distribution of windows updates. I would love to have that on linux and would never dare to turn it of unless is was reaching my data cap. It's just liek seeding you torrents, hell i have seed ratios of over 100 for some linux distro images.

LinuxBender · on Jan 24, 2017

My personal preference would be MPTCP, then BitTorrent, Googles idea for UDP HTTP, then plain TCP HTTP.

Many things can benefit from MPTCP however, including BitTorrent.

sleepychu · on Jan 24, 2017

Not sure if Daniel reads this but:

>only gets a few

/s/gets/has/

sigi45 · on Jan 24, 2017

So much wrong/missing and missleading information. It would be better to delete it.

inconclusive · on Jan 24, 2017

Why we still use HTTP is beyond me. And I don't mean about the speed issues. Why have a protocol that's so complicated when most of the things we need to build with it are either simpler or reimplement parts of the protocol.

mreithub · on Jan 24, 2017

Could you elaborate your issues with HTTP a bit? What kind of protocol would do a better job?

Minimal implementations of HTTP (and I'm strictly talking about the transport protocol, not about HTML, JS, ...) is dead simple and relatively easy to implement.

Of course there's a ton of extensions (gzip compression, keepalive, chunks, websockets, ...), but if you simply need to 'add HTTP' to one of your projects (and for some reason none of the existing libraries can be used) it shouldn't take too many lines of code until you can serve a simple 'hello world' site.

On top of all that, it's dead simple to put any one of the many existing reverse proxies/load balancers in front of your custom HTTP server to add load balancing, authentication, rate limiting (and all of those can be done in a standard way)

Furthermore, HTTP has the huge advantage of being readily available on pretty much every piece of hardware that has even the slightest idea of networking. Any new technology would have to fight a steep uphill battle to convince existing users to switch.

Have I mentioned that it's standardized and open?

lmm · on Jan 24, 2017

I do see a lot of application protocols tunnelled over HTTP that have no sane reason to be. Partly to work around terrible firewalls/routers/etc. - but of course the willingness to work around those perpetuates their existence. E.g. one reason for the rise of Skype was so many crappy routers that couldn't handle SIP.

ue_ · on Jan 24, 2017

My friend once mentioned that FTP would be a good option, I'm not sure why though. I think they regarded HTTP as superfluous for the purpose of what we use the web for.

protomyth · on Jan 24, 2017

Anyone who has to deal with ftp on firewalls would say ftp (and ftps) would do well to disappear from the world.

mreithub · on Jan 24, 2017

It's not just firewalls. The fact that (unencrypted) FTP is still widely used today when better alternatives like SFTP (via SSH) have existed for years strikes me as odd.

(I'm speaking about authenticated connections. For anonymous access - which should be read-only anyway - you're usually better off using HTTP anyway)

ashark · on Jan 24, 2017

I once had to provide an FTP-like interface to user directories for the website's users. Couldn't find an easy way to do it with SFTP without creating Linux users. Found an FTPS daemon that would let me call an external script for auth and set a root directory, which made it trivial (once I deciphered its slightly-cryptic docs).

So in that case, at least, I was very glad FTP(S) was still around.

shakna · on Jan 24, 2017

HTTP tends to be faster for what we use the web for: [0]

FTP does have some advantages, but HTTP has more advanced support for resuming connections, virtual hosting, better compression, and persistent connections, to name a few.

[0] https://daniel.haxx.se/docs/ftp-vs-http.html

mreithub · on Jan 24, 2017

Interesting...

I bet if we had used FTP instead of HTTP for serving HTML right from the start, FTP would today have all of the same extensions and the same people would argue for it being too bloated :) (HTTP started as pretty minimalistic protocol back in the day)

I often find the discrepancy between what HTTP has originally been designed for (serving static HTML pages) and all the different things it's being used for today highly amusing. Yes, some of todays applications for HTTP border on abuse, but its versatility (combined with its simplicity) fascinates me.

kodfodrasz · on Jan 24, 2017

No, because FTP is stateful, thus it sould not have scaled well to many HTTP usecases of today, and something alse would have been born probably, to solve the problems with statelessness, which can be solved by statelessness.

The two success factors of http are statelessness and fixed verbs.

kodfodrasz · on Jan 24, 2017

Nice trolling! :)

http://i3.kym-cdn.com/photos/images/original/000/732/170/796...

HTTP is quite a good protocol. Simple, extensible to a sane extent, but not overly extensible (XMPP i'm thinking about you). HTTP is not accidentally successful. FTP is a bad joke. (stateful. binary mode, 7 bit by default. uses multiple connections (unless in passive mode))

jaimehrubiks · on Jan 24, 2017

Basic http is dead simple, it works, and it also has many addons with backward compatibility (one can still use a basic http client or server in most cases) and even new version fully optimized to nowadays needs (and even in binary form)

shakna · on Jan 24, 2017

I'm not as familiar with HTTP 2, so I'll only talk about the previous specifications...

Which are dead simple to construct, send, receive and parse.

Really.

For example, let's curl -L (view everything but the body) for the spec: http://www.ietf.org/rfc/rfc7230.txt

HTTP/1.1 200 OK

Date: Tue, 24 Jan 2017 12:00:55 GMT

Content-Type: text/plain

Transfer-Encoding: chunked

Connection: keep-alive

Set-Cookie: __cfduid=df57c7720b704a40e4c3367bbe248771c1485259254; expires=Wed, 24-Jan-18 12:00:54 GMT; path=/; domain=.ietf.org; HttpOnly

Last-Modified: Sat, 07 Jun 2014 00:41:49 GMT

ETag: W/"3247b-4fb343e4dcd40-gzip"

Vary: Accept-Encoding

Strict-Transport-Security: max-age=31536000

X-Frame-Options: SAMEORIGIN

X-Xss-Protection: 1; mode=block

X-Content-Type-Options: nosniff

CF-Cache-Status: EXPIRED

Expires: Tue, 24 Jan 2017 16:00:54 GMT

Cache-Control: public, max-age=14400

Server: cloudflare-nginx

CF-RAY: 326353e6a6a257a7-IAD

A bunch of newline, (CRLF), seperated key-value mappings. Some with a DSL (Such as Set-Cookie).

It gives you a status message instantly, a date to check against cache, a Content-Type for your parser, acceptable encoding, for your parser, a bunch of other values for your cache. All for free.

As for the body of the content? For a gzipped value like this, it's everything outside the header, until EOF. That's not quite as easy as when the content-length parameter is given, but hardly difficult for parsing.

HTTP is easy.

In fact, HTTP is so easy, that in-complete HTTP servers can still serve up real content, and browsers can still read it.

HTTPS is more complicated, but if you simply rely on certificate stores and CAs, it becomes much easier, but HTTPS is a different protocol.

motoboi · on Jan 24, 2017

> As for the body of the content? For a gzipped value like this, it's everything outside the header, until EOF. That's not quite as easy as when the content-length parameter is given, but hardly difficult for parsing.

This is chunked and keep- alive. Things get a little trickier

shakna · on Jan 24, 2017

True, you keep the connection open, and receive a length of expected bytes, and then said bytes, until 0 is sent. Still simple enough that there are a dozen implementations of less than a page, only a search away.

thrillgore · on Jan 24, 2017

Based on my Snort statistics, a small but increasing number of sites and service providers are starting to expose support for the QUIC protocol.

recrof · on Jan 24, 2017

QUIC is not application-layer protocol.

ainiriand · on Jan 24, 2017

Why we use HTTP you ask? Because it works. How about that.