
Instant.io – Streaming file transfer over WebTorrent - zerognowl
https://instant.io/
======
jzelinskie
I work on some BitTorrent software and while it's a really cool protocol, it
isn't designed to sequentially stream data. Some clients support streaming,
but the act of prioritizing sequential chunks of data rather than chunks that
are most likely to be unavailable in the future is bad behavior for the
collective group of peers.

I haven't personally given much thought to solving the problem of streaming,
but I am surprised that the WebTorrent FAQ doesn't mention why they didn't
take this opportunity to design a protocol that has more suitable trade-offs
than BitTorrent. I'm getting mixed messaging; is their goal to connect the
BitTorrent network with WebRTC or enable high quality P2P streaming via
WebRTC?

~~~
feross
Hi, creator of WebTorrent here.

> [BitTorrent] isn't designed to sequentially stream data

We’re working on improving the algorithm to switch back to a rarest-first
strategy when there is not a high-priority need for specific pieces. In other
words, when sufficient video is buffered, there’s no need to deviate from the
normal piece selection algorithm.

But the fact is that with the speed of today’s internet connections, the user
is going to finish fully downloading the torrent in a fraction of the time it
takes to view it, so they will still spend more time seeding than downloading.

In practice, the only time that the rarest-first algorithm is important is on
poorly-seeded torrents, or in the first few hours of a torrent being published
when the ratio of seeders to leechers is really bad. I plan to keep improving
the piece selection algorithm so that WebTorrent can be a good citizen.

Also: you should note that not all WebTorrent users stream sequentially.
That's just one option for downloading the data.

Also: It's noteworthy that BitTorrent Inc.'s official torrent client (as well
as the largest player by marketshare), uTorrent, offers sequential
downloading, as well as selective file downloading. And the BitTorrent network
remains very healthy.

> why they didn't take this opportunity to design a protocol that has more
> suitable trade-offs than BitTorrent

BitTorrent is the most successful, most widely-deployed P2P protocol in
existence. It works _really_ well. My goal with WebTorrent was to bring
BitTorrent to the web in a way that interoperates with the existing torrent
network.

Re-inventing the protocol would have made WebTorrent fundamentally
incompatible with existing clients and prevented adoption. The way we've done
it is better. The wire protocol is exactly the same, but there's now a new way
to connect to peers: WebRTC, in addition to the existing TCP and uTP.

Also, re-inventing the protocol is a huge rabbit hole. There was already a lot
of risk when I started the project -- will WebRTC get adopted by all the
browser vendors? Will data channel stabilize and be performant? Is JavaScript
fast enough to re-package MP4 videos on-the-fly for streaming playback with
the MediaSource API? My thinking was: Why add inventing a new wire protocol
and several algorithms to the table?

Thanks for your thoughtful comment. Hope you'll give WebTorrent and our new
desktop app, WebTorrent Desktop a try!

~~~
the8472
> We’re working on improving the algorithm

That sounds like you prioritized implementing streaming first over being a
good citizen.

> Also: It's noteworthy that BitTorrent Inc.'s official torrent client (as
> well as the largest player by marketshare), uTorrent, offers sequential
> downloading, a

To my knowledge that is only available if the swarm condition allows and is
not purely sequential. But that is second-hand knowledge, so I may be wrong.
But either way, _the default_ is rarest-first.

~~~
chias
> That sounds like you prioritized implementing streaming first over being a
> good citizen.

I read that as "it sounds like you prioritized getting a working proof-of-
concept first over working out the long-term details".

~~~
the8472
But you don't need streaming for a bittorrent-over-webrtc PoC.

And those "long term details" are implemented by all bittorrent clients, so
they're hardly something novel that needs figuring out.

~~~
chias
There's a difference between an industry-wide proof of concept and a personal
one. If I was making a text editor, I would begin by focusing on making a
proof of concept that I could accomplish the features that I wanted in the way
that I wanted to do it -- it wouldn't help much to say "emacs has that feature
so that's fine".

These long term details are implemented by all _established_ bittorrent
clients. I would bet that version 0.1alpha of many of them did not, but were
rather in a state of "holy moly this works! I should go show HN".

~~~
the8472
Webtorrent is 2 years old and seems to have several active contributors, do
you really think the "0.1 prototype" argument applies here?

Not to mention we're not talking about some optional, nice-to-have feature
here, we're talking about a core aspect of bittorrent which gives it
robustness.

Also, you forgot to address my other argument.

~~~
chias
Yep! I didn't address your other argument because I accept it and there's
nothing about it I disagree with. I don't disagree with any of what you just
said, either.

I just wanted to point out that "streaming over web-torrents" is the feature
being demo'd here, which means that (a) it's a new feature (I assume?) to this
project / these developers, and (b) it's clearly something they feel is a
nice-to-have feature, because they not only chose to spend time making it, but
also announced to HN when they had a working PoC. If people never posted
something to HN until they were "100% complete", I think this place would be a
lot less interesting than it is.

------
ckdarby
I see a lot of points about how this isn't exactly the best practice but I
still don't follow why it isn't the best practice.

Say you're already running a video sharing site and your servers are serving
up all the content to the clients. So, you add your servers as seeders. The
client comes in with support for webRTC, requests packets in order, gets your
servers as seeders along with a couple other people watching the video and
everyone goes along their merry way.

The rare portions don't seem to be an issue because your servers are always
seeds, always running, and already have the capacity to support all the
demand.

Is this not a win/win to reduce some bandwidth consumption?

~~~
rakoo
Absolutely, all the talk about rejecting streaming really concerns the "true"
p2p swarms, where everybody can be a seeder and everybody can be a leecher,
and there is only one "true" source, the original seeder. In those cases the
peers can go down at any moment in time so it is very important for the swarm
vitality that pieces be distributed as efficiently as possible.

Your scenario is more or less the same as what we have today for those swarms
that are comprised of many peers on the desktop and a few high-speed always-on
seedboxes that already act like some kind of CDN.

The more seeders there are, the better, in any situation. The question is
whether the swarm we're talking about is whether you can expect some seeders
to be relatively long-lived (in which case streaming is ok) or if we are in a
free-for-all (in which case streaming is not). Not all swarms are of the first
type, far from it.

------
amlib
Tried to stream a 20GB mkv and it ate all my memory until the oom killer took
over =(

~~~
nashashmi
Yup, It uses the browser's memory to transfer files.

Until browser apps can be given permission to access the file system, this
will be yhe case.

~~~
EE84M3i
We have this today, it's just that this site doesn't support it. Only place
I've seen this used (other than thumbnailing images that are drag and dropped
on imgur and whatnot) is on Mega.

Spec: [https://www.w3.org/TR/FileAPI/](https://www.w3.org/TR/FileAPI/)

MDN: [https://developer.mozilla.org/en-
US/docs/Web/API/File_and_Di...](https://developer.mozilla.org/en-
US/docs/Web/API/File_and_Directory_Entries_API/Introduction)

~~~
feross
I think the spec you're looking for is the FileSystem API
([https://developer.mozilla.org/en-
US/docs/Web/API/File_and_Di...](https://developer.mozilla.org/en-
US/docs/Web/API/File_and_Directory_Entries_API)), not the File API.

Unfortunately, this is a non-standard API implemented only by Chrome. You have
to use other APIs like IndexedDB and WebSQL (also deprecated and non-standard)
to get a working solution in all browsers.

This deficiency is really holding back the web.

~~~
erbbysam
Yup, this works -
[https://github.com/erbbysam/webRTCCopy/blob/master/client/js...](https://github.com/erbbysam/webRTCCopy/blob/master/client/js/file-
io.js) (not the cleanest code)

You can then use idb.filesystem.js to add api support for firefox etc. Search
the file above for "is_chrome" for a few idb.filesystem.js-specific quirks.

Looking at that page, it looks like firefox will ship with support in version
50?

~~~
szimek
I'm using idb.filesystem.js in
[https://www.sharedrop.io](https://www.sharedrop.io), so that only very small
part of the transferred file is stored in memory, but then without asking
users for permission (i.e. using non-persistent storage) you "only" get ~4GB
(not sure exactly, I tested it with files up to 1.5GB).

------
agumonkey
Cool, a little more standalone than [http://file.pizza](http://file.pizza)

tests:

[https://instant.io/#74ce2f164e3d9ec5d5ee72c9aafc0cf5860e3d92](https://instant.io/#74ce2f164e3d9ec5d5ee72c9aafc0cf5860e3d92)

[https://instant.io/#c241674dc3b257637abfcb08203303fc25de007f](https://instant.io/#c241674dc3b257637abfcb08203303fc25de007f)

------
stryk
I recently found this service: [https://reep.io](https://reep.io) which, I
believe, uses WebRTC to directly transfer between 2 browsers (they claim that
after the initial 'handshake' they are out of the equation). I'm curious how
it compares with Instant.io for simple file sharing use cases (example: send
my mom a movie of my kids that is too large of a filesize to email, in a
manner sufficient for a non-technical person to be able to easily receive,
view, and save)

------
arj
Does anyone know if this works privately. Or if there is a good way to seed
files only between friends?

~~~
rakoo
Introduce something only you and your friends know. Zipping with 0 compression
and a password or a random file will make the swarm completely independent
from any other one.

------
petre
Never worked for me. Tried to downliad Debian, no luck.

------
caub
why not using simply webrtc?

------
mirimir
Cool! But be aware WebRTC leaks public IP address for VPN users, and also
leaks hashes of device IDs.[0] And in Chrome, it's very hard to block. This is
a dangerous mix with talk of torrents :(

[0] [https://www.browserleaks.com/webrtc](https://www.browserleaks.com/webrtc)

Edit: From feross I get that WebRTC no longer leaks ISP-assigned IPs when
using VPNs.

~~~
feross
> WebRTC leaks public IP address for VPN users

This is incorrect.

WebRTC data channels do not allow a website to discover your public IP address
when there is a VPN in use. The WebRTC discovery process will just find your
VPN's IP address and the local network IP address.

Local IP addresses (e.g. 10.x.x.x or 192.168.x.x) can potentially be used to
"fingerprint" your browser and identify across different sites that you visit,
like a third-party tracking cookie. However, this is a separate issue than
exposing your real public IP address, and it's worth noting that the browser
already provides hundreds of vectors for fingerprinting you (e.g. your
installed fonts, screen resolution, browser window size, OS version, language,
etc.).

If you have a VPN enabled, then WebRTC data channels will not connect to peers
using your true public IP address, nor will it be reveled to the JavaScript
running on the webpage.

At one point in time, WebRTC did have an issue where it would allow a website
to discover your true public IP address, but this was fixed a long time ago.
This unfortunate misinformation keeps bouncing around the internet.

There's now a spec that defines exactly which IP addresses are exposed with
WebRTC. If you're interested in further reading, you can read the IP handling
spec for yourself.

[https://tools.ietf.org/html/draft-ietf-rtcweb-ip-
handling-01](https://tools.ietf.org/html/draft-ietf-rtcweb-ip-handling-01)

~~~
mirimir
Thank you. That's good to know. So is that now the case in all browsers?

~~~
feross
It's the case in Chrome, Firefox, and Brave. I assume Opera is the same since
it uses Chromium under-the-hood. I don't know about Microsoft Edge.

