Hacker News new | past | comments | ask | show | jobs | submit login
Tus.io: Open Protocol for Resumable File Uploads (tus.io)
72 points by chrisfarms on Nov 18, 2015 | hide | past | favorite | 28 comments



It will need to monitor and induce at most 20-50ms bufferbloat when uploading, regardless of whether the traffic is high or low priority.

Try this demo for yourself:

1. Run "ping google.com" from another computer on your LAN.

2. Upload a 10-20MB file via Gmail or Dropbox from your computer.

3. Watch the ping times on the other computer skyrocket from around 100ms to upwards of 5-10 seconds.

4. Try a Google search from any other computer on your LAN while this is happening.

As an example, Apple's software update actually uses a variant of LEDBAT (the delay sensitive congestion avoidance algorithm from BitTorrent's protocol) when downloading software updates to avoid inducing bufferbloat in the downlink.


When you think about it, it's pretty weird that we've had resumable, buffered streaming video modules for server-side stacks for years now, but nothing standardized for sending data in the other direction.

I guess getting things into the server stack can roll out slowly, with very limited and spotty adoption.

Browser support on the other hand...

Well, need I continue?

Suffice to say, that even without the open warfare between rival browser projects, and nightmare-mode security vulnerabilities in common browser plug-ins, it will only make a difference if hundreds of millions of users actually latch onto a new protocol, before it becomes something of a norm.


If anyone is interested in prior art, take a look at Google Cloud Storage's resumable uploads: https://cloud.google.com/storage/docs/json_api/v1/how-tos/up...

It's very domain specific and relies on GCP's JSON API so probably isn't suitable to apply broadly.


Hi, one of the authors here. Indeed, resumable uploads are not a thing we invented. Nevertheless, we think it's handy to have a document describing the approach and client and server implementations. Amazon, Dropbox and Google offer some in the way of describing how they do it, but nothing to address the problems that wider audience could encounter, and no platform for collaboration to make it better. tus was built under MIT and on GitHub and will continue to evolve through contributions from the community (although we consider that what we cemented into 1.0.0 ready for adoption)


Resumable file transfers in JS is really cool.

But why a totally new protocol? Why not use bittorrent, rsync, or even ftp? A lot of effort has gone into making those reliable in the face of network congestion. Trying to shove a lot of bits down a single TCP connection with no traffic shaping is just asking for bufferbloat to strike.


Because these protocols are unusable from the perspective of JS on a webpage in a browser. Yes, browsers do support FTP, but I don't believe there are any browsers that support FTP uploads (and even if there are any, I doubt it'd be possible to upload from JS).

HTTP also has the advantage of being allowed through virtually every firewall.

It seems that webapps are paving the way of the future, for better or worse.


Hi, one of the authors here. Thanks for raising a valid concern. Vimeo in fact already offers FTP uploads for premium users, but helped us write this as the advantages of something HTTP based are significant. We can:

- be sure that browsers and servers already speak it (so small additional libraries required on both ends)

- gracefully degrade to regular form POSTs should tus support not be possible (IE8, etc)

- easily write hooks for progress bars, and e.g. encoding of the uploaded material

- use existing holes in firewalls. FTP requires many ports (for PORT, DATA, PASV) that are blocked on airports, public libraries, large corporations

- add existing HTTP components to make tus better (loadbalancing, intrusion detection, auth, proxies, etc). Some of these apply to FTP as well, but the HTTP options are more (advanced).

Bittorrent is not really client-server oriented, which is what this protocol is trying to solve. Rsync has many the same disadvantages as FTP listed above.

As for a single TCP connection with no traffic shaping, tus support splitting up a file into multiple parts and uploading them in parallel (as regular tus uploads, so profiting from checksums, retries, and resumability) and stitching them together on the server-side by issuing a 'concat'. This means you can have as many connections as you like, and the individual chunks can complete in any order as well.

Traffic shaping is not part of the protocol. By default tus should saturate available connections.

That said we're writing down some recommendations in a separate 'developer guide' document that offers best practices for implementing and deploying tus, traffic shaping could be part of it. Feel free to weigh in here https://github.com/tus/tus-resumable-upload-protocol/pull/68


rsync was the first thing that came to my mind as well. You can even use it for entire directories of stuff.


The 100-Continue idea mentioned in the 3 year old top comment seems like a much better idea. The issues on the tus issue tracker seem to discard this prematurely.


Hi, one of the authors here. 100 Continue wasn't discarded that quickly and is part of the protocol. More info here: https://github.com/tus/tus-resumable-upload-protocol/issues/...


I saw that issue. As I read it, it discards using 100 continue to solve the original problem.


Do you have to use the tus server written in go (https://github.com/tus/tusd)? Possible to use a node.js backend?


You could implement the protocol in node.js. Here's the standards document: https://github.com/tus/tus-resumable-upload-protocol/blob/ma...

edit: Looks like someone's already done that: https://github.com/vayam/brewtus (found on http://tus.io/implementations.html, which also has links to Ruby, PHP and CoffeeScript server implementations, and Qt C++, PHP, Go, and Python clients)


Hi, one of the authors here. We're currently working on an official implementation in Node.js ES6 now here: https://github.com/tus/tus-node-server - so yes :)


As much as I'd like to see this using an existing standard, I'm very glad - my upload bandwidth is <1 megabit, and I get several disconnections a day (ADSL desync)


Hi, one of the authors here. I really hope the sites that you're using a lot will all implement tus :) I'm confident that could avoid some frustrations


Does this allow traffic shaping on the client? Or will this "stuff the pipe" once an upload starts?


Hi, one of the authors here. It would, but traffic shaping is not part of the protocol itself and by default tus will saturate available connections.

That said we're writing down some recommendations in a separate 'developer guide' document that offers best practices for implementing and deploying tus, traffic shaping could be part of it. Feel free to weigh in here https://github.com/tus/tus-resumable-upload-protocol/pull/68


I think traffic shaping is very desirable when using a websocket for a separate command-channel. The last thing you want is an upload blocking all communications over that command-channel.

Furthermore, browsers can open only a maximum number of connections at the same time (I believe the maximum on some browsers is even just 2). What if the application already uses one connection for a websocket? Will there be only 1 connection left for the upload? And what if the browser needs to download other resources in the background, such as images, fonts, etc.?

Just some concerns, good luck with the project :)


Some valid concerns indeed! You may run dedicated uploading infra which allows more connections (different subdomain/IPs), but it's something we want to be explicit about in our developer guidelines, I'll also see if we can make traffic shaping part of the js browser implementation, as this seems to need it most - thanks!


How long till you guys get a Java server library out?

Currently using AWS API for resumable uploads here.


Hi, one of the authors here. You are the first one to request it, I've documented that here https://github.com/tus/tus-resumable-upload-protocol/issues/....

None of the current core members have experience running Java servers in the real world so it's hard to see what would be involved. We're definitely open to discussing it in the issue if you like.


Why are they not prefixing their headers with "X-"?


I recall it became unnecessary at some point(?)

Might be relevant. https://tools.ietf.org/html/rfc6648


Hi, one of the authors here. Correct, we added that to the protocol FAQ too: https://github.com/tus/tus-resumable-upload-protocol/blob/ma...

The only exception is X-HTTP-Method-Override, as people have been standardizing on this, and the header is specifically meant to let people deploy tus in environments that don't support all of HTTP and are apparently hard to change.


Thanks


Out of curiosity why did you decide to use NSURLConnection instead of NSURLSession? I recall NSURLSession also had some code related to resume functionality for both uploads and downloads but my memory is a bit foggy right now.


Particularly since NSURLConnection is deprecated in the current SDK. NSURLSession has been available since IOS 7.

While NSURLSession does support resumable downloads, it isn't able to resume uploads.

The needNewBodyStream: delegate is the closest thing, and that is called when first uploading from an NSInoutStream or when a transmission error occurs for an idempotent request, or an auth challenge occurs when some of the stream has been sent. (Streams cannot be rewound.)

Background NSURLSessions can take a file reference to upload, and ideally will perform the upload "at a good time", hopefully while charging and connected to a stable network. (But it won't resume an upload, just retry if possible.)

Disclaimer: I'm "involved", but not speaking for any parties involved. Mostly because I don't get invited to those sorts of parties.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: