All heap access must be synchronized. Using pooled buffers may actually be faster because you can ensure that only one thread is accessing a specific pool. Just create a new pool, when a new thread is created.
Or, you know, each thread could allocate only when it needs something and handle its own recycling. Node.js has a particular advantage in that it isn't concurrent in any way (which, of course, has many other downsides.)
You can certainly build concurrent systems on top of Node, what it doesn't have is preemptive multi-threading, which sucks IMHO, but speaking of Go I also don't like that it does M:N multi-threading instead of 1:1 like the JVM, as the scheduling done is suboptimal.
Draining is easy. Your health check should be hitting a url like '/ping' anyway which responds with an OK if the box is in a reasonable state and willing to serve traffic.
I always add an additional check to see if a file called /tmp/down exists, and if it does, return a 500 for the health checks. Existing clients will continue to be served but that instance will get no new connections.
> If you have a better suggestions than using the Range header that will still allow clients to send multiple file chunks in parallel, I'd be very interested in it!
I don't see a way to support parallel transfers using only existing HTTP headers (without violating the HTTP spec). I would suggest maybe proposing a new header in the HTTPbis WG. For example, something like Available-Ranges that returns a ranges-specifier indicating the set of ranges that are avaiable.
This could possibly be returned as part of a 416 response when attempting to GET a file that isn't entirely available yet. A HEAD request would thus return the same thing.
> It's 100. We haven't specified GET requests yet, but a server could stream an upload in this case until all bytes have been received.
The reason I brought up a GET request is because "the metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request." (section 9.4 of RFC2616)
If you haven't got all 100 bytes yet, your GET request can't return a Content-Length of 100, thus your HEAD request shouldn't be returning 100 either.
I would have thought you would return whatever content you had available (hence the 70 bytes), but if you want to support parallel transfers, then a 416 error response indicating the available ranges might make more sense.
Your library looks great - thanks for releasing it!
S3 is an incredible offering, but since I'm working on tus.io, I'll focus on what's wrong with it : )
- Multipart chunks need to be 5 MB at least. An interrupted part cannot be resumed. This kills the mobile use case.
- Throughput to S3 is bad from outside of EC2, uploads often start at very slow speeds and won't reach the capacity of the connection in many cases.
- S3 does not let you stream/access an upload in progress easily, so you can't start to transcode a video while it's still uploading.
- The S3 API is the opposite of RESTful.
- S3 is a proprietary service, their protocols are not intendent/documented for adoption, and IMO they don't deserve great people like you making free contributions to their ecosystem.
edit: I'm not trying to say S3 isn't a good choice for many people. But our goal is to bring resumable file uploads to every iOS, jQuery, Wordpress, Drupal, Rails, etc. application in the world - S3 is not the right starting point for that.
I love S3, and we use it all over the place at transloadit.
I realized my comment sounded overly negative, so I added a clarification to my comment: Our goal is to bring resumable file uploads to the entire planet, S3 or any other proprietary protocol should not be the base for that.
> I'm also wondering how things like proxies deal with this. A lot of mobile networks have nasty transparent caching proxies in their network.
That's a good question. A http proxy could always cause issues, but most proxies should leave POST/PUT/HEAD requests untouched. That being said, we won't freeze the protocol until we had a chance to try it against a variety of mobile networks, which is why we're already starting to implement an initial iOS client over here: https://github.com/tus/tus-ios-client (not ready yet, but keep an eye on it).
> Also when uploading a file through Nginx (when the upload works correctly) it won't send anything to the backend until it has the complete data, is this the same if the connection cut half way through?
Meanwhile clients are free to choose a small chunk size for individual PUT requests (e.g. 1 MB), which will allow them to still have resumability (in 1 MB intervals) without changing their architecture.
Last but not least, we'll implement the tus protocol for our commercial uploading service, transloadit.com.
So I'm reasonably optimistic that NGINX won't be a major hurdle for the adoption of the protocol.
> TCP error correction should take care of this for most parts ...
Oh no it doesn't! We have an analytics service receiving HTTP posts from browsers all over the world as JSON. There is an astonishing amount of single bit errors going on. Usually the initial 20 bytes are okay, but after that we see all sorts of patterns including a bit flip every 8 bytes or so. Note that these will have been received at Google's appengine servers with the correct checksum. I believe that much of the cause is intermediary devices (eg performing NAT or routers) that are responsible for the corruption and recalculate the checksum putting a good checksum on what is now corrupted data.
For that service we have to use HTTP (grumble grumble IE grumble). For our regular stuff we use HTTPS where we do still see the problem but it is considerably rarer. In that case the cause is most likely the client device having problems (eg RAM bit flips, cosmic rays, overclocked/overheated CPUs etc)
All else being equal I'd recommend you add a layer of checksums as a helpful sanity check. Using SSL also does that for you, but it sees the data late.
I don't think it's an issue with "emotionalizing the subject", nor one of people getting offended, but rather one of professionalism.
That sort of language, in a context like that, reminds me of a Zed-style rant. It makes it harder for me to take it seriously, you know? The whole project ends up coming off as an amateur effort, even if that may not be the case.
I don't consider professionalism and the word "fuck" to be mutually exclusive, but at the end of the day we'll focus on what attracts people. Our current choice of words clearly fails at this goal, so we'll consider replacing it.