
Common problems with large file uploads - ananddass
http://blog.filepicker.io/post/27503593299/common-problems-with-large-file-uploads
======
zimbatm
Given the title I was expecting the article to provide a solution.

From personal experience, the bigger the file, the more likely you will
experience a connection cut in the middle of the upload. That is why the most
important thing it to support resumable uploads.

At the moment there is no clear consensus on how to handle that. Amazon S3 has
one protocol, Google uses two revisions of a different protocol, one on
YouTube[2], another on Google Cloud Storage[3]. Both work by first creating a
session that you refer to when uploading the chunks. There is also the Nginx
upload module[4] that delegates the session ID to the client for some reason.

And there is no browser client available to my knowledge.

That's all I know folks

[1]:
[http://docs.amazonwebservices.com/AmazonS3/latest/API/mpUplo...](http://docs.amazonwebservices.com/AmazonS3/latest/API/mpUploadInitiate.html)
[2]:
[https://developers.google.com/youtube/2.0/developers_guide_p...](https://developers.google.com/youtube/2.0/developers_guide_protocol_resumable_uploads)
[3]: <https://developers.google.com/storage/docs/developer-guide> [4]:
<http://www.grid.net.ru/nginx/resumable_uploads.en.html>

~~~
otoburb
I miss the ZMODEM protocol. Resumable file transfers over 56kbps was the bomb.
Made me feel whole again (pun partially intended) back in '89.

~~~
wamatt
Goddamn I had forgotten all about that haha. Rush of nostalgia from BBS scene.

XMODEM was painful. ZMODEM was leet.

------
ars
For the HTTP/2.0 discussion there was here earlier:

A way to continue an interrupted file upload.

Because POST variable are sent in order, if you put the file first and the
other variables after, the server never sees them if the file was interrupted.
So when I code a form I always put the hidden ones first so at least I can
give a useful error message (since I know what the user was trying to do).

It would be better to decouple them and upload the files and the rest of the
variables separately.

------
tagx
I'd really like to be able to use Dropbox as a magic upload handler for any
file I upload on my local HD, not just those in my Dropbox folder. They handle
the logic of getting all my files into the cloud. Why can't I point a website
to my Dropbox and say here, this is handling the file upload?

~~~
girasquid
Dropbox has an API that will (theoretically) let you do this, but there hasn't
been a ton of people jumping up and implementing it yet. It'll be cool when it
shows up.

~~~
banana_bread
I've used the dropbox API before to automatically upload photos in a dropbox
folder to Flickr. It occurs on an interval (cron job every 2 mins). I'm sure
you could do the same thing using FTP or a custom API that exists on your
destination server.

~~~
jmathai
They released a delta API recently. It's a bit of a pain to perform this task
as you can't differentiate between a new file and a renamed file or a file
that's been moved from one location to another.

In all cases you get a delete (if the file isn't new) and then a new file
event.

I implemented it, and it kinda sucks for this sort of thing. The purpose seems
to maintain local state to mirror the state on Dropbox. Not terribly
interested in that...I just want to subscribe to specific events (webhooks,
anyone?).

This _was_ their solution to the frequently requested webhooks. It falls
short. Way short.

------
ChrisNorstrom
8gb+ files? I found a way but you have to use a JAVA FTP Applet. I tested
these two here: <http://jupload.sourceforge.net/> and
<http://www.jfileupload.com/>

Dragged and dropped an 8gb+ file and left it on for 5 hours. Worked perfectly.
No time outs, no errors, and I'm on a shared hosting account at 1and1.

My problem with them is that it wasn't possible to hide the FTP username and
password, they were always in javascript files. I whined, I complained, I
bitched, and there was nothing they could do about it. :( So you basically had
to password protect the whole directory with .htaccess and be very careful
with whom you shared the credentials.

If you don't want people to download and install software just stick with JAVA
FTP Applets.

~~~
mbreese
You could always just hard-code the username/password into the applet and
recompile. That shouldn't be too hard...

Or, if you control the FTP server, you could dynamically add and remove random
virtual users/passwords to the FTP server (hopefully virtual users). Then when
the client javascript gets the username/password, it could only be used once.

~~~
cstavish
One could scan the .class file for string literals with relative ease.
Obfuscation would be an improvement, but still not completely secure.

~~~
gue5t
It would hardly be an improvement. Wireshark would be a first step for most
reverse engineers when there's network authentication involved.

------
rajbot
I've been dealing with browser-based large file uploads, which means dealing
with lots of browser-specific issues.

Fortunately, things are getting better, especially for the webkit-based
browsers. Firefox still has some issues, and I check
<https://bugzilla.mozilla.org/show_bug.cgi?id=678648> pretty regularly. Just
today this bug, which was filed in 2003, changed from Status = NEW to Status =
ASSIGNED.

Today is a good day.

~~~
liyanchang
First, I'm impressed that someone was uploading 2gb files back in 2003...

Agreed. Good to see that firefox is going to be able to do more than 2gb soon.

~~~
rajbot
That title is a bit misleading.. On some platforms you can already use Firefox
to do >2GB uploads, but there is still a 4GB limit..

If anyone wants to help beta-test a HTML5 uploader that calls archive.org's
S3-like endpoint under the hood (no IE or Opera support yet, though Opera 12
is now working..): <http://archive.org/upload/>

------
t4nkd
I've experienced this issue before when establishing a publisher backend for a
D2D pc game business. It seems to be basically impossible without a Java
applet of some kind, and even then it's wonky at best and just 'fails' at
worst. The real fix for the issue seemed to be simply providing an FTP
connection and letting people connect through the native client of their
choosing.

That really seems to be the key for this problem, develop a simple native app
capable of FTP uploads, that make it easy for users to deliver files to your
app within the context of their use. Most browsers are capable of opening
native applications via unique protocol, you could easily enrich the process
by having the native app be a part of(or try to blend seamlessly with) major
browsers.

------
jasomill
As plenty of file transfer protocols, clients, and servers support resumable
transfers (FTP, SFTP, rsync, proprietary browser-based tools, etc., or even
basic HTTP if you arrange for the file to be pulled rather than pushed and
your "client's server" has byte-range support), perhaps this should be titled
"why you shouldn't use a single HTTP POST request from a browser to upload a
large file". The general reason seems to be "because this is not a use case
this feature is commonly designed for and tested against."

------
abemassry
I ran into this problem with <https://truefriender.com/> the solution I used
was to use nginx instead of apache, nginx streams the file to disk and then I
can handle it with PHP. I still have the 2GB problem but I've tested out Perl
and I can go past it, now I just have to implement it.

~~~
liyanchang
Being on Herkou, been bit many times by the 30 second time out. No luxury of
changing it, let alone moving in nginx.

------
kookster
It may not work for ginormous files, but I've used a flash swf object to
upload to s3, released as part of a Rails gem. The latest version is here:
<https://github.com/nathancolgate/s3-swf-upload-plugin>

------
severin
Hi everyone. We developped a solution just for that! Please feel free to look
at <http://forgetbox.com> and give us feedback.

Our users send 130GB files, directly from Gmail...

------
zampano
Excuse me if this is a stupid question, but why would timeout issues on large
files affect something like Heroku more often than other types of hosting
services?

~~~
re
Heroku enforces the timeout.

<https://devcenter.heroku.com/articles/request-timeout>

<https://devcenter.heroku.com/articles/s3#file_uploads>

[http://stackoverflow.com/questions/7854239/heroku-timeout-
wh...](http://stackoverflow.com/questions/7854239/heroku-timeout-when-
uploading-big-file-to-s3)

------
graup
I use node.js with this plugin: <https://github.com/felixge/node-formidable/>

Works like a charm!

~~~
prezjordan
I immediately thought of node with this post. Makes uploading and streaming a
breeze!

------
frytaz
split them into rar/zip files with checksums on client side then upload...

