
How Uploadcare Built a Stack That Handles 350M File API Requests per Day - awartani
https://stackshare.io/uploadcare/how-uploadcare-built-a-stack-that-handles-350m-file-api-requests-per-day
======
nametube
User generated content (especially images) are a great attack vector, what do
you do to isolate/mitigate against attacks like that ?.

~~~
BillinghamJ
How so? Obviously they won’t be executing (or likely even analysing) any of
the uploaded content.

Similarly, browsers should not generally be particularly vulnerable to
malicious content being loaded with appropriate MIME types in appropriate
containers (e.g. <img>)

It sounds like you should be asking how browsers protect users from malicious
content. Perhaps you could elaborate?

~~~
nametube
Image and Video codecs come under attack quite often see
[https://blog.sucuri.net/2016/05/imagemagick-remote-
command-e...](https://blog.sucuri.net/2016/05/imagemagick-remote-command-
execution-vulnerability.html)

In this context the image manipulation they do with pillow and the underlying
libjpeg would be a potential source of vulnerabilities.

~~~
acdha
Significantly, it's not just libjpeg but every format supported by Pillow
([http://pillow.readthedocs.io/en/3.4.x/handbook/image-file-
fo...](http://pillow.readthedocs.io/en/3.4.x/handbook/image-file-
formats.html)) — many of those vulnerabilities have historically been in
obscure formats where the implementation has had far less attention than the
mainline JPEG or PNG support.

------
coldcode
I find it interesting that these sorts of stacks have tons of moving parts.
Maybe it's the nature of highly scalable systems? Or does it come from
starting with one particular technology and then having to drag in lots of
other things to make it work?

~~~
dmitrymukhin
In the article we tried to convey the main idea behind that — take the best
tool for the job at hand. There's no "one size fits all" framework or product
to put you money on. It's much easier to handle this zoo than making something
do that it's not supposed to.

Furthermore, to get high scalability, you have to make things as loosely
coupled as possible. This means you're up to making some choices.

Hope that makes sense and answers the question :)

------
lukb
That's a great read. I've always been interested in learning how such tech-
oriented companies found their initial traction. Are there any blog posts /
articles / podcasts about Uploadcare's early days and the search for the
product/market fit?

~~~
notrheadagain
I just got wind of it, we at Uploadcare will soon be releasing an article with
more info about the early days :) And, I believe, a podcast or two. Thanks for
this question, btw. Would you elaborate on what you would like to know? It'll
help us compile a great article, thanks :)

~~~
lukb
Great to hear that!

The reason why it's particularly interesting to me, i.e. to someone with a dev
background, is that the lean startup wisdom says you should be very specific
about the customer you're after and Uploadcare seems like a solution targetted
at a broad spectrum of customer segments. Of course, I'm happy to be proved
wrong if there are one or two dominant customer segments that you address
Uploadcare to. Also, you might have as well started out with a very specific
customer persona and spread to other segments. Whatever it was, curious to
know.

I guess many developers dream up products targetted at developers like them
selves. Selling to fellow developers is hard. It would be great to read a
success story for a change.

------
copyconstruct
I wonder what's the breakdown between unique files delivered as opposed to
files delivered from the CDN cache. Also, what's the breakdown for file
uploads, manipulation and delivery? The 350M API requests per day would make
more sense if we get this brakdown

~~~
dmitrymukhin
Cached/uncached file delivery is close to the universal 80/20 ratio. Cached
operations are not included in that number.

Unfortunately, I can't say anything more than that.

~~~
copyconstruct
Curious - does that mean you serve close to 1.75 billion requests per day, out
of which 350M are unique requests that exercise your stack instead of being
served from a CDN. It'd be interesting to know more about what's the number of
transformations you do at peak, if you can talk about it.

------
dmitrygr
350M/day = just about 4K QPS. Is that considered impressive nowadays?

~~~
tyingq
Assuming most transactions are a largish file transfer it seems impressive.
And I assume the transactions aren't evenly spaced, so the peak is likely much
higher.

4k qps of DNS, for example, would be less interesting.

~~~
RX14
In the case of large responses, bandwidth out is a much more interesting
metric. I'm sure their number sounds more impressive though.

------
hlieberman
It looks like your certificate expired.

Since it's just a DV certificate from Comodo, have you considered switching to
Let's Encrypt? Its automated systems could have helped you automatically
update.

~~~
RKearney
It's a wildcard certificate, which isn't supported by Let's Encrypt yet.

~~~
powvans
Site appears to be hosted in AWS. Lots of free SSL goodness to be had from
Amazon as well.

~~~
King-Aaron
It's stretching into off-topic land here, but could you suggest any AWS SSL
resources that are worth investigating?

~~~
manigandham
AWS has Certificate Manager which provisions free certificates and manages
renewals automatically. Usable across ELB, Cloudfront, etc.

[https://aws.amazon.com/certificate-
manager/](https://aws.amazon.com/certificate-manager/)

------
sigi45
By using AWS.

~~~
dmitrymukhin
It's not that hard to get high numbers with AWS, indeed :p

The hard thing is to make it cost effective. To that end I can proudly say
that AWS bill is not in the top list of Uploadcare expenses.

------
drchaim
All I can say is they don't seem to donate Django project.

~~~
orf
If that's all you can say, don't say it at all.

~~~
dmitrymukhin
Oh the irony.

