Hacker News new | past | comments | ask | show | jobs | submit login

What kind of versioning are your referring to? I know some databases has it but not sure if I understand it.

Personally I use S3 or similar for this. At the number of files I'm working with storing it in a relational database would be crazy expensive.

Scheme could be something like;

FileId, VersionId, Name, Data, UploadedTimestamp, UploadedBy, ...

If you’re on AWS, pushing and pulling from an S3 bucket is probably a great solution, and then of course there’s nothing to worry about in terms of backups.

Do you still need to keep an index of the files / metadata in a DB, or can you tag everything you need directly on the S3 objects and just pull the whole bucket?

Even if you’re not in AWS S3 works great, especially where you need to serve content direct to client’s browsers. S3 supports either making a file public and available to all, or using pre-signed URLs to provide short term authorisation to access a file to specific clients. No need to pull the bucket at all.

You will hit limits to how fast you can generate those crypto signatures for s3, and a limit to how much can serve; at a certain scale you will want to use a cdn

> You will hit limits to how fast you can generate those crypto signatures for s3

Presigned URLs are HMAC-SHA256:


I just measured GetObjectRequest.Presign() at 1063 ns. That's a million signatures per second of my 2015 MacBook Pro using the stock AWS SDK for Go _on one core_. This is plenty fast already, and it's guaranteed to be running in parallel on better hardware in production. There's no way signature generation is a bottleneck in any real application.

> and a limit to how much can serve

I've gotten hundreds of gigabits per second between S3 and EC2 without doing anything special. What problems are you referring to?

> at a certain scale you will want to use a cdn

So use a CDN too. CloudFront fits there neatly, and it can even sign its own requests to S3:


None of those criticisms are valid reasons to avoid S3. S3 is an excellent solution for many common problems.

Having said that, S3 has genuine drawbacks for certain workloads. (For example, the long tail on time to first byte is sometimes a dealbreaker.) In my opinion the biggest limitation is budgetary: typically one will spend all their money on AWS data transfer fees long before hitting S3 service limits.

Tons of tiny files is the problem where the bytes to sign ratio is low.

And I didn't say to avoid s3. I work for a subsiary of AWS I want you to use it a ton.

Let's say your serving up a bunch of thumbnails, maybe 1024 per user per load.

Or in my scenario 25k files. 25k signs per user is 40 users per second per server signing constantly.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact