Hacker News new | comments | show | ask | jobs | submit login

We stopped storing our IoT raw data in databases. We still need to search it, but now we store only metadata in the database (we know what we will search by, so we can make appropriate metadata) and store the raw data in S3. So any searches are in the DB, using the DB for what it is good at. This means that our storage / database cost approaches just the S3 cost, because our metadata is ~0.001 the size of the raw data. It took our product from "we are going to shut this thing down" to making money.

Got the idea from an AWS talk here https://youtu.be/7Px5g6wLW2A. Blew my mind. I coded up all the changes in a day. Took way longer to move the data than to code it.

This is one of the ways to use AWS 'the right way'. Without serious optimization, using AWS as a IaaS provider is going to cost more.

AWS provides a ton of building block primitives you can use to build with at a price point better than you can do it yourself. If you just try to do it yourself using their IaaS offerings 24/7 (ec2, vpc, etc) then you're in for a bad time.

How slow is it to retrieve the data from s3 though?

It is slow, but most of the time, we don't use the raw data - we use the metadata. It worked great for our app. We got a excited and tried to implement some on the fly data summaries where we would have to touch a bunch of files every time we wrote anything. That just didn't work because of the speed.

Let's be specific on speed. Most of the time, the MB or two needed for a plot is a fraction of a second, which our customers can deal with. That retrieval is a single object from S3, the way we organize things.

Having said that, the talk I linked to has some great advice - use the "folder" structure to write data so you don't search, you just use the naming scheme to do a direct object read. In addition, we can keep most reads to a single object, which is fairly fast.

As is always true, you will need to test to see if it fits your speed needs. But even with just the naming scheme and meta data in the database to eliminate all searches on S3, the speed works for us.

Yes, s3 is very slow. Maybe to pursuade clients to move to their db solution.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact