Hacker News new | comments | show | ask | jobs | submit login

How much did storing the data on S3 cost where you said, "However, the data is on S3" or was it there for such a transient time that it didn't cost much? Bandwidth costs in/out of S3 too?

Edit: Actually I read the S3 parts again, it sounds like the CommonCrawl project pays the S3 costs, I think, since it looks like you're using their domain data?

The results of the Common Crawl project are hosted on AWS Public Data Sets, so it's not in my account. https://aws.amazon.com/datasets/

I see, without CommonCrawl paying for S3 (or AWS maybe eats that cost to help the public); this would be an expensive project.

Actually, on the paged linked on your parent post, it says

> AWS is hosting the public data sets at no charge for the community

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact