Hacker News new | comments | show | ask | jobs | submit login

The results of the Common Crawl project are hosted on AWS Public Data Sets, so it's not in my account. https://aws.amazon.com/datasets/

I see, without CommonCrawl paying for S3 (or AWS maybe eats that cost to help the public); this would be an expensive project.

Actually, on the paged linked on your parent post, it says

> AWS is hosting the public data sets at no charge for the community

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact