Hacker News new | comments | show | ask | jobs | submit login

That's awesome. If you are not already doing so, you can download my set from the torrent and include it in your database.


For exporting, pg_dump -F c greatly compresses the data so cost-wise you might be able to put on S3 and publish as a torrent.

Exporting is one possibility, but eventually I'd like to provide a read-only sql access to the database we host. We have a few ideas on how to do this [1], but it's not implemented yet.

[1] https://github.com/mozilla/tls-observatory/issues/92

Perhaps something like a modified PostgREST could work?


The problem isn't so much exposing the data as a rest api, as it is allowing for complex queries that may contain various table joins, subqueries or recursive conditions. I only skimmed through the documentation of postgrest, but it doesn't make mention of joining tables, which is a deal breaker for our use case.

Idea from someone just starting to learn about databases (very green :P):

- People request access and get an API key associated with a given load threshold, or don't use an API key and default to some low threshold

- Anything that SQL EXPLAIN says is over the threshold returns an error

- Successful requests' load costs and execution time (and possibly CPU, if that can be determined) count toward a usage rate limit

- An SQL parser implements the subset of SQL you deem safe and acceptable and forms a last-resort firewall

Obviously this is a complex solution; I'm curious what people's opinions are on whether this would overall be simpler or more difficult in the long run.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact