Hacker News new | past | comments | ask | show | jobs | submit login

> rather than using something generic like 'jq' is telling.

The best generic tool for managing (structured) data is SQL. Once you have the datasets imported (via the custom readers / loaders) it's just plain SQL (and works with SQLite, PostgreSQL, MySQL, etc.)






For large data repositories, especially public/open datasets, a major concern is versioning. While it is not impossible to render a nice diff between two SQLite files, it's not as ingrained in our everyday tooling (e.g. GitHub) as plain-text diffs.

For small to medium-sized datasets, a nice middleground would be SQL dumps. Put the dumps in Git for versioning and diffing, and load them into $DATABASE for actual queries.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: