Datasette Facets: Faceted Browse and a JSON API for any CSV File or SQLite DB

simonw · on May 21, 2018

Here's a demo of it running against 190,000 trees in San Francisco, using my datasette-cluster-map plugin to include a map visualization: https://san-francisco.datasettes.com/sf-trees-ebc2ad9/Street...

More about that plugin: https://simonwillison.net/2018/Apr/20/datasette-plugins/

chrisweekly · on May 21, 2018

My favorite tool in this space is [lnav](https://lnav.org) which has an embedded sqlite engine and works with all kinds of structured data. It might obviate the need for datasette, or maybe complement it in a scripted workflow....

donut · on May 21, 2018

Nice. Looking forward to trying this out.

Reminds me of http://openrefine.org/.

aorth · on May 21, 2018

Indeed! I was just thinking that this could be a substitute for viewing where I don't want to have to fire up OpenRefine, but for editing, GREL, reconciliation, etc OpenRefine is still king! I've just started playing with reconciliation in OpenRefine via conciliator + Solr. Pretty cool.

https://github.com/codeforkjeff/conciliator

obelix_ · on May 21, 2018

I see something about uploading the CSV file. Not clear where the sqlite dB is being created. Is it local? Does the browser create it? Ideally I would like to run this locally when I am playing with data.

What's been the largest file tested? Basically what's the max rows/cols it can handle fast?

simonw · on May 21, 2018

You can run it locally - that's the default way to use it. "pip3 install datasette", create your SQLite database file and run "datasette mydb.db" to start exploring.

I've run it successfully on SQLite files up to a GB in size and theoretically it should work with much bigger files than that.

The https://publish.datasettes.com/ tool works by taking your uploaded CSVs and running my "csvs-to-sqlite" script on them, then deploying the datasette application alongside that new SQLite database file.

obelix_ · on May 21, 2018

Thanks! Looks pretty useful. Will be checking it out soon.

hprotagonist · on May 21, 2018

i’ve been keeping tabs (ha!) on this for some time.

keep up the good work! it is incredibly useful for me to be able to rapidly poke around a new data set and get a lay of the land.

Perhaps equally importantly, being able to do this in a meeting with the semi-higher ups, live, makes you look fiendishly smart.

brianzelip · on May 21, 2018

The final section of this recent Changelog podcast provides an interview with the Datasette author, https://changelog.com/podcast/296.

gigatexal · on May 21, 2018

This is awesome. Is there something like this for Postgres?

simonw · on May 21, 2018

Not that I've seen so far, but I did write a related tutorial on building faceted search on top of Django and PostgreSQL a few months ago: https://simonwillison.net/2017/Oct/5/django-postgresql-facet...

seektable · on May 21, 2018

you can try https://www.seektable.com which has 'flat table' reports for data browsing and postgresql connector; however this is cloud tool and your DB should be accessible by tool's server. BTW, CSV files are also supported.

chrisweekly · on May 21, 2018

Nice publish workflow, leveraging Zeit `now`.