
A Jupyter Kernel for SQLite - Tomte
https://blog.jupyter.org/a-jupyter-kernel-for-sqlite-9549c5dcf551
======
llimllib
If you're interested in this, you might be interested in Datasette:
[https://datasette.readthedocs.io/en/stable/](https://datasette.readthedocs.io/en/stable/)

Which seems to me to be farther along in providing advanced
querying/faceting/visualization/sharing capabilities on top of sqlite.

(I love jupyter, and this kernel seems neat; not trying to throw stones at
anybody, just to link a project in a similar domain)

~~~
seemslegit
Having to rerun the query and reload the page just to change the sort order on
a small dataset is very 1997

~~~
simonw
You gotta re-run the query or you're limited to sorting just the visible
results.

I guess I could let it spot when there are less than the page-size of results,
but then I'd need to be confident that the JavaScript sorting algorithm
exactly matches the underlying SQLite sorting algorithm - taking into account
character sets and custom collations and suchlike.

I've been deliberately avoiding adding any JavaScript to core Datasette almost
to make a point: you don't need it. HTML pages load faster than so-called
"modern" SPA monstrosities. They work really well. They don't require a bunch
of extra steps to avoid breaking the back button etc.

But... Datasette provides a JSON API for everything, and supports plugins.
There's nothing to stop someone who really wants no-page-reload query
execution from implementing it as a plugin.

I myself have built JavaScript plugins for things like bar charts and line
charts ( [https://github.com/simonw/datasette-
vega](https://github.com/simonw/datasette-vega) ) and map visualizations (
[https://github.com/simonw/datasette-cluster-
map](https://github.com/simonw/datasette-cluster-map) )

So yeah, my personal bias here is that building websites like we did in 1997
is deeply undervalued.

~~~
seemslegit
The UI needs to be such so as to make the users understand that by clicking a
column header they would be sorting the results of a previous query rather
than re-running it with a different sort column and that the two are not
generally interchangeable, likewise with instant in-results search.

At least with the server being on the internet and the example setup I
disagree that the current way is faster or even near a modern webui

~~~
simonw
Can you think of a way I could make that clear in the UI ("this will sort all
of the data including the rows you can't see" v.s. "this will sort the rows
that are visible to you right now")?

~~~
seemslegit
I think this is the natural expectation of anyone familiar with spreadsheets,
reporting tools or data grids with sortable column headers, but a message
along the lines of

"Sorting results of the previous query, click 'Run SQL' button again to query
with ${column_name} as the ORDER BY column"

and a "got it" hyperlink to be stored in your session state store of choice
upon first click on a column header would probably make it clear.

~~~
Swannie
I tend to disagree.

I find that data/UI tools that only sort the visible results, and not re-
populate the paged view of the data, extremely counter-intuitive.

Even old libraries like ExtJS supported easy re-querying the server upon the
column sorting event - though it has to be said, many lazy implementers didn't
use this feature.

------
amasad
This is very cool. SQL is a great interactive language, we added it to repl.it
too: [https://repl.it/languages/sqlite](https://repl.it/languages/sqlite)

~~~
3l3ktr4
Oh my god! That's so cool! Thanks for sharing it!

------
seemslegit
Cool hack, but probably wouldn't install a separate kernel or run an entire
notebook just for sqlite work.

What would be useful is for the kernel of your language of choice to provide a
magic for sqlite and return results of queries in language-native data
structures.

Something like this: [https://pypi.org/project/ipython-
sql/](https://pypi.org/project/ipython-sql/)

~~~
seemslegit
Alternatively: a new type of cell in jupyter itself alongside 'code' and
'markdown' for sql work available regardless of kernel choice.

~~~
3l3ktr4
Yeah, I had this idea at first, but it's not trivial to run more than one
kernel at the same time. I know some examples of people who did it, and some
projects that I could plug in and try to make it work with my kernel, but this
was a first version to see how the community responds! Let's see, might do
something like it in the future. I think it'd be really cool and powerful to
make it interact with Python for example.

~~~
cheez
Yes this is a great first step. Would absolutely love a way to use this
alongside my Python code.

------
crazygringo
Wow. It never even occurred to me that this was missing from Jupyter -- but in
hindsight seems _so_ obvious.

Congrats to the Jupyter team on this!

~~~
3l3ktr4
Thanks! <3

------
reallymental
Incredible, but wasn't this kind of available though psycopg2 (postgres
connector), SQLAlchemy or any other kind of database connection library ?

I realise the difference between the kernels altogether, but how is one better
than the other?

~~~
dataminded
Not really. You could use the python kernel and embed your SQL code within
your python code but you couldn't run cells with just SQL. It made for a very
poor analyst experience.

Microsoft got it right with Azure Data Studio.

~~~
yellowapple
Does ADS support SQLite? I already use it for SQL Server, and I'd tried the
PostgreSQL add-on but couldn't get it to work for some reason (but that was
when it was first released so it might've stabilized since then).

~~~
dataminded
No. I've seen success with Microsoft SQL and PostgreSQL.

I've also experienced instability issues, it needs that VS Code level polish.

------
aghillo
I like this. Previously I’ve used notebooks to explain a data pipeline from
different perspectives. One notebook showing the ETL process going from raw
data to RDF triples in a store; one SPARQL notebook showing the raw queries;
and then a final decision support notebook using a Python binding to the
underlying SPARQL query library. It seemed to work well.

------
CrazyCatDog
This is awesome! Makes sql as accessible in the classroom as python—-we waste
so much time installing SQLite and trouble shooting for students (b-schools).
As soon as this is accessible in colaboratory (google), or similar, super
light web interface and notebooks become trivial... THANK YOU!!

------
justinclift
Interesting. Wonder if it'd be possible to embed it for visualisation in a Go
web app?

Was adding basic online chart capabilities to our SQLite publishing website a
few weeks ago (eg:

[https://dbhub.io/vis/justinclift/Marine%20Litter%20Survey%20...](https://dbhub.io/vis/justinclift/Marine%20Litter%20Survey%20%28Keep%20Northern%20Ireland%20Beautiful%29.sqlite)
)

But if people could do Jupyter notebooks and visualise them like this too,
that could be useful.

Hmmm, should probably set up some kind of survey on our website to ask... :)

~~~
3l3ktr4
Hit me up on
[https://gitter.im/QuantStack/Lobby](https://gitter.im/QuantStack/Lobby) if
you think I can be useful. I'm @marimeireles there and on Github too! :)

~~~
justinclift
Thanks, will do. :)

------
gepoch
Excellent! I usually just type up my SQL in triple quotes and execute it from
python, which gives me a little bit of templating too. Excited to give this a
try (and start using more views.)

------
Koshkin
This looks nice. Now if only somebody wrote a Jupyter kernel for (power)shell.

~~~
mweatherill
Have you seen Azure Data Studio? It uses the same code base as Visual Studio
Code and has first class support for notebooks. It includes a PowerShell
kernel

~~~
jonsequitur
On the .NET team, we're working on a polyglot Jupyter kernel that includes C#,
F#, and PowerShell support. We're collaborating with the Azure Data and
PowerShell teams. It's in preview now.
[https://github.com/dotnet/interactive](https://github.com/dotnet/interactive)

------
Keyframe
How's this different from let's say BeakerX or Zeppelin?

~~~
3l3ktr4
Zeppelin is not Jupyter. And I'm not sure how BeakerX works tbh, but it uses a
whole docker image all the time to run right? So, might be very resource
consuming, this kernel was implemented on C++. I honestly have no idea of how
good is the tooling around SQL in BeakerX but might be a better alternative if
you don't care about the docker image and doesn't actually need SQLite, that's
different from SQL, which is what they're offering.

