
Show HN: SpreadsheetDB – A database that you can query with spreadsheets - Clex
https://www.spreadsheetdb.io/
======
codezero
It's a little known fact (is it?) that you can use a SQL like query language
on Google Sheets.

You can query across different tabs on a single sheet, or query a totally
different sheet!

[https://developers.google.com/chart/interactive/docs/queryla...](https://developers.google.com/chart/interactive/docs/querylanguage)

[https://support.google.com/docs/answer/3093343?hl=en](https://support.google.com/docs/answer/3093343?hl=en)

~~~
beering
This feature makes me think, "When all you have is SQL, everything looks like
a relational database."

I recently confused some co-workers when they examined my Google Sheets query
- not knowing Excel very well, I just wrote SQL-ish. Now they have to live
with my SQL-ish spreadsheet or figure out how to redo it with Excel functions.

~~~
andrewstuart2
How about "when all data is fundamentally structured, a Structured Query
Language will be highly useful in many cases?"

Seriously, we live in a universe where everything is composed of lots of
smaller things that exist in relation to each other. Relational Data isn't so
much a database technology as it is fundamental information theory.

~~~
iheartmemcache
Let me preface this with: I'm a _huge_ proponent of SQL when appropriate. I
understand why Cassandra et al were required when generating dynamic live
newsfeeds for each user in an ecosystem when the RDBMS query plan was pretty
much designed in a naive non-uber-scalable-fashion.

That being said - let's take a field which changes fairly rapidly, all things
considered - medicine. You have two forms of changes which can occur within
your data model -- (a) definition changes where your schema stays the same
effectively and (b) paradigm changes. Let's say you're with a patient and need
to code in a ICD[1] value for whatever malady he has. A crude analogy would be
a comparison between the Dewey decimal system and the ICD. As the ICD evolves,
new codes might be added but the fundamental structure remains the same. So, a
new database paradigm TotallyAwesomeRuleBasedImmutableDB pops up, and you just
use the pre-existing code for "Databases" (say 20) and appropriate the next
available integer in that set for TARBIdb(lets say, 73). No problems there.
Essentially, we just inserting a new value (73, or added a new "smaller
thing") into the set of Databases to universe U_1 by without violating any
invariant properties. Consider (a) a non-issue.

The problem occurs when you need to address the issue of your patient being an
alcoholic. Previously, this was referred to as _trunksucht_ (loosely
translated as: "dipsomania"). This would fall under the 305, 303, 291, or 980
codes depending on if you were say, a non-drinker with no psychological
cravings for alcohol who pledged for a frat, drank too much and developed
alcohol poisoning as a result of that. If you were a daily drinker exhibiting
physical withdrawal symptoms, you would be coded from an entirely separate
section. From the ICD-9 to ICD-10, the WHO reclassified this entirely (such
that either patient would be coded in to an entirely new category 'F10').

This breaks a lot of things, from the front-end GUI which would do validation
with a regexp [0-9]{,3}\\.[0-9]* or some such which is trivial enough I
suppose, to the stored procedures which then have to get vendor approved
schema changes, all of which will be rolled up and sent to your IT guys who
have to go through tedious QA to make sure everything works. (And I mean
literally every modal dialog, every validation, every external XML-RPC call to
bill out your patients insurance company, Universe U_2).

I've read Codd's paper, the entire ANSI SQL11 standard, and have a fair grasp
of both major vendors. There is nothing in there about gracefully dealing with
these situations. I suppose one could argue that an RDBMS' purpose is solely
to act as a data-store while retaining availability and data integrity -- that
is, U_1 is all your Oracle instance cares about (manage your tablespace's set
of schema objects and the interaction between U_1 <\--> U_2 is completely out
of the scope of it's set of responsibilities). Maybe so, but when Oracle added
the XMLType with full XPath selector capabilities in 9i (give or take?) they
certainly co-opted the responsibility to deal with changes of a structural
(that is, "of type 'b'") nature. Dynamically typing, weakening of type
systems, writing to an 'duck typing'), serializing via the Protobufs
methodology, etc are all just band-aids; we need a surgeon to go in there with
some surgical sutures.

[1]
[https://en.wikipedia.org/wiki/ICD-10](https://en.wikipedia.org/wiki/ICD-10)

------
falsedan
Other posters have mentioned Microsoft's Excel 20-year-old ODBC support [0]:
you can connect any database as a source of data, then create pivot tables for
live ad-hoc transforms and summarizations of a database. This is a separate
feature to connecting to an Excel spreadsheet using ODBC.

I showed this to my manager and he immediately demanded I explain to the rest
of the team how this black magic worked…

[0]: [https://support.office.com/en-gb/article/Overview-of-
connect...](https://support.office.com/en-gb/article/Overview-of-connecting-
to-importing-data-c0ad7aec-ff1a-4e48-9c21-dc18a102433f)

~~~
lancewiggs
ODBC is over 20 years old.

I was using ODBC from MS Access to query Oracle databases in the early 90s -
both ODBC and (preferably) pass-through queries. The entire system included
then generating (recursive) reports in Access, passing the data to Excel to
integrate with other data and input, opening and updating a Powerpoint file
with linked tables, saving the PPT and returning to Excel for the next in the
series.

All driven by VBA. All in 1995, maybe 1994 too.

I had to restart Windows after each run of 20 or so, and had two computers as
one was always tied up running jobs. Kludgy but at least I wan't writing
COBOL.

Here is some 1994 PR:
[http://archives.cpajournal.com/old/16531700.htm](http://archives.cpajournal.com/old/16531700.htm)

------
fiatjaf
The idea is very nice, but I don't like the fact that it is just an API for
calculations. That kind of defeats the purpose of a spreadsheet format.

At first I thought it was an actual spreadsheet application, or an Excel
plugin or whatever.

I tried multiple times to create a spreadsheet app that would speak to JSON
APIs easily from within cells. The problem is that writing a spreadsheet app
from scratch is not easy (one of my tries:
[http://sheets.alhur.es/](http://sheets.alhur.es/)), but when that is done it
is not hard to integrate it directly with PouchDB, CouchDB -- or any other
database or application, actually, with a custom integration.

~~~
tekromancr
You might want to look at
[https://github.com/handsontable/handsontable](https://github.com/handsontable/handsontable)
for the front end. You could probably mash it up with this to build what you
describe.

------
ukd1
Cool idea, but that logo just reads as "preadsheet", which, is not exactly
positive.

~~~
Zombieball
Good call out. I just want to add that the issue is exacerbated on a mobile
phone. iPhone 6, portrait mode, the "S" and "preadsheet" are even on different
lines!

The linewrap goes away when I rotate to landscape mode.

~~~
Clex
Thanks for the feedback, that will get fixed soon.

~~~
cyberferret
Also, you mis-spelled "guarantee" under "99% uptime"...

~~~
Clex
Fixed, thank you.

------
Mister_Snuggles
I'd love to see this concept taken a step further and be able to write back
changes (within reason, of course) to a SQL database.

The piece I'd most love to see is some form of editable Pivot Table, where the
changes get written back to the appropriate place in the database. For
example:

    
    
        CREATE TABLE Balances(Account varchar, Scenario varchar, Amount numeric, primary key(Account,Scenario));
        -- Pretend that this is pre-populated with all Account and Scenario combinations
    

Using this schema, I would want to get a Pivot Table out of this where Account
makes up the row labels, Scenario forms the column labels, and the Amount
values are the data. Where the magic happens is that I want to be able to make
changes to the amounts and have it write those back into the database with the
appropriate UPDATE statements.

~~~
cyberferret
Be prepared for a bill from your DBA for his/her blood pressure medication
then... or the accounts team when they realise that a missing conversion of a
string to a numeric caused you to zero out every customer's account balance...
;) (Based on real world experience with my clients over 3 decades)

Even 20 years ago good old Microsoft with their MSQuery capability in Excel
stopped you from doing this every way they could.

~~~
Mister_Snuggles
Depending on how it's done it can be perfectly safe.

I've actually built a thing, in Lazarus of all things, to do this very thing.
Unfortunately it's very specific to my use-case, but it is "safe" in the sense
that it doesn't screw things up too badly. My particular use case is actually
very spreadsheet-like, but I also want some things that are a lot easier to do
in SQL than in Excel.

My use case is to have a tool to help me evaluate monthly financial statements
for a smallish organization, enter my own adjustments, and come up with a
final statement.

It's spreadsheet-like in the sense that I enter everything into a grid of rows
and columns. The rows are accounts (in the income statement sense, not
customer accounts), columns are a "type" (e.g., actuals, my adjustments,
adjustments made by the auditor, adjusted actuals, budget, budget vs actuals,
etc). Some accounts and some types are calculated, thus read-only in the
table. There's also a drop-down at the top to select a period.

Basically, it's everything that a basic Excel Pivot Table does, but database-
backed.

My Lazarus program lets me fill in that table however I want, when I hit save
it persists it all to the database and runs a stored procedure to recalculate
all of the calculated values. The database makes sure that I don't try to put
a string in a numeric field, and it (PostgreSQL) is smart enough not to
convert strings to zeroes.

The thing that makes this better than a spreadsheet is that I can just add a
new account/scenario/whatever and run a stored procedure to insert the missing
rows in the data table. When I did a similar thing with Excel, it was easy
enough to insert rows. The hard part was inserting all of the rows everywhere
they needed to go (I had a tab for budget, a tab for actuals, a tab for
adjustments, etc) and updating all of the formulas.

------
franciscop
I created something simpler but based on Google Docs and just retrofitted
promises into it: [https://github.com/franciscop/drive-
db](https://github.com/franciscop/drive-db)

------
eob
Cool idea. We've built a spreadsheet-powered web platform (Cloudstitch - YC
S15) that lets you publish web forms and widgets that rely on spreadsheets as
a backend.

Several folks use us just for the spreadsheet API -- as this service seems
oriented. We provide a simple, common API that can talk to both GSheets and MS
Excel 365, support various forms of row-level access control, and Ruby on
Rails-like join syntax. Would love to hear feedback.

------
garyclarke27
I was an Excel expert before becoming a sql expert. Following this I changed
the way I created large excel models, now 90% of my data is in Excel tables.
The main benefit of these is they are much easier to use and manage than range
names and the ranges automatically change their size to contain the data. Wish
ms would add unique row ids (as we'll as the unique column ids they have) is a
bit cumbersome having to setup index match everywhere to link rows in
different tables. It's amazing what you can do with sumifs and countifs these
are far more powerful then the sumif - very similar to sql where clause, with
multiple criteria such as <= , works with text and numbers. I used to use tons
of pivot tables but no longer, they are too unstable and confusing and very
hard to format nicely.

~~~
sidegrid
Any good resources you used to learn this (to replace pivot tables)?

------
sandGorgon
pretty cool - I have been a fan of tools like Airtable.com (IMHO the best
cloud spreadsheet I have seen). In my own startup, I have hacked together a
version of a cloud spreadsheet using Handsontable + reactjs + flask api.

Can you talk about the stack you are using ? really interested to hear about
it.

~~~
Clex
Thank you for the kind words!

Airtable is really neat indeed. But SpreadsheetDB aims to be more a database
with an alternative paradigm than an actual spreadsheet editor.

Regarding the stack, it is made with Go and Postgres.

------
brilliantcode
Not sure what pain problems this solves. You can already programmatically
access and manipulate Google Sheets.

And no, you are not going to run in to their upper bound limitations on rows
and columns often.

------
SEJeff
Aka Access?

~~~
londons_explore
SSSh - we like reinventing the wheel in web based form!

------
nodesocket
Somewhat related. I've been using AirTable
([https://airtable.com/](https://airtable.com/)) to manage my consulting
startups clients and even track hours. Absolutely loving it.

You define tables and scheme in their beautiful excel like web interface.
AirTable then automagically generates a RESTful API you can build on-top of.
They even have a great Node.js ORM[1] library. Finally, you can export public
embeddable forms for inserting data into tables.

I'd love to see somebody create a React based front-end that pulls data and
schema from AirTable and automatically creates a CRUD interface like Rails
does. All inclusive authentication (Twitter, Facebook, GitHub) and user level
permissions.

[1] -
[https://github.com/airtable/airtable.js](https://github.com/airtable/airtable.js)

------
sirrice
There's some related research out of UIUC that's pretty neat.
[http://dataspread.github.io/](http://dataspread.github.io/)

