
Show HN: Octo – Generate a serverless API from an SQL query - khalidlafi
https://octoproject.github.io/octo-cli/
======
eatonphil
I've got a similar project that reads your db schema and generates a Go REST
API and a TypeScript/React web interface. (The code-generation is language
agnostic so at some point I'd like to add at least a Java REST API as well.)
It supports PostgreSQL, MySQL, and SQLite.

Unlike PostgREST/Hasura and some other dynamic tools you can "eject" at this
point if you'd like and continue on development without the generator in a
language you already know. But I'm working on exposing Lua-based hooks you
could carry across whatever backend language you choose to generate and avoid
the need to eject.

It has builtin support for paginated bulk GET requests with filtering,
sorting, limiting. Built-in support for bcrypt-password authentication and
optional SQL filters specified in configuration for authorization of
particular endpoints based on session and request metadata.

Still very much a work in progress but the goal is to push the envelope on
application boilerplate.

Screenshots are of the example/notes project in the repo.

[https://www.dbcore.org/](https://www.dbcore.org/)

[https://github.com/eatonphil/dbcore](https://github.com/eatonphil/dbcore)

~~~
MuffinFlavored
I feel like projects like this work for simple stuff but as soon as you need
analytics/insights or actual business logic, you almost always need to just
"roll your own" API. Am I wrong? Do other people feel this way? Can anybody
think of a few projects they've worked on that would be too complex/a ton of
work to make work with these kind of simple template generators?

~~~
robmccoll
It seems like when working with generators, the trick is to have the right
boundaries between generated code, points where you can extend the generated
code, and the API through which you use the generated code. If successful, you
should never feel the need to hand edit the generated code itself, and you
shouldn't need to worry too much about re-running the generator breaking
things or stomping on your code.

~~~
xgenecloud
>> you shouldn't need to worry too much about re-running the generator
breaking things or stomping on your code.

Can't agree more! There are code parsers which give out a DOM and then code
can be manipulated. But is a bit of work. We are trying this concept in our
framework.

------
Trisell
The idea is interesting. But it looks like you end up with a yaml file that
enumerates each of your tables/endpoints and the queries that back them. So
are we exchanging the “complexities” of code, where we have control and
testing, for the “lack of complexity” of yaml that becomes unwieldy and
untestable in the name of “simplicity?”

~~~
stingraycharles
Don’t forget that at some point, you’ll want to generate the yaml from code,
because otherwise it becomes impossible to maintain. And quickly you’ll find
yourself back at square 1. :)

------
Glyptodon
One of the things that's not obvious to me about things like this (and other
similar tools) is where/how scopes/limitations/permissions are handled. I
assume they either are or can be, I just never see it spelled out clearly.
What am I missing?

~~~
lukeramsden
I can't speak for this project specifically but for some context,
Postgraphile's way of solving this, as it only supports Postgres, is to use
Postgres's Row Level Security feature, whereby you enforce scopes and
permissions at the data layer, as well as using table grants to roles
specified in JWTs and such.
([https://www.graphile.org/postgraphile/postgresql-schema-
desi...](https://www.graphile.org/postgraphile/postgresql-schema-design/#row-
level-security))

This project doesn't seem to have any inbuilt AuthZ functionality, so unless
your database has that built in like Postgres, or you need something that's
not possible in-database, I guess you just... can't.

~~~
ecf
I recently started a side project on top of Postgraphile and I had to end up
scrapping it in favor of something I was more familiar with.

The biggest problem is that unless the main language you’re familiar with is
PL/pgSQL, you’ll eventually run into the roadblock that is having everything
reside in the DB.

In my case, I simply could not figure out how to use the Users table without
having the password returned in all queries. I could turn off the
automatically generated queries that Postgraphile made, but at that point why
bother with it?

Other problems are that there isn’t a mature and well established method of
maintaining development, staging, and production database schemas on projects
with more than one developer.

Graphile makes
[https://github.com/graphile/migrate](https://github.com/graphile/migrate),
however, it seems incredibly brittle and is built on a workflow that will
absolutely break production if it is slightly deviated from.

I really liked using it! I just hope some of these developer ergonomics issues
could be better handled.

~~~
lukeramsden
> In my case, I simply could not figure out how to use the Users table without
> having the password returned in all queries

The recommended way of doing this is to store anything you don't want public
in a separate table with a one-to-one relationship, and then controlling
access to that table through computed columns and such
([https://www.graphile.org/postgraphile/postgresql-schema-
desi...](https://www.graphile.org/postgraphile/postgresql-schema-
design/#storing-emails-and-passwords))

What I can't recommend enough to people starting with Postgraphile is to check
out graphile-starter, and especially it's first migration:
[https://github.com/graphile/starter/blob/main/%40app/db/migr...](https://github.com/graphile/starter/blob/main/%40app/db/migrations/committed/000001.sql)

That really helped me to understand better how to structure my schema, as
separation of concerns within a model by using different tables is not
something that is necessary when doing a regular application-in-front-of-
datastore type app.

> Other problems are that there isn’t a mature and well established method of
> maintaining development, staging, and production database schemas on
> projects with more than one developer.

Yes, I don't have to grapple with this issue so I don't really have much to
say about this. Graphile-migrate works great for me, but I can see how it
could be an issue for larger projects, although I would think that database
migrations are simply tricky on their own, regardless of Postgraphile.

~~~
ecf
You’re completely right about the starter and I feel if I had discovered it
first instead of trying to roll my own migration system with Knex and stuff, I
would have stuck with Postgraphile.

But instead of rebasing my project with the starter I decided to go back to
what I was comfortable with and do a Rails gql api.

I’ll definitely be trying it out in the future though! I felt so incredibly
productive with rolling stuff out before auth became a concern.

~~~
lukeramsden
Yes it's been incredibly quick for me to iterate and build my current project.
You are still right about migrations though, I'm not sure how that would work
in a bigger team with feature PRs etc.

------
akie
Perhaps I'm old, but who needs an API for an SQL query? I'm not sure I
understand the use case, or the advantage of something like this over a
regular API call to a backend which would also allow you to do e.g.
authentication. Enlighten me?

~~~
chadhutchins10
One of the best use cases for this is say you have a backend/internal system
and you want other things to start interacting with it. Instead of having to
write the api to interface with it, you can just use something like this and
with little effort you have an api and can talk with the database.

~~~
alangibson
I think the point he was making is: why the API if you just want to talk to
the db? You can connect to a SQL db over the network and protect the data with
views and stored procedures.

~~~
RussianCow
Because you can't (easily) connect to an SQL DB from any arbitrary client,
like a web browser or mobile app.

~~~
chatmasta
That's actually exactly what we're trying to build at Splitgraph [0]. :) We're
building a "data delivery network" (DDN), which is like a CDN, except instead
of forwarding HTTP requests to upstream web servers, we forward SQL queries to
upstream databases.

The premise of the idea is that we can cut out the middle-man for a lot of
data distribution use cases. We give you a way to deliver your data in native
SQL, using the Postgres wire protocol. We've decoupled authentication from the
database, so we can do it in a gateway / LB layer using PgBouncer + Lua/Python
scripting. Any SQL client can connect to the public Splitgraph endpoint (as
far as a client is concerned, Splitgraph is just a really big Postgres
database). You can write queries referencing and joining across any of the 40k
datasets on the platform.

In fact, just this week we've been working on v0.0.0 of our web client. This
lets you do things like share and embed SQL queries on Splitgraph, e.g. [1]
(this query actually joins across two live data portals at
data.cityofchicago.org and data.cambridgema.gov).

There's also an example here of using an Observable notebook with the
Splitgraph REST API [2]. It also works with the Splitgraph DDN configured as a
Postgres database, but that's only supported in private notebooks for now
(since normally it's a bad idea to expose your DB to the public!)

In general, we like the idea of adding more logic to the database. Tools like
OP's are useful in this regard. In fact, at Splitgraph we use Postgraphile
internally (along with graphql-codegen for autogenerated types) and we have
nothing but good things to say about it.

[0] [https://www.splitgraph.com/](https://www.splitgraph.com/)

[1]
[https://www.splitgraph.com/workspace/ddn?layout=hsplit&query...](https://www.splitgraph.com/workspace/ddn?layout=hsplit&query=--+Join+across+two+tables+at+different+government+data+portals+%28Chicago+and+Cambridge%29%0A--+Splitgraph+will+rewrite+the+queries+into+the+providers%27+query+language%2C+get+the+data%0A--+and+run+the+JOIN%2C+returning+the+results+over+the+PostgreSQL+protocol.%0ASELECT%0A++++cambridge_cases.date+AS+date%2C%0A++++chicago_cases.cases_total+AS+chicago_daily_cases%2C%0A++++cambridge_cases.new_positive_cases+AS+cambridge_daily_cases%0AFROM%0A++++%22cityofchicago%2Fcovid19-daily-
cases-deaths-and-hospitalizations-
naz8-j4nc%22.covid19_daily_cases_deaths_and_hospitalizations+chicago_cases%0AFULL+OUTER+JOIN%0A++++%22cambridgema-
gov%2Fcovid19-cumulative-cases-by-date-
tdt9-vq5y%22.covid19_cumulative_cases_by_date+cambridge_cases%0AON%0A++++date_trunc%28%27day%27%2C+chicago_cases.lab_report_date%29+%3D+cambridge_cases.date%0AORDER+BY+date+ASC%3B)

[2] [https://observablehq.com/@mbostock/hello-
splitgraph](https://observablehq.com/@mbostock/hello-splitgraph)

------
cube2222
Looks great!

If you like this, check out OctoSQL[0]... Also in Go... Though OctoSQL lets
you query multiple databases / files / event streams like kafka using SQL from
your command line, not as a server, so a fairly different use case, but you
should check it out nevertheless!

The naming clash is funny.

[0]:
[https://github.com/cube2222/octosql](https://github.com/cube2222/octosql)

~~~
jhoechtl
I realy like your tool. In fact I am slowly integrating it into a solution
which will expose a REST API and workspaces identified by a UUID. In our
organisation it is so common to receive an Excel or csv which you have to join
with the database. Octosql is great for that.

I am wondering what future role badger will play in the future? It would also
make a great additional KV backend btw.

~~~
cube2222
That's really great to hear!

We're considering moving to a more in-memory model, as we're not sure if the
badger storage idea was a good one and worth it.

TBH we're still not quite sure in what direction we'll be continuing. Though
we're surely gonna be developing it further.

But currently we're considering a rewrite with multiple assumptions changed
(column oriented).

------
ForHackernews
Looks vaguely similar to [http://postgrest.org/](http://postgrest.org/)

~~~
lukeramsden
And Hasura, Postgraphile et al too. These, as well as PostgREST also give you
much more flexibility in the form of plugins in library mode and other such
things - they also generate the actual queries for you, via introspection, as
opposed to this which requires you to write the query yourself.

I think there's certainly space for this project, i.e. hand-written queries,
on any database (Postg[REST|raphile] both only work with Postgres of course,
not sure about Hasura). Not sure it will succeed without support for more
forms of Serverless deployment, primarily Lambda.

------
alexellisuk
Nice to see openfaas featured here and thanks for your PRs to Arkade. I do
wonder what your strategy is on connection pooling and authentication?

Also not keen on the passwords being kept in a plaintext file - someone will
check that into git. OpenFaaS has secret support which you can use Amal. So
does Knative.

------
mitjam
Reminds me of the venerable Datasette by Simon Willison:
[https://github.com/simonw/datasette](https://github.com/simonw/datasette)

------
o1lab
Interesting concept and quite liked the playful logo. Can we pass in env
variables to db connection ?

We are in similar space, we take input params of db and generate CRUD apis
with Auth+ACL and then APIs are packed into a single lambda function. There is
support for serverless framework as well.

[1]:
[https://github.com/xgenecloud/xgenecloud](https://github.com/xgenecloud/xgenecloud)

~~~
amal_kh5
Yes, for example:-

? Enter the database password: ${DB_PASSWORD}

------
alexzender
Interesting, I've built a similar project that generates GraphQL API based on
your database schema - [https://okdb.io](https://okdb.io)

------
bobbywilson0
My main purpose of tools like these has always been prototypes, or hobby one-
off type stuff. For SPAs, or a sketch with a Jupyter notebook. They're great
for this sort of thing because in my experience, this used to require building
some sort of API just to get a simple json interface to the database. It was
my understand that the purpose of these types of tools was mostly that.

Are folks using these kind of things for non-trivial production applications?

------
hn_throwaway_99
I fear that all of these "expose your DB as an API" tools like this,
Postgraphile, Hasura, etc. are going to set up folks for a world of hurt down
the road. Tightly coupling your end clients to your database schema can make
it extremely difficult, if not impossible, to refactor your DB in you need to
(which is highly likely).

~~~
brycelarkin
I’m building a project using one of those tools. I imagine that difficulty
refactoring your database is more a problem of bad schema design than the
tool. If you normalize and abstract out the implementation details into Views,
I can’t see how refactoring would be difficult. Haven’t built anything at
scale with Postgraphile/Harusa, so just wondering if I’m missing anything
here.

------
modarts
Looks like someone took this tweet literally
[https://twitter.com/davecheney/status/1296033304756404225](https://twitter.com/davecheney/status/1296033304756404225)

------
rimkms
Logo is looking good , gj

------
revskill
Do u know any similar tool wwhich supports group by query ?

------
amani92
Very impressive, Great job.

------
WrtCdEvrydy
Your timing is perfect.

------
ahmadbana
Very interesting projects and can be scalable Keep up the good work

