Hacker News new | past | comments | ask | show | jobs | submit login
GraphJin – GraphQL to SQL Compiler (graphjin.com)
80 points by nikolay 76 days ago | hide | past | favorite | 80 comments

> After working on several products through my career I found that we spend way too much time on building API backends. Most APIs also need constant updating, and this costs time and money.

> It's always the same thing, figure out what the UI needs then build an endpoint for it. Most API code involves struggling with an ORM to query a database and mangle the data into a shape that the UI expects to see.

Most business API's have business specific per user read/write access roles.

I am starting to think that a better solution to GraphQL and other DSL's is to:

- A. Make the iteration speed of adding a new plain HTTP endpoint very fast.

- B. Use a typed language, and some kind of macro to extract the types into an Open API spec.

This way everything is just a regular function in your general language:

- http_handler(request) -> response

- user_has_access(user, resource) -> bool)


Regular functions have no external dependencies, are easy to understand, edit and stand the test of time better than DSL's.

If GQL does what you need out of the box, it seems a win. But I would assume some endpoints need to fall back on the above approach anyway giving you a mixture of GQL-generated and hand-written handlers.

Having used APIs from SOAP to REST to GraphQL and back, I would agree. It got to the point in a recent project (written in Go) where just slapping another endpoint on, even with fairly particular nesting depth, was a 5 minute ordeal. That contrasts in my mind against another recent project using interpolated GraphQL which has been a struggle on almost every front: struggle to CRUD data, struggle to fetch certain corner cases, struggle with the performance/mem usage of the autogenerated graphql -> sql translation, struggle to bolt on custom graphql handlers for things not mapping neatly to models, struggle to debug/test, on and on. Just write a goddamn JSON handler.

I’ve been building [1] for the past few years and it’s pretty close to what you’re describing (and also does much more).

[1] https://encore.dev

Sounds close to FastAPI[1]. Anyone know of equivalent / better frameworks in other languages?

[1]: https://fastapi.tiangolo.com/

We use not so much frameworks but combination of lightweight libraries:

- runtime assertions [0] - to map unknown values at i/o boundary into statically typed code (rpc input parameters, sql results etc)

- template based sql combinators to sanitize sql/generate sql [1]

- jsonrpc over websockets - for bidirectional comms between f/e and b/e

It works very well for us for hundreds of rpcs - it's easy to manage things like permissioning/narrowing visibility and other aspects you need in business/enterprise-like setting ie. performance monitoring is very easy/straight forward.

[0] https://github.com/appliedblockchain/assert-combinators

[1] https://github.com/appliedblockchain/tsql

We build Taxi (https://taxilang.org), which is the type system for APIs, and then Vyne (https://vyne.co) which uses that types to automate all the plumbing.

It's pretty close to what's being described here, and removes the boilerplate in API orchestration.

How is taxilang different from openapi? Does it have client code gen capabilities?

GraphJin can be used within your own http handlers (REST) sort of like an ORM. Or you can use the @script directive to add a JS script to handle the before and after of the request.

Here's an example of a relatively complex query that you'd need when building something like a blog. GraphJin will compile this nested query into a single efficient SQL statement.


Postgraphile allows for user roles,etv

This is where something like OPA shines. Given a context, you can make an up-or-down decision as to whether or not the request is allowed. Then your not having to wedge security into your ORM reshape code.

I think tRPC.io works well for this, but haven’t used it for anything big. Curious what other folks think of it.

That assumption is wrong. GraphQL can deliver auth, file downloads, even streams.

+1 to that additionally there is nothing like subscriptions, defer, async on the REST side of things. These are really great capabilities to have towards building snappy data rich apps.

Maybe this is a dumb observation but at this point whats the point of even having a backend or graphql (except for maybe authentication). If you are mapping the graphql to sql why not just send direct sql commands. If you arent adding anything extra whats the point of even using it. Just query SQL from frontend

In my opinion GraphQL is operating on an entirely different level to SQL in one meaningful way: manipulating structured data

Manipulating rows in SQL is easy but doing nested joins and trying to represent one-to-many relationships correctly in the response is non trivial. I would say you have to be good at SQL to leverage the DB to make structured data, returning tables and rows is entry level.

In GraphQL on the other hand, it’s seamless / entry-level to query for nested data and represent it really semantically, which is very appealing when doing presentational work (read: UIs, I suppose).

Therefore, the purpose of constructs like this is to allow those working on the presentational layer to be able to construct queries for semantic, structured data themselves with no requirement for SQL / backend expertise. This is a pretty meaningful improvement to unblocking development for both sides of the stack, in my opinion, and is why I apply Hasura everywhere I can.

There are existing frontend tools that can compose GraphQL but only string builders can compose SQL, AFAIK; it also seems to act as a "paper over the underlying database specifics" by using what seems to be MongoDB-esque criteria: https://github.com/dosco/graphjin/wiki/Guide-to-GraphQL#othe... meaning the consumer need not know the postgres-vs-mysql-isms (in theory, of course)

Also, don't overlook the whiz-bang of the GraphQL introspection tooling -- it's super handy for just kicking the tires on something in ways that "dump the SQL schema to the browser" likely wouldn't do

The related pg_graphql posted a while back (https://news.ycombinator.com/item?id=29430720) actually mentions GraphJin positively, and talks about a bunch of competing implementations, although it's not one-to-one with GraphJin because it seems to support mysql whereas pg_graphql is of course a PG extension

Also GraphJin is the only one built in Go and therefore can be used as a library within your own Go app allowing you to use it while also adding any business logic you might need.

The iql tool is pretty amazing for coding up a query and intro.

Because sometimes writing the equiv SQL can be a pain in the ass. Pagination is such an example. PostGraphile does a good job of it that I don't need to reason about it.

This is called PostgREST: https://postgrest.org/. Never used it in production, but PostgREST leverages multiple PostgreSQL features: row-based security, schemas, authorizations, etc. I quite like PostgREST, and I'd use it for internal stuff.

We used PostgREST in a project at my previous company, and it's the most horrendous thing imaginable. The project is a meme in the company and nobody wants to work on it, including its author (who decided to switch teams becaus of how much he hated it).

I don't know if this was because of PostgREST or because all the logic was encoded in grotesque looking SQL that nobody understood (and changing some things requires dropping and readding things).

It is an extremely cool project, don't get me wrong, but it's one of those things that's "I know this already and I'll use it for a weekend/poc project to not deal with writing a backend" and not for something important. And again, I don't know if this was just poor usage or something that manifests itself in every project that relies on it. Also the DSL is very much it's own thing, so migrating away from it is painful, especially if you inherited the project and don't want to read the document. But again, sample size is 1, so it's anecdotal.

I tried PostgREST for a small internal and it was a decent experience. I followed the following rules:

1- Don't expose your real database but expose a database consisting of view on your real database so you can refactor.

2- Wrap your bussines logic in stored proc to keep your sql readable

3- Use jwt for authentication

That said, I would not recommend it for a large scale application. I consider it, maybe wrongly, as the MSAccess of the backend.

This is basically the idea behind [Thin Backend](https://thin.dev/). Instead of exposing raw SQL strings we have a few high level functions to compose queries in a nice way and to do all kind of CRUD operations. These functions map 1:1 to SQL.

We use a WebSocket connection to keep all queries fast. Auth is solved with row level security.

It's even in the name: The backend layer is just a "thin" layer over the database. Most of the business logic is then implemented in a "rich" client/frontend.

GraphQL is the extra. It's usually a lot easier to reason about a graphQL query and the necessary data for a frontend component, than to write a one-off SQL query for the same purpose.

GraphJin does auto db discovery and hence builds a relationship graph which it then uses to write out a single efficient sql query for any Graphql query or mutation even nested ones. As GraphJin improves and uses new capabilities of newer versions of the db you get that for free. In the backend your query is compiled only the first time into a prepared statement and from then on requests pretty much go directly to db.

Check out some of the generated queries this extension [1] pumps out and you might have an answer.

[1] https://github.com/supabase/pg_graphql

oh boy we got to the point were we’re wondering why we need a „backend“ anymore

The most curious part of these tools (for me) is how they handle cursor pagination.

Unfortunately, there are two closed out issues, seemingly with no resolution, that seem to indicate cursor-based pagination doesn't work (with Relay) and perhaps correctly (at all).

[0] https://github.com/dosco/graphjin/issues/44

[1] https://github.com/dosco/graphjin/issues/47

Thank you. Any project that closes a real issue without solving it is a joke to me.

No real issues have been closed cursor pagination has been working for years. This issue clear says "frederikhors closed this as completed on Mar 3, 2020" as in closed by the person who opened it.

Cursor pagination works fine just ask for a cursor with any query and you can then pass that back with the next query to paginate. We don't support Relay atall.

I'm always befuddled by why the GraphQL haters have to come out of the woodwork to shit on a new project in the space, rather than just quietly saying to themselves "this isn't for me," and moving onto the next story.

From my personal experience, there's many backend developer types who resist tools or libraries that eliminate or minimize their bespoke API development. (Even when they complain about constantly making new endpoints for some frontend)

To many folks, it probably reeks of the horrible RPC frameworks of the early 2000’s, which couldn’t be further from the truth, unless they roll their own server like many people seem to be doing for no reason.

I think folks have a hard time understanding the value until they use it themselves/

Since founding GraphCDN, I've been surprised by how many customers we see using Hasura. I was aware of Hasura beforehand, but sort of wrote them off after seeing the fates of GraphCool (pivoted to Prisma) and Scaphold (acq. Amazon).

Apparently, there's strong demand in the market for quicker ways to create GraphQL APIs (see Graphile and now Graphjin), so I'm excited to see more innovation in this space!

Author of GraphJin here, I agree Hasura is pretty great as well.

"automagical GraphQL to SQL compiler" makes no sense. Is this just another Hasura? Is it a compiler or a service?

Is similar to Hasura, but can be used as a library in Go. Makes a ton of sense if you have an already developed service in go and you want to add /graphql endpoint and have a database schema exposed with your already existing authentication.

Hasura solution would be as a separate service and separate authentication most likely

I want to say something nice about this... but unless your entire enterprise is a pure CRUD app with just a SQL database I don't see how this is useful. Complex application backends pull data from multiple sources - SQL and NoSQL databases, REST and SOAP endpoints, message queues, service buses - and perform all sorts of transactions and data transformations.

One of the strengths of GraphQL is you can transform data from multiple sources into a common model the UI can use. If your whole backend is a single SQL database schema GraphQL doesn't seem to add a whole lot of value there. I'd rather have a code generator that builds a simple Go microservice and use protobuf JS in my front end.

> I'd rather have a code generator that builds a simple Go microservice and use protobuf JS in my front end.

You might, but there's something to be said for consistency within an org. GraphQL is powerful and there's many a use case for querying data via a simple GraphQL API in the frontend. You could have just stated you have a bias against use of GraphQL without the need for multiple data sources, and left out the rest of the FUD.

Projects like Postgraphile, Hasura, AppSync, and Supabase's pg-graphql aren't popular because people are lazy or are out there in droves implementing APIs laden with antipatterns. They're popular because they're useful and powerful. And for many companies that funnel all of their data into single stores like Postgres or other relational databases, these kinds of tools are invaluable.

You can do this with GraphJin. We support adding custom resolvers currently we have an HTTP resolver built in which allows you to join your DB GraphQL query with a remote HTTP Rest endpoint. For example you want users from your db and their subscription data from Stripe. It works out of the box, GraphJin will write out the correct Rest call using data from the db call eg stripe_customer_id from the db will be inserted into the rest calls. Additionally its smart enough to handle rate limits and parallel fetches if required.

Here's an example https://pkg.go.dev/github.com/dosco/graphjin/core#example-pa...

And here's an example of a Redis custom resolver to join with data from Redis. https://pkg.go.dev/github.com/dosco/graphjin/core#Resolver

this is conceptually very cool, and graphql can be great, but hooking graphql straight up to a persistent datastore as a means of exposing things to external/front-end clients is an absolutely terrible idea and nothing is going to change my mind.

GraphJin DOES NOT hook GraphQL straight to a DB. It is a backend service which in DEV mode saves queries to an allow list and in production on queries from this list are allowed to be run. GraphQL does not mean clients are allowed to change the query willy nilly in production that would be a nightmare.

Also in production the query is compiled into a db prepared statement the first time and then on queries after validation go directly to the prepared statement. This is no different than a hand written HTTP endpoint + ORM.

Addition if using the standalone service is not for you then you can use GraphJin as sort of an ORM within your own http handlers.

The question around why use it is complex. Some reasons even nested queries and mutations are compiled to a single SQL query. Knowledge of indexes is used to write a fast query. Support for things like joining again Postgres Array columns, JSON columns, fast cursor pagination, Graph queries using recursive CTEs, full-text search and a lot more out of the box and will work on Postgres, Mysql, Cockroach, Yugabyte and I'm working on more.

> GraphJin DOES NOT hook GraphQL straight to a DB. It is a backend service which in DEV mode saves queries to an allow list and in production on queries from this list are allowed to be run. GraphQL does not mean clients are allowed to change the query willy nilly in production that would be a nightmare.

Thats "straight to a db" by my definition. But I agree that saved/allowlisted queries are a key problem that needs to be solved in any prod environment, regardless of tooling you are using.

I'm not saying there isn't a use case for it, but I think there is a better solution 100% of the time. I am clearly not the target audience for this though, so my opinion is kind of irrelevant.

In my world graphql is a tool for aggregating microservices in to a supergraph, when you are at a point that a monolithic DB (or a couple of them) with data from lots of different domains no longer exists. In that sort of environment, which is an extremely common use case for gql, doing this to expose your domain data in a graph breaks microservice isolation patterns and directly ties your domain contracts to internal datamodels; both of which are almost always bad long-term.

If the problem you are trying to solve is "spend less time building http endpoints for CRUD apps", then something like graphjin is a win, but I'd argue it not a pattern you'll want to use forever. If you are using graphql to aggregate cross domain services in a large engineering organization, this is a bad idea.

I understand that sentiment, since I was similar before, but I've come to realise that it's little different from from slapping an ActiveRecord-style ORM and framework on top of it. The difference is where you encode access rules.

With frameworks they're in code, with graphql adapters, they're declaratively managed (idk if this project does it, only talking conceptually, but Hasura does it). This is kinda better when you think about it because it's easier to simulate/validated the ACL. And on top of that, it's possible to build no-code interfaces on top of it to manage those rules by anyone in an organisation. Add hooks and you can add complex business along side these access rules, without encumbering the business logic with ACLs.

I realise this is a bit idealistic, but I don't think it's an unachievable goal with the current tech we have out there.

With all that said... Even though I sound like a proponent of this now, I'd still be a bit nervous and on the fence about having this in production.

When I run PostGraphile, I have it behind another backend because of this concern.

It does spare you from having to write SQL, but instead you have to write the GQL documents. It also spares me from having to come up with pagination SQL as PostGraphile handles that in pagination cases, along with text searches.

PostGraphile also lets you selectively expose which tables / fields should be available to a GQL client, but if PostGraphile itself can be attacked, then yes, an FE client could get to the underlying DB.

What does this "middle end" do? Does it proxy to PostGraphile, or does it do something with the query beforehand? Or does it not proxy it at all?

Can be a mix of both or a straight proxy to what the client is allowed to access. For example, if you need to perform multiple operations on the same resource in a single call, the middle tier would do those multi-operations instead of the FE client.

You're pretty much having the middle tier act as the business logic layer as the FE client(s) doesn't have to worry about calling a sequence of operations to do a task, and instead just have to call a single operation on the middle tier instead.

Ah alright. So you do complex input (example) transformation as part of the business logic in the middle tier, and then forward it to the db? Does that mean you parse the query in the middle end too, or just parts of it (I'm not that knowledgeable about gql, so sorry if the question makes no sense)?

We are soon going to be evaluating a similar idea at work, so I'm just curious how it works. If it's lengthy you don't have to go into too much detail if it's a bother to explain!

It's called a "BFF" (Backend For Frontend) https://blog.bitsrc.io/bff-pattern-backend-for-frontend-an-i...

It's a really nice pattern particularly for multi-client applications consuming the same resources. Instead of trying to make a single REST api work for desktop, mobile, and web, you can have a BFF for each which all share the same GQL definitions and data access patterns, but can have platform specific routing/handling/auth logic.

The middle tier does not have to be a GQL server; it can be REST or some other protocol like gRPC. The middle tier has a GQL client that calls the PostGraphile backend.

Yes to your question overall.

As another reply states, it's exactly a backend for a frontend.

I dislike working with closed-minded folks for reasons such as this. Where's the danger in exposing GraphQL to a read-only replica containing non-sensitive data?

GraphQL takes security pretty seriously the way we think of GraphQL as a DSL for bulding APIs. The queries you issue in DEV are saved to an allow list and only those queries are compiled in production. Any other queries (or changes to existing queries) from the client is ignore. So you can use it to expose your primary DB. It's really no different that you writing code to build an HTTP endpoint, except of-course here you only write GraphQL

Data Exfiltration As A Service

I thought that was an Azure trademark…

But really, with proper authentication in place and preferably built-in to a GraphQL-enabled datastore, a GraphQL server can more secure than your average self-implemented HTTP API.

The latter can easily leak through e.g. haphazard app-level joins. Meanwhile, the GraphQL server can secure things at object level, much like if people actually integrated their frontend authentication all the way to their (No)SQL server.

thats honestly not my biggest issue with it, but its also a problem (a solvable one though)

If you're looking for something like GraphJin, PostGraphile or Hasura but with less boilerplate and complexity, more end-to-end typesafe approach and optimistic updates, check out Thin Backend https://thin.dev/ (https://github.com/digitallyinduced/thin-backend)

Thin Backend takes a bit more of a higher level approach to database operations than services like GraphJin, but solves fundamentally the same problem. Doing things in a more structured way also allows us to do things like optimistic updates by default that require manual work with GraphQL tools.

To see some code examples, here's a small example project done with thin-backend: https://github.com/digitallyinduced/thin-backend-todo-app It's running on Vercel here: https://thin-backend-todo-app.vercel.app/

If you're interested in how building a small demo app with Thin looks like, check this video https://youtu.be/-jj19fpkd2c

Happy you're sharing this on here, glad to have more options on the table. However I will say GraphJin is not complex to use nor does it have any boilerplate. It's feature rich and hence has knobs to help you tune it to your needs. However it's defaults are well thought out and so should work out of the box in most cases. For example to use GraphJin as a library in your own code this is all it takes.

Here's a quick example of using it as a library in your own code.


> less boilerplate and complexity

Kind of lame to pitch this in the comments on GraphJin. Make your own post and try to get on the front page on your own merits without bagging on this.

I tend to disagree. A lot of discovery happens on HN.

Many posts will have alternatives posted in the comments. Often the creator is the best to provide links and describe the differences.

My alternative is a generalized DSL to code framework powered by CUE. You can write directly in the output and later regen because it uses diff3. Generate all the things in any languages!


We have a couple users recently that are switching from different GraphQL services, so I expect this to be relevant for people checking out GraphJin :)

Thin was on the Frontpage of HN around a month ago: https://news.ycombinator.com/item?id=31164799

Author of GraphJin here, thanks for all the comments on here. It's a labour of love for me over several years. Today it's so useful to me and other that I'm probably going to work on it forever. GraphJin is used as the backend for my startup https://42papers.com a site to discover and read top trending research papers in CS, DL, ML, and other fields.

Main question, how do you deal with authorization? PostgreSQL Row Level Auth is horrible to use when you scale up compared to implementing auth on the app level. Both Supabase and Hasura suffer from this problem. The moment your data auth needs become slightly complex, you end up with a mess of calling out to external functions and ad-hoc SQL functions in the database.

> ad-hoc SQL functions in the database

If you use tools like Sqitch, Metagration or any other suitable migration management system then functions in your database stop being scary.

Sqitch: https://sqitch.org/ Metagration: https://github.com/michelp/metagration

> PostgreSQL Row Level Auth is horrible to use when you scale up compared to implementing auth on the app level.

In what way?

When you have RBAC based on multiple different conditions, e.g. "does user have XYZ?" you will have to architect your entire app around putting these claims into JWT so the GraphQL engine can properly map those onto RLS. In a traditional app auth is on the app level and defined in code, not a mix of code, SQL, JWT claims and Hasura/Supabase configs.

> When you have RBAC based on multiple different conditions, e.g. "does user have XYZ?" you will have to architect your entire app around putting these claims into JWT so the GraphQL engine can properly map those onto RLS.

If it's RBAC, why am I not just giving creating per-user postgres roles inheriting from appropriate roles for the relevant features?

> In a traditional app auth is on the app level and defined in code, not a mix of code, SQL, JWT claims and Hasura/Supabase configs.

Doing authz via DB-side permissions attached to per user DB accounts is an equally, and more well-established, traditional app approach.

I just wrote an article on this (adding custom claims to the JWT) for Supabase projects: https://dev.to/supabase/supabase-custom-claims-34l2

You don't necessarily have to put those claims into a JWT (other than the user ID). You can just store the "XYZ" that the user has into a table.

That is definitely true (and probably easier than doing custom claims), but the reason I use claims is that it saves doing a join against that extra table inside your RLS policies, which can sometimes have a big performance impact. Those claims are also available on the client side, too, saving a round trip to the server if you need to check the claims.

I built something similar to this for SQLite, as Datasette plugin: https://datasette.io/plugins/datasette-graphql

You can try a demo here: https://datasette-graphql-demo.datasette.io/graphql?query=%0...

I have never understood why people adopted graphql.

Despite years of asking people to explain its use, any example cases given can always be resolved with regular old http/REST style apis.

You still have to write sql queries to back your graphql calls anyway, so why are we spending extra obfuscation and effort instead of just getting the work done.

Facebook has so corrupted the engineering environment over the years that if a company recruiter contacts me and I find they're using any FB tech other than React, it's an instant no. I've walked over that luau bbq pit enough. React only gets the pass because it's not specifically bad tech, and it's too ubiquitous.

I doubt this will change your mind or even pique your interest, but - Look into how PostGraphile and pg-graphql combine an entire request into a single SQL statement. Something that REST is not well-suited for and requires multiple requests to accomplish.

I won't comment on GraphQL itself. But GraphJin builds an internal graph of your db relationships, etc and uses that to write a single efficient query. Even if you have a deeply nested query, insert or update it will only result in a single efficient SQL query. Additionally we also support GraphQL subscriptions out of the box and soon, defer and async as well allowing you to build live updating apps using the same GraphQL queries.

Wonder how this compares with PostGraphile which I think does the same thing and is pretty well-established:


For one thing, GraphJin alleges to support MySQL and YugabyteDB: https://github.com/dosco/graphjin#features

Also Cockroach just need to get it officially in.

Another parent mentions the lack of support in GraphJin for cursor pagination. That's something that Graphile does very well.

It's has support for cursor_pagination for years now. You can even have multiple cursors per query.

Applications are open for YC Winter 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact