Hacker News new | past | comments | ask | show | jobs | submit login
GraphQL Query Rewriter (github.com)
65 points by chanind 31 days ago | hide | past | web | favorite | 23 comments

I would personally caution against using an implicit rewriter like this. It is implicit in the sense once you change your schema, there's no documentation about the deprecated queries that are still supported. Yes, old queries will still work, but people who stumble up on them will be very confused as to why they are working, since the schema will say otherwise. Tools like IDE auto-completion for queries[1] or the graphical interface GraphiQL[2] will also ignore these rewrite capabilities and so will not help with writing, editing or running the old-but-still-supported rewritten queries.

Instead of that, i'd personally recommend either sticking with your old schema if the change is as superfluous as the one mentioned on the project's README (changing the type of the `userById(id)` field parameter from `String!` to `ID!`), or shamelessly embracing versioning your fields. In the case example case mentioned on the README, that could mean adding a @deprecated directive on the `userById(id: String!)` field, and then adding a new `userByIdV2(id: ID!)`. Users of the new field can alias it on to a friendlier versionless name, like:

  query {
    user: userByIdV2(id: 123) {
This way, the changes on the schema are much more explicit: users can still use tools like GraphiQL or schema-aware text editor plugins to write their queries, while receiving feedback about their use of deprecated fields, and what they can do about them :)

[1] Like Intellij IDEA's graphql plugin https://plugins.jetbrains.com/plugin/8097-js-graphql [2] https://github.com/graphql/graphiql

I wouldn't necessarily recommend using query rewriting for public graphql apis, but for internal apis or private apis to power mobile apps or web apps it should be fine. Just make sure you add tests for the old queries to ensure they keep working. Once you see the deprecated queries stop being used as users update their clients you can drop the rewriting entirely too.

That seems like a good approach for those cases, yeah :)

I guess the difference then boils down to whether you're OK with having a bunch of deprecates stuff in your schema (you can also remove those fields once they are not used anymore) and some "V2", "V3", etc in your field names, or your prefer having a more pristine schema but paying some price of discoverability and tooling support for those deprecated features.

I'd reckon the vast majority of real-world use of GraphQL is bog-standard web-api apps. With this, you could deploy a single schema changeset which updates the server, client, and middleware; then remove the middleware in a week once the web client has been guaranteed to have been refreshed or reloaded. Best of both worlds really.

What i don't like about Graphql is that it enable stringly type query. Why not enforcing JSON instead ?

Stringly type is too hard to be manipulated.

> Stringly type is too hard to be manipulated.

With param and fragments, the json would be too verbose, so you'd end up by always building your queries programmatically, so sharing queries between GraphQL implementations and/or runtimes/languages would be harder.

GraphQL language for queries and schema adds a layer that describe what your runtime and/or compile time has to conform with, which is incredibly helpful for stacks with heterogeneous environment.

In the end, you can always parse and generate those full text documents and manipulate them programmatically, as much as you want (just be cautious with perfs). So the apparent fragility has many ways to be addressed, with a huge advantage of having common ground on all stacks.

With JSON, you have many benefit, such as syntax error for JSON. It's impossible with stringly type.

With dynamically generation of query, JSON parsing is easier than stringly type. You don't need a library like graphql-tag to do that.

Instead of posting a string body request, you post a JSON request, which is the standard way for long time.

In general, stringly type graphql query is not a fine choice to me.

I hate to break it to you, but JSON is a string too. i.e. There is no such thing as a "JSON value" in JavaScript, just a method that will convert a subset of JavaScript primitives to a JSON string encoding.

GraphQL is a well-defined syntax that can be validated by both client and server. An invalid query can be rejected with a syntax error.

Yes, it is harder to dynamically construct a GraphQL query on the client, but you shouldn't do that anyway. The language has directives and variables that can introduce dynamic elements.

JSON here is used as a structured format, with clear specification.

Graphql query also has specification, and you can encode it with JSON format, too.

WHy not ?

What do you mean by structured? How is GraphQL structured Would you consider something like PHP's serialize() function to be structured?

   serialize(array('1' => 'elem 1', '2'=> 'elem 2', '3'=> 'elem 3'));
   // a:3:{i:1;s:6:"elem 1";i:2;s:6:"elem 2";i:3;s:6:"elem 3";}
There are a lot of advantages to creating a custom syntax. In your example what does `query: { user: { id: 1, name: "true", } }` yield? How do you represent fragments? How do you represent variables and directives?

You could implement conventions like `$ref`s in JSON schema, but then you've ended up creating another custom syntax... just this time it's piggy backing on JSON instead of plain text.

parse the query, emit ast as json, done.

to me one of the important points of graphql is that it isn't json.

Can you give an example of what you mean?

I mean, instead of

query { user { id name } }

why not

query: { user: { id: true, name: true, } }

Generally you'd use a library with much stricter conventions than JSON to generate and parse GraphQL queries, rather than writing them as a string. If you're using JS/ES/TS, you can even configure a linter[0] to validate your queries against a schema as you're writing them.

Insofar as JSON is a type system, it's a weaker one than GraphQL, and would be a lot more verbose. Consider the query

    query(id: Int, hasUser: Boolean) { 
      @include(if: $hasUser)
      user(id: $id) { 
It would be possible to represent this in JSON, but it would be a lot less elegant.

      "query": {
        "args": {"id": "int", "hasUser": "boolean"},
        "fields": {
          "user": {
            "args": {"id": "$id"},
            "fields": {"name": true},
            "includeIf": "$hasUser"
The JSON still depends on stringly-typed values like "$hasUser", but it's much harder to read and has many more unnecessary characters.

EDIT: if you want to manipulate GraphQL queries, you need to parse them[1] and then look at the AST. This is not very well documented, so the best way of doing it is by experimenting with the values you've got, and then verifying your approach against graphql-js.

[0] https://github.com/apollographql/eslint-plugin-graphql

[1] https://github.com/apollographql/graphql-tag

I was originally skeptical as GraphQL already comes with the `deprecated` directive for this use case and it’s recommended for API evolution. Was actually delighted to see that this can let you make changes to your schema transparently without breaking clients!

Query rewriting is always scary to make use of, because it only takes like 2 rounds of rewrites (original | rewriter1 | rewriter2 | db) before all hope is lost

If you end up using query rewriting to upgrade client queries, you'll end up very quickly with divergent SQL-schemas spread across many clients with little proper way to identify (client A module B was written targeting DB v0.1, but module C was looking at DB v0.2, and client B was only written when v1.0 released; A was upgraded by query rewriting from v0.1/0.2 to v0.5 and then v1.0, so the codebase still features ancient references, and applies two rounds of rewrites to find the final query output)

Basically you would have to prove that your rewrites are confluent in the sense of a term-rewriting system[1]. I'm not sure if the project here gives you any help in doing that, but it would be pretty cool if you could somehow prove the rewrites are confluent.


That only half-helps though doesn’t it? It saves you from rewrite-order dependence, but doesn’t the requirement that one has to follow ten rewrite (patches?) to figure out what was actually expressed by a query in the codebase. Which takes it down from hell-mode to hard-mode difficulty

Yeah that's true, and now that you mention it, maybe patch theory could help (in the sense of Pijul, Darcs, etc)

Though I don't know how much effort I would spend on trying to make this work.

The problem I think is that at least with patching, its a run-once situation; once its updated, you never have to see that code/binary again.

With query rewriting, its a continual patch, which applies both against the old queries.., and when the old queries are modified; that is, two divergent codebases now exist, referring to different schemas, but unknowingly one codebase gets warped into a different universe where its query somehow works.

Both codebases continue to exist, and continue to be worked on. Divergent, yet somehow converging. Every client hit by query rewriting is another divergence.

The issue is that the codebase’s query in fact needs to be rewritten.. once, and finally! If you only modify it in-stream... then the rewrite rule must exist until the end of time, or the codebase is updated manually.

So really you want a way to patch codebases you probably don’t control.. but I think if you can solve that, you’ve solved APIs everywhere

So maybe what's really needed is an editor plugin for GraphQL that lets you auto-rewrite queries, or if the queries are generated by a library, then that library should do it.

The thing though is that you don’t actually want to rewrite at runtime; thats the hacky solution which ends up as layers of rewrites.

You want to rewrite at compile-time, rewriting the codebase itself, and save the changes to version control. As if a human were updating to a new schema. You don’t want to allow stale, unversioned schema-references littered around the codebase itself; being valid by the time it executes is fine and well functionally, but hell to follow

You just described healthcare data formats in a sentence.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact