Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
From REST to GraphQL (jacobwgillespie.com)
160 points by edoloughlin on Oct 10, 2015 | hide | past | favorite | 43 comments


It is kind of amazing to me, that we've almost come full circle now from the early days of database driven, server rendered pages. Back then, before ORMs, before even many templating solutions, rendering a page was basically two steps:

- Build usually one, but maybe a few, SQL queries for the database. These queries (as part of their WHERE, GROUP BY, etc clauses) would do the roles of aggregation, summation, and even role based security since the WHERE clause could constrain roles based upon the user's cookie.

- Render the page out by walking this SQL result.

It's been interesting to see the UI layer move out of server rendered pages and into the client, and I do think that moving the web from a "rendering HTML document once" to "in a render loop rendering a DOM" is a step forward not just in flexibility but conceptually. But the data access story is still wildly in flux, and the fact that we are now effectively building a query language and query optimizer that in practice will sit on top of SQL and the RDBMS optimizer makes me feel like this abstraction is definitely too thick. I hope the logical outcome of all this will be a comprehensive data system that ties these pieces together (schema, migration, modelling, performance, data binding, etc.)

I've hoped that this would happen for a while but perhaps as with most things in software engineering a change in the surface syntax and notation (GraphQL) re-frames the problem in the heads of the right people that will be both motivated and able to kill off some sacred cows. I think a data system that draws on the lessons of the RDBMS work of the last several decades, the distributed systems work of just the last decade, while also being in touch with the needs of modern applications has yet to come along -- perhaps it has and I haven't seen it yet.


Another possible solution while sticking to REST: instead of embedding all the sub-objects have an "expand" query param where you can list all the sub-objects/relationships you'd like returned (EG: GET /playlists/ID?expand=tags,tracks) This way you can still do everything in one request but not bloat the response data structure for situations that don't need the sub-objects.


Yes, people do that all the time (e.g. of the top of my head, the Google Drive API).

GraphQL is the natural extension of that, but more flexible and composable. I can get a user's name and email, and his friends names, with the cost of only one request and latency.

You could change your REST API to do the same, but eventually you are reproducing the GraphQL equivalent.


this is how JSON-API works (Yehuda Katz, et al) http://jsonapi.org/

`GET /articles?include=author,likes&fields[articles]=title,body&fields[people]=name`


Interestingly, this is what Facebook does with their REST Graph API: https://developers.facebook.com/docs/graph-api/using-graph-a...


GraphQL allows queries on those "expanded" sub-objects/relationships e.g.

  {
    playlist(id: 123) {
      tags,
      tracks(top: 5)
    }
  }


It's great that GraphQL's solved some of your issues, and you've encouraged me to look into it a bit more. One thing you should worry about, though - if at any point there's any user-specific data in an HTTP response that doesn't have the user's ID in the URL (or something pointed to by the 'vary' header), it kills the idempotence of the request, whether you're using what Rails gives you or GraphQL or XML-RPC or whatever, which is pretty much the whole point of REST. It's not just REST-nazism, this completely destroys the usefulness of any cache past that point if it doesn't understand the internals of your API (as you seem to note that you've realized at the end of the article). It might make sense to just not expose the GraphQL to the client, so you can think about the idempotence of your requests from the backend, and allow yourself to keep making use of caching proxies and client-side cache.


That's correct, you destroy HTTP-level caching.

(For one, your queries can be large so are POSTed anyway.)

Relay is a caching layer that understands GraphQL semantics, but you no longer have transparent HTTP proxies, browser caching, etc. Though HTTP proxies are dead thanks to HTTPS, and with modern browser APIs, you can do your own limited caching.

It's certainly a tradeoff for what you expect to give you best performance gains: fewer roundtrips, or more caching by third parties.


> Though HTTP proxies are dead thanks to HTTPS

Well, they're not as broadly useful as without HTTPS, but you can still always MITM yourself to get a free transparent caching layer between the client and your end server.


I have "read Relay source" on my todo list, since that's something that Relay is intended to solve (using GraphQL schema introspection to understand the responses from the server and intelligently compose queries and cache the data). We may eventually be able to utilize Relay on the web, but I'm more interested in understanding how Relay works conceptually in the interest of porting the logic to native platforms (iOS / Android). But I agree, it makes caching more difficult, especially with caching proxies.


Do you have some way of sharing non-UI code between native mobile clients, e.g. Xamarin, RoboVM, or something JavaScript-based? React Native would be particularly appropriate here since you're using GraphQL and looking at Relay. Rewriting non-UI client code in two or more languages is unacceptable in my view, because it either encourages dumb clients, which is the opposite of what you want if you're using GraphQL, or leads to subtle discrepancies between platforms.


It may be worth looking into HTTP 2. While it may or may not be supported widely enough for production yet, it provides for things like "server push", which should allow you to minimize network round trips while keeping components separately cacheable (and therefore not requiring you to do gymnastics with your API to keep your cache granularity down), which should be helpful both with and without GraphQL. I would recommend, either way, though, thinking about organizing your objects around how quickly they spoil for their given cache key (usually the URL) across multiple clients, since that ultimately determines where you can hit cache and where you need to make a new request, regardless of what libraries you're using.


If you needed this level of flexibility in your mobile apps, why not just download the datastore to the client and use something like SQLLite, or a JSON query language on the client (similar to mongo, there are lots of options).

In addition, in reading the thing that "pushed you over the edge" ... I really don't see how REST has failed you here, in terms of how you are partitioning your data. I can't imagine that some kind of intelligent caching of data (no need to re-download playlist data to the client that hasn't changed, etc). wouldn't have been a more straightforward approach here.

I find the GraphQL stuff interesting but your problems would have been better served by refactoring what you had rather than moving to a completely new API and query format, at the cost of sacrificing an easy to use interface for your internal and (maybe) external consumers of your API. I'm not convinced you won't have the same issues with GraphQL and could stand a bit more pragmatism.


This is exactly what I thought too. From what the author described, it seems that they were just using REST incorrectly. When a view wants tags from a playlist, the solution is not to include tags in every other request. You can simply include a query parameter instead:

    GET /playlists/ID?tags=true
This lets the server knows to include the extra information for that request, keeping your original endpoint clean.

We don't know all the details, but from what the author wrote, it seems they were shortsighted to switch to a shiny new tech instead of looking for better ways to refactor what they currently had.


Terrific writeup - thanks for sharing. IMHO this is the most important item: "We have yet to solve caching GraphQL responses on the client."

Mobile client-side caching and sync/refreshing (or client-side change push) is where users get big gains in perceived speed gains. Fielding's REST handles this well, in my experience, though it takes some work to get the resource granularity correct, especially as a mobile app's feature needs are changing.

In my experience it's often the case that the mobile app can benefit from Fielding's REST (with HATEOS) and from using resources that are mobile-specific and quite different than the out-of-the box server-side models that map 1:1 to database rows. As an example, the JSON API team describes how to nest resources efficiently and cacheably, and also send HATEOS links.

I'll be very interested to hear more about your next steps, including how you decide to do caching, puts, and any kind of pub/sub or equivalent. Thanks again!


Glad you enjoyed it! As someone pointed out below, what we were doing before is not true REST, and we may have been able to solve our problem in other ways (HATEOS, "true REST", etc.) - we experimented with some HATEOS-esque URI templates early on, similar to how Github's API implements them, though personally I feel like I need to study more to understand true REST and HATEOS.

I will definitely follow up with our findings as we progress towards solving our remaining challenges.


IMO, the enabling library in the article is FaceBook's "DataLoader".

When I first saw GraphQL, my thought was that it was a nice abstraction to enable client-dependent API. But it lacks the actual implementation, and provide no answer to "what about performance?" question.

DataLoader answer that question elegantly.


Yes, once I stumbled across DataLoader for the first time, I was amazed that what I thought was a major issue for GraphQL had such a straightforward solution.


Hmm. From what I understand of GraphQL, it gives you:

- less boilerplate (no need to create GET REST endpoints and corresponding JS AJAX calls)

- consequently, more flexibility (just request more data in the client, no need to update server code)

- formalized query language (and apparently, the ability to check it?)

However, the idea of exposing your schema to arbitrary requests sounds much too dangerous to me. I see the advantages for really large applications, but I'll stick to REST with a 'format' parameter for the foreseeable future.


This seems to be a common misconception. GraphQL is just a protocol; query fields don't have to be tied to your schema at all, so you don't have to expose anything you don't want to, in the same way that you don't have to make every table accessible via REST.


While it doesn't have to be tied to your schema, it provides too much flexibility. I worked on an API that provided a data graph-type of way to fetch desired data. The problem quickly became that we were not able to predict the exact paths provided which made it very difficult to tailor the performance. We didn't realize this early, but as the API usage grew, we realized the flexibility was a mistake.

The article mentions pre-processing the graph to determine complexity, some caching, and some timeouts. This is not good enough and can/will be worked around. Inflexible API's with (mostly) single, predictable paths to data provide a great way to tailor performance at the (IMO) negligible cost of extra HTTP requests.


Let me rephrase that: you are exposing a way to trivially traverse the entire object graph you're exposing through the API, putting you at risk of DOSing your server when your friend Joe the frontend dev makes a small mistake in his query. You can obviously get the same result with REST, but usually you have to work a lot harder for it.


This is actually solvable. The queries that will be executed by an app can be analysed statically, and therefore are available as part of a build pipeline. So it's possible to create a whitelist of queries automatically, or even go one step further and give each query a short unique ID and just say "execute query foo with these parameters" rather than sending the actual body of the query over the network.


When a solution requires this kind of hacks for a simple task as rate limiting, maybe the solution is a bad one to begin with.

With REST it is fairly easy to do rate limiting on endpoints.

With GraphQL, it is fairly easy to DDOS with a complex query that will ask the server to fetch an insanely complex data structure. And there are no endpoints anymore.


Yeah, but at that point the quantity of magic will start to make me uncomfortable.


I quite like GraphQL, but only as a filter and composition layer in addition to a REST interface.

REST gives elegance to the non-GET operations and it is only the caching, JSON size, and composition of multiple resources that can sometimes be a strain for some applications (or all applications if the REST API is designed badly).

I'm probably going to investigate providing a common GraphQL query interface to my REST API. This would internally be an orchestration layer providing the filtering of properties as well as the composition of resources, and of course the instructions to that layer would be GraphQL queries. This would allow internally cacheable resources (perhaps via ESIs) and the filter would be applied in front of the cache layer.

I'm not yet at the point where I think GraphQL replaces REST, instead it complements it.


It seems like a really common use case where

1) The service wants to keep its API general, and not client-aware

conflicts with

2) The client wants to assert a client-specific spec on the returned data

(e.g. the track tags, etc)

One of the other possible options would have been to keep the general API and then have a separate client-aware bespoke service that consumes it, and serves as the intermediary. Back in the SOA days they called this "service composition". There are real tradeoffs there though, in that you also want to avoid fully synchronous (nested) service-calling-service scenarios because then latency and SLA becomes the choking point. Caching doesn't fully solve that. More reactive architectures start to help out at that point, where consumers are not necessarily waiting for the full response before they take action.


> Future Puzzles

> Looking forward, here are a few things we are currently looking to solve:

> Mutations (Writes)

I agree that this is a more compelling way to structure GET requests, but REST and GraphQL are not comparable until they are solving the same problem.

Writes are a huge piece of RESTful interfaces (PUT, POST, PATCH, DELETE). I'm happy Facebook is open sourcing their work, but it does a disservice to GraphQL to compare it with REST.

Perhaps the title would have been better as "From REST reads to GraphQL."


> Writes are a huge piece of RESTful interfaces (PUT, POST, PATCH, DELETE)

GraphQL handles mutations fine. This guy just hasn't implemented them yet.


Indeed. He implemented a read-only system and wrote it up (very well I might add). Since he hasn't used mutations yet, he didn't feel ready to write about them.


This sounds like incompetent back-end design in the initial REST API. I never thought GraphQL would be more performant than REST-API + optimized queries. I don't have any experience with the current set of GraphQL libraries, but I've assumed that GraphQL is a solution to a completely different set of issues but with a possible performance penalty.


> This sounds like incompetent back-end design in the initial REST API.

It doesn't even qualify as REST. They've got hard-coded URI hierarchies and IDs everywhere. That's the opposite of REST.

REST APIs must be hypertext-driven: http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hyperte...


Yep, you are right - I think our previous implementation would be better described as a "typical HTTP Rails JSON API." As I briefly alluded in the article, we actually initially had a hypertext-driven API using RFC 6570 URI templates, but had to eventually change designs to increase performance for mobile apps. Payloads needed to be smaller and we couldn't afford the extra requests necessary to traverse the endpoints.

But yes, I agree, what is commonly called "REST" isn't usually.


So by increasing the payload you've improved the performance of the mobile app by reducing the # of requests that you had to do?


I'm as cautious about GraphQL as one can get but I believe the greatest failure of REST is that it is unable to set clear rules and boundaries as to what is REST and what is NOT REST. I want to see a canonical implementation of what an ideal REST architecture is. Never saw one. GraphQL has clear spec which makes implementation easy. REST is at best a set of vague ideas and I understand why a lot of developers do not like that. because I don't.


> I believe the greatest failure of REST is that it is unable to set clear rules and boundaries as to what is REST and what is NOT REST.

It hasn't failed in this. The principles are very unambiguous. The problem is that people don't actually read the source material and just follow crappy tutorials and ape other systems without understanding what REST is. Then they go on to build things that aren't REST, they go on to write crappy tutorials themselves, and everybody learning from them and aping their systems fail just as badly. The problem is not clarity of communication, it's bad PR.

> I want to see a canonical implementation of what an ideal REST architecture is. Never saw one.

You're using a pretty good example right now. The World-Wide Web is an implementation of the REST architecture. It's not perfect, but it's the archetype REST was born from.

> REST is at best a set of vague ideas and I understand why a lot of developers do not like that. because I don't.

Look, it's really not difficult to find the concrete, unambiguous description of REST. It's right here:

http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch...

All you had to do was Google it or look at the Wikipedia page. There is nothing vague about it whatsoever. The problem is not that REST is vague, the problem is people just don't bother reading the source material.


Isn't it reinventing the SPARQL endpoint?


nice I'll be very interested to hear more about your next steps


I've heard some rumblings about adding GraphQL as a feature to Elixir / Phoenix. Anyone know more about this?


I am member of the Phoenix team. We are not adding GraphQL to Phoenix. What Chris McCord said in his keynote is that some of the ideas behind GraphQL/Falcor, like co-locating your query with your view, is also useful on the server side because it makes the code more maintainable (for example, you no longer need to write a big query in the controller with knowledge of all the view pieces). It also makes views easier to cache, easier to detect when the cache is expired, easier to compose and so on.

How that will affect Phoenix is yet to be seen but developers shouldn't expect a big departure. In any case, if folks want to build their own package that provides GraphQL/Falcor server and clients on top of Phoenix, by all means, go ahead and have fun!


Would you please share a link to said keynote? I've been playing around with Elixir and plan on diving into Phoenix as well, so it sounds like it'd be interesting.

I've seen there's a library for creating GraphQL services with Phoenix, and I was also planning on using that as well.


Elixir conf 2015 videos are not yet on ConFreaks. Keep your eyes peeled.


The mentioned talk "What's Next for Phoenix" by Chris McCord is now online https://youtu.be/-7Q3bD4qSVE?list=PLE7tQUdRKcyZb7L66A9JvYWu_...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: