Seriously. For internal stuff you want to be as specific as humanly possible. You want to optimise the living fuck out of that hot path thousand-times-a-second query, and build an entirely separate service to handle that particular URL if necessary.
For a public API, why would you ever encourage submission of arbitrary queries? They will destroy your database servers.
Note that SQL vs. NoSQL is not even a competition here. An RDBMS can handle arbitrary queries much better than any brute-force map-reduce system, and doesn't require spending thousands of man-hours writing your own query planner. The difference is that it doesn't (always) automatically scale horizontally.
So from where I'm sitting, GraphQL is nothing more than an invitation to maintenance and performance headaches, predicated on the idea that everyone scales infinitely horizontally and can brute-force every query.
Personally I prefer being able to serve a thousand queries a second from a single server managing a 1TB database with a 50GB working set, with a latency under a second even when (looking at the raw query) it 'should' touch more rows than there are atoms in the universe.
 In light of the replies, I should express some surprise that building REST endpoints is expensive. If you can execute at runtime an automatic combination of various other endpoints to produce a result, is it not equally simple to generate the code necessary to do the same? That can at least be examined and maintained more easily than dealing with the massive array of possible varations to known-working queries which comes with modifying the rules driving the GraphQL evaluator...?
Note that GraphQL does not allow the specification of arbitrary queries - much like REST does not allow access to arbitrary resources - the sever defines what queries are available and the user can choose to request a selection of them, and to pluck data from them (or follow links - simply a JOIN).
In other words, GraphQL lets you combine a subset of pre-defined queries in one go.
AFAIK Facebook created GraphQL specifically for their gateway API - a service used as a facade between internal service mes(s/h) and their client - not for internal ones themselves. That's why things like schema stitching didn't came from FB - they weren't using it in that context.
I may be misunderstanding something here, but when 'general' queries are combined externally a lot more work is done than is necessary. Which may be fine for small intermediate sets. But treating it as any kind of general solution is silly.
That said, people do similarly stupid things within individual codebases running in individual services to a single database, so as usual it likely comes down to how the tool is used rather than how it can be abused. Still, spreading these things across services and processes looks like it only makes abuse easier.
Let's say you call UserService::batchGetUsers(userIds) to get a list of users from a service backed by a MySQL DB, call WidgetService::widgetsForUser(userId) to get a list of a user's widgets from a separate service backed by a Redis cache and another MySQL DB, and return them. What's the problem here? Lack of transactions? Unnecessary fields sent down the wire? Something else?
The parent author holds some very stubborn beliefs about how systems are built (his/her way is the correct way!), which is great for discussion, but probably not the best example on how to actually build big systems.
>avoid having to write a custom endpoint for every single REST query
So what are you saving, really? Could you better implement GraphQL type functionality as a client side wrapper for traditional REST apis? Then you could keep the traditional tooling as well as simple query join semantics for client devs.
Of course we're very concerned about the performance drawbacks and are trying to plan for them. We don't expect it to be web scale and aren't replacing our REST APIs that serve the public. Even so we expect to need to be smart about calculating query complexity.
So while most of the criticisms of GraphQL on this page do apply, for us the net value is looking quite positive.
One easy way to fix this is by using graphql instead of REST. Now the frontend teams have access to somehow get the data they need and are thus unblocked.
I'm just too used to being the person who then has to make the server deal with this particular use case (meaning a million possible variations on the same query) run a thousand times faster :P
So if it's understood at all stakeholder levels that there are tradeoffs, it could be a good tool. If. >_>
Sure, if you try to shoehorn an API that does not fit the database, it can be challenging, but you can always start with the simple case, doing multiple queries to the DB for one GraphQL request and only later see what is actually used and could benefit from optimizing.
You have more opportunities to optimize. With REST, if a client needs to list some resource and then access all of them one by one, it is going to be n+1 requests and you can't do anything on the back end to change that. With GraphQL you can look at the query holistically and optimize as needed.
All in all it feels to me that GraphQL gives the client the ability to better communicate the intent of what they are trying to do. Declarative over imperative.
But this one thing is so useful for almost everyone that for internal APIs using GraphQL is usually a no-brainer.
You actually don't want to be as performant as possible for internal APIs. There is a performance-flexibility trade-off involved and GraphQL lets you choose a different point on the Pareto frontier than maximum performance.
An API is like a promise.
In the best case, people decide to use your API and build on it. Then, if for whatever reason and in whatever way, you break your API, you force everyone affected to reimplement at least to some degree.
With GraphQL you’ll be promising the sun the moon and the stars if you aren’t very careful. (Even if you are very careful you’re still promising a lot, though hopefully the available tools will help you out.)
With GraphQL you put yourself in a very tough position of either keeping big promises or breaking big promises.
Most of the time you are going to be better off keeping small promises.
For front-end/public API parsing a query to a querable graph enables adhoc-ness and co-location which removes a lot of manually written controllers with much fewer resolvers, which also means removes huge amount of accidental complexities.
There are 10 different pages you need to read the 'Article' resource, in the index page you need the title and the summary, in details and edit pages you need the content, in the details page you also need comments.
The thing is, all of the pages read 'Article', they need it, but they only care about only some parts of it, some reads also require extra and possibly cascaded joins with other resources.
In most medium-sized web apps people will tend to write a bunch of different APIs to fulfill these needs, it's grunt works and hardly consistent because they're duplicating things. Or even worse, they will invent their own queries, a half-baked GraphQL everywhere, which is inconsistent and sometimes buggy.
Our application has est. 3000 internal APIs as of last year, it's becoming harder and harder to even find which API to use because there are so many nuances. Almost half are doing different kinds of reads, say if we can make everything in a queryable graph (ironically, almost all businesses are graphs, yet only very few people explicitly treat them as graphs), the APIs will be much lesser and easier to understand.
Also I fail to see how graphql can simplify an API, the graphql operates on the actual APIs so if you don’t understand those, you’ll probably not understand the graphql version of it too.
With that said, I'm not a big fan of GraphQL, but this particular trade-off seems like a win over REST.
And in the back-end, after the query was parsed it would be split to a bunch of requests depends what's your data source.
If the data source is in-memory then good everything's done.
If the data source is an RDBMS it would result in an n+1 query. However, people would always use data loader with GraphQL, which batches the query and converts the n+1 query to two queries. REST APIs usually only care about single kind of resource per API, so if there are cascaded joins (eg: get the user and its posts and comments and comments' comments), there would be less overall requests in you are thinking in amortization.
If the data source is another REST API, if there's batch APIs for resources, the similar approach as RDBMS would also be taken.
E.g. I want all of my posts and their respective comments and the users making those comments. You can narrow things down a lot by specifying fields: only give me the name and avatars of the commenters. With REST I've done this in the past with N API calls: give me my posts, iterate over the IDs of those posts to make N API calls for their comments, and another N calls for the comment user data I need.
At that point, a custom API endpoint could trim down the network calls a lot, and this is where GraphQL shines in comparison. You wrote a resolver for a user, comment and post in the backend, and GraphQL server frameworks can piece those together for you.
Also if someone creates a query that triggers a path where the n+1 problem absolutely wreak havoc with the backing data storage you could have serious problems with that. By fixing an API you get predictability.
There’s also the mismatching between data versions and on what version or view you are operating on.
Things like dataloader exist but the behavior of your schema gets harder and harder to reason about when different caching mechanisms get thrown on top.
I think everyone agrees that the client facing api for GQL is fantastic... maybe that means graphdbs are the next wave :P
If you're using a column store structure for most data, you're mainly doing individual lookups based on a single key, graph data in another key/keys, and related keys looked up separately. Each optimized for the single record(s).
Especially since distributed/collectively this data will be accessed faster than a single rdbms would be able to manage.
That said, if your application can/does use a single sql datastore, and you don't need that much scale, the effort to setup/configure GQL may not be worth it in a given instance.
* It allows rapid product iteration over the "social graph". It allows each product to have unique behaviours and data requirements, and new product capabilities can be rolled out quickly. Facebook's entire data layer (TAO, Ent framework) is optimised towards rapid iteration.
* It optimised performance for client load times. Mobile networks are high latency and low bandwidth. GraphQL allows each application to load data in the most efficient way possible.
We're throwing decades of web architecture to the wind here. Caching is a an exercise left to the reader.
If we're going as far as GraphQL, why _not_ open your DB to the public?
TL;DR: it's not really a query language, like SQL. It's more of an opinionated RPC framework. I generally don't think building out one's entire data model in their GraphQL schema is a good idea, in general, but it's also not the true value-add of the system.
Ideally though, you don't use dependent fetches and you have a bespoke endpoint that collapses the dependent operations possibly as tight as a single join query. If this is the case you're trying to solve, then why shouldn't we strive for something like a declarative query language?
The bespoke endpoint thing is a big part of what GraphQL avoids. It gives you something explorable, but as a team, you can decide how comprehensive you want it to be. If you have a monolithic database, you can use SQL to explore it, but GraphQL often sits in front of a service layer, which connects to my next point.
GraphQL's mutations often map to entire processes, which might change a data store, like a DML SQL operation, but also have other sorts of effects.