
Best Practices for Designing a Pragmatic RESTful API - johnchristopher
https://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api
======
teddyh
> _Should the media type change based on Accept headers or based on the URL?
> To ensure browser explorability, it should be in the URL. The most sensible
> option here would be to append a .json or .xml extension to the endpoint
> URL._

I disagree with this. The most elegant solution is to use Accept headers, and
you should therefore implement that. Of course, since those are hard to use
from a browser, you should also solve _that_ problem, but solve that problem
_separately_. I usually do that by supporting an extra ?type=application/json
query parameter, which internally the server-side code converts to an Accept
header, which is then interpreted normally. Note that I use the _media type_ ,
not a possibly ambigous “.json” extension. Would “/foo.json” mean that the
data is of type application/vnd.hal+json or maybe application/vnd.api+json?
Who knows?

IMO, file name extensions _do not_ belong in URLs. URLs were never meant to be
files, and we should try to avoid .html, .cgi and .php in our URLs. See also
_Cool URIs don 't change_ from 1998:
[https://www.w3.org/Provider/Style/URI](https://www.w3.org/Provider/Style/URI)

~~~
wwweston
Using Accept headers is the right thing, for the reasons you specify, and also
the additional reason that they allow for the client to specify a list of
possibilities that can be negotiated down to a result (server doesn't
understand `application/vnd.hal+json`? maybe it can still send you
`application/json`).

That said: the extension-implies-media-type approach may not be right for an
application server that renders a resource on the fly, but it does seem to
have a place in the specific kind filetree-via-http web server, where the
resource specified by a given URL is already rendered to a specific media
type, and the server really only has two choices for figuring out what that
type is: parse the file (potentially expensive) or apply a heuristic set up in
the server configuration to the filename (potentially not specific enough or
outright wrong). Neither choice is necessarily wrong for that subcase.

I'm iffier on saying as much for media-type-in-query-string method. It's easy
enough to use a client built for sending headers like Postman or curl or to
augment common browsers with extensions that using the understood HTTP
convention seems like the right thing for most cases. The only exceptions I
can think of would be those where debugging a media type specific issue needs
to happen on machines devs don't control. Needing to debug issues on machines
devs don't control is common enough, but issues specific to rendering one
media type should be rarer, and the intersection of both of them should be
vanishing unless something isn't right elsewhere in the dev process.

~~~
Bombthecat
Also, rest is interesting because it solves a lot of "base protocol"
questions. Accept header is part of standard http.

So that should be the default way.

~~~
Supermancho
That isnt a problem. Your API docs are the solution. Mapping communication
protocols on to your applicarion for zero benefit over any other strategy.
Yhis idea has been a hindrance at various cargo cult teams for decades. At
least the RFC doesnt try to fool others into the fantasy.

------
rumanator
Great sunday reading. However, in the section "But how do you deal with
relations?" the author presents nested resources as a best practice. I don't
feel this is the best course of action. Instead of nesting message resources
in, say, `/tickets/12/messages/5` shouldn't a better approach be to store them
in `/messages/5` and keep `/tickets/12/messages` as a collection of IDs or
summary resources? I mean, messages are a separate entity which might even be
moved to a dedicated microservice. Why is it a good practice to nest them
within an API?

~~~
LoSboccacc
there's a trade-off involved in latency size and server hits

shipping name and description of some resources often read together allows to
minimize cost and optimize performances, up to a point

ultimately the client knows what the UX needs and it's in the optimal position
to ask for the minimal resources needed, hence graphql et al.

but that leaves the rest API designer in a rut, where to make the call for
nesting and where for searching related?

if I had to draw the line, if the related entity has a non null foreign key
(or the nearest applicative equivalent) toward the parent it's a prime
candidate for nesting

~~~
rumanator
> there's a trade-off involved in latency size and server hits

I might have not conveyed the point adequately, but my point was orthogonal to
networking or HTTP calls. I was referring to how resources were being
needlessly nested, thus leaking direct dependencies wrt other resources. More
specifically, although tickets might refer to messages, I didn't understood
why nesting messages within a ticket passed off as a best practice. I mean, if
ticket already provide a collection resource of message resource IDs, why is
the path to message resources being defined specifically with regards to
specific tickets?

~~~
bsaul
I think there’s some confusion on what the original problem is: how do you get
the ticket, together with its messages, in the minimum number of api calls ?
That’s why parent post mentions latency and graphql.

The other problem : « having already queried the ticket, how do i get its
messages », does indeed allow for various answers : tickets/Id/messages , or
/messages?ticketid=xx or even /messages?ids=a,b,c,d are valid, depending on
how orthogonal tickets and messages are, and don’t depend on latency, indeed.

------
jayd16
Some questions that aren't answered in the article:

How do you structure batch operations like creating multiple things? Is the
current answer to make N queries and hope you're using http/2?

Whats the best practice for media types in a json api? Should every object
type have a specific media type or maybe just a normal and error wrapper type?
Seems like most of the big tech apis don't actually get more specific than
'application/json'.

~~~
veesahni
> How do you structure batch operations like creating multiple things? Is the
> current answer to make N queries and hope you're using http/2?

OP here. Two approaches:

1\. Have a special endpoint (POST /batch) where you send an array or requests
[ {method: "", path: "", body: ""} ] and get an array of responses

2\. Yes, assume HTTP/2

Based on conversations with teams who've based their API designs on this post,
I've previously recommended #1. But it has it's flaws - a request that's
dependent on response from another can't be part of the same batch.

Today, I'd say it makes sense to make sure you implement your API with HTTP/2
and reduce per-hit rate limit "costs" for clients connecting with HTTP/2\.
This way, you're encouraging HTTP/2 adopting for heavy API users.

Note, if you need to batch only on GET, then something like GraphQL is also
interesting.

~~~
jayd16
>Have a special endpoint (POST /batch)

This is what always seems hacky to me. Why don't we start with batch handling.
Why shouldn't every (POST /resources) accept an array of new resources? We've
already resigned to using plural everywhere.

HTTP/2 would be nice, but as a dev that has to serve Unity client, we can't
even design APIs that require PATCH.

------
thexa4
If you're interested in API design the upcoming RFC standard might be of
interest: [https://tools.ietf.org/html/draft-ietf-httpbis-
bcp56bis-06](https://tools.ietf.org/html/draft-ietf-httpbis-bcp56bis-06)

~~~
avdempsey
While this link is very interesting, its advice doesn’t seem to completely
pertain to the kind of “single deployment” APIs most of us are probably
making.

From the draft: “This document specifies best practices for writing
specifications that use HTTP to define new application protocols, especially
when they are defined for diverse implementation and broad deployment (e.g.,
in standards efforts).”

That’s not to say there aren’t useful ideas here (I found it very interesting
in its own right), but the provisions against fixed URL schemes is followed by
no commercial HTTP API I’ve ever seen.

------
bradleyjg
_An API that uses the Link header can return a set of ready-made links so the
API consumer doesn 't have to construct links themselves. This is especially
important when pagination is cursor based._

In the header is nice because then there’s no need to parse the payload to get
the next page. But better still is to avoid cursor based pagination. Instead
give me a cheap endpoint to get the total number of results and the configured
max results per page and have constructable urls. (E.g. “?page=4” or
“?offset=500”). This way generating all the urls can be a completely separate
process from pulling the results.

~~~
mtsr
That's indeed the nicer way of paginating, but it breaks if the underlying
resultset changes between requests. Which is exactly when cursor based
pagination is generally used.

~~~
bradleyjg
If the changes in the resultset are additive it's no problem as long as they
are sorted in such a way as new results go to the end (which the api should at
least make an option if possible). Updates to data within results may be a
problem because you can end up with a dataset that has a view of the world
that doesn't represent any particular time, but in many cases are safe.
Deletions screw everything up and should be avoided if possible.

The general solution to this problem is to allow as part of the query some
particular time that you want the results to reflect the state of the world as
of, but that's obviously going to be expensive to serve.

------
rumanator
Regarding API versioning, FTA:

> There are mixed opinions around whether an API version should be included in
> the URL or in a header. Academically speaking, it should probably be in a
> header. However, the version needs to be in the URL to ensure browser
> explorability of the resources across versions (remember the API
> requirements specified at the top of this post?).

I don't see how this is relevant with a RESTful API. In REST, resources are
transparent and found through HATEOS/autodiscovery. Thus it's really
irrelevant if the URL found through HATEOS includes an API version or not.

However, URL versioning is a indeed a side effect of having multiple services
dedicated to serve each version of an API.

In the end, it doesn't feel like path/media type versioning is a relevant
issue because it's not an either/or type situation but actually complementary.

~~~
PretzelFisch
When you see versioning it's a pretty good hint that they use is as a buzz
word and it's just RPC wrapped in REST clothing

~~~
rumanator
RPC-over-HTTP passed off as REST is indeed a nuisance but unfortunately that
doesn't get rid of the need to support multiple versions of the same API,
specially if you don't control which clients are consuming your services.

------
injb
It's worth reading, but his arguments against using hyperlinks are vague and
frankly weak. He says there's not much to be gained because you can't make
"significant" changes without breaking client code; that may be true for
certain values of "significant", but the bottom line is that you can change a
lot more without breaking clients if you use hyperlinks everywhere, than if
you don't.

It's not totally clear what he means by "not ready for prime time" but HATEOS
has been achieving what it was designed for for a long time - that is,
reducing the interdependence between client and server code.

It's also worth noting that using server-generated links everywhere eliminates
the need for an entire category of documentation, and makes debugging much
more pleasant and efficient (especially when the person debugging didn't write
either the client or the server code).

Honestly, if there's one feature that determines whether an API should be
described as RESTful or not, it's this. Think very carefully before building
an API that requires the client to know how to build URLs!

------
JamesSwift
This has been my goto reference for API design, and is what I send others when
discussing API design, for years now. Nice to see it pop up here.

------
ptman
RFC7807 for JSON errors instead of everyone inventing their own schema
[https://tools.ietf.org/html/rfc7807](https://tools.ietf.org/html/rfc7807)

------
jbjohns
This has nothing to do with REST (I realize it says RESTful but I wish that
term would die as RESTful has nothing to do with REST either). It seems to re-
invent OData as well.

I don’t know how it has gone so wrong with REST as it seems pretty easy to
understand: it’s how your browser has always worked. Your browser doesn’t know
_any_ URLs at all (yes I know about the search engines but that’s
configuration). What it knows are data types. It can be taught new data types
(e.g. PDF) but not new URLs because it doesn’t know any. So if you want a REST
API you need to be designing data contracts between client and server. URLs
are a server side implementation detail and completely irrelevant to the
discussion.

The advantage to this kind of architecture is that the server and clients can
develop in a more more decoupled manner. They need only agree on data types. A
new web site existing never requires rebuilding a browser. Only if a new kind
of data (e.g. HTML5) is to be supported.

But one thing to be considered, as with all architectural patterns, is if it
fits the domain you’re architecting. REST isn’t optimal for any possible
problem. Sometimes HTTP-RPC (what the article describes) is an easier fit. But
once you realize and accept this you no longer have to follow standards that
seem not to make sense for what you’re doing (e.g. HATEOAS which is done
automatically if you’re really doing REST, and seems so inefficient if you’re
not).

------
duregin
gzip + ssl is still (and will always be) a risky choice, isn't it?

Do any of the newer compression algorithms fix that problem?

~~~
deathanatos
It depends.

The problem was that SSL supported compression directly, so you could compress
the encapsulated stream. What happens in HTTP is that, say the cookie header
contained the user's session cookie, and the body was somewhat controllable by
an attacker. (E.g., by making CORS requests in the background.) The attacker
could repeat "Cookie: auth=a" many times; if your auth cookie started with
"a", it would compress slightly better as both could get compressed together,
things would be slightly faster, and an attacker could use timing information
to discern that he'd gotten the first character correct, and move on to the
second.

See:
[https://en.wikipedia.org/wiki/CRIME](https://en.wikipedia.org/wiki/CRIME)

HTTP compression being mentioned in the article only compresses the body. It's
still possible to execute the same sort of attack situationally if there's
some part of, say, a response body that an attacker can control and a part
that response body that the attacker doesn't control and is sensitive and
wants to know and somehow only has access to the timing information.

While there is a Wikipedia article on this variant ("BREACH"), I think this is
more informative: [https://security.stackexchange.com/questions/20406/is-
http-c...](https://security.stackexchange.com/questions/20406/is-http-
compression-safe) ; it lists a decent example of trying to get at a CRSF
token.

But generally, JSON responses don't mix secret data + attacker controllable
data, I feel, so compression should _usually_ be okay. (And IME, it's
typically done.) SSL/TLS compression should usually be left off, as that seems
much easier to exploit.

------
Supermancho
Anything that includes put patch delete is not pragmatic. Changes have side
effects in any nontrivial system, making the semantic goal little more than
wishful thinking. As others (even within the first few comments) figure out,
being able to specify every mutable element from a GET is the most pragmatic
quality of a modern web api.

