Hacker News new | past | comments | ask | show | jobs | submit login
Common REST API mistakes and how to avoid them (logrocket.com)
40 points by mooreds 9 days ago | hide | past | web | favorite | 26 comments





Version your API from day one (instead of company.com/api, support company.com/api/v1). This makes it a lot easier to support legacy users.

Don't put your API on the same Domain as your website. Use api.company.com or a dedicated domain instead of company.com/api.

Or you could learn how to build REST-ful web interfaces correctly, in which case the problem goes away.

The only #SeparationOfConcerns that actually matters is:

1. Verbs describe actions upon resources (e.g. add, read, delete)

2. URIs identify individual resources you may wish to act upon

3. Content Types (and content negotiation) determine how a particular resource’s information will be encoded for transfer between client and server.

For instance, if a client has the following URI that points to a “person” resource:

    example.org/persons/12345
and the public documentation states that a person’s information can be represented in any of these formats:

- "text/html" (a standard human-readable webpage) - "application/org.example.person+json" (JSON-encoded machine-readable data) - "application/org.example.person+xml" (XML-encoded machine-readable data)

then the client can GET the resource’s data in one of three different formats, according to preference and/or need; e.g. standard web browser, custom smartphone app, Traditional Enterprise Application.


Interesting. What's the rationale?

It’s the separation of concerns best practice extended to domain names. To not have to think to path collisions if the website and the API are in the hands of different teams is a plus. As well, in this case, it’s a lot better to avoid the root domain which is less flexible then a subdomain. For instance, you can’t have a CNAME behind a root domain

Flexibility. A CNAME is easier than a reverse proxy.

Security. Don't share cookies with your site.


What if sharing cookies with your site is the intended behavior, e.g. for API's that you're calling directly from your frontend?

I actually have heard it is better to use headers for versioning. Here is a link with three options for versioning: https://restfulapi.net/versioning/

That's extremely fussy compared to versioning the endpoints. Harder to test, harder to implement, and delivers no real benefit.

How exactly? How different it would be to pass a header instead of switching path ?

Im was writing such tests in spock and pacts.io and it was extremely easy to do both.


For the client, it's only marginally more difficult because you have to specify an extra header--since you'd have to specify the path anyway, versioning via path gives the client one less thing to specify.

For the server, it's annoying because every backend framework I've encountered ultimately boils down to "here is a function that gets called when the server receives a request to a given path". If you're versioning by path, you just write a new function for the new path. If you version by header, now you have to have a single request-handling function that performs conditional logic based on a request header--at which point I would probably just dispatch to separate functions for each version, which is exactly what path-based versioning would give me for free.

I'm also not exactly sure how, if at all, it's possible to document this type of behavior in OpenAPI/Swagger, if that's of any concern or relevance.

All in all, versioning by header isn't dramatically more annoying than versioning by path, but I see virtually zero concrete benefit from incurring the cost in the first place.


I think there is more than enough resources on the web to understand stand points of each of those solutions.

For me it seems like those are just tools created to tackle specific problems. Each has own pros and cons. Depending on usecase url based versioning may be worse than mime one. And vice versa.

I just wanned to point out that api versioning can be done equally easly in both ways.


It’s not “equally easy” though. I’m asking what the pros are, of versioning by header, and I’m not actually hearing a sensible response.

From the article I posted:

"Using the URI is the most straightforward approach (and most commonly used as well) though it does violate the principle that a URI should refer to a unique resource. You are also guaranteed to break client integration when a version is updated."

So versioning with content headers is useful when

* it's really important that there is a one to one mapping between a URI and a resource (not /v1/customer/1 and /v2/customer/1 URIs which both refer to customer 1). I'm not familiar enough with API construction to know why this might be important, but maybe system clarity?

* You have far flung clients that are not easy to update (iot, mobile apps, software that needs to be manually configured) and you want all clients to always go to the same URI (perhaps for whitelisting through a client firewall).


> it's really important that there is a one to one mapping between a URI and a resource (not /v1/customer/1 and /v2/customer/1 URIs which both refer to customer 1). I'm not familiar enough with API construction to know why this might be important, but maybe system clarity?

This isn't important unless you take "True REST" seriously. This notion is a fussy little hobgoblin that most people rightly dispense with.

> You have far flung clients that are not easy to update (iot, mobile apps, software that needs to be manually configured) and you want all clients to always go to the same URI (perhaps for whitelisting through a client firewall).

Surely if I'm not updating some of the clients, they can just continue using the v1 endpoint while other clients use a v2 endpoint. I don't actually see how this helps.


Know how to design a RESTful interface correctly, and you don’t need “API versioning”. Certainly not URL-based versioning, which is as anti-REST as it gets.

The correct way to manage non-backwards-compatible changes in a RESTful system is to define a new content type for the resource representation that has changed.

For example, here is a JSON-encoded representation of a Person resource:

    {"name": "Bob Jones", "age": 42}
To describe this particular data structure + encoding, we give it its own content type:

    "application/org.example.person+json"
and provide public documentation for this representation type.

To retrieve a description of a given person in this exact format, a client sends a GET request with an "Accept: application/org.example.person+json" header.†

Let’s say after a while you decide to replace the "name" field with separate "firstName" and "lastName" fields:

    {"firstName": "Bob", "lastName": "Jones", "age": 42}
This new representation clearly isn’t backwards-compatible, so give it a new content type that reflects this:

    "application/org.example.person.v2+json"
and provide public documentation for this new representation type, alongside the documentation for the older format.

A newly written/updated client that requires this improved information sends a GET request with an "Accept: application/org.example.person.v2+json" header.

Existing clients that still use the old format continue to send GET requests with an "Accept: application/org.example.person+json" header.

Where there are many such servers all around the world, it is likely that some will be older than others. A client that much prefers the new, more detailed, representation but is prepared to work with the old-style representation if that’s all it can get will send an Accept header containing both content types weighted by preference:

    "Accept: application/org.example.person.v2+json;q=1.0,
             application/org.example.person+json;q=0.3"
No sequentially version-mangled URLs. No Grand New Major API Release Announcements. No “we have ended support for the v1 API so now you must switch to the v2”. Just flexible, reliable interactions between any number of clients and servers, where each client and each server is free to evolve naturally and non-disruptively over time.

TL;DR: Everything you think you know about RESTful HTTP is completely and utterly WRONG. Same goes for everyone you learnt it from, and so on. #FractallyWrong

--

† A client that doesn’t care what it receives as long as it’s JSON encoded can, of course, send an Accept header containing the basic "application/json" and if the server is happy to serve that particular idiot^Wtype then that’s what it gets back. A server is also free to represent the same resource in any number of encodings, e.g. "application/org.example.person+xml" (for All your Enterprisey™ clients), "text/html" (your Auntie Ena in her Internet Explorer 6 will always ask for this), and so on.


> The correct way to manage non-backwards-compatible changes in a RESTful system is to define a new content type for the resource representation that has changed.

And this is the fundamental problem with REST and why nobody actually uses "True REST". People don't merely use API's to exchange representations of resources. They use API's to order pizzas, hail Ubers, subscribe to YouTube channels, update parameters for Netflix recommendations, and do lots of other things that have real-world side-effects. You can of course shoehorn all of this into a REST abstraction, just as you can represent all computation as a Kingdom of Nouns (https://steve-yegge.blogspot.com/2006/03/execution-in-kingdo...), but is there any actual benefit in doing so?

The more general, time-tested, and useful abstraction has always been RPC, which is why virtually every real-world "REST API" is just RPC over HTTP, except without the heavyweight mumbo-jumbo of SOAP and with some attention paid to the proper use of HTTP verbs. This is not a bad thing.


“They use API's to order pizzas, hail Ubers, subscribe to YouTube channels, update parameters for Netflix recommendations, and do lots of other things that have real-world side-effects.”

That’s three POSTs and a PUT, with the appropriate content types; absolute bread-and-butter RESTful interactions. At least try to pick something behaviorally awkward for HTTP, like COPY or SEARCH.

If you can’t see how to express changes to a remote state machine in RESTful terms, with cascading sequences of automatic behaviors and subsequent state updated triggered off the back of that, I seriously doubt your competence to express them any more reliably using plain dumb RPC without a whole host of $UNDEFINED_BEHAVIORs falling out its ass.

Because, get this: what’s important is not the REST/RPC in the middle; it’s the state that exists at each end. And in the wonderful unpredictable hell that is non-deterministic computing, unless your management over time of all that distributed replicated state is solid as rock then your customers’ information WILL go to fuck.

Frankly, the only thing ad-hoc RPC really does is enable lazy irresponsible incompetent coders to ship lazy irresponsible incompetent code that flings shit all over the world with absolutely zero effort. Having had to deal with other web apps’ RPC APIs in the past, I can confirm. They’re a bunch of fucking punks.

At least expressing all interactions as state changes on a remote graph forces you to think in terms of state machines, and ensures a clear and complete set of rules for telling you to go 40x yourself when—whether through idiocy, malice, or the simple inherent reality of race/lock conditions—your request tries to fuck things up. Plus, if everyone’s following a common set of UX/UI patterns, after a while you can likely pull out a whole lot of reusable data formats that everyone can now adopt.

Also, WTF does that Steve Yegge Java rant have to do with distributed system design? There’s one reference that matters here, and this is it:

https://en.wikipedia.org/wiki/Fallacies_of_distributed_compu...

Normally I’d suggest that you’ve failed to see the forest for the trees, but I suspect in your case it’s because you’re standing in the ocean.


You're taking an extremely aggressive tone, which is not only extremely unpleasant, but also does not, in my experience, characterize a person as someone who actually knows what he is talking about. Otherwise you wouldn't be so defensive and eager to resort to accusing others of incompetence.

Let's break down the actual substance you are presenting.

> [Ordering a pizza, hailing an Uber, subscribing to YouTube channels, and updating parameters for Netflix recommendations are] three POSTs and a PUT, with the appropriate content types; absolute bread-and-butter RESTful interactions.

Sure, it's easy to represent them this way. (Although even here, this can be a little wobbly: for instance, depending on how I would model a recommendations engine, a PATCH might be more appropriate than a POST or a PUT if I'm merely updating some of the parameters rather than rewriting them all whole-hog.) Content-type aside, I would also use the POST verb as a bare minimum for these operations.

> If you can’t see how to...

I can see how to do that. What I can't see is why I should adopt what seems to me like an unnecessary and unnatural abstraction, other than some jerk on the internet abusing me over it. (Which, to answer another question of yours, is analogous to what Yegge is ranting about.)

Even if that abstraction serves some purpose in the abstract, the sheer reality is that most tooling and most developers don't slavishly follow it. That leaves me with two choices: bitterly abuse people who write API's that aren't compliant with the abstraction, or simply follow working examples, even those examples are effectively RPC-over-HTTP-with-JSON-payloads.

> Because, get this: what’s important is not the REST/RPC in the middle; it’s the state that exists at each end. And in the wonderful unpredictable hell that is non-deterministic computing, unless your management over time of all that distributed replicated state is solid as rock then your customers’ information WILL go to fuck.

Agreed. But, as you yourself point out, most people don't follow true REST, and many of them manage to get things working.

> Frankly, the only thing ad-hoc RPC really does is enable lazy irresponsible incompetent coders to ship lazy irresponsible incompetent code that flings shit all over the world with absolutely zero effort. Having had to deal with other web apps’ RPC APIs in the past, I can confirm. They’re a bunch of fucking punks.

I'm not advocating incompetence. If you're using HTTP, use HTTP verbs properly and use HTTP status codes as-properly-as-feasible. Design your endpoints to be as-idempotent-as-feasible. If you want to commit to RPC, use a well-considered RPC framework like gRPC.

> At least expressing all interactions as state changes on a remote graph forces you to think in terms of state machines, and ensures a clear and complete set of rules for telling you to go 40x yourself when—whether through idiocy, malice, or the simple inherent reality of race/lock conditions—your request tries to fuck things up.

It's not clear to me that REST is either necessary nor sufficient to solve all the problems inherent in distributed systems. Can you express your case for it in non-abusive terms?

> Plus, if everyone’s following a common set of UX/UI patterns, after a while you can likely pull out a whole lot of reusable data formats that everyone can now adopt.

Can you clarify what you're trying to get at here?

> There’s one reference that matters here, and this is it: https://en.wikipedia.org/wiki/Fallacies_of_distributed_compu...

Yes, I'm aware of these. Can you clarify how they apply?

> Normally I’d suggest that you’ve failed to see the forest for the trees, but I suspect in your case it’s because you’re standing in the ocean.

This is just uncalled for.


Article has nothing to do with “REST”.

Thankfully so, because 110% of what is written about “REST APIs” is an absolute bag of shit. #OxyMoron


Actually using the “REST” abstraction and taking it seriously is perhaps one of the most common mistakes!

Great advices but in my opinions I prefer to stick with https://jsonapi.org specifications . There is good librairies to serialize JSON objects into JSON API compliant objects.

Or Widcard API (https://github.com/reframejs/wildcard-api) for Node.js <-> Browser

(Disclosure: I'm Wildcard's author.)


Why do so many prefer strings over unix timestamps? Do they like making things that much more complicated for the sake of human readability?

ISO 8601 should solve both. Human readability and standrzied time with zone

There are also use cases where Unix timestamps/UTC are actually the wrong solution.



Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: