> I’m not completely sure they do matter.
> There’s a lot of smart people at Facebook and they
> built an API that only ever returns 200.
This is the best bit of the article.
Most (many?) APIs, whether or not their authors call them REST, are single URL APIs that you throw JSON/YAML/whatever at. If the caller is going to get back a serialized status message (["Not found"]), then the HTTP request was successful, and it should return 200.
I'm not sure why people insist on spreading their APIs out between the transport and content layer.
If the API is indeed RESTful, and it complies with the concept that a URI represents an address of a specific resource, then returning 200 even when the resource was not found does not make any sense. The request was not "successful", rather the request failed to find what it was looking for!
That said, "single URL APIs that you throw JSON/YAML/whatever at" are not RESTful anyway. For a non-REST API, then it may make sense to return 200 even if something was not found, but it is still not a good idea to just ignore the properties of the upper-level protocol your data is ferried on.
> I'm not sure why people insist on spreading their APIs out between the transport and content layer.
HTTP is not the transport layer. It's an application protocol meant to convey application data. People spread their API over it because that's how it's supposed to be used. If you're just using HTTP as a means to toss random data between computers, you should consider TCP.
On the client side, whether a GET returned a successful (2xx) or error (4xx/5xx) status is observable for cross-origin requests. This can leak information about the user, particularly if you have cookie-authenticated resources that 503 depending on the user's identity (e.g. Facebook 503ing if you're not friends with X). This can be resolved by requiring a CSRF token or fancy header, but that muddles the RESTful semantics.
A common solution is to make your public API RESTful and authenticated differently from your browser cookie sessions, and make the private web APIs always return 200.
403 would be the proper status if the user was authenticated but forbidden. 503 would imply an error on the server and that the service is unavailable.
Here the 403 is appropriate because the user is logged in and trusted to some degree by your system, but isn't allowed to access that URI.
If the user were unauthenticated and tried to access the same URL, he should get a 401 for Unauthorized, which is the same response he should get for every URI in your system, thus exposing nothing about your underlying service.
> GitHub returns 404 instead of 403 to prevent these information leaks.
This behavior is explicitly permitted by the standard, FWIW: An origin server that wishes to "hide" the current existence of a forbidden target resource MAY instead respond with a status code of 404 (Not Found).
Ah yes, thank you, that should have been 403. I suspect a 401 would still leak info, as the <img> or <script> tag will let you differentiate between a 200 and other failing status.
I'm not sure I follow how something that's wrapped in a TLS session can leak information if it's returned as an encrypted header, but not if it's returned as an encrypted body?
That's by design - allowing arbitrary TCP in a browser is just asking for security and DDoS problems.
> HTTP works through proxies
So does TCP (SOCKS).
> Web servers
Using web server wouldn't make sense if you were using straight TCP.
The real reasons HTTP is used when plain TCP would be more appropriate are:
1) Lazy/stupid firewalls configured to only allow port 80/443. This leads to everyone doing everything over HTTP, moving the problem and making the firewall much less useful.
2) NAT. When the primary benefit of the internet - where each peer is equivalent in the protocol - cannot be counted on, centralized servers (usually using HTTP because of #1) are used instead of direct TCP.
Using an architecture shouldn't be a dogma but a choice. REST is defining a set of constrains that provides a set of benefits. REST emerges because people noticed that the 'Web' was more scalable and robust than other distributed architecture (COBRA, DCOM...).
So if your usecases would benefits from caching, discoverability... then you may want to choose REST.
Reading that old thread really makes me realise how perceptions have changed over time. I think more folks understand how useful REST can be, as it provides a standard set of semantics for talking about resources, and that semantics are pretty widely applicable across various situations.
In that respect, REST is a lot like relational calculus: certainly not a panacea, certainly not universally-applicable, but an extremely useful way to approach a large number of different problems. And just as one should have to justify why one is not using an RDBMS (and there are a huge number of reasons not to!), one should need to think hard before abandoning REST for RPC.
> If the API is indeed RESTful, and it complies with the
> concept that a URI represents an address of a specific
> resource
I suspect the maintainers of both of those APIs are doing it right... ;-)
More seriously, almost every API I've seen that tries to be (or claims to be) RESTFUL is just somewhat randomly distributing parameters between the URI, headers, and a post body.
> HTTP is not the transport layer. It's an application protocol
> meant to convey application data
Nope, it was definitely designed as a transport layer, which describes meta-data about content going both ways. You can tell it's not a real application protocol because people shoe-horn serialized data structures in both directions.
> If the caller is going to get back a serialized status message (["Not found"]), then the HTTP request was successful, and it should return 200.
Now all clients have to implement their own error checks, string parsing, just to discover something that the HTTP layer could have told you explicitly, that the resource requested could not be found.
If an API treats data as resources, then the API should return HTTP status codes that clarifies how the query against the resource was handled.
"Could I have this?" = "404 Not Found" is a great response.
"Could I have this?" = "200 OK" (not really) is a crap response.
Yeah, I hate APIs that return 200 with a JSON (or XML) body that says the response was anything but OK.
Another bugbear is a server suddenly switching content types for errors. If you're on IIS, make sure you're handling your own damn error pages so I don't suddenly have to account for an HTML 500 error page in my javascript code. Cisco's AXL API was a terrible offender for this one.
I've never used IIS, but isn't that confusing protocol codes with content codes?
IMO HTTP status codes should return information about the HTTP transaction, and only about the HTTP transaction.
If you run an API layer on top of that, that's a separate concern.
So "Server busy" should not be in the same space as "That API request was bad."
On my last project I returned 200 for successful transactions, 404 for missing pages, and 444/closed connection for 'Piss off and stop making inept attempts to hack this server, use it as a free proxy, or rack up your SEO stats."
IPs trying any of the latter are blocked by Fail2Ban.
444 seems to have cut down on spam requests from botnets, and it's cheaper than serving a full 404 page.
* Make request
* Has error string message?
* Error
Fairly similar code-paths. If there was a very consistent implementation of REST API's and how they worked/responded this argument might have more legs; because client libraries could have a shared base libraries which handled this logic. As it is, API specific libraries have to implement error handling logic so it doesn't really matter.
The problem there is that error strings tend to change over time... an error match today may fail tomorrow.
Additionally too many clients think that error messages are safe to be shown to end users. In reality error messages are seldom to never fit to be shown to end users, not least because they're not localised, but also because the target audience for an error message for an API is the developer, not the end user.
What's the problem with including an additional user-friendly, alert-friendly, localized message in the payload? That's a much better general solution than trying to figure out which nonsensical HTTP status code to return.
How does an API know the context surrounding a message to be presented?
i.e. is the client a mobile one? Does it have limited space to display a message? Is the client actually a desktop client calling the API with specialist needs to show a system dialog with specific properties filled that is more than just a single error string?
An API is an interface to interact with the web application, beyond providing the means to interact with resources it should make few to no assumptions about how clients of the API will act, their environment or constraints, their locale requirements, or anything.
The client is best placed to decide how to communicate to the end user, and the best thing the API can do is provide clean and unequivocal instruction to the client that something has happened so that the client can do it's job of handling it how it feels is best.
This just complicates the handling of the request on the client for no apparent benefit. And there are many more HTTP codes besides 404 which are mostly meaningless (payment required, anyone?), or fit very poorly within the application logic. It is a layer of legacy cruft that is best avoided. Non-200 codes should indicate some exceptional situation on the server and could be handled client-side as "server configuration exception".
A lot of client libraries throw exceptions when a non-200 status code is returned, something that plenty of newbies and not-so-great developers struggle with.
I imagine that always returning 200 drastically cuts down on support costs and basically-irrelevant forum posts, regardless of the technical rights and wrongs of doing so.
>an exception is better than a silent failure in almost all situations.
>this is an argument for not using 200 for all; not against
The practical side of me disagrees.
Experience bears out that plenty of developers are at a loss when e.g. a login callback page shows an unhandled exception error despite error-checking code being in place and running against the response's "error" node or whatever is used.
The problem is that there just aren't enough people out there that can understand the entire stack, and the end result is that people who've paid (good/bad) money for development - and their end users - end up with services that'd otherwise be decently functional showing the odd exception error instead.
(the technical side of me agrees with you absolutely!)
> Most (many?) APIs, whether or not their authors call them REST, are single URL APIs that you throw JSON/YAML/whatever at. If the caller is going to get back a serialized status message (["Not found"]), then the HTTP request was successful, and it should return 200.
Having spent three years of my life writing tooling to deal with such a vendor's API, I would really, really prefer that people avoid this anti-pattern. REST is a really nice idea, and it works out tremendously well in practice. RPC-over-HTTP is not such a nice idea, and it works out rather painfully in practice.
As an aside, the mere fact that the message is serialised doesn't mean that HTTP and rest are inapplicable: HTML, XML, JSON, YAML, protobufs, s-expressions &c. are all just different forms of serialisation which can be applied to data. The RESTful way to indicate Not Found is to return a 404 status code, with a body which is useful to the client in some way. A human being might accept plain text or HTML; a programmatic client might accept a JSON or protobuf response, and probably doesn't even need one for a 404.
Having spent more than three years of my life writing tooling to deal with such a vendors APIs, I wish people would stick more closely to the pattern of error messages in 200 responses. The application stack is not the webserver and I don't rely on status codes when dealing with the api through various transports.
> Why doesn't rpc over http work out well in practice?
It's not so much the nature of HTTP as the nature of RPC. RPC is all about calling remote procedures (hence the name); the issue is that there's almost always some internal state which is being mutated. REST makes clear what that state is, and provides reasonably clean semantics for idempotent updates; RPC tends to obscure that state. RPC can certainly be done right (after all, REST can be thought of as a constrained form of RPC…), but in many cases REST is the right level of abstraction to work at.
Most (many?) APIs, whether or not their authors call them REST, are single URL APIs that you throw JSON/YAML/whatever at. If the caller is going to get back a serialized status message (["Not found"]), then the HTTP request was successful, and it should return 200.
I'm not sure why people insist on spreading their APIs out between the transport and content layer.