Hacker News new | comments | show | ask | jobs | submit login
Designing a Pragmatic RESTful API (vinaysahni.com)
311 points by veesahni on June 4, 2013 | hide | past | web | favorite | 132 comments



We've built a first version of an API that we have in testing at the moment, and it follows a lot of the things laid out in an ebook✝ and in the linked article.

The epiphany we had was that whilst machines do access the API, the developer is always the customer and user. Everything we do should help the developer, and if we have to break rules to help them... then largely we should.

I've built a couple of very pure REST APIs in the past, but had a lot of developers pushing back and demanding something simpler. To not use media types so precisely, to be more accepting of what data is sent, to provide meta-data along with the resource (most seem to prefer an envelope), to prefer composite resources over very decoupled interfaces, etc.

This time, I haven't even tried to build a pure REST API. This time I've just mixed together the bits that developers I've spoken to liked and prefer. Adjusting as I went depending on how it was received.

http://microcosm-cc.github.io/

That's the docs for it, and we get the arguments out the way right at the beginning. All we're trying to do is build an API that helps developers get their task done. We're not done, and I know it's not pure anything... but the feedback we're getting is far more positive than any pure REST API I've ever built.

✝ If you are willing to give out a fake email address then this free eBook is a great resource and has a lot of sane information presented clearly: http://pages.apigee.com/web-api-design-ebook.html


    The epiphany we had was that whilst machines do access
    the API, the developer is always the customer and user.
    Everything we do should help the developer, and if we
    have to break rules to help them... then largely we
    should.

Thank you, this articulates my disagreement with the idea of hashing all URLs so that client developers are forced to follow links returned by the API instead of generating their own URLs

http://blog.ploeh.dk/2013/05/01/rest-lesson-learned-avoid-ha...


There's nothing really wrong with RPC-like APIs. They're often much simpler to use in modern web development, for developers with access to documentation.

Hyperlinking has a few benefits, the biggest being discoverability without the need for browsing some documentation that explains how to build URLs, but it's not always the best possible solution for every application -- and it typically leads to chatty applications.

The only thing that bothers me, though, is when these RPC APIs are called "RESTful" just because they use HTTP verbs correctly.


This speaks to something that has baffled me for a long time. If we have to bend/break the rules of REST to make things usable in the real world (and I believe we do), why have REST as an ideal in the first place?

Why work halfway towards A, when we could define a more realistic B and implement it fully? We spend too much time justifying which parts of the holy book to ignore.


You have different objectives than a purely RESTful system does. If you had REST's objectives in mind, then the architectural constraints would be sensible.


But who has truly RESTful objectives in mind? Are there any widely used REST APIs that truly adhere? I acknowledge that there may be. It's just the ones I come across always make significant concessions. Sometimes people tell me that "the web as a whole" or RSS are examples, but those seem too fundamentally different from any API I might create.


The GitHub API is pretty damn close these days, and I can tell you that there's a very big company you've heard of with a two-digit person team working on a HAL-based hypermedia API right now. Twilio has been pretty good too.

There are lots of private APIs that operate this way; for example, much of Comcast's internal stuff is pure, hypermedia driven REST. But it's not open source, so you don't hear about it.

A YC-funded company, Balanced Payments, does an excellent job as well.

> those seem too fundamentally different from any API I might create.

Right! That's because you're primarily thinking of RPC styles, so of course it will seem foreign. Try this sentence on for size, from a different time period:

"But who has truly object oriented objectives in mind? Some people tell me that Smalltalk or C++ are examples, but those seem too fundamentally different from any code I might write."

That's not to say RPC is a bad thing: often times, it's just fine. But if you have the problems REST is designed to solve, REST will solve them much better.


The other benefit of defining the endpoint in the resource is that you can send the request somewhere else. Sharding your API, so to speak.


The apigee doc is a great resource. The PDF is also indexed directly via google if you don't want to sign up: info.apigee.com/Portals/62317/docs/web%20api.pdf


I don't think what the developers are asking for is inherently not RESTful, as long as there is already a strict REST implementation. Adding composite resources in addition to the decoupled ones seems like a intuitive way to improve the usability of the API and the performance of the application (so there aren't so many API calls for common procedures.)


"the developer is always the customer and user"

My own criteria is that RESTful interfaces should be easy to use from the command line using curl (or any other similar tool) - not that this is the main way an interface will be used, but it helps a lot with exploration and troubleshooting.


This is a well-written and informative article, kudos.

That said, it reminds me how fucking overcomplicated REST HTTP API is for 99% of uses. As an API user, all I want is to call a function on a server, pass it some arguments and get a result. I want it to be dead simple, and REST is probably the opposite of that.

Finally, it also occurs to me that most API calls may call one function which returns lots of data that I don't need. Specifying the data types and field names I want in the query would simplify parsing and potentially reduce bandwidth use (if you've ever seen an API call that returns a user's profile when all you wanted was their last login timestamp, you know what I mean). edit Whoops, didn't see he mentioned the field limiter... why don't more people do that?


REST only looks complicated because you're seeing it with RPC goggles. In REST, you don't call functions, you just ask for documents and send documents to the server(s). There's nothing particularly complicated in it, it's just a different approach.

That said, this article isn't particularly faithful to REST.


Sure, but peterwwillis was likely making the point that REST sometimes just gets in the way. REST itself is not complicated, but trying to make it work where it shouldn't can complicate what you're trying to do. Not all APIs can be modeled well with the "resource" or "document" concept. A lot of times all you really want is to ping an endpoint and get/post some data.


If you just want to get/post some data, why are you modeling the system using functions (computations)? If anything, a document oriented approach is much more suited for that use case.


Sorry. I shouldn't have over simplified that point about it being just data. I meant to emphasize that you really want to just call an operation and "just do some work". The data being just function parameters with optional data returned.

I'm currently working on a few large B2B APIs and it's been difficult to implement REST. The value of these APIs come from the actual work performed and not just updating the state of a few rows in a DB. I have very "business logic" heavy endpoints which take many parameters and return very different payloads. As much as I hate it, sometimes business and practicality comes before purity. :)

I actually asked for some feedback in a comment below buried somewhere: https://news.ycombinator.com/item?id=5819821


Oh no, I certainly don't think there's anything wrong or impure about not implementing REST. REST is an architectural style that, by imposing some constraints, gives you certain benefits (Fielding's paper talks about this). It's not a panacea, and it's certainly not right for every case.

Just don't make an RPC API and call it RESTful ;)


I fully agree. Fielding warns in his dissertation about design by buzzwords himself.

Almost all web APIs out there aren't REST APIs because they don't respect the "driven by hypermedia" constraint which makes sense because this constraint was put in place for human users driving applications through web browsers, not for machines.

I've started to formalize a new "Web API" architecture style that takes the best of REST but leaves the requirements and constraints that don't make sense. See this blog post: http://blog.restlet.com/2013/05/02/how-much-rest-should-your...


Maybe HTTP is simply missing the INVOKE verb for executable documents.


If you want to retrieve some data based on some parameters, you can just GET it and pass some query strings. You shouldn't need to care if the data comes from an executable file (which is usually does anyway) or someone typing it in on a terminal.


That's already POST, basically.

Furthermore, if you really wanted to, HTTP does allow new verbs to be added, so...


Sounds good to me :)


I guess the way I think of REST is that you make requests of different kinds to resources, some of the responses containing documents that contain references to other resources - and all you need to start is the "root" document for the service and an understanding of what the contents of each document mean.


It seems more complicated to me to define a document format that provides URLS, than to just tell people what the URLs are.

In my mind I think of some things e.g.: google search, as a function not a resource. Trying to think about what the resource might be seems tangential to what I am trying to do.


> As an API user, all I want is to call a function on a server, pass it some arguments and get a result. I want it to be dead simple, and REST is probably the opposite of that.

How is this not dead simple in REST? Can you provide a specific example and how REST makes it complicated? Because I don't see it.


I think you are confusing the qualities of a good public API and a good internal/private one (for application architecture purposes only.) For internal, I believe REST is the way to go, no contest. Externally is a different story. You would typically only expose a subset of functionality, use different authentication schemes and use a format that can be more easily accessed across domains like JSONP.


The problem with limit is the programming overhead in frontend and backend.

If you want a performance advantage you must parse the limit-params and select only the needed fields from db and send them to client. But this make the backend complicated.

If something is misspelled (frontend/backend) you need more time to find the error.

It is easier and cleaner provide some additional calls for specific data (like lastLogin).

If you use something like Backbone in frontend, you are happy to access all needed data so easy, why restrict them?

Ok if you have an heavy used performance critic application, you can start optimizing. But why, welcome to the cloud, add some processes ;-)


RESTful: almost correct usage of HTTP verbs to implement something that has nothing to do with REST. But it sure sounds nice.


The obligatory "you're doing it wrong" comment following virtually all REST articles.


Once people stop calling it the wrong thing, the discussion can be about something else.

I also hate this nitpicking, but it's clearly not going away so you're better off not inviting it by using the term incorrectly.


It's why I've come to prefer the term "RESTish". It probably still isn't enough to mollify hard-core purists but indicates that, e.g. the user knows versioning ought to be in the accept header rather than the URL, but also that hardly anyone either creating or using web APIs cares.


I have always considered "RESTful" to be analagous to your "RESTish" meaning.

If I am 100% REST then I would say "I have a REST API". If, like most companies, I am not following all the REST conventions then I would say "RESTful" api.

The moaning about "RESTful has been hijacked by people who don't know REST" by the REST purists always struck me as strange when they could have made that simple distinction.


> If I am 100% REST then I would say "I have a REST API". If, like most companies, I am not following all the REST conventions then I would say "RESTful" api.

If you aren't following REST conventions, its probably better to say "HTTP API" and not make any claims at all related to REST (except, perhaps, negative ones like "non-REST".)


The suffix -ful literally means "as much as will fill", e.g. spoonful, and usually takes the broader meaning "characterized by", e.g. careful. As such, I've always regarded "RESTful" as a full REST implementation, and it seems most other people who have an opinion on the matter do too.

I borrowed RESTish from Dan Savage's concept of "monogamish", meaning a relationship that conforms generally but not rigidly and precisely to the norms of monogamy. As such, a RESTish API conforms generally but not rigidly and precisely to the norms of REST.


Question: in all discussion about API design, the hairiest to me is always authentication.

The article recommends SSL, but the internet says that "SSL is slow." Is there a guide to using SSL correctly, and techniques for making this more efficient? An SSL primer?

It also recommends using oauth. There are hundreds of libraries for consuming oauth APIs. What exists (I'm a Python+Flask guy, but really any help would be great) for implementing oauth authentication for my own API?


SSL is not slow.

Grab the latest version of Nginx, turn on SPDY, enable SSL session cache.

You must always serve your API over SSL, as auth information is going to be in headers or the querystring and both would be readable by a MITM if you do not use SSL.

Nginx 1.4 statements to pay attention to (sample config):

    ssl                       on;
    ssl_certificate           /etc/ssl/domain.crt;
    ssl_certificate_key       /etc/ssl/domain.key;
    ssl_session_cache         shared:SSL:10m;
    ssl_session_timeout       10m;
    ssl_ciphers               RC4:HIGH:!aNULL:!MD5;
    ssl_prefer_server_ciphers on;
    ssl_stapling              on;
    spdy_headers_comp         1;
If you're not using Nginx, why not? Just use it as a reverse proxy and drop it in front of whatever you are using.


While we're dispelling SSL myths, let me add the reminder that SSL does encrypt the URL and querystring and HTTP headers. It doesn't look like it in browsers because they still show the URL cleartext onscreen, but over the wire all of that is indeed inside the encryption envelope. Only the destination IP address isn't encrypted.


And whilst we're still here... SNI and SSL.

Thanks to Internet Explorer, and the early versions of the stock browser on Android, you will need a unique IPv4 address for your SSL endpoint.

You cannot safely serve multiple SSL sites on the same IP.✝

Basically: The hostname is also encrypted, so the SSL requests on some browsers require a unique IPv4 address. Your provider will give you one if you say the magic word "SSL" to them.

✝There are ways for most browsers, but not for IE and early Android browsers. Thankfully mobile device churn will cure us of the Android issue, but the affected IE versions will take longer to die.


Internet Explorer does support SNI since IE7. However Windows XP doesn't even if running IE7 or newer.

That said, we're on our way to a bright SNI filled future, we just have to get over the hump of old versions of Windows and some old mobile devices before it's common enough to be used reliably.


Huh, TIL. What was the reasoning behind this?


When client connects to server, first it does the SSL handshake. Then it sends HTTP headers.

As a result, SSL really has nothing to do with HTTP and could be used to wrap other protocols. Check out stunnel ( https://www.stunnel.org/index.html ) which can be used to arbitrarily encrypt communications for any TCP based protocol


And one of the first parts of the SSL handshake is for the server to send the certificate to the client, so the server has to know which certificate to send based only on the IP address that the client is connecting to. The client then verifies that the certificate matches the host name it's trying to connect to. Then it sends the Host: header with the request.

It's a strict layering: TCP - SSL ("Secure Socket Layer", right?) - HTTP

There are two (and a half) ways of using a SSL certificate for multiple hostnames on the same IP address / interface: SNI (Server Name Identification (?)) and Subject Alt Names or Wildcard certificates. SNI extends the SSL protocol to send the hostname during the client handshake. The Subject Alt Name extension, which has been more reliable and available for me, adds multiple hostnames to the certificate for the client to match against. Wildcards do the same thing, patternistically: *.example.com.


Who knows. I prefer to focus on what needs to be done to make things work.


Just to add: make sure your applications handle SSL certificates properly, and only trust either verified signed certificates or only the known certificate from the API server. Otherwise MITM could still expose everything.


Something to consider, the author advocates json responses across the board, which is pretty good advice, however, it's incompatible with `204 No Content` responses (also recommended). If you're building a client that connects to an API that could potentially return a `204 No Content` response, you can't assume that there will be json in the body and automatically parse the response.


I don't see the problem here. You get a 204, you don't expect content. If your consumer hits a URL, ignores status codes, and expects content, You're Doing It Wrong.

To me, it looks like the HTTP equivalent of a C function that returns NULL.


Agreed. Just throwing it out there since I've encountered it before, more than once, where others assume there will always be a response, regardless of status code.


I've been spending many weeks reading articles and books about building a RESTful api. There are a lot of 'ivory tower' guidance articles and books on REST that make little sense in the trenches.

That article sums up exactly my own distillation into practical terms of all that information out there. I wish I had read it first.


Look, REST wasn't something invented in Fielding's study, it was a destillation of an architectural style that was extremely common: Websites. Many ( and at the time, most) are stateless, follow HATEOAS, using only a standardized set of methods, etc.

There's nothing difficult or impractical about REST, and the proof is that we use it every day.

Now, it's not aplicable to every case, of course. In those cases, just use something else, and don't call it REST.


I am not sure I follow his point about why HATEOAS is not practical, but I know that I have been able to make it work in my own REST APIs using content types. I only return JSON if the Accept header specifies "application/json". (Which is probably what you should be doing anyhow.) I usually also allow an HTML fragment response for the "text/html-fragment" Accept type. The default response type (or if "text/html" is explicitly requested in the Accept header) is the HTML fragment data wrapped in document markup with any forms necessary to expose all functionality in the browser (without JavaScript, but just enough CSS to make it pleasant.) It takes more work, but it not only is the best documentation, you can point the href property of JavaScript click handled hyperlinks right to the REST URIs for proper fallback.


HATEOAS is first and foremost about the API being self describing.

Quoting from wikipedia: "A REST client enters a REST application through a simple fixed URL. All future actions the client may take are discovered within resource representations returned from the server."


I would disagree and say it is first and foremost about using hypermedia as the engine of application state.


Does your JSON response at the root entry point to the API return something signifying what all possible operations are?

Does the JSON representation of a Employee object returned URLs as a part of the body for the resource addresses for any "child" objects?

I don't think it can be considered HATEOAS if the answer to either is no.


>Does your JSON response at the root entry point to the API return something signifying what all possible operations are?

Why, no, it doesn't, because it takes 3 parameters, each of which can take 10,000 possible values, and I don't want to transmit a trillion options every time someone pings the root.

I mean, I considered documenting how to pass the parameters on a client's first entry but the HATEOAS crowd told me I was just re-creating RPC, and I agreed. So no HATEOAS.


Of course not - it is called HYPERMEDIA as the engine of application state (not JSON.) That's why it's not called JATEOAS. The information you are asking for sounds more appropriate in an OPTIONS request.


> Of course not - it is called HYPERMEDIA as the engine of application state (not JSON.)

If the answer to those questions is no (whether the format of the response is JSON or anything else), then you aren't using hypermedia as the engine of application state.


Content types and HATEOAS are orthogonal properties, I'm not sure how you made the latter work using the former?


> Content types and HATEOAS are orthogonal properties, I'm not sure how you made the latter work using the former?

They aren't orthogonal. Content-types are central to HATEOAS:

From one of the key descriptions [1] of the HATEOAS constraint on REST:

A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]

[1] http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hyperte...


That's media types. I was referring to the strings used in content-type, the HTTP header. You obviously need to support some media type, otherwise how would you represent a resource? But you don't need to support the Content-Type header to have HATEOAS.


> That's media types. I was referring to content-type, the HTTP header. You obviously need to support some media type, otherwise how would you represent a resource? But you don't need to support the Content-Type header to have HATEOAS.

If you are doing a REST architecture with a protocol other than HTTP, sure.

But since the Content-Type header is the mechanism by which HTTP communicates media types, and since in-band, rather than out-of-band, communication of resource locations and media types is essential to HATEOAS, the Content-Type header is a pretty important mechanism in HATEOAS when using HTTP.


  file document
  document: HTML document, UTF-8 Unicode text
You can distinguish between media types without using the Content-Type header or out-of-band communication. In fact, there's a (non-standard) header for preventing some browsers from replacing the Content-Type value with their own guess.

It's obviously useful, especially for distinguishing between similar media types (e.g. JSON document with different schemas), but not necessary for HATEOAS.


Agreed, I am just trying to guess where the author is confused and reply in a useful manner. I am guessing that too many so-called "REST APIs" return JSON response bodies without explicitly requesting them in the Accept request header. By simply using the Accept header how it is meant to be used, you can return HATEOAS, JSON, XML or whatever format you want specifically designed for the target client.


What does it mean to "return HATEOAS"? HATEOAS is an architectural contrainst, not a format.


True. What I meant was a document that exposes all functionality of the network resource using hypermedia. I did not think that all responses had follow HATEOAS.) Some response should be OK returning partial data or limited functionality as long as something discoverable is explicit and comprehensive.


RESTful design is still one of the tricky problems I meet on a daily basis. For example, I have blog application in which each blog entry can have multiple versions in its revision history. What should the restful api look like to provide the ability to revert to a certain version?


Off the top of my head—have a BlogPost and a BlogPostVersion resource. BlogPostVersion has all the content, and BlogPost simply has a canonical URL and a link to a BlogPostVersion.

You could then PATCH the BlogPost with the link to whatever BlogPostVersion you want to update to.

Curious to see what others would recommend.


Indeed. Creating a new version of your post should be a POST to /posts/X/versions. This would create something like /posts/X/versions/U-U-I-D.

Since REST is all about mapping to the underlying semantics of HTTP you'd then want to make /posts/X redirect to /posts/X/versions/U-U-I-D

Since there's nothing wrong with updating your resource under the hood (think of e.g. http://www.weather.com/weather/right-now/) posts/X would simply always redirect to the latest version.

If you don't always use the latest version by definition, then you'd probably do a PATCH with the new version id to /posts/X, or a PATCH with 'active: true' to /posts/X/version/O-L-D.


If I understand your solution correctly, does it mean that it's now the client's obligation to get the content from the revert-to version and create a new version and POST it to /posts/X/versions? That should work, but what if I don't wanna give the client the ability to create version arbitrarily (only allowed to revert to a pre-existing version)?


My first solution was assuming you could not revert. If you want to allow revert then the client would first call /posts/X/versions, get a list of all versions and then either do

    PATCH /posts/X
    { "version": "older-revision" }
or

    PATCH /posts/X/versions/older-revision
    { "active": true }
Access control is completely orthogonal to this; so for your sample case you would just return a 403 for any other calls (like e.g. POSTs to /posts/X/versions)


Thanks for the reply but maybe I didn't make our requirements clear. The history should be kept intact after the revert - a new version should be created that duplicates the reverted version and become the current version. Thoughts?


Um, how about POST /blog/id/ and in the request body revert=version#

Seriously REST isn't a mystery. I think the problem is few understand what it is. Here is my 30-second version:

1. Identification of resources and manipulation through representations. This means a network resource should have a URL that is the same no matter what you are doing to it - getting changing, removing, modifying or any custom manipulation. For the Web, use HTTP verbs and request body data to define the actions.

2. Self-descriptive error messages. Don't return 200 OK for everything. Use status codes and verbose response bodies to describe what happened.

3. Hypermedia as the engine of application state. Expose ALL functionality in HTML using hyperlinks and forms.


I am not sure how RESTful that looks to me. I would like to route to the action from the url rather than having to read what's in the body. In your solution, I would need that body parser to differentiate this revert action from an update action.


Irrespective of what is good or bad design, defining an action in the URL goes against the most critical REST principles. Design it whatever way you like, but if you do this, it is not RESTful and you should not refer to it as such.


I think what you have said is correct of how, but not why.

However I think what this whole discussing is missing is the properties that you derive by adhering to a REST design.

I think there should be some tests of a design to see if you are actually getting the benefits of REST.

For example in a REST design you can re-arrange the internal URL's in a server and the site is still usable to clients because they are following links from the root.

Why do we want this ability to rearrange URLs?


The concept of putting URLs in the document is not just so that you can re-arrange them, it's so that you can use HATEOAS.

Example: think of posts here on HN. One thing a specific media format would include is a "reply" link, but on hellbanned posts that link would absent, so that state would be inaccessible to clients.

Or say you've used comments on your blog, so each post has a link to the list of comments about it. Now you switch to Disqus, and so you could change the URL to point to their comments pages instead, and a decent client would use it transparently (assuming good media types).

All of this is taken for granted and used a lot on the real RESTful space: the HTML Web.


RESTful interfaces use HTTP verbs (actions), and the urls contain nouns (things)


All blogs and howtos only describe the change of API Access with new versions, but not how to handle different backend data versions.

One way is to use the newest version in Client and the Server has to convert old Data to new Data, then deliver to Client. But sometimes it not so easy, depend on complexity of your data.

Other way: convert all Data to new Version and change Data-Access in the old Api Version (but here is the Problem with: never change a running system, things that work before could go wrong). And if you have a huge site with many users, it is not possibly to interrupt the service.

To maintain many versions is for a short period ok, but for longer usage not practicable.

Did somebody have experience with that ?


Great article. I'm actually in the middle of building out a new API. I've built many RESTful APIs but I'm starting to rethink of a couple of things with this new one. Does anyone have any good resources on when it's NOT appropriate to use REST? Or is the assumption that it should generally work for anything if you model it right?

I ask this because the API I'm building is for a B2B product and lot of the "actions" are not state change requests. In fact, they are a lot of verbs which fire off lots of business logic and don't really map well to a single entity. Some endpoints also need to return very large and deeply filled entities in a single call.

I've started to investigate JSON-RPC, is that a good option?


> Or is the assumption that it should generally work for anything if you model it right?

It should generally work for anything if you model it right.

> I ask this because the API I'm building is for a B2B product and lot of the "actions" are not state change requests.

How can anything both be an action and not be a state change request?

> In fact, they are a lot of verbs which fire off lots of business logic and don't really map well to a single entity.

A "verb that firest off lots of business logic" sounds like a RPC-style metaphor.

In a REST architecture -- and they don't necessarily map perfectly so with more description I might characterize this differently -- I'd characterize that as most likely a entity creation (HTTP POST) action (the entity being a particular invocation of the underlying logic, and containing all the necessary parameters.)

> Some endpoints also need to return very large and deeply filled entities in a single call.

How does this conflict with REST. REST has nothing against "large and deeply filled entities". (Remember that HTTP is itself a RESTful API, and obviously is designed for a use case where "large and deeply-filled entities" are frequently returned.)

You may want to define specific media types for each of these types of entities to do REST properly, but since in practice you are going to have to define the structure of the entity returned no matter what application architectural style you are using, this isn't really a substantial extra workload for REST.


> It should generally work for anything if you model it right.

No, this is not true. Each constraint of REST comes with drawbacks, and if you can't afford those drawbacks, you can't do it RESTfully. Fielding's thesis is very upfront about this.

The biggie: latency. If you need sub-10ms responses, REST is the wrong way to go about modelling your problem domain.

The second: client-server. If you want the server to initiate behavior on the client, REST is the wrong way to go. See the wealth of WebSockets/Meteor/Real Time Web (tm) frameworks and their hype for examples of when you'd want to do this.


First, I think that you misunderstood what I meant by "anything" -- I meant any logical model of what an API does, which seemed in context to be what the post I was responding to was asking about. I wasn't saying "anything" in the sense of any combination of performance requirements.

That being said, I'm not convinced that your particular objections, aside from responding to "anything" in a different sense than intended in context, are really accurate.

> The biggie: latency. If you need sub-10ms responses, REST is the wrong way to go about modelling your problem domain.

HTTP might be problematic here, but I don't see why REST is problematic. (REST doesn't rely on HTTP -- in fact, HTTP is itself a REST-based system -- and can be implemented over protocols with different performance characteristics.)

Its obviously a problem for every request to navigate from the API root if you have tight latency constraints, but there is nothing unRESTful about having a client cache the locations of the key resources it is interested in after the first access. In fact, reducing latency by encouraging cacheability is an explicitly-cited motivation for REST.

> The second: client-server. If you want the server to initiate behavior on the client, REST is the wrong way to go.

If you want system A to initiate a behavior on system B, then in the context of REST with regard to the behavior at issue, A is a client consuming an API and B is a server providing an API. If it is necessary for other reasons for A to be an HTTP server and B to the HTTP client, then you obviously aren't going to be doing typical REST-over-HTTP to implement the API that A is consuming and B is providing. But there is no reason that you can't use a REST architecture for the API. (OTOH, since, in simple cases, the API implementation will likely be being provided as Code-on-Demand to B by A, there's may not be a lot of reason to use REST, but it could help reduce coupling between different components on A.)


> I meant any logical model of what an API does,

Gotcha. You're right that I got this wrong, but I think my objection still stands; a peer-to-peer interaction model is still not RESTful.

> Its obviously a problem for every request to navigate from the API root if you have tight latency constraints,

Even in truly RESTful systems, 'every request' wouldn't navigate from the root; the first interaction starts there, but it's not like you keep going back to the root every single time you want to do anything.

> In fact, reducing latency by encouraging cacheability is an explicitly-cited motivation for REST.

Absolutely. I was thinking of the 'layered system' constraint. From http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch... :

> The primary disadvantage of layered systems is that they add overhead and latency to the processing of data, reducing user-perceived performance.

You are right that caching helps balance this out, but not everything can be cached; for example, a first-person shooter game would be hard to cache.

> A is a client consuming an API and B is a server providing an API.

Even if it's 'oh they're just two different APIs working together', it's not a singular API, which is what we're talking about here.


On latency and the layered system constraint you have a point.

> Even if it's 'oh they're just two different APIs working together', it's not a singular API, which is what we're talking about here.

I thought we were talking about the utility of REST architecture for the API(s) involved. Obviously, if you have requirements which require two different APIs where the consumers of one API are the providers of the other API, then regardless of architecture, it won't be one API, but that's orthogonal to the architecture appropriate to either or both APIs.


I can think of an example:

Send a message to the server to process all approved cases, which has no connection to an individual resource. The client has no fundamental knowledge of all server-side resources that may or may not be affected, and may not even be allowed that information.

It's an action, but it's not really a post. You're not creating a new resource. You're not patching anything, you're not really getting anything... It's closest to a PUT, but you're not really updating a particular resource...

This may not be a document-based API like REST expects, but it is a fairly common enterprise requirement for a system.


> Send a message to the server to process all approved cases, which has no connection to an individual resource.

"Individual resources" are defined by the needs of the API. If you need an endpoint that can be given a command to process all approved cases, then that is an "individual resource".

The particular kind of resource I'd normally model it as is one which is or has a collection resource in which individual command instances are the members of the collection.

> It's an action, but it's not really a post.

I disagree. Submitting a new request to initiate the action is exactly a request to create a new command resource subordinate to the collection of commands subordinate to the command processing endpoint resource, which naturally maps to an HTTP POST action to the collection. The processing of approved cases, and the resulting changes to the backend data store, are consequences (side effects) of the creation of that resource.

> This may not be a document-based API like REST expects

REST doesn't expect a "document-based API". It expects a resource based API. Commands, collections of commands, and endpoints which have collections of commands as well as other subordinate resources are all, themselves, valid resources, whether or not they are sensibly described as "documents".


"REST doesn't expect a "document-based API". It expects a resource based API. Commands, collections of commands, and endpoints which have collections of commands as well as other subordinate resources are all, themselves, valid resources, whether or not they are sensibly described as "documents"."

Ah, well said. It's so hard to find good examples of this though. Most REST tutorials focus on simple nouns that happen to map nicely to tables. But you're suggesting that "resources" could be far more abstract. But, if I were to treat Commands as resource and perhaps make it my only resource, isn't that essentially RPC?


> But, if I were to treat Commands as resource and perhaps make it my only resource, isn't that essentially RPC?

If you have a root URL for the API, and all the endpoints are located via links from the document at the root URL, and submitting commands gives back a results resource that either is or provides a URL for the output, and all the different resources have media types that define what is needed to understand/process them without requiring out-of-band information beyond that describing the media types and the root URL of the API, then it can still be REST.

I think its probably fairly common that there are situations where the "active" side of a REST API will largely look that way, even if there is a read-only component that looks like of the collection-resources-as-tables, individual-resources-as-table-rows business data view.

That being said, its probably not really good REST if things being modelled as abstract commands with side effect of changes on multiple entities really could be modeled as changes to some particular business entity that also had side effects on other business entities. But whether that applies to the commands you are using will depend on your use case.


Great article. I wish you would have written this a year ago when I needed this info. I had to learn all of this stuff the hard way. I'm definitely going to use this as a first resource to give to people that I know who are building restful APIs.

I'm slightly a bigger proponent of HATEOAS, but if you don't need it yet, I think you can always add it in later. I've been the giver and receiver of bad things when it's not followed, but that is generally around massive projects.


I prefer providing a "Range" header for pagination. It's typically used for retrieving byte-range chunks of large objects, but it's also applicable to return a subset of a result set, like "Range: records=0-10". This has the drawback of not being easily applied from within a graphical browser, but I don't consider that a big priority for a REST API in the first place. Viva la curl!


That's great as long as you're using something with efficient random-access (SQL). If your datastore/back-end is btree-based, however (say, CouchDB for example, or Google search results) you're better off with 'next', 'prev', 'first', etc. pagination. Asking for the 50,000th record means skipping the first 49,999! So, you may be painting your back-end into a corner by counting on random-access being an efficient operation.


The header's ranges-specifier doesn't specify a syntax, so you can use a docid-based pager just as easily.


"A RESTful API should be stateless. This means that request authentication should not depend on cookies or sessions. Instead, each request should come with some sort authentication credentials."

Doesn't that mean you need to do a database lookup to verify the user with every request? Seems like a lot of overhead just for the sake of avoiding sessions.


Assuming you are using a distributed architecture, there is no way to verify a user without at least one database lookup because the request could be coming into any API server. So in most cases we're not avoiding cookies and sessions just for the sake of it.


You don't need to do a database look up if you stuff some context into your token and encrypt it with a secret key. When the server receives the request it can simply decrypt the token and deserialize it into some sort of strongly typed usercontext


Doesn't this open you up to replay attacks though? Since you can't store that a token was already used..


I failed to say that your token context should have a "time based expiration", in that a new token is reissued periodically as defined by you and your needs. I would refer to the ASP.NET Forms Auth mechanism with its sliding expiration.


Sure you can. Just include a timestamp, and expire the token, at, say, time + 90 seconds, or whatever makes sense for the application.


I get that you can expire it, and that helps, but it's not the same as use-once. Of course, just using a timeout is probably fine in many cases, especially if it's used with SSL. But replay attacks are still possible since there's a windows where it can be re-used.


Replay attacks are always gonna be possible unless you use a one time token or signature, thems the break's..., unless you wish to get into the something you have and something you know model. How can you do a use once token making concurrent requests without a strong authentication mechanism client side such as issuing private keys to clients....and all the PKI admin overhead. I think its safe to say, that a restful api should be stateless, and bottlenecks such as session state are not necessary.


I'll take that as a "yes" ;)

AFAIK, neither signatures or "something you have, know" alone fixes replay attacks. Since this is a well known problem in cryptography, many solutions exists. All of which are probably overkill for this use.


At least with the use of a digital signature and nonce you can guarantee that the request hasn't been tampered with!


Use redis/memcache to avoid constant DB hits.

Alternatively, if a time limited token is practical, use a self-signed expiring token.

One reason to avoid sessions is security aspect of it. Cookies are handled automatically by the browser which opens the API up to XSS


Use an expiring token like mechanism.

The API user first gets a token using credentials. Future requests use the token for authentication.

A new token will be required periodically.


What are the benefits of doing this instead of just requiring authentication with every request?

Is it because the authentication part is a lot of work for the server or client?

Is it for the negligible (in this context) security benefits of not using the same secret-key for all traffic?


The hashing for the authentication is intentionally computationally slow (thus mitigating brute force validation). The token issued is basically like a session id - validating a session id is really just like string compare, so it's much much fast.


How is that fundamentally different from storing sessions in a database (which is common to be able to scale horizontally)?

Whether or not that is a lot of overhead depends also on how much work you were going to do to process the rest of the request. If your API allows sorting and filtering of a large data set, like the article suggests, then the authentication overhead is probably relatively small.


You don't need to do a database look up if you stuff some context into your token and encrypt it with a secret key.

When the server receives the request it can simply decrypt the token and deserialize it into some sort of strongly typed usercontext.


Others have already commented, but also keep in mind that subsequent requests should be pulling from cache. So the overhead is lighter than a full DB request.


Just wanted to say thank you for this. I have been searching for a good article explaining the details of thinking through building an API. This really could not have come at a better time for me! Cheers!


[deleted]


Obviously off-topic, but under what circumstances would you ever be returning a field containing a "password" (even a hashed one) over the network in the first place?


OP has mentioned that 'search' is not a noun and hence can't be modeled as a resource. First of all, 'search' can be used as a noun in the sense of quest/query - 'His search for the Holy Grail was fruitful.' Secondly, we happen to have a multi-model search in our application and we treat it as a resource. This has worked out very well for us and our users can easily craft search queries just like for the rest of the RESTful resources.


REST isn't enough.

To do it right use the HATEOAS constraint. http://en.wikipedia.org/wiki/HATEOAS

And use application/hal+json http://stateless.co/hal_specification.html

The more we follow these simple constraints the more we can start to build tooling to consume any API without having to know much about it ahead of time.

That said, well written article.


> REST isn't enough.

> To do it right use the HATEOAS constraint.

The HATEOAS constraint is part of REST; if you aren't using it, you aren't doing REST.


Great article, I've read many like it, BUT what I can't find is information pertinent to how one should host their API. I submitted an "Ask HN" (https://news.ycombinator.com/item?id=5820761) earlier, but could some folks please advise me on the hosting side of "Designing a Pragmatic RESTful API"?


This is a great article, and although it may raise the ire of some of the REST purists out there, I completely agree with the pragmatic approach. It prompted me to finally publish some of my thoughts on API design that complement the more technical ones in the article:

https://news.ycombinator.com/item?id=5831253


Before leaving my current place, our Architect built a PHP micro-framework for our APIs that follows 90% of what's outlined here. I never really understood the magnitude of his work until I tried developing a new API in a different framework. Intelligent routing and query param handling all baked in is so rad.


Excellent article. I've just written up REST API design patterns for our group here and it maps very closely to the guidelines you've outlined. Many other posts about REST design are lacking real world experience items like rate limiting.


What about jsonrpc (http://json-rpc.org/)? I am using it in a project and I really appreciate the simplicity. Would be curious to hear from others who have experimented with it.


Well just as long as you use the right paradime REST is not always the best model to use.

I built a tool to use a third partys restful api for a when they should have built it using message queuing as that suited the application.


Don't limit yourself to JSON. Your dislike of XML does not mean that JSON is always the right answer. Instead, write code that flexibly can render to any of a set of formats, and use content negotiation to determine which format the user agent wishes to consume. This lets you do things like let the read-only portion of your API be accessible via browser, and it will future-proof you.


I did think that the criticism of XML for being hard to parse was a bit odd. Although I generally prefer JSON for simpler stuff XML is pretty easy to parse and sift through with XPath.


It has to do with the awful XML generated by automatic tooling (SOAP by WCF and JEE).

Hand-generated XML can be just as nice as using hand-generated JSON, and on the other hand automatic domain-model-to-JSON mapping can be as ugly as any XML monstrosity.


Most languages have the concept of arrays and hashes, which map unambiguously & idiomatically to JSON via generic serializers.

Creating idiomatic XML, on the other hand, is a little more tricky. Should something be a tag or an attribute? What should it be named?


Though for the same dataset, XML will be slower to parse. For performance's sake use JSON.


> Though for the same dataset, XML will be slower to parse. For performance's sake use JSON.

Loading a huge JSON file is almost certainly slower than using a SAX parser for a huge XML file. Maybe there are SAX like approaches for JSON, too.


Yes, it's called "streaming". Like in XML, you can also used a mixed approach. See e.g. https://sites.google.com/site/gson/streaming


I suspect that depends a lot on what platforms are used at the client and server.


when he says "use JSON where possible XML only where you have to", I thought it was funny, like developers out there are just dying to use this verbose behemoth instead of its terser cousin json. But anyways, I agree with everything you're saying just wondering if there actually are any devs out there who just adore some sweet XML (seriously) ?


OP here. I got an email with an interesting argument on the topic. Paraphrasing: Today JSON is hip, tomorrow may be something else. By supporting XML by default and using XSLT to translate it to alternate outputs, you're able to support multiple formats without having to modify your software itself. Ofcourse, this does come with the cost of having to maintain XSLT files.

To businesses that need to support multiple formats (enterprise requirements?), XML + XSLT sounds like a fair approach - it allows you to simultaneously create idiomatic XML & JSON


> By supporting XML by default and using XSLT to translate it to alternate outputs, you're able to support multiple formats without having to modify your software itself.

By separating out the rendering-to-an-output format from the basic logic of the application, you get similar benefits without creating a dependency on XML handling libraries and requiring another implementation language (XSLT) for the rendering component.


There's no benefit to using XSLT instead of just writing code to transform your data into another format.


I'm not a fan of XML, but I do like RDF, and unfortunately RDF/XML is the most common encoding format.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: