Hacker News new | past | comments | ask | show | jobs | submit login
Common REST Mistakes (prescod.net)
40 points by pan69 on Feb 12, 2010 | hide | past | favorite | 39 comments



Several of these points are highly prescriptive without offering any motivation for the advice.

For example, "your public API should not depend on the structure of your URIs. Instead there would typically be a single XML file that points to the components of your service." It's difficult to understand this kind of point without an example in hand - preferably an example taken from an actual website.

The whole list presupposes a familiarity with REST theory and jargon. And the assertion that 'sessions are irrelevant' instantly raises my suspicions about the author's relationship with reality.


By "sessions are irrelevant", he basically means "don't recreate circuit switching on top of your packet-switched TCP/IP connection." Each HTTP request should carry with it all the state required to do what it needs to do (including state stored in cookies et all); it shouldn't be dependant on what has come before or what will come after it.


Right this is a huge item when it comes to scalability. I argue this with developers all the time. The problem is that session is like Crack, it starts with just a little. It is so easy to get caught up in storing off some info in a property bag or List for later use, but it binds you to so much hardware and software infrastructure, by using just one little pattern.

I am old enough to remember the first time the session pattern was used in a web framework, I remember cringing then just as I do now when I see it used.


I agree. A lot of the more pedantic REST advocates seem to use this kind of thinking, such as implying that we shouldn't use descriptive URIs because the client or programmer shouldn't need to understand how the URIs are structured. If that's the case, surely we could break REST convention as much as we want and just make the programmer or client go along with it.

I think when most normal people say REST, they simply mean some sort of vague messages-over-HTTP metaformat that doesn't necessarily enforce odd restrictions about POST vs. PUT, is somewhat self-documenting and easy to work with due to the limited scope of the medium, etc.


I don't think it was implied that we shouldn't use descriptive URIs—just that descriptive URIs should be treated in roughly the way we treat comments—they're for humans to work with, not computers, unless there is an established microformat (like the various -Doc formats used in comments) that the service can be guaranteed to obey. That microformat would, obviously, be linked to in place of the index XML file—as it would be able to be used to procedurally generate same.


I agree too. Roy Fielding's dissertation on REST does say that URIs should not contain information about the identity of the user nor the details of the implementation of the resource on the server (no .php, .cgi, .aspx) but in a section on URIs (6.2.1) he does say "...REST accomplishes this by defining a resource to be the semantics of what the author intends to identify..." I read that as a requirement for descriptive URLs.

And I think of Fielding's dissertation, together with a few RFCs like 2616, as the founding documents of REST.


tlack, when you write 'A lot of the more pedantic REST advocates' you make it seem as if there was a choice to be 'strict REST' or 'less pedantic REST'. The problem with that is as follows: REST is an architectural style, which means that it consists of a set of constraints it imposes on an architecture. The benefit is that such constraints induce a set of system properties (e.g. cacheability, scalability). Usually, the choice for an architecture means that you are interested in your system having these properties. Now, if you drop constraints at will, you modify the style and do not get the desired system properties. The notion of 'non pedantic REST' (or 'Low REST') is missing the point.


the idea is that instead of documenting your project list to be at /projects/ and individual projects to be at /projects/PROJECT_ID you should have a SINGLE entry point into your API which should have LINKS to your various services.

so you will have something like

    <api>
      <projects>http://../projects/</projects>
      </users>http://../users</users>
    </api>
or even this:

    <service>
      <name>projects</name>
      <href>http://.../projects/</href>
    </service>
    ...
and then your project list should not just return you projects ids to construct a url with, instead it should provide you full urls to access the relevant resources.

The rule of thumb is: with a PROPER REST API you should never do any "url generation", you should get ALL your urls (except for the SINGLE entry one) form the API.

check out this link for some more info: http://www.theamazingrando.com/blog/?p=107


Point 3 is phrased well in the title, but I'm kind of annoyed at the implication that you should intentionally obscure your URLs. I can see why he makes that point, and I see the benefit, but I also like URLs that make sense when I'm cutting and pasting.

I have a much easier time finding a cut-and-paste error from email in http://bob.com/posts/37/comments/79 than in http://bob.com/postthing/1f724c302a3b69ef9327313adb269a8d, even if the latter is convenient when the site later restructures things or adds a server.


Since a number of people have mentioned point #6 (Sessions are irrelevant), I have a question: Can anyone point to a concrete, more detailed discussion of how to do authentication and authorization in a RESTful manner?

I may be missing something very obvious, but I just don't see how this should/can/might look. (Just to be explicit, I'm not trolling. I'm confused.)


One way to do this is to have the clients pass the authentication information in the headers with every request.


Basically, http://en.wikipedia.org/wiki/Digest_access_authentication#Ex... but with anything you like in place of the GET.


Is the well known and used cookie based authentication in conflict with REST? In my mind it's not, it's orthogonal to it.

If not logged in, redirect to a login page (resource) which upon success redirects back, e.g. GET login?next=desired_resource.

If logged in but not authorized to perform the action on the resource return 401.

That's it basically, isn't it? Also not trolling, challenge me if I'm missing something please.


Cookie-based authentication is generally only applicable to the browser interface. From my point of view where REST really shines is at the API level. HTTP authentication while not particularly great in the browser ( mainly because of UI/logging out issues ) is good at the API level where you send your credentials with each request.


> HTTP authentication while not particularly great in the browser ( mainly because of UI/logging out issues )

You know, with all the little browser-chrome experiments that have been going on lately, I'm really surprised that no one has tried to make HTTP-authentication-based login/logout/account management as painless as HTML-served variants. I'd imagine that it would couple with the little "lock" icon in the URL/status bar, making it have four states instead of two: "insecure, secure, insecurely logged in, and securely logged in" where clicking on it brings up a menu to both view credentials and change your password/edit profile/log out/close account/sign up/anything else browser makers want to implement. (They could be distinct, of course, but I like the idea of making "insecure" look scary so that users would be deterred from sending credentials through it.


What if you have to serve both interactive users and automatic clients? Would you use the url structure for both or would you have a separate api structure?


If you know Ruby, you could check out AuthLogic.


The HTTP spec includes the Basic and Digest methods of auth.


Point 6 (Sessions are Irrelevant) is really tricky. Fielding says much the same about sessions as this article but there's a lot of room for clarification.

http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm


My take on "sessions are irrelevant" is that you don't start and close a session. You start by creating a resource: itenerary, document, submission or whatever you call it. You do that by POST-ing to the list of such resources and you edit it (PUT) until you're done. You could always come back to it (GET) and edit some more (PUT). Or you could delete it (DELETE).

Of course, there can be authentication and authorization but, really, there's no need for a session since you have to do those two on each http request anyway.


I believe #3 is actually a cornerstone of modern Ruby on Rails usage :-)


I have not read Roy Fielding's dissertation, and so I am speaking out of partial ignorance. But isn't most of this the opposite of what REST is supposed to be? URLs aren't supposed to represent resources? Whahh?


The contents of URL strings are not supposed to represent resources -- If a client of your "REST API" needs to concatenate strings together to form new URLs based on an out-of-band 'specification', YOU'RE DOING IT WRONG

Repeat after me: Hypertext. Is. The. Engine. Of. Application. State.

URL strings should be completely opaque -- you should be finding link to new resources in other resources, not constructing them yourself (query strings added to found resources are permitted).

The whole "meaningful URL" thing is a strong signal of False REST, and it infects the Rails ecosystem pervasively. REST has nothing to do with your oh-so-clever request routing rules.


One can build a service that has so-called "Rails-style meaningful URIs" that also returns hyperlinks to new resources. They aren't mutually exclusive.

That said, yes, everyone seems to forget about hypertext. I'm curious: if I'm building a REST-based JSON-formatted service, what's the convention for returning related URIs? Ex: if I return a Product resource andi want to have links to the products image, the owner's resource, edit this product, and so on, is there a standard way to return them in the JSON structure, or doesthis nor matter?


Can you explain more? It seems like you're saying that RESTful APIs should somehow be self-documenting, and that there should be no need for an API specification. I've never heard this argument before and would love to know more.


That is not exactly what was said. The idea is that URLs are not the same as resources, and that similar URLs do not necessarily imply a relationship between two resources.


REST has always felt a little bit dirty to me; an improper commingling of what should be two different layers. To me, using aspects of the HTTP protocol to have meaning for a higher level service is a bit like using aspects of the TCP protocol as part of the HTTP definition, where in a more ideal setup they should be independent, but layered on top of one another. Almost all of the REST mistakes mentioned in this article are due to the unholy matrimony of the two layers, and the misunderstandings and abuses that arise from that. Additionally, using HTTP error codes can break compatibility for anything running in a browser plugin (i.e. Flash) because the browser eats the error code and does not pass it on to the plugin. Needless to say, I'm a big fan of RPC which runs over HTTP POST normally but isn't married to the underlying protocol. I think SOAP gives RPC a bad name (I certainly hate the crap out of it) and doesn't properly separate the protocols either, using HTTP codes to indicate errors and such.


I think you might be wrong about the 'underlying protocol'. HTTP is an application layer protocol. If something feels dirty it is to put another protocol on top of it especially if it doesn't add value.

With REST, you are not adding more special meaning than the HTTP methods. Using those methods according to the spec is all there is to it. If that's enough to do your business, why add more to it? Things should be simple as much as possible.


Simplicity is good, but so is flexibility and compatibility. As I mentioned, parts of REST aren't good enough; anything that doesn't have access to http status codes won't be fully compatible. A decent RPC protocol like JSON-RPC is both flexible and simple. An added bonus to not marrying your protocol to http is that your API will work seamlessly over other protocols should the need arise. I have an API that was originally intended for web clients, but later the need arose for some internal tools to be able to call the same methods remotely and asyncronously. Instead of going over http, the jobs are submitted to gearman in JSON-RPC format and everything works without having to modify the server or unnecessarily and inefficiently route the calls externally through apache.


I used to argue in favor of json-rpc as well, but I have to say that I have allied with the dark side now. if for no other reason than caching, with REST you have fine control over what addressable resources get cached and which do not. This is huge for performance in some systems. I have found once you use REST semantics in earnest to build an application, it is hard to levy argument against it. So much of the webs infrastructure starts to work with you and not against you once you start to use it.


It's possible my perspective is skewed. I write APIs for stateful applications, so clients don't ask for the same information over and over, reducing the need/utility of browser caching. I can see how REST would help solve the caching problem.


I agree with that. But here you aren't talking about HTTP you are talking about RPC. You could have gone with RPC directly from the very beginning. Typically, you put that on top of HTTP to avoid firewalls, etc.

I said simple as possible, but the real world is always complex and one day you may have to use something other than JSON-RPC in which case someone will have to create an adapter or an abstraction. Thankfully this is easy to do in software even if it sometimes looks nightmarish.


Point 6 (Sessions are irrelevant) is where REST really falls down as a way of explaining the success or practical operation of the web.


What REST actually is, is not explained on this page. More information can be found here: http://en.wikipedia.org/wiki/Representational_State_Transfer



I'm actually designing a REST system to connect together the subsystems of a robot with a number of independent microcontrollers, so this is very relevant. Keep the REST articles coming, please!


He lost me at 'use PUT and DELETE'. There's a whole range of clients (eg most browser's XMLHttpRequests, Flash) that don't support anything but GET and POST.


Not true, XHR allows to use DELETE and PUT without any problems.

But yes - Flash can only use POST and GET (or maybe only POST? It's been quite a long time since I used Action script)


I think he means HTML forms. PUT and DELETE are broken in at least some (even modern) browser implementations.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: