Hacker News new | past | comments | ask | show | jobs | submit login
The Good, the Bad, and the Ugly of REST APIs (oreilly.com)
115 points by apievangelist on June 5, 2011 | hide | past | favorite | 37 comments



Map your API model to the way your data is consumed, not your data/object model

I'm glad this was brought up. I feel I'm constantly battling with myself on this issue. The purist in me always wants to create REST endpoints that return just the minimum info you'd expect from that URL which represent my model perfectly. The result is requiring customers to call the service a lot. But sometimes if you take a step back and think about how it will be consumed you can really do your consumers/customers a favor. Although, if not carefully thought through you could end up doing more 'work' and returning data that is always discarded from the response.


This thinking can however also be taken to far. The Dutch Railways recently released an API for all their trip planning, live departures and pricing info. Unfortunately, it's mainly a way to get at the API their apps call: it's all presentation based.

For instance, up until recently there was no way to tie a certain train service listed in a given trip to the same service in the live departure endpoint. It's also still not possible to tie a train service to its schedule.

They've taken their API and made it all consumer-oriented and it's gone overboard.


"I know you love {JSON,XML} and you think everyone should be using {JSON,XML} and that the people who use {XML,JSON} are simply stupid."

Does anyone love XML and hate JSON?


I don't love XML or hate JSON, but part of my job involves writing Flash/AVM1 (ActionScript 2) code that runs on slow CPUs (yes, feel free to pity me).

I can get reasonable performance out of large XML results because the XML is parsed inside the Flash VM by native code. There is no such luxury for JSON, so I have to resort to writing hacky non-compliant "JSON parsers" in ActionScript that are optimized for specific data sets (because the slow-CPU/Flash-AVM1 combo means a true compliant JSON parser for the full API result is just way too slow).

So, while I may much rather be using JSON (or better yet, protobufs of thrift) for everything in a totally different environment, I do appreciate APIs giving me XML results because of my specialized situation.


It's embarrassing that most live JSON-based protocols, including my employer's, hearken back to the painful garbage-in garbage-out dawn of the industry (what with ad hoc validation and no schema at all). It's hard to overstate the importance of every implementation agreeing on whether or not a given message should work. That's the only knock against JSON I know of, apart from bloat of course (but XML is worse).


I'm sure there are some programmers who know their way around their favorite XML library in their favorite language and would refuse to learn another.


But there isn't much to learn while parsing JSON. It nicely maps to high level languages. It does make your knowledge of XML useless, but it doesn't has the cognitive overhead of learning something which takes time and effort.


I wouldn't say love and hate, but I've met people who like XML because it's familiar and don't care for JSON because to them it's foreign.


You've got to know when each is suitable. Doing REST properly in JSON involves hacks and trade-offs that you don't have to make with XML. The classic example is how you encode a link with a relationship - XML is rich enough to make it trivial. With JSON, you either have to specify a microformat, or rely on mutually agreed regex-matching rules on raw strings (which sucks, but might be enough in a restricted scenario).

Then you've got namespaces, entities and validation - all things which might cause someone to say "I'm inevitably going to need X in all my applications, therefore JSON can never be suitable." They'd probably be overgeneralising, but I can certainly see how it would lead to a preference for XML over JSON.


right, i can't think of anyone that prefers xml over json-- xml seems too bloated, dom and sax parsers require lots of code to walk trees compared to just a single json parse call and then you can just directly access things in an object. though i have had to write several json decoder/encoders for special objects.

for my internal rest apis, the only response format i provide is json, this is for mvp purposes. though i will add xml when i publicize these apis.


My understanding is that REST imposes a bunch of restrictions about how an API is suppose to work.

For example the idea that you can just communicate a single base URL and document formats, and other urls are deduced from that. Which add complexity, but make the system more able to evolve.

My thinking is that these constraints are inappropriate for an internal API, where you could embed knowledge of how to construct a number of URL's in clients.

Also some tools, like .NET, have good support for mapping XML schema's to classes, and I don't know of equivalents for JSON.


This is like XML Myths 101.

> xml seems too bloated

To cite Wikipedia¹, “The XML encoding may therefore be shorter than the equivalent JSON encoding.” Please provide a fragment of XML document that is significantly more verbose than its JSON counterpart.

> dom and sax parsers require lots of code to walk trees compared to just a single json parse call and then you can just directly access things in an object.

Using DOM/SAX (especially SAX) is like walking the JSON's AST. You're doing it wrong (this is too low-level).

The second problem with this argument, is that people using it are selling you something that isn't real. You can't just ‘directly access things in object.’ You need to deal with untrusted input that is just coincidentally mapped to your very basic data structures (dictionaries, lists and numbers and strings). You will have to write schema—it will most likely be something that is different to each ecosystem (Y.Dataschema.JSON for Javascript, Colander for Python, etc.), and then you will consume exactly what you would consume with any other format, be it XML or Thrift or PB, that is your DAL objects, your representation objects if you're doing REST.

If you are doing it any other way, if you're doing it the naïve way, you're living in a fantasy land—you're using not only something that wouldn't scale, but something that is unsafe and unnecessary, yes!, verbose.

A very simple case for a web-application (and this is for where I argue), is that by using JSON.load() you're loading everything to memory. That makes you vulnerable for a simple dumb attack like this:

    {"malicious":{"document":[{"how":{"many":{"nested":{"dictionaries":{"can":{"your":{"webapp":{"handle":{"per":{"how":{"many":{"webapp":{"requests":{"before":{"it":{"runs":{"out":{"of":{"memory":{"?"}}}}}}}}}}}}}}}}}}}}]}}
So you need to use streaming parser with a schema, to drop request when there's first sign of trouble, or you need to limit your webapps to requests that are no more than thiiis big, which of course limits you as to what you can include in that JSON document (if you want to include 5MB file, you will need to be ready for 5MB of nested dictionaries and lists). Not that you can't do that with JSON, however it's not just JSON.loads() anymore, now is it?

You asked why someone would prefer XML over JSON. Well apart from XML having dealt with all of this issues long time ago and is widely supported, how about the fact that XML is great for the Open Web and JSON will slowly kill it?²

¹ http://en.wikipedia.org/wiki/Json

² http://news.ycombinator.com/item?id=2588606


It's not a matter of love/hate, but I've found it's a lot easier to sell something as "a web service using XML" than "a RESTful API using JSON". People know what XML is, not everyone has heard of JSON.


REST is good, SOAP is bad

The irony is that SOAP is a lot closer to being RESTful than RPC-over-HTTP aka "REST API", because SOAP actually attempts to have a uniform interface driven by hypertext (WSDL). That doesn't mean it's easier, but REST is not supposed to be easy.


"Meaningful error messages help a lot"

the beauty of rest is that you piggy back off the error codes of the http protocol, e.g. 403 forbidden, 404 page not found, 500 internal server error. of course you could always add text in the response providing more detail, but i argue that this is not very helpful in all cases, what can you as the api consumer do if the server is barfing back something cryptic (503)? the use case i can think of is for an errant parameter in the request, i'm a firm believer of the client app doing field validation, of course that doesn't absolve the server from having to also do it. but instead of providing a 500 (internal server error), i think there could be something more meaningful like a 400 (bad request) followed by some response parameter with more detailed information like "your address field contains a float".

the example given for authentication is a poor choice, there are security reasons why you would want to be vague, returning the fact that the password is mistyped only tells the consumer of the api that this user exists, please hack me.

the one thing i hate about rest is how to document the endpoints and the various http statuses that could be returned, if someone could generate a script that could help me here (for python) that would be appreciated.


Rails uses 422 for invalid input.

Which wikidepia says[1]: Unprocessable Entity (WebDAV) (RFC 4918) The request was well-formed but was unable to be followed due to semantic errors

1: http://en.wikipedia.org/wiki/List_of_HTTP_status_codes


"And while you're at it, don't use HTTP authentication either. Use signed queries that authentication each API call individually."

What is the problem with using digest auth with proper request counter (nc) incrementing done over HTTPS?


There's nothing wrong with it or with HTTP Basic Auth over HTTPS either for that matter.

The author doesn't seem to understand that HTTP Basic/Digest Auth is effectively exactly what he wants: a signed request that authenticates each API call. The fact that it goes in an HTTP header rather than some query parameter or something in the request body is only relevant in the case that users of your API don't have the ability to manipulate headers. This is a vanishingly small set of users. Whether they know how to do it or not is another matter.


The problem with it is that you can't cache it in an intermediate server, like Varnish or Squid. You have to `Vary` on the authentication header, and with a counter that increments every request, it invalidates the cache every request.

Better is Basic over SSL, since the Authentication header never changes.


What's the problem with HTTP Basic auth over HTTPS?


Cost?


Isn't "signed queries" the same as 2-legged OAuth?


>>And while you're at it, don't use HTTP authentication either. Use signed queries that authentication each API call individually.

Beyond server-server api calls and authentication, I'm curious to know what folks think about authentication and individual-user client apps making api calls to a server.

If you build a web service whose REST apis are only intended to be consumed by customers of your $0.99 mobile app, one option is to use no authentication at all. This means that a third-party may potentially write another $0.99 app that "steals" your web-server resources.

Has this been a problem and what (if any) schemes (e.g. digest auth, facebook login etc.) have you used to mitigate the problem. (facebook login etc. doesn't fully address the problem unless you can distinguish between facebook users who have purchased the app and facebook users who haven't purchased the app )


Why not to use client cert and HTTPS or HTTP digest authentication with same password across clients? In order for third-party to create their client, they have to go through reversing procedure, which would be unlawful.


You expect that the third-party co-opting your server resources would be in US legal jurisdiction?


I expect this will give me firmer grounds whenever I need to send DMCA takedowns.


Failing to realize that a 4xx error means I messed up and a 5xx means you messed up

More often than not, this seems to boil down to bad error handling than it does lack of understanding.


I love the simplicity provided by a good REST API.

ESRI's REST platform is a good example of a nice, clean API. Very easy to diagnose and resolve issues, especially with tools like Fiddler.

A really nice feature to have in a REST API is the ability to run operations in an HTML form. With this, you can re-run requests in the browser and tweak parameters to help you diagnose issues. Very useful.


I prefer Charles: http://www.charlesproxy.com/


Charles is awesome. We used it to debug connectivity with some third-party web service over HTTPS -- absolutely worth its money.


Absolutely. Charles can sniff SSL by acting as a man-in-the-middle Certificate Authority. Very clever!


Looks good. I'll give it a try, thanks.

One problem I have is dealing with viewing my Comet connection (long-lived XHR). I stream JSON to the browser and Fiddler picks up and is able to display all this data, but prevents it from being received by the browser. Would be nice to not have this problem so I can see my JSON events in Fiddler and in my UI.


Have you tried tcpTrace (http://www.pocketsoap.com/tcptrace/)? It's dead simple, so I wonder if that may help. I haven't ever used it with long-lived XHR, so I can't say it'll work, but it's a free, 224 KB standalone, so it can't hurt to try.


Can you let us know if you had better luck with Charles?


Sure thing.


Speaking as someone who's currently building a cloud API, this makes a lot of sense.


To me, throttling and chatty APIs seem to be orthogonal. These apply to REST or SOAP or any other API.

EDIT: I'm wrong, chatty is specific to REST.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: