Hacker News new | comments | show | ask | jobs | submit login
Stop Designing Fragile Web APIs (fenniak.net)
63 points by mfenniak on Apr 22, 2013 | hide | past | web | favorite | 28 comments



No, no, no. I disagree strongly that APIs should be vague about what they provide.

An API designer who deliberately leaves out documentation of important features (like pagination, sorting, etc), and fails to specify the format of what comes back any further than "this will be what you intended", hasn't designed an API at all.

APIs are well-defined interfaces that other programmers can rely on. If there is no well-specified contract, there is no API.

Defining such an interface so that it's reasonably future-proof? Yup, that's hard. And you're not an API designer until you take on that challenge. If you instead punt on all that responsibility, and provide the user few or no guarantees about how your service works, you've taken the cowardly way out.

The essence of software architecture is isolating and tackling complexity. Isolate it in the API, don't push it out to every one of your users.

I don't argue that an API needs to specify absolutely everything about its operation, but obscuring and removing features from an API without changing the version number is foul play. That is fragility.


It is very important to also document what the API doesn't provide, in addition to what it provides.

If the documentation doesn't specify the sorting order, this means that the client must not make any assumptions on the sorting. The client either has to take the sorting as is (and trust the API provider to always use a userfriendly order), or has to re-sort the result on client side to stay in control.

A similar argument holds for the result set: If the API docs specify that there may be "additional fields", the client must be flexible enough to ignore unknown fields.

There is an important difference between being deliberately vague (being clear on what the API doesn't guarantee) and just being vague (sloppy documentation). The latter is of course a PITA, but I it is clear from the article that it's the former what the author had in mind.


Hi Pete,

When I read your comment, I'm left with the distinct impression that we agree on 90% of what you're saying.

Of course you're going to document your API and define the format of what comes back from your requests. An API is a contract. Yeah, you isolate complexity in your API. You design an API, which is a difficult job, rather than just saying to your user, "here's a mapping to my database, do whatever you want with it."

This article does not say that APIs "should be vague about what they provide." It says the opposite, they should be very specific about what they provide.

You design an API by providing specific tools for your users. You give them hammers, not chunks of metal that they can mould however they want. You don't throw "features" into your API, you provide capabilities that your users need.

Mathieu


I think it depends on why you are making the API.

If the intent is to publicize your business by letting other programmers integrate their app/website with your service then I agree with designing an "API" that way. Because you don't want people to do with your API things you didn't intend them to do.

If the intent is to give other programmers the ability to work with your data and create their own thing using your service then the API need to be more flexible and be able to give more freedom possible to the users.


The author is basically advocating moving application logic inside of the API. This is counter to the entire purpose of an API.

Moving application logic inside the API will indeed protect you from having problems where the API changes in ways that make it incompatible with the Application, but that's because the API is now part of your Application and you're just going to solve those same issues within the API.

An API which provides every relevant data field is not "a programmer's API", it's an actual API.


I could be wrong, but I don't think it's that black and white. I think the intention is to advocate thinking about user intent. If the user really does want/need a flexible API method, then I think that makes sense to actually provide that.


The first example he gave was a leaky abstraction tightly coupled to the underlying storage, not an API. An API should precisely contain application logic - but that doesn't mean that your app shouldn't permit parameters as per the second example (the article presented a false dichotomy).


The recommendations of this document are definitely at odds with the design of the API my team is currently working on.

We are trying to build a data access API that can provide a functional and flexible view into data where we know some of the requirements and expectations of the initial users, but we don't know how it will be extended and transformed in the future.

If we don't provide things such as customization of the returned fields, desired date ranges, and ordering, then I believe we would have to rely on waiting for feedback from users/devs that we did not satisfy and attempt to build new end-points to correct that deficiency. That seems to be almost guaranteed to get us off to the wrong start for a large portion of users and they might just decide it isn't worth asking us to make changes since we didn't appear to be catering to them in the first place. I understand that we can't meet all the unknown needs, but if we can build features into the API to meet general classes of needs, that seems better than ignoring them until the specifics become known and then supplying further endpoints that answer only those specifics.

How would you extend this philosophy to a large data portal such as data.gov?

Even in the FBI example, if there is no extensibility or transform-ability in the API, how are the unmet needs of users handled? Do you have a system that gathers complaints from users about the data they suspect is possible but not available and then build new endpoints for each distinct need? Do you just ignore the potential transformative uses that aren't directly applicable to the needs of the customers to which you are paying attention?


A difficulty is that you can't remove options once they are published. A second problem is that you don't want your API to leak the underlying implementation or it will be brittle. But that doesn't mean that you shouldn't allow parameters - both examples given were a cop-out really.

I've had good results with basic parameters - date ranges, index ranges for paging, ordering by a couple of important and relevant fields (could well include some kind of computed value like 'hotness'), and probably most importantly - a query parameter (search algorithm being blackboxed in API).

If you need many more params, it may be a sign that the underlying resource is becoming overcomplex and can be decomposed - I've made applications where the API interfaces were completely uniform across resources, which worked out really well and encouraged takeup (the most important measure for an API).

The other main question is whether to allow the caller to request which fields to return (cache-filling and complexity vs bandwidth).


Ooh, very interesting questions. :-) I'm not sure I have the answers for these questions, but I do have some ideas.

Most APIs don't want to provide data analysis capabilities. They want to provide data. Data analysis is a really hard problem space. For the FBI example, the FBI doesn't want to build an analytics tool for you to use; that's not their core business, and it's not something that helps them achieve their goals.

Data analysis versus data are two different problems with different classes of solutions; for example, a valid approach to delivering data might be an API to retrieve compressed, batch (nightly) downloads of events or deltas. That's not the only solution, of course, but I just mean to create an example that's a different point-of-view.

If you intend to provide data analysis from your API, I would say that is the "intent" of your API and you should design it as such, with the appropriate flexibility and user-defined behaviour that is required to answer arbitrary queries from users. This is a hard API to design, build, and scale.

When designing a data analysis API, you should address scenarios like a user making a valid API request that can't be delivered in real-time; is it your intention to make your API accessible in real-time, in which case you have to either crash horribly or tell this request to piss off? Or is it your intent to provide the analysis no matter what, in which case perhaps it would be valid to queue such queries, process them in batches with limited concurrency, and deliver the results when completed.

Regarding the unmet needs of users, you should always provide a feedback mechanism for your API. This is something that varies a lot depending upon your customer; a lot of developers work in "enterprise" software, where the feedback channel is usually pretty apparent. Public APIs with an undefined audience will have a harder time getting that feedback, but usually you can bet that the importance of an issue is proportional to the number of times you hear about it.


I'm not sure this works once other people start building on top of your API though. I would rather bump the API version than break API compatibility by adding/removing/renaming fields.

Github (which you used in your example of good design) do change fields returned in their API responses but only between API version not inside of versions. The link you gave [1] is to Version 3 of their API which is different to Version 2 and will be different to Version 4. I don't really want to be programming against APIs where identical requests can return varying responses based on supposed intent.

1. http://developer.github.com/v3/repos/statuses/


Hi Alex,

I never advocated breaking compatibility by adding, removing, or renaming fields in this article. Would you please let me know what gave you that impression so that I can clarify it?

I also did not state that having a versioned API is a bad thing. GitHub's API was a great example because it's both well-designed and versioned. Good design is going to make your API less fragile and far less prone to version changes, but I'm strongly in favour of versioning your API.

Mathieu


Hi Matthew,

You suggested that the FBI could re-write the API request to be: http://api.fbi.gov/wanted/most

Which would return a set of known fields. You then go on to say:

"After complaints that “notoriety” is a made-up number, the FBI can hide the field; the intent-driven design is unchanged."

This suggests that the API author is free to add, remove or rename fields in the response without bumping version as it's part of the "intent-driven design".

In response to your second point, I then don't understand what the real purpose of intent-driven API design is. The only different appears to be that you given more RESTful URLs than using loads of query parameters?


Ah, thank-you for pointing that out. That's definitely a mistake in the article. I've amended point #4 with another possible approach.

Intent-driven design reduces the fragility of your API, gives you a wider variety of changes that you can make to your API that are backwards compatible.

If it's done perfectly, and your software solves the same business problem tomorrow as it does today, then you will never need to increment your API version number. The reality is that it will never be designed perfectly, and requirements are never that stable, so having a versioned API is a practical and prudent choice.


Artificially constrained APIs are annoying as hell. It's maddening when you want to build something that doesn't fit the scenarios the API builder had in mind, and you know the data is just below the surface, but you can't because the builder didn't expose it.

It's great to include simplified APIs like "/wanted/most" but I also want the more powerful version.


i agree with this. also, one annoying nugget is what happens if the api owner changes the logic behind wanted/most and they breaks some functionality behind something i've built?


Intent-driven API gels with a talk from Netflix that came out just a few days ago: http://www.infoq.com/presentations/API-Revolution

His argument is that we're in the middle of a switch from RESTful resource-oriented APIs to what he calls "Experience-Driven APIs". He explains how Netflix has made this shift with its PS3 app. They effectively introduced a new facade layer inside the server and that's what the PS3 talks to.

The main thing about this talk is that this whole question constantly repeats itself with the pendulum swinging back and forth. There's really no right answer and it depends a lot on the reason for providing the API in the first place.

And that gets back to another reason we are probably seeing the trend, which is that companies like Twitter are becoming more protective of their APIs. While the OP is mostly arguing this from the perspective of technical efficiency, this commercial argument could sway companies even further towards intent-driven APIs.

I imagine we will end up in a dichotomy, where companies who provide their own UI and services will typically support intent-style APIs, while companies who specifically have a business model of charging for their API will continue to offer the more precise resource-style API.


TL;DR If you remove features from an API then there will be less to break.

But the article just assumes that those features you removed weren't needed. Seems a bit dubious to me.


Nice and fresh idea about API design, thank you fenniak. You almost answered my questions as they poped in my head, especially about DRY!


Awesome, I'm so glad to hear that. Thanks for your feedback. :-)


How is the simpler API more "fragile"? It may lack some virtues (e.g. a paging interface) but how could it be more robust?

Also note that it exemplifies many of the virtues cited (ease of pre-generation, ease of caching, etc.) over the more complex option. Indeed, I'd suggest many API designers would prefer to offload busywork such as sorting to the client for anything other than very large datasets (you four core 3.2GHz CPU (a) is running idle, and (b) costs me, the API designer, nothing).

Consider that the simpler API is: a) simpler to document b) simpler to implement c) simpler to consume d) scales better

Note that every time the user changes sort order the complex API gets hit. The simpler API is out getting coffee and scoring with the sexy APIs from across the street.

How is it worse?


I agree webapis need to be less fragile but I definitely would n't advocate following this approach. To be honest I'm actually surprised how well the original followed good REST principles. Hateoas is a good way to make your api less fragile to change btw.


I don't think you are making a convincing argument for your point. Most likely it will be a developer interpreting the api and generally speaking they are likely to want some flexibility in how you query a resource.

I would say your proposed suggestion is much more limiting, you have basically removed the query string and changed the name.

So your proposed version returns all the data, in most notorious order, that could be tonne of data. You are going to need query strings for paging and filtering and sorting..this is perfectly good api design.

I would recommend odata.org for a read. I would also recommend apigee.com white paper on good api design.

Also with versioning in mind, you may wish too include api/v1.x/mostwanted


Am I the only one who was really disappointed that there isn't actually an FBI API?


All I could find on data.gov was this widget https://explore.data.gov/widgets/hwp2-9a2f


well, vague on details is a bit tricky - what happens when the data set becomes huge and they need to include pagenation? sometimes the intent-based api will start to include payloads in the body of the request to specify things like page number etc. Is this less fragile? Or have we just moved the complexity into a different place?


After building out dozen of API's I have came to conclusion that in order to design a great API you have to start with writing a client for it first.


I feel as though that idea is related to test-driven development




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: