Hacker News new | past | comments | ask | show | jobs | submit login
API Practices If You Hate Your Customers (acm.org)
335 points by blopeur 37 days ago | hide | past | web | favorite | 248 comments



I was expecting to see two of my pet hates - returning a null array to represent no items, and returning a single object without an array to represent one item.

I also once worked with an API where you had to send the data in POST format - abc=123&def=456. After much pressure from their customers, they finally relented and added an XML version of their API... where your request could look like this: <request>abc=123&def=456</request>.

Even better, note the & is not an &amp; - you can't even use a standard XML parser to interpret the result.

Also: APIs that have a mix of strongly and eventually consistent calls and no documentation about which is which. Or, an API that was at one point strongly consistent turns eventually consistent without warning.


I had this happen with a vendor that was told that they needed to add REST support to their legacy Java app. They turned up with an API that encoded XML as base64 and put that in a POST variable on HTML page. Then they added their own crypto on top with hard-coded RSA keys. They helpfully included both the public and private keys in their documentation.

When they turned up for a meeting to present the fruits of their labour they got patted on the back by all the managers for a job well done. Record time too!

They faces fell when I told them that I'm flat out rejecting their API design and that they will have to start from the beginning.

Their excuse for the design was that ampersands are too hard to process, so base64 encoding made them "safe" to include in the REST API.

Sometimes you just have to put your foot down about this kind of idiocy.


Unfortunately I have vivid memories of one API I worked with (many years ago) that included entire encoded XML documents as attribute values within top level XML. At least there was only one level of recursion.

Still, perhaps not as bad as using XML documents as database keys...


A common misconception is that you can put anything in xml via CDATA so long as it doesn't have the ending delimiter embedded.

https://stackoverflow.com/questions/21087648/xml-invalid-cha...

I know about this because I received a file with illegal CDATA characters once.


Receiving XML that is almost valid can be a real pain - it certainly used to be that some XML parsers didn't report some weird errors in a particularly helpful way requiring manual scrutiny.

I have unpleasant memories of trying to work out why some SOAP web service was breaking and it turned out the WSDL was invalid in a subtle way.


I saw that in HTML. Not from a professional developer but it still blow my mind. That was the final straw that made me quit the project.


I've needed to do that for HTML. If you need to put part of a document inside an iFrame (perhaps because you don't trust it) you have your backend generate <iframe sandbox srcdoc="..."> and nest it.


There's often a semantic difference between null and empty.

Like, if I'm checking the result of a batch processing job, I want to know if the job finished & resulted in an empty set, or there is simply no result yet.

It's "the thing you're looking for doesn't exist" vs. "the thing you're looking for exists, but is empty."


Right, which is why it's really confusing when an API returns null to mean empty.


That's fair to say I guess - but it should be accompanied by some other property to indicate a condition like that.

The API I'm thinking of though was effectively a wrapper around a database query which retrieved items.

That's what gets me about that decision you see. When you get nothing from a database, you get an empty set. The runtime was some version of .NET Framework, which by default would write an empty set as an empty array. So it wasn't even an accident - someone actually had to add extra logic to make it return null.


Perl's DBI module gets around this with a value "zero but true" aka 0E0.

It is useful in cases such as indicating an operation was successful, but zero rows were affected. Using it as a number resolves to zero, using it as a boolean results in true.

I think in the general sense under API though, a status flag is the best way to avoid confusion as you say. Null could just as easily mean 'error' as 'not yet finished'.


I've seen this happen due to system evolution. At first some entity may have a parent record or not. So when you ask for the parent, you either get it, or null.

But then the system evolves to not be many-to-one, but many-to-many. To avoid breaking old clients, they make it so the only difference is when they return multiple related records, in which case they're given in an array.

Thus you now have: null for empty, the record it self if there is only one related, and an array of records if there are more than one related.


there are still evolutionary justifications which I'm sympathetic to now and then, but mostly not. In many (most?) cases, understanding if something should be one-one or one-many is known up front. Or should be known. We have decades of examples of best practices with many common data structures. Hard coding a customer account to only ever have one address, for example - no. I don't buy that justification - a customer/address thing - for any size company/project - should just be modeled as one-many (at least). It may be slightly more 'work' up front, but that work avoids potentially major breaking changes and work later on.

I get countered with "YAGNI" now and then, but after 25+ years of doing this (and, again, decades of examples of your exact use cases already in google ready to learn from), I can usually tell when you ARE going to need it.


My example was pretty contrived– I would also expect some sort of flag representing the state of the job in that case.

My point was just that the absence of a value is distinct from an empty value. And that intentionally modeling those 2 cases separately can remove a lot of ambiguity.


Ehm you shouldn’t GET for the result, but for the job descriptor. If you really need to GET the result then you should expect an HTTP 4xx or 3xx response.

There is no justification for NULLs


If your API is RESTful. There are plenty of “rest” APIs out there that are really JSON-based RPC endpoints.


Of course there is, but generally speaking there's an infinite number of possible reasons for an empty or null value, not precisely two. That's why if you don't watch out they multiply.


> returning a null array to represent no items, and returning a single object without an array to represent one item.

This just reminded me of a long ago incident at a former employer. I worked in back-office at a hedge fund and was responsible for maintaining several APIs & services written in C++. The APIs were pretty straight forward query for object(s), get back a vector of objects. If nothing was found, you'd get back an empty vector.

The company also had a program where they'd hire grad students right out of school and immerse them in the trading desk, working on trading algorithms and such. One such kid sent in a "bug report" directly to me, CCing all of front office. The "bug"? We was querying for something that didn't exist, so he got back an empty vector. He didn't check that it wasn't empty, and immediately accessed the front of the vector. Boom, segfault.

Normally, I would have been kind to such a junior dev, but their original email was very condescending and went to an unnecessarily large audience - like hundreds of people that didn't have time for this. His email read something to the effect of "Your API is crashing my code. Fix your API."

Instead, this junior got a whole other lesson instead. I replied all, and included all of back-office on the email (so the rest of the team was aware to watch out for this clown). "The API is working perfectly fine and as designed. The fuck up is on your end. You queried for something that doesn't exist and proceeded to dereference and empty vector. Fix YOUR fucking code."


I really don't like your story. The junior is clearly behaving immaturely and inappropriately by CCing others and not using an appropriate tone, but your response is to... retaliate by doing the exact same thing to him? What lesson is the junior supposed to learn here? That acting that way is A-OK as long as you have seniority and are factually correct? At least the junior has the "excuse" of being a junior, but really as a more senior person you should have known better.

Surely there must have been a better way for you to handle this situation.


I fundamentally disagree with the idea that it's a virtue to act politely and "professionally" in all circumstances, no matter what the provocation. I think it can be entirely reasonable (depending on specific circumstances) to publicly shame someone who has been publicly acting like an asshole.

Simply being wrong in public deserves a gentle correction (possibly in private, again, depending on the circumstances), but often I don't really care to be nice to people who aren't being nice. It's certainly possible that being nice all the time in situations like this gets you better outcomes overall, but we all have a limited amount of patience to dole out, and some people just don't deserve it sometimes.

And on the other hand, some people don't get the message when you respond politely. Responding politely in this case could just as easily teach the junior that his debugging strategy (blame anyone but himself) and dickish tone were completely ok. There's certainly a polite way to inform the junior that his behavior won't be tolerated, but I can't fault someone for simply responding in kind.


I had a similar situation where a certain person would blame my API for all of his problems in a very public manner. First few times I responded quietly pointing out the mistakes but then I started responding publicly pointing out that the person should first learn the programming language before blaming others with examples from previous interactions. The behavior stopped immediately. Sometimes you have to bully the bully and make clear that being new is not an excuse for bad behavior.


I'm not seeing anything wrong with what OP did. By CCing everyone, the junior is trying to make themselves look good by making OP look bad. The junior learned that if they act like an ass, especially an incorrect ass, the other employees won't tolerate it.


That's the DESIRE for what the junior learned. But is that what they actually learned? One could just as easily (or more easily) decide instead that this is the way business is done. It's literally all they've ever seen.


Responding politely could also teach the junior that it's ok to act like an asshole, because he'll get a polite response no matter what.

I suppose the ideal response would have been a polite correction, followed by a polite rebuke regarding the junior's tone and behavior, but I can't fault someone for responding to fire with fire on occasion.


A polite response does not have to be a meek response. You can be blunt, you can point out all the problems, while still being polite.

That also has the advantage of being 100% clear.

"You just told hundreds of people my code is poor. You attacked me, and my professional reputation, and you did it inaccurately. This gives me little reason to respect you or want to help you. Asking without copying the world and without assuming you were flawless would have done a lot to help your case. Because I AM professional and you are new to the profession, I'll hold back on the reflex to point out to every one of these hundreds of peers that influence your career how rude, presumptive, and WRONG you were. Instead, I'm pointing out in private that you are wrong and not behaving in a way someone looking to succeed is behaving. This tolerance is not something you should expect from me again, and you should be grateful to be getting it now because your behavior does NOT call for cooperation. If you do this again, the next person will likely not show this patience and understanding. Do you understand that you made the error in your code? Do you understand how your behavior invites people to think poorly of you? Do you understand how to change both your code and behavior for the future? "

This has the senior developer takes on more work instead of a sense of vindication. However, this is much more likely to get a positive result in the long term - there's no mystery hope that someone that clearly missed out on social repercussions will suddenly understand them when attacked.

Being a senior developer is not about winning battles, it's about learning to win without battles, and even more importantly, it's about TEACHING that.


I've gotta say, if I was a third party to the exchange that involved your suggested message I would think the senior developer was a pompous and condescending ass. In the UK we (or at least I) wouldn't consider what you wrote to be a polite response.


Funnily enough, my last team worked with a software supplier from the UK, and I’m 99.99% sure the above text was copy-pasted from an email reply to one of my overzealous team members. We thought it was over the top, but chalked it down to British culture


There is also a difference between teaching and grandstanding; the reality of the situation is that there is no single correct response for this kind of situations: the junior maybe was zealous, maybe he was trying to shame the senior, maybe he just copied in CC the wrong mailing list.

Obviously the original answer was not appropriate in every situation, but also likely it was appropriate in some.


So what they've learned is, if someone makes an incorrect assumption, sends a nasty email, and CCs the entire department, I get to respond by sending an email back pointing out their error and CCing an even larger audience.

I don't think that's necessarily a bad thing to learn, even for the junior...


There's no way to know in advance what someone will learn from an experience. You also don't know what they learn 'now', and what they might reevaluate and relearn years from now about that same situation. Basing your response decision primarily around what someone might learn isn't a great way to decide how to respond.

Couple folks I'm working with right now, and I had thought a couple of times "well, this wasn't a great scenario, but at least they'll learn ABC from it". One did, one didn't, and keeps making the same moves (I hesitate to say 'mistakes', but in my view they are).


> There's no way to know in advance what someone will learn from an experience.

Correct. You can, however, improve the odds. Being explicit about what you want them to learn as opposed to expecting a certain degree of interpretation cannot hurt your odds, though it can't guarantee them.

> Basing your response decision primarily around what someone might learn isn't a great way to decide how to respond.

In this context, what else was the point of the response? I was comparing to the STFU tell-off of the above post and the defense of it that they would learn from the experience.


> what else was the point of the response?

some other public reaction/recognition for the OP, not for the benefit of the original sender.

emotional venting? that's sometimes its own reward.


It wasn't wrong, but it also betrays that OP wasn't really a senior either. Just an older junior.


>Surely there must have been a better way for you to handle this situation.

The junior was in the wrong, so I think you'd agree that a correction is in order. Replying-all is one way to do such a correction. The alternative would be to force him to issue a retraction himself.


There's no need to shame people who are wrong.


True. If the junior dev was simply wrong, but polite about it, it might (and should) have ended without a shaming.

While there is perhaps no need to shame someone who is acting in a shameful way, I find it hard to criticize someone for doing so. Sometimes an aggressive statement deserves an aggressive response.


If they're trying to shame or chastise others, while being wrong, seems fair to me. Then again, I subscribe to the "play stupid games, win stupid prizes" school of dickish behavior correction.

Or, to defer to the ever-wise Gods of the Copybook Headings:

"What's good for the goose is good for the gander".


I've noticed that if you stay strictly professional, folks think higher of you and they feel shame for having done this. You also appear way wiser and likeable - all of which gives you more clout. And if you're more often right than your peers, everyone benefits by your having more clout.

I've been in this situation before and I've found this approach beneficial.



"I've noticed that if you stay strictly professional, folks think higher of you and they feel shame for having done this."

Some do, some think it is a sign of weakness and start behaving even more unreasonably.


And since the boss knows you are right because of your previous behaviour, all you need to do is ask the boss to take care of the disturbing element if it continues.


That's making a lot of assumptions about your 'boss'.


Yes I'm aware. I have been extrely lucky with my jobs so far.


Ah, but that's the kinder, gentler version of "what's sauce for the goose is sauce for the gander"[1] which seems even more apropos.

1: https://en.wiktionary.org/wiki/what%27s_sauce_for_the_goose_...


Normally that would be true.

But it's necessary to shame people who work the trade desks, or else they will never learn.


I honestly think people learn better if you don't shame them.


I honestly think it depends on the entire context including the people and all this generalization is useless and a waste of precious Internet bits.


not unless they're being dicks about it, like in this example ;)


Live by the sword, die by the sword.


Everyone dies from trying to dereference empty vectors that are != nullptrs at some point in their life. It isn't only APIs that can set traps...

Many people don't see mailing everyone as self promotion for showing off, most often they just don't know whom to address with stuff like this so they try shotgun mailing.

That alone can still be bad though, since it gives every recipient an opportunity to be distracted if they want to be.


It depends on the company. Often the best thing to do is fire back without escalating. With someone like that junior sending an email like that without first talking to someone is a good way to get fired or put on a PIP depending on how long they’d been with the company.


Really, there are many reasons to run internal mailing/distribution lists and designate emails from them in some manner, such as a subject prefix. One of which is that people are less likely to sent an email to the "Staff" list if most the correspondence there is from management and about company level things, and then it's also very appropriate to reply to the email with content that addresses the request (in this case, you're wrong, don't do that) and also a rider about how this is not the appropriate place for this, and they should submit to the appropriate list/form.


How did your API endpoint signal internal errors or parameter errors to the client?


I had the pleasure of working with the API of a customer that wanted to expose a JSON/REST API to their existing XML/SOAP backend. Instead of going the sane rout and re-use the XSDs to serve as the structure for the JSON, they just made the JSON structure up on the go.

One child node? That would be one JSON object / value for you sir. Multiple child nodes? That would be on JSON array for you sir. No child node? No JSON type for you sir.

So what happened when the amount of child nodes was dynamic? You would either get: nothing, an object, or an array of objects. Oh the fun times we had!


That reminds me of a vendor who wraps SAP Business One in their own webservice. This webservice has two business methods.

The first one, ExecuteXML, takes an <XmlBody> representing a regular SAP B1 XML request and passes it on to one of the real SAP services. We have to find our own XSDs for the inner part, because they sure as hell don't have those.

The second one is ExecuteSQL. It lets us run raw SQL against the SAP database. It doesn't have any support for prepared parameters. What it does have is a blacklist to prevent DDL and other funny business, such as semicolons. This blacklist runs on the raw string you send, and doesn't understand any escape characters. To send a string containing a literal semicolon, I had to turn it into CONVERT(VARCHAR(MAX), 0x...).


I bet $50 there was a junior dev promoted too high too fast proud of that technical feat.


That sounds like absolute nightmare - at least keep a convention.


For those Gophers reading this and sobbing silently, there may be hope: https://github.com/golang/go/issues/27589


Why would you want to marshal a nil slice to json as non-null? That just hides the fact that you had a nil slice. In fact, that's the exact opposite problem; representing nulls as empty arrays.


As explained in the thread, that is because Go best practice is to treat `nil` and `[]T{}` as the same thing in every API. If you want an explicit `null` in your JSON you can use `*[]T` as your field type, and then it makes more sense.

The very idea of allowing the nil value for slices seems to be very strange and inconsistent, as slices are struct types, not pointer types in Go (they contain a pointer and other members). Just one of the many ways that builtin types in Go are fundamentally different from user-creatable types, I guess...


It is admittedly very weird that Go supports nil slices/maps instead of just having the nil value be an empty slice/map that points to constant storage. But as long as Go has a semantic difference internally, representing that as null externally makes sense. Though I suppose as long as the conversion from null to empty array/object is opt-in it's fine.

For context, we recently had a bug where backend forgot to initialize their map, so they were sending us a null where we expected an object, and it would have gone undetected for much longer if the JSON didn't contain a literal null there.


The nil value for a slice is an empty slice that points to constant storage: Data=0, Len=0, Cap=0. That's just not the same as Data=somethingelse, ... which you get if you allocate something.

(And all Go zero values are exactly what you get with the relevant RAM filled with the zero byte.)


If the nil value for a slice is the empty slice, then why do the following two variable definitions differ in behavior?

  var x []int;
  y := []int{};
Both produce a slice of capacity zero, except `x` serializes to null and `y` serializes to an empty array.


returning a null array to represent no items

To be honest, that's what I'd expect. What do you dislike about that result, and what would you prefer to see returned? Jump straight to 404?

returning a single object without an array to represent one item.

So an array if there's multiple results, and a bare object for a single result? That's unpleasant.


> Jump straight to 404?

That's my pet peeve. I completely understand the logic - using the HTTP status code to show the result of interaction with the resource is very RESTful - but my complaint is that unless you provide more context, a 404 doesn't tell me if I'm getting a response of "that resource doesn't exist" vs "you're accessing a url pattern that will never work".

So if you want this random internet user to be pleased, never return a 404 response unless the caller is able to see this difference. (Usually having SOME form of body unique to the API is sufficient to prove the 404 is not because of a bad API call).

Giving me the HTML to your site 404 page when I called a non-HTML API is likewise sad-inducing.


We have only one pattern like this... when there's a client resource (containing personal information for contacting them) that the client has asked to be private... we return an empty JSON object (versus an object with the contact information keys)


If an API is expected to return an array, but the data set is empty, I expect an empty array. I should be able to go straight from the request to a for-each loop without checking for null.


> What do you dislike about that result, and what would you prefer to see returned?

It’s inconsistent and means I need to write special case code to check for it, when before I could choose to. It should return an empty array.


null array sounds like it's an array (presumably an empty one), it would have been clearer if OP had said null instead.


Not sure why you're getting downvoted - that was how I read it as well.


Presumably they would prefer an array always be returned. If zero items, an empty array; if one item, an array of length one; if more items, a longer array.


>> returning a null array to represent no items

> To be honest, that's what I'd expect. What do you dislike about that result, and what would you prefer to see returned? Jump straight to 404?

I'd expect an empty array to represent no items. It's the difference between "" and "[]".


404 indicates an error, if the result is just empty but not erroneous 204 "no content" may be better.


+1, 204 is underutilized for no-result cases.


In the browser it will also do no page refresh as there is nothing to render.


“returning a single object without an array to represent one item.”

I saw the same thing when working with PHP consuming data from a SOAP endpoint. First I thought PHP is stupid. But then I realized that XML can’t model single item arrays versus single objects. They look the same. JSON is better that way. You can model empty arrays and single item arrays.


> XML can’t model single item arrays versus single objects. They look the same.

    <results>
      <result>
        <property>value</property>
        ...
      </result>
    </results>
I don't really see the problem.


You need to know that “results” is an array so you need the schema. In JSON an array is an array without doubt.


I mean, sure, but if you're creating an XML API, you've almost certainly defined the schema. If I take JSON like

    [
      { "fruit": "banana" },
      { "color": "yellow" },
      [ 1 ]
    ]
...and turn it into XML like...

    <document>
      <fruit>banana</fruit>
      <color>yellow</color>
      <ids>
        <id>1</id>
      </ids>
    </document>
It's going to be fairly clear what's going on.

I'd generally prefer to work with JSON, too. But the silver lining of XML is that you can define virtually any schema you want, and there are times that's conceivably going to be clearer or even more concise than the equivalent JSON.


If you don't know that "results" is an array... how do you handle the uncontroversial case of getting back an array of two results?


I know that but the PHP deserializer doesn’t know and gives me back an object instead of an array of one. If it’s two it detects an array.


But it's always a single object. If it has two parallel nodes at the document root, it isn't valid XML.

So an array of two results looks like this:

    <results>
      <result></result>
      <result></result>
    </results>
and PHP automatically treats results as a two-element array, but for the one-element array

    <results>
      <result></result>
    </results>
you just get a parse error?


Not a parse error but a single “result” object, not an array of length 1. With two elements it recognizes an array.


If you can parse

    <results>
      <result></result>
    </results>
and you end up with a single "result" object, you must have identified "results" as being an array of one element so that you could discard it. (You didn't get a single "results" object.)

That's not a problem in what XML is able to model, it's a problem with you throwing the information you got away and then pretending you didn't receive it.

On the other hand, if you meant that you get a single "results" object, then you're saying that your code is written to require invalid XML and fall down when valid XML is provided. Again, that sounds more like a problem with your code than with the modeling capabilities of XML. There's an argument to be made for accepting invalid input; rejecting valid input is ridiculous.


You do understand that php and quite a few other languages provide pre-built serialization and de-serialization from XML, right? The OP was talking about the out of box behaviour for those. Your answer seems to predicate on having written a custom deserializer.


My answer is predicated on the idea that you should be using a deserializer that doesn't rely on receiving invalid input, regardless of whether that deserializer comes from a standard library.


A sample output (either XML or JSON) is not a substitute for a specification.

Looking at the XML when a list has only a single element, you wouldn't know if there could be more. You need an XSD to accompany the XML, which is what is typically provided.

When looking at a JSON document you might run into the same situation where the document you received doesn't contain all possible elements, and you wouldn't be the wiser. You'd still need a proper JSON schema to accompany the document.


You can force PHP to handle sngle-element arrays consistently, with the SOAP_SINGLE_ELEMENT_ARRAYS option, the behaviour is much saner when enabled.


Good to know! Unfortunately I have a call to a “FixSoapArray” all over the code already :(


What would we do without ten different ways to represent nothing?

  if ismissing(x) then
       x = empty
  end if

  if isempty(x) then
       x = ""
  end if

  if x is nothing then
       x = new foo
  end if

  if right(typename(x), 2) <> "()" then
       x = Array(x)
  end if


I worked with a product for natural language processing that wanted the text in a query string. This led to 100+ page documents begin sent as a string in the request. My usual REST testing app would freeze up if I wanted to test some of the largest documents in the data set.


> This led to 100+ page documents begin sent as a string in the request

How were you able to do this when the standard maximum length of a query string is 1024 bytes? I guess you could flaunt the standard as you were responsible for the backend


Late response, but maybe you'll still read it...

We had some issues because there was no real security on the API of our NLP tooling. So we put NGINX in front of it to create a from of API key auth. NGINX would deny the larger messages by default, so we had to increase some parameter so it would pass the large documents. And indeed, since this was only relevant in the backend, it didn't matter. For the front-end we built our own API that was a lot more sensible.


Except that standard...isn't. It's what some ancient version of MSIE did and that used to count for a standard in Triassic; nowadays, it gets passed around as cargo cult advice. (The relevant RFC recommends no more than 8000 bytes, sure.)

https://stackoverflow.com/questions/417142/what-is-the-maxim...


It's not cargo-cult: anyone who's worked on cross-browser front-end development probably encountered this at least once. I ran into this personally a few years back when I was trying to be too clever by half with base64-encoded queries. It's not just ancient IE that has limits - in my case it was a corporate proxy that was truncating the query (which is why only that customer was getting that bug). A year or 2 before that, I ran into the MSIE limit (must have been IE8 or 9: don't know if that counts as the Triassic period, because I don't know what we'd call the IE4-6 era)


I did encounter this multiple times...not in the last decade though.


> returning a null array to represent no items, and returning a single object without an array to represent one item

this happens when you add JSON version for old XML-based APIs (seen this multiple times)


> this happens when you add JSON version for old XML-based APIs (seen this multiple times)

...without using an XSD schema. With a schema,not an issue


I actually really like POST format. Very easy, very standard, very readily available.

I also like that it encourages flatter data models.


What would you suggest instead of an empty array for a 0 item return?


The person you replied to doesn't like returning "null". He/she would probably prefer returning an array that contains 0 items.


He called it a "null array" which I interpret as an array containing zero items. If it was just a plain old null then yeah, that's super annoying.


This makes sense now. If I did a GET to posts and got a "null", I would think I made some kind of error. If I get an empty array, I would (correctly) assume there are no posts yet.

Basically, GET to posts would always return an array of posts. Whether there are 0 (empty array), 1 (just one object in the array), or many (array with n length). This makes API logic way easier to handle without having to check what structure I received even with a 200 status code


Also makes a lot of iteration easier... just a for loop over the array to dump something on the page with no edge cases.

You can add a warning by just checking the length against that array blindly.


If you return an empty array, I can always process the result by looping through the array.

If it's sometimes an array and sometimes null, I need branched logic for each call.


I was totally expecting to see something about using a protocol in an unexpected way, because "the protocol is not good enough".

I had to work with an API where the company decided everything should return http code 200 (well, at least all 4XX errors), and give the error code in the JSON response, mixing existing 4XX errors and their own errors.

When pointed out, the support answer was "we chose to give meaningful error messages instead of HTTP codes, that's why we respond with 200 in case there's an error in the request". Not the answer I was expecting.

Another annoying practice is to answer 2XX, put the request in a queue, and not provide ANY information about the queue, which can be minutes or even HOURS long. Debugging is a nightmare. The one I worked with who did that did not even have the excuse of being a startup or small company.


> I had to work with an API where the company decided everything should return http code 200 (well, at least all 4XX errors), and give the error code in the JSON response, mixing existing 4XX errors and their own errors.

If all of your errors line up perfectly with HTTP, I guess this could work. But if you've got something that doesn't fit, or two things that would map to the same one, it gets weird.

And then you have http client libraries that do great things like only return response bodies if status 200, or only return http statuses they were aware of. It's not very RESTful to just return http 200 with an embedded application status, but it's easy and consistent, and I would not write an HTTP api otherwise, unless I was had a good reason to follow some existing spec that used statuses.


It depends. I've seen bigco use this technique and their reason was that they wanted to separate problems with API and request from actual issue with their servers/platform. So if you get 200 it means they received request successfully but if there is issue with request they'll include error in response with 200. If it's anything else then there is a networking or delivery problem. It's not standard way to handle things but can be useful at times.


If the server got an issue with your request it can return 400 (Bad Request), which means that the server has received it successfully before acknowledging it as a bad one.

Even then, if the server returns 400 (Bad Request) the server can still attach a response body to that, in plain text/plain, application/octet-stream or even application/json, which could contain elaborate information.


This is what the 400 and 500 ranges of HTTP status codes are for.

The former when the problem is on the client side, the latter for a problem on the server.

And you can still send a descriptive body for the error, if you so choose.


This is not true. Counterpoint, responses like 404 or 503 Bad Gateway do not come from the destination server. They do not indicate the intended server received your request.


404 and 500 should come from the destination server. You are correct that 503 would not.


Hmm are these assumptions valid? Can't a misconfigured load balancer cause a 404? Couldn't a bug in nginx, node or an app server produce a 500 response outside of your control?


I suppose a misconfigured load balancer or nginx instance could return absolutely anything. I think you'd have to be actually maliciously misprogramming it, though, to reach that level of dysfunction!


For example, suppose you want to distinguish between a missing/deleted resource /myuser/23123 and a completely invalid query /muser/23123.

Both of these are 404 (or 410 for permanent caching) responses according to HTTP, though they have very different reasons for "non-existance".


No, the "completely invalid query" is 400.

404 is only for "the request makes sense but that specific resource doesn't exist".


400 is a very generic error. It is often used to complain about problems with the request body.

Considering how brief the HTTP RFC is, your interpretation is I think as good as any, if very uncommon. (Not what Django, Rails, Flask, etc. do).


In this case it’s a 404 because the query URL pattern doesn’t exist - the most common cause of 404s.


Same thing, no? If I ask for /api/user/238885 and I get a 404 back, it's because the resource "user" with ID of 238885 doesn't exist. If I ask for /api/banana/wharrgarbl, it's because the resource "banana" with the ID of "wharrgarbl" doesn't exist.


How is this better than using different status codes depending on the type of problem?


>I had to work with an API where the company decided everything should return http code 200 (well, at least all 4XX errors), and give the error code in the JSON response, mixing existing 4XX errors and their own errors.

So here's the deal with this pattern...If you're returning a typed error response, something the client application should interpret, you want to be able to know which error responses will actually have that body and will not be a generic error like a 404 or a 503. If the response code is 200, you can be generally sure that the response came from the target host. Thus, the client knows they can parse an api level error from a 200 response and they should not attempt to parse non-2XX responses. I don't love the pattern but its not completely pointless.

Does anyone know if the HTTP spec guarantees codes in 4XX range should only come from the intended host? It seems like 400 is a safe bet but I've never double checked myself.


The client should try to parse the response body only if it has the appropriate Content-Type header value. It should not assume that responses with various status codes have a particular body format.


Do you suggest having a content-type header specific to your app? Something like "application/my-app+json"? Will most tooling handle this correctly? In my experience the always 200 api style is a lot more common.


I've found that returning 200 with errors seems to be the sane way to do things, since sometimes it's difficult to tell whether a server error belongs in the http later or graphql layer.


> When pointed out, the support answer was "we chose to give meaningful error messages instead of HTTP codes, that's why we respond with 200 in case there's an error in the request". Not the answer I was expecting.

I don't see the problem here. As a developer I'd much rather receive a standard json packet with information helping me figure out what went wrong.

I really don't understand your complaint here, I've designed and worked with both types of API's, and I find the standard json format to be far easier to deal with.


Totally agree that specific messaging is very convenient, but meaningful error messages and meaningful status codes aren't even remotely mutually exclusive. It's perfectly legitimate and even easy to send a 4xx or 5xx with a response body as JSON (or, for bonus points, with any other content type the client requests from possible server capabilities).

And in my experience, having an out-of-body/band general indicator that doesn't require you to parse message responses makes things WAY easier.

Well-designed APIs do both.


I'm curious, what tech stack are you working in where parsing json is difficult?

You want to talk about inconsistent? How about a 500 error may or may not result in the standard response format you're expecting because it may be coming from the server and it may be coming from the API.

I'd much rather my 500 errors be legitimate server problems.


While we're asking questions: what stack out there makes reading/writing HTTP headers hard? Because that's all it takes to work with response codes.

And yes, of course parsing JSON isn't difficult. Note that I said parsing messages -- the message field not the response body. And parsing that message field and checking conditionals to determine your client's behavior is something you'll have to write code for unless you're just relaying the error message back to the end user.

Now, if you have an HTTP error code, you already know something about why the error condition is happening before you look at any part of an error message field. For example, if it's a 4xx error, and you know that your client generated the associated request to the API using user data, then you can probably just pass the error message straight back to the user w/o parsing it and going through your own error message logic (although that depends on the quality of the API too).

> How about a 500 error may or may not result in the standard response format you're expecting

Well, in that case, relying on some message field you might have supposed would be in a specific JSON response isn't going to help you much either.

Might be better to have the HTTP error code and prepare your client to read responses based on multiple content types.


Two reasons why using 200 instead of 404 can make sense:

- No way to distinguish between the api client calling the wrong URL eg /mytypo/12345 instead of /myquery/12345. With 404, the client will not realise their mistake without debugging.

- Lots of 404 errors can trigger security sensors. Having to whitelist 404 in a security sensor can be a pain to convince the team that manages them. Doubly so if they are part of the other company rather than yours.

Aside from slightly less code to write, what benefit does 404 bring?


With a 200 your client won't know they've done anything wrong without debugging either. There is nothing preventing you from returning a detailed error message with your 404.


If under your suggestion you have to look at the details of a 404 anyway to get a detailed error message, what is the point of the 404? Your client always has to do an explicit check for error/no-error. Why make them check two different locations? Just always look at the body: one location.


You don't have to. Plenty of times, a 404 is all I need to know. Why make the client check two different locations?


If the 404 is due to the client calling the wrong URL, rather than trying to access an entity/record that doesn’t exist, the client code will incorrectly assume the entity/record doesn’t exist when it actually does.

Eg an API has a /discountvoucher/ID, so your client can enter a discount voucher code and get info on the voucher. If your client code calls /voucher/ID instead, under the 404 approach you would incorrectly think the voucher doesn’t exist, when it does.

If using the other approach, you have this code in the body, you would know straight away that the URL itself is wrong because you’d have no JSON body.

So in the former approach the client code would be oblivious to the error until someone realises, maybe years down the track, that the code is calling the wrong URL. In the later approach you would know immediate as the client will receive a body content it’s not expecting and throw a fit.

Yes you could return 404 as the http status code and embed the code into the body, but that brings us back to why do both? That opens up for lazy programmers to just check the first and not the second.


> If the 404 is due to the client calling the wrong URL, rather than trying to access an entity/record that doesn’t exist, the client code will incorrectly assume the entity/record doesn’t exist when it actually does.

There's the root of your error: the entity is the URL; the URL is the entity. If the client requests a URL which does not exist … that URL does not exist.

If the client requested a URL which does not fit the expected schema … that URL does not exist.

Once you embrace RESTfulness & HATEOAS, life gets so much simpler. Also, every time you return errors in a 200, God kills a kitten. If for nothing else, think of the kittens!


Because they can pull extra information out that can help them understand why they're receiving a 404. The lack of information can itself become a signal.


> -No way to distinguish between the api client calling the wrong URL eg /mytypo/12345 instead of /myquery/12345.

More often than not, this distinction matters less than one might think. In either case, you've got an upstream problem for the client: how it's determining the `mytypo` portion of the path or how it's fetching invalid identifiers. This is actually part of the point of the philosophy behind REST.

But if caring about fine distinctions is the point here, there's no appreciable improvement between using 404 for both "something's bad about the path" and "path's good, no resource here" and using 200 for both "path's good, no resource here" and "everything's fine!" Especially when, personally, I think the distinction between "I got back data!" and "I didn't get back data" is more important than "I didn't get back data for reason X" vs "I didn't get back data for reason Y."

Now, maybe you don't agree with that prioritization, or maybe you'd argue "Oh, I just handle did-vs-didn't-get-data situations both under 200 with a message somewhere in the response body." Cool. As my earlier comment points out, you can do that in the response body of a 404 too.

So of course it is quite possible to make distinctions between why a 404 error code happens, just the same as it is for what kind of 200 you're getting back.

And you can still go straight status header on top of that too: make your own 4xx for "resource not found even though URL is correct", maybe something like 434. Or if you're queasy about that sort of improvisation, use 400 with a status message in the body about an invalid id. Or if you're absolutely sure for some reason that it should be 2xx, consider 204 (though it sure seems like a bad id is a form of client error).

> Lots of 404 errors can trigger security sensors. Having to whitelist 404 in a security sensor can be a pain to convince the team that manages them.

I might well have a side-eye for a security team that sets its thresholds here too aggressively, but OTOH, a client that is generating a lot of 404 errors on legit URLs is either broken or it's getting fed bad ids by the API... or it is in fact malicious. In each case eyebrows should be raised. Hopefully any security layer is also sending back a useful status code and response body to the client.

> Aside from slightly less code to write, what benefit does 404 bring?

Like I said in my earlier comment, out-of-band broad status code can let you switch between error handling pathways well before you've spent time parsing a more detailed message in the body (which you may or may not get, depending on the error condition!), or even for specific error messages you have never thought of much less seen. Knowing what kind of error you're dealing with before you know the deep specifics is really useful across a variety of software situations.

And there's so much web tech and tooling that's set up to see 200 as "Everything is Fine, You Got What You Wanted!" You're working against the related infrastructure when you use 200 to indicate an error condition. You're working with it when you use HTTP status codes.


> And parsing that message field and checking conditionals to determine your client's behavior is something you'll have to write code for unless you're just relaying the error message back to the end user.

Oh noooo, you have to.... handle errors coming out of API's you consume. What an imposition... It must be terrible to be you, always having to write code to handle when things don't go exactly down the happy path.

> Might be better to have the HTTP error code and prepare your client to read responses based on multiple content types.

yes! Imagine seeing a 500 and knowing for sure that there's a server misconfiguration. Being able to trust http status codes: it's what's for dinner.


> Oh noooo, you have to.... handle errors coming out of API's you consume. What an imposition... It must be terrible to be you, always having to write code to handle when things don't go exactly down the happy path.

Do you really think this is a conversation about never wanting to handle errors, or is that sarcasm as a convenient way of getting out of actually thinking during the discussion?

Good specific HTTP status codes from the application layer help the client sort errors by type before they have to parse specifics. Or in cases where the client may not even have been prepared for the specifics.

Have you really never found a software situation where it's useful to know what the type of error is before you get into the details?

If that's the case, you definitely shouldn't be anywhere near API design.

> Imagine seeing a 500 and knowing for sure that there's a server misconfiguration.

Imagine knowing there's a whole range of 5xx status codes that allow for both that possibility and others.

> Being able to trust http status codes: it's what's for dinner.

Well, I'm glad we've gotten here, given that you seemed to start with "200 all the things, sort it out in the response body."


> Do you really think this is a conversation about never wanting to handle errors, or is that sarcasm as a convenient way of getting out of actually thinking during the discussion?

It was me making fun of someone trying to argue that you would have to write code for the non-happy path in scheme A, but not B.

> Good specific HTTP status codes from the application layer help the client sort errors by type before they have to parse specifics. Or in cases where the client may not even have been prepared for the specifics.

> Have you really never found a software situation where it's useful to know what the type of error is before you get into the details?

Yes, because you could never embed that sort of information into the JSON, that's not what it's for! It's for... well I guess if you're not using it for that sort of information I don't know what it's for.

> Imagine knowing there's a whole range of 5xx status codes that allow for both that possibility and others.

https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#5xx_...

11 5xx status codes. I'm glad to know these cover all possibilities for the software you write, but you know what's even worse than anything we've discussed here?

15 different API's defining 512 to mean different things specific to their software. All in the name of software cleanliness, because parsing json is apparently icky and hard in brainfuck, their API language of choice I guess?


> because parsing json is apparently icky and hard in brainfuck, their API language of choice I guess?

It's quite clear that what I'm talking about here has nothing to do with the ease of having a piece of software parse JSON, but with the difficulties introduced by thumbing your nose at helpful conventions in a context where "rough consensus and running code" has been brilliant at enabling a spectacular network of interconnected clients and servers.

It's less clear why you'd insist on returning to characterizing these points as "oh, you think JSON parsing is hard", but some of what it could indicate isn't flattering. Whether you're content with that or rueful about it at some point might also say something, though with any luck I will no longer be among those evaluating it.

> 11 5xx status codes. I'm glad to know these cover all possibilities for the software you write

Not my claim. My claim is that existing status codes cover some broad categories, and by recognizing what your status conditions have in common with existing HTTP codes, you can do early sorting across condition categories, and work in cooperation/re-use with other pieces of code and infrastructure.

> but you know what's even worse than anything we've discussed here? 15 different API's defining 512 to mean different things specific to their software.

Which of course applies equally to your individualized in-band error codes... that sword you were swinging cuts both ways, right? Except, of course, that they're not really exactly equal situations: if you 200-plus-response-body all the things, a client developer has to learn not only what your custom error codes mean without the benefit of working from HTTP-related conventions/groupings that you apparently won't engage, but they have to figure out which property/s they're passed back under when there's already a perfectly good standard for these things in the header information.

Contrast that with the situation of a developer who is working on a client that gets a status code header 512 back from an API written by someone who groks and tries to work with status code groupings. Even if the client dev has no idea what the server means by 512 specifically, they already know that it's not related to a user input or even a client request problem, and more importantly, code already written to deal with other 5xx errors either for this client or other clients knows at least that much too. If it turns out general 5xx handlers aren't adequate, they can turn to docs and/or response body info to learn more about 512 condition specifics and augment the general case with specific handlers (however rare the case may be in which a client might need to do anything other than relay API status messages).

> Yes, because you could never embed that sort of information into the JSON, that's not what it's for

You certainly could. You could also put all the other header information into the response body, too... Content-Type, Cookie info, Authorization headers, CORS, etc. For some reason people don't. Hell, I'll bet even you don't. Perhaps you'll ask yourself why. Perhaps not.

> It was me making fun of someone trying to argue that you would have to write code for the non-happy path in scheme A, but not B.

This isn't even an accurate summary of your comment, let alone an adequate characterization of or response to mine.

Good luck with your own happier paths. If you're as much more correct than I am about this topic as your rhetoric (if not your logic) seems to imply you believe you are, then I'm sure your decisions will be their own reward.


Whether there's a bug in the underlying API code, or a bug in the web server, there's a risk that it "may or may not result in the standard response format you're expecting", and therefore... should be a 500 error.

Yes, using the wrong status codes is a problem, you're right, that's the entire point of the thread you're responding to.


that's flat out not true, it's a red herring.

I can't think of very many tech stacks where an error inside the API couldn't be trapped and responded with a standard json response.


>but meaningful error messages and meaningful status codes aren't even remotely mutually exclusive

They actually are. Things besides your app code can response with 404 and 503 but not your error body. 4XX and 5XX bodies must be parsed more carefully and cannot be assumed to have a consistent structure with the same level of trust a 200 response would.

Http codes are nice but it makes api error parsing much more complex so there are some trade offs.


Monitoring tools logging request/response payloads are just going to look at status codes. Breaking that kind of stuff cos you wanted to do it your way sucks.


I guess I've never worked with a monitoring tool that didn't support deserializing json and looking for specific data.


A huge and borderline criminal conflation with HTTP in general and REST-y APIs.

The big fat elephant in all this is simple; None of our fucking opinions matter one god damn bit to the end client. Being originally designed to you know, consume hypertext over the hypertext transfer protocol, clients can and will do whatever the fuck they want with the status codes.


Well, speaking as a client developer, the API is far, far easier for me to understand and write logic against if its authors agree to send back HTTP statuses with their standardized meanings instead of making up their own scheme.


HTTP statuses have no "standardized meanings" at the application level.


You can return a 400 and still return a json payload in the http message body. Maybe is that what the parent is saying?


IIRC cross domain requests are often one of the causes for this style of API design. I remember in older browsers at least anything not in 2XX would be unreadable by the client, so they had no idea what actually went wrong on the frontend.


That's... extremely rare, extremely ancient, and even then was usually only a problem below a certain page length.


I have integrated against many weird B2B APIs and those who always return 200 are actually pretty nice to work with so even if it is a weird choice as an API consumer I do not mind it all, there are much worse things you can do.


I’ve heard the argument as: “HTTP errors for protocol level errors, 200 + json for application level errors”

And honestly it kinda make sense when you think about it that way. “404, wrong url” and “404, id not found” should be different errors.


> “404, wrong url” and “404, id not found” should be different errors

I'd actually argue the opposite, though I can definitely see both sides.

To me, because two systems communicating RESTfully need not know anything about each other, responding to "please give me the resource as this URI" with "there is no resource at that URI" seems perfectly correct. I don't really need to know the specifics of how the remote machine is dealing with my request.

If more detail is required you can always include a message in the request body, which I think should be standard practice for handled errors anyway.


I experienced something similar with a currency value API. Every response was 200 OK with a field containing the actual HTTP error if there was one. I made an issue on their GitHub explaining why that made consumers code messier than it should be and asked if they could fix it. They replied saying sure, but after several months they closed the issue without a fix and then refused to reply further, and then went and did a 360 degree rebrand of the company changing the pricing model and the API itself. I ripped out usage of their garbage API and replaced it with another almost immediately.


Ha, I love it when people make dumb excuses for awful decisions like that. /s

I much prefer colleagues who can say, woah that was dumb thanks for informing me of a better way. Making excuses doesn't help anyone unless one has a valid logical reason they did a certain thing, otherwise I DONT WANNA HEAR IT.


The people who write the API and the people who are tasked to provide support for it are usually not the same. If these are APIs between internal business units, my condolences, no excuse.


> I had to work with an API where the company decided everything should return http code 200 (well, at least all 4XX errors), and give the error code in the JSON response, mixing existing 4XX errors and their own errors.

I've done something like this in the past. There was a (horrible) reason; old versions of Android had terrible built-in HTTP stuff and tended to break on any 'unusual' HTTP responses (for instance, 204), especially where HTTP compression was enabled. By far the safest way to support clunky old Androids turned out be be just return 200 for everything with the real code embedded elsewhere...


Ugh, i'd put a gateway in front of it that transformed the status codes to a 200-response.

Instead of building your api according to the default Android client ( ps. Lookup BFF microservices)


It was a while ago, and we were very cost sensitive.

FWIW, I think the last Android phones with these client issues have probably died by now (it got mostly sorted out in 4.3 or so IIRC); this isn't a current concern.


Just pointing out an alternative that wouldn't break/uglify your api ;)


This isn't the ArcGIS Server API is it?


No this isn't. Without calling out names, the first one is a small localization platform (but seems to be a very prevalent issue in multiple APIs) and the second is an international hosting provider. Both are good enough for my case to overlook those issues, that are nevertheless infuriating.


Man that thing is a mess, isn't it....


I had that when I was dealing with the Withings API. Oh you got a 4XX error? Here is a 200 back with a json payload of {error: 400, message: "Some message"}.

Ugh it was frustrating.


So I wasn't wrong. I fought someone over this and someone else was like 'looks okay to me'...


I think that 200 status code for everything is such a bad practice. I'm surprise the amount of people here that are OK with that.


Slack API does that too

(to their credit, the docs are very complete and clear, even if their API design is questionable)


> decided everything should return 200

Sounds like that decision came from someone who didn’t spend much of their career consuming APIs.


I spent a quite big part of my career consuming APIs and an API which always return 200 is no big deal. The opposite which I have also encountered is much worse, returning successes with 4XX, because some libraries do not like that. Why does the opposite happen? Because sometimes it is not obvious if something is an error or just another return value.


But they are higher ups and should obviously be allowed to make ALL decisions, why listen to some developer who is beneath them. That dev has only spent the last 5 years working with APIs, databases, and web applications, what does he even know, probably nothing? /s


Sounds familiar... ugh


I'm missing a few in the list: 1) Have an API but only expose that via some weird-ass tool instead of a normal protocol. 2) Have a lot of documentation but make it so bad it's useless (looking at Telebib2 right now) 3) Only return generic errors which are always the same 4) Refer to standards so you don't have to document them, but then deviate slightly so all standard tools don't work 5) Make it ambiguous (best one I've seen so far: 30% of the required fields must be filled in, doesn't matter which 30%).

I could go on looking back the projects I've done.


>"APIs also permit customers to use a lot more of your product. If they have to click, click, click to use your product, they're going to use it only a little. If an API exists, they can automate their use of your product, which would let them use it a lot more. They could automate provisioning for their entire company. They could build entire new applications based on your API. Just think how much more of your product they would be able to consume with an API."

This so much. One of our main adtech vendor is lagging behind on the API front. We've been chasing them for a while on when they'll be up to date so that we can build on top and they get back to us on how their other customers don't really want an API.

Of course, no one (beyond us tech crowd) goes to bed thinking oh I need an API to interface with my DSP. But I'm sure they'd be delighted about what the API can enable them to do.

Here's an advice if you are building / improving a pretty consequent SAAS product : Your UX sucks. Not that it is awful but it is likely not optimised for what your customers are doing, simply because you keep adding features and now it's bloated.

And I don't mean that it is an awful thing, it is the nature of such product to keep growing, because marketing, justifying salaries and obviously adding value.

Chances are, 80% of your customers are only using a good 20% of the features you offered and they'd rather have something that is tailored for efficiency to address their main pain point. So either you step up your game to offer that flexibility or you get your API game up to speed. Your competition might beat you to it and I know which product I'll choose.


#DJI

Re: hiding docs:

Luckily, this can be easily done by putting the documentation behind a login screen... consider making your documentation a PDF file...

DJI's release notes for the DJI-SDK are only available for DJI registered developers. As a zipfile - of a single pdf doc.


I've worked with plenty of APIs (> 50). The biggest headaches I came across are bureaucratic stuffs rather than poor API designs. For example, there are vendors (like ConstantContact) that requires you to provide your credit card to get an account for API testing. You know what, they auto-bill your card after X number of days and require you to call them up to cancel your subscription. Because I'm on the other side of the world, I've to drop them a call in the middle of the night just to cancel billing.


If you ever find yourself in that situation again: send them an email and explain you’ll contact your bank and issue a chargeback if they don’t cancel it without a phone call. The issue very quickly resolves itself; chargebacks are expensive.

The power is in your hands; use it. Don’t play their games.


Here's one from a vendor I currently integrate with - have multiple APIs (in this case, a SOAP one, a REST one returning JSON, and then some language SDKs that wrap the latter for convenience), and have their feature sets be an interesting Venn diagram with various intersections but no unions. Just today I found out that the in the extensive integration we've written using the SDK (which wraps the REST API, remember), I will have to shell out and use SOAP because one (1) parameter on one (1) object in their extensive object hierarchy isn't supported by REST or the SDK, just by the SOAP protocol. Nice.


Same situation with private beta features, which "can not" be officially supported in the official SDK for the product.


Here are a few more:

* Aggressive rate limiting that makes it hard to use the API in real world situations. Especially if there is no documentation about what the limits are.

* Throwing errors with no explanation about what went wrong or how to correct it. Especially effective if given two very similar requests, one succeeds and the other one fails.

* Throwing errors randomly / when under load / when not under load / based on the phase of the moon.

* Having absolutely no example code anywhere in the documentation.

* Requiring hundreds of lines of code to even establish a connection to the API.

* Requiring a specific client library to access the API. Extra points if it's Windows only or requires a specific out of support version of Python 2.


Use of multiple different sets of allowed characters in identifiers for entities in your API is also a good way to show you customers that you hate them..

Extra bonus points for intermittently using case-insensitive but case-preserving identifiers.

Microsoft Azure storage APIs is a prime example of this with 4-5 different sets of identifier restrictions dependening on what you're naming. (Including one that allows any valid C# identifier, for which the documentation conveniently refers to the ECMA specification of C#).

Also enabling some special characters like # when creating entities, while knowing that # will be interpreted differently when fetching entities. Such that entities with # in the key cannot be fetched or deleted, but only created.

Extra extra, bonus points for retaining # support in the interest of backwards compatibility :)


My favorite was an API for a product you've heard of that couldn't be bothered to do datetime math correctly.

Their recommended solution: Make sure all of your intervals overlap by several minutes to avoid any problems. :-P


I’d like to add ‘Avoid any permission model for API keys’.

Even with multiple keys, if they are all god-mode it’s difficult for a central operations team to give internal users API access without risking unauthorized changes or audit findings in regulated environments.


Personally i think API should be obligatory. Every company should provide it with respect for private data. Additionaly if a company does not have it, it should be prohibited to block automatic measures to process the system and data.

There should be a pricing model to cover basic expenses made by the API usage.

Maybe you wonder why on earth it should be legally allowed to process automatically system data?

We have a lots of great services online. But we can not move further, integrate them into even better ones, because of lack of APIs and bot bans.

Without it, we wont be able to create high level services people expect. And currently if these will be provided, than only by companies that have Monopoly over vast parts of online offering. Because they can make more monopolistic deals with service providers.

Another problem is that service owners have adventage over users. Because they can process users data to drive business decisions torwards them. But users can not process thr same data for their adventage. With more and more complicated systems user ends up being dependant on the company service and has no possibility of validation.

And this means even more trouble than we have today. So ye, open programmatic access as a human right.


Does "every company" include the thousands of local businesses running Woocommerce or Shopify?

Larger companies love regulations like this because they can afford it, and it keeps smaller competitors from ever growing to be a threat.

I'm not saying they should be able to do whatever with data - but for such providers, manually replying to an email is preferable than implementing a complicated technical solution they can't understand (and will often likely mis-configure, exposing your data) when they just want to sell hats.


That's fine. The second point was precisely that if the company doesn't want to provide an API, fine; but they then don't get to try and stop people from scraping it instead. And I agree with my sibling comment; Woocommerce or Shopify or whatever just build it in like everything else they provide.


Agree re: scraping, I was more stuck on "obligatory" so I guess I glossed over that.

I also agree that the big players would roll it into their core offering which would make life easier, but I'm wary of monocultures. A lot of people are already a bit annoyed at how much Wordpress is out there and this would just be another reason to skip a smaller (or bespoke) alternative and raise the barrier of entry for anyone thinking of making an alternative. But if we're agreeing that no one is forced to do it and scraping is OK, that's fine.


Clearly if this regulation existed companies like the ones you mention would include an API as part of their offering.


I'd weigh that up against the very real risk it is mis-configured by someone's nephew and exposes my private data to anyone who asks.

I think mandating API for certain industries like banking is a good idea, but I'm not so supportive of having it apply all the way down to my local hair salon. They have enough to worry about.


of course, I think everyone here would agree it's always good to have an API... too bad you're not gonna succeed in pushing this "rule" out, especially scraping stuff...bot bans are in part because it's rather easy to overload the system with bot calls, which, since they use same page as users, will also affect normal users...especially taxing if the page is dynamic with server-side stuff, like maybe some aspx or php page, which at best would just hog the system resources, at worst could possibly break altogether (most likely due to no free ram), etc... I'm no expert tho, so these specifics might be wrong...


LinkedIn would like to have a word with you.


LinkedIn is a worked example of why we need such a law.


I really liked Strip's API version scheme and how well done it is.

Recently I'm writing code that interacts with API of a Enterprise Product (fairly new, less then 3 years old) and its driving me crazy. They version every single API endpoints separately. So endpoint A to create a user is at version 7 but endpoint B to assign a group to a user is at version 4.

WTF?


Stripe's API is well documented, but I don't understand why it uses form data instead of JSON. Seems weird/scary to have an API driven by completely unstructured text.


It is structured. The first page of the Stripe API links to the encoding standards. The main difference between it and JSON is JSON supports complex data structures while form encoding is flat key-values. In fact, with RFC7578, multipart/form-data encoding allows arbitrary Unicode strings, unlike JSON, which requires weird handling of some Unicode data.


What's unstructured about HTML form data format?


> Technique #8: Ignore the IaC revolution

Google Firebase has no API for creating projects. Therefore it is impossible to use Terraform or other IaC (Infrastructure as Code) tool to provision backends that send push notifications to Android apps. Manual steps are required. In this way, Google Cloud prevents its customers from using SRE best-practices.

https://github.com/terraform-providers/terraform-provider-go...

Google as an organization is simply unable to focus on users.


I only ever use internal APIs at work. I've never used anyone's external APIs and am wondering if there's a whole tech ecosystem I'm missing out on. What are some commercial use cases you all have had?


A great source to learn from is ProgrammableWeb: https://www.programmableweb.com/

But in general, a lot of things on the web that you would expect. Using Stripe, Paypal, Ebay, Amazon, Google Maps, etc have APIs into their services.


From an ops perspective, many *aaS providers simply don't have working features that we need, so we use their API and build our own solution.

From a dev perspective, there are industry standard APIs which a range of products support, and to design a new product that we want to sell, we have to use said APIs and work with different providers and customers. There are so many differing APIs to do the same thing that we have an internal team whose job is to write an internal API that internal customers use to interface with all the external APIs. This model scales well, as this API-of-APIs team can integrate new solutions transparently to the internal customers, so we don't have to rewrite products just because an external API/provider/customer/etc changed. (back in the day this was with straight databases and apps, and the intermediate layer was called "db apps", and was basically an early API)


Stock / financial data.


I thought the chosen example for idempotent requests was a bit funny as I don't think POST is necessarily idempotent and depending on your specific use case making it so may or may not be easy.


It requires passing in an id of some sort. Client generated ids are cool for a lot of things; I wouldn't use them for everything.


Completely agree with you.


The article doesn't mention POST. It says "the first time we call it the VM is created. The second time it is called the system detects that the VM already exists and simply returns without error". Makes more sense to assume it means PUT, defined as "The HTTP PUT request method creates a new resource or replaces a representation of the target resource with the request payload"


POST is explicitly not idempotent.


...and the given example of something that should be idempotent is posting to create a new VM instance. Which you certainly may do. It's just a funny choice of example since that's the one example where the behavior he describes is potentially expected.


POST isn't required to be idempotent, but to be clear it's not improper to make an API with idempotent POST endpoints.


Which is exactly why it's a fairly unusual choice for an example.


It would be weird if he used a method like GET which is normally idempotent. People already make those be idempotent, so they would miss that the point is that a good API has all idempotent endpoints, POST and not.


Can someone start an "API practices if you hate your developers" version?? I'm sure HN has some great stories to share about that.


This article is junk. There are many reasons for not offering an API to customers.

I run a small b2b app between two completely non-technical businesses. There is absolutely no need for me to have an API available to them, they’ve never asked for it and have no desire or ability to consume it.

Believe it or not, there is also an expense to offering an API! An API is a product offering like anything else, and products need customer support. API products need advanced support because they are technical in nature, you can’t use a call center for this type of support.

And I don’t really want random joes signing up for my API and doing god-knows-what with it. I’d much rather create a relationship with you so I know what you’re trying to do (so you don’t bring down my servers in the middle of the night or run up a huge bill because you’re use case is a little beyond what I can support).

The other points can just be summarized as “have a perfect API and perfect docs”. Ok. Thanks. Can you show me an example of this in the wild? Didn’t think so.

The better suggestion is to analyze your business and decide if offering an API furthers your businesses’ goals.


I didn't downvote you but I do strongly disagree with pretty much everything you've said.

> There is absolutely no need for me to have an API available to them, they’ve never asked for it and have no desire or ability to consume it.

Whenever I've used "small b2b apps," especially between non-technical businesses, if you don't have it I'm not asking for it. I know how hard it is to be a solo founder or part of a very small team so I'm either going to use your product as-is, or I'm not going to use it.

> API[s]...need customer support

There are plenty of small, niche APIs where there is out-dated documentation and not much else. I'm saying that's a great experience, but let's not pretend it's illegal to have an API without on-demand live support.

> And I don’t really want random joes signing up for my API...

The reasons listed here can all be handled via code. There are very few reasons to prevent automated sign-ups.

> The better suggestion is to analyze your business and decide if offering an API furthers your businesses’ goals.

If you think it can't, you're just not being creative enough. Every single API with more than 2 users is being used in ways its designers hadn't originally thought of. Many brains are better than one brain and you can often get more (and better) ideas from what other people are doing with your tech than you ever could on your own.


I get what you’re saying but I think you’re not being pragmatic enough (and perhaps being a bit developer-centric, ignoring that there are millions of people out there who run businesses that have 0 need for offering a consumable API).

You’re asking for access to the heart and soul of a company and demanding to pay the same price as a customer who asks for much less. I’m having a hard time understanding why I should pursue you as a customer when I can get the same money out of someone who is way less demanding.


Actually if I want to do way more then I’m generally okay paying more for it. Nobody is asking you to do it for free but automation is something I look for frequently when a tool is core to my work. It’s why I avoid most GUI tools and systems unless the GUI is more than text fields and buttons. It’s simply too much work to repeatedly follow a script for a system like that when I have a very clear use pattern for it.


I would argue the primary benefit of having an API even if not made public is that it will encourage better architecture , design, and development making the product more maintainable over the long run.

It also means you can scale if you ever do need to add a customer that wants to make use of an API.

Saying not to do an API because customers don't want it is similar to not using source control because you're the only developer.


An architecture is good if it is serving your business’ needs. My company makes $13 billion a year off of garbage architecture. If we completely re-engineered it to be perfect, we’d probably still make the same amount of money, it would just cost us hundreds of millions of dollars to do it (actually, we tried and wasted about $300mil).


Is having an unnecessary API really better architecture though? It adds quite a bit of complexity and there are other ways to address separation of concerns.


Depends on how big your team is. If it's just you and you feel like you dont need one then dont. If its 3 teams of 10 then an API is extremely helpful. It's a great way to frame conversations in something beyond personal preference.


Not sure why you're being downvoted. An API is a feature just like any other features your company offers. If 1000 customers/clients ask for feature A and 0 ask for feature B, it's obvious you shouldn't waste your time on B. If B = "an API", and it doesn't change that it's a waste of time in this scenario.


That product methodology is sure to fill your roadmap with minor variations of button colors.


This article really made me think about the Google Ads API. Miserable to work with.


What would you change?


Everything, it's a horrible horrible API.


A seemingly common one in the move fast and break things world is to provide all information in the form of hello world tutorials and youtube videos. The source code is its own reference, amirite?


This seems to be written from the point of view that the "industry" is all enterprise SaaS software or something. I don't think I'd agree with calling any of these "industry best practices". Maybe "commercial business software best practices". #2, 3, and 4 aren't really applicable to a ton of software products. There's a lot of niche software that isn't intended for the general public.


How about "shutting down your API"?


Question: do you guys find a graphql-only API acceptable in 2019? For us to develop a (good) REST API layer on top of our existing solution is a non-trivial piece of work, and GQL offers us all we need on the front end. Customers seem ok with it, but we haven't got any large enterprises yet.


Ha! Anybody who implemented a payment processor on their side of the fence to interact with Visa/Mastercard on the other side will have seizures while reading this article. Those API's are definitely written with "hate your customer" practices in mind.


Ooh yeah. If you have ever integrated with Vantiv you will know what all this means. Heavy lifting is on the customer side. What a pain


I used to work in a company like that almost all techniques described in the article were in use.

Turns out it doesn’t piss only customers but developers (of the API) as well.


The API that was dragged out of my company's software team ticks oh so many of these boxes...

It's eerie, almost as if the author tried to use our system and then decided to write this...


Interestingly, many of Amazon AWS's apis don't break these rules, and I don't really hate them.

Charges for use. Manual documentation. Idempotency. A weirdo protocol.


I guess that explains why the ACM Digital Library doesn't have an API.

Even worse, they seem to forbid any sort of programmatic access in their ToS.


API practice number one if you hate your users: Don't provide one, and then ban them from working around the gap:D


As a non programmer, I use Zapier constantly for all my API needs


"Technique #5: Use a terrible protocol

Debugging is boring. Wouldn't you rather appeal to customers who write bug-free code on the first try?

To really show disdain for your customers, use a proprietary protocol so that language support is limited to the client libraries you provide, preferably as binary blobs that are never updated. If you design it carefully, a proprietary protocol can be difficult to understand and impossible to debug, too.

Alternatively, you can use SOAP (Simple Object Access Protocol). According to Wikipedia, SOAP "can be bloated and overly verbose, making it bandwidth-hungry and slow. It is also based on XML, making it expensive to parse and manipulate—especially on mobile or embedded clients" (https://en.wikipedia.org/wiki/SOAPjr). Sounds like a win-win!"

I remember when the early days of Amazon S3, trying to write Bourne shell scripts using shell built-ins and single purpose UNIX utilities to form the HTTP and interact with the servers, instead of scripting languages with libraries like Perl, Python, Ruby, etc. This is how I interact with HTTP servers normally. I never have any problems keeping things simple and dependency-free.

To do this with S3, it felt nigh impossible. There were small errors in their documentation of how things actually worked. It felt like they were intentional just to trip me up. I know they were not.

The official recommendation back in those early days of AWS was to use one of the protocol options provided by Amazon, HTTP or SOAP. You would think, heh, I will avoid SOAP and keep it simple. I will just use HTTP.

The truth is both required using scripting languages with libraries. Amazon's own utilities were written in Java. I know developers have their reasons for making these choices, but as a user, that complexity really put me off.

From my perspective, this ACM article is right on point.

"When I see a top-down description of a system or language that has infinite libraries described by layers and layers, all I just see is a morass. I can't get a feel for it. I can't understand how the pieces fit; I can't understand something presented to me that's very complex. Maybe I do what I do because if I built anything more complicated, I couldn't understand it. I really must break it down into little pieces." - Ken Thompson http://genius.cat-v.org/ken-thompson/interviews/unix-and-bey...


That's a bit of a strawman. The cultural tradeoffs between "inspectable" JSON and "bloated" SOAP have to do with their most common consumption environment.

In my current life I work mostly with JSON and a typical workflow involves checking API documentations, querying the endpoints, and understanding the structure. I need to own the mapping between JSON and domain objects.

In a previous life I worked with .NET and C#, mostly SOAP APIs. A typical workflow involves right-clicking somewhere, pasting the link to the SOAP endpoint's WSDL file, and automagically getting a collection of strongly typed classes that I can manipulate directly and operate on as if they were domain objects.

The idea that when using SOAP one spends any time (manually) parsing and trying to make sense of XML is a misconception.

To put it in a modern parlance, one shouldn't compare SOAP to JSON. SOAP is JSON + Swagger, with the Swagger integration costing 0.


Some proprietary protocols are easier to implement than open ones. The other day I wrote against one API server that just accepts messages in json format over an ssl socket. It was maybe 3 or 4 lines of python? (Not counting setting up the connection) and probably wouldn’t have been bad in C. For the shell you could use OpenSSL s_connect and something to generate json (awk or echo would probably be enough since it was all flat dictionaries of strings.)

> There were small errors in their documentation of how things actually worked.

that sucks, but I’ve run into that with people speaking “http” and not just special protocols.


A pretty bogus article. If you can charge for an API and people are paying for it, why would you not charge for it? If people aren't paying then think about making it free and making it a doorway into your general product.


I don't think they were arguing not to charge, I think they were arguing a) not to charge an obscene amount, and b) not to charge individually for every little thing someone might like to do (e.g. have a couple tiers but not full-blown a la carte pricing).


I think that if your product is something which is available (by browsing, for example), but you only offer a very expensive API - no stripped down free use - you're setting yourself up to both web scrapers, and giving away potential customers to the competition.


> An operation is idempotent if performing it multiple times yields the same result as performing it exactly once.

And then he casually offers an API that does different things on first and second call as the "good" example. If you have a "create a virtual machine" API it better create a fucking virtual machine. If I call the damn thing twice, I expect to have two VMs. If there is some sort of unique argument like create a named VM, I would expect API to throw an error if the name is already taken, no to just return like everything is normal.

And this guy is being all snarky about API design?


I agree with the author re: returning success on retries; it lets you automate the retry process.

I work in mobile games; because someone might play in a tunnel or bad network area, I need to make sure that every request is retry-able.

To do that I generally include a GUID of some kind in the request; if the client says "create an entry for XXYY," there's a chance that the request will get to the server but the response will fail to reach the client.

If the client is able to retry the request (with the same GUID) and get a success response, then I can have the retries handled transparently in the communication layer; all the client code needs to know is "I made this request and it was a success," without any knowledge of how many tries it took.

If the second/third/etc request returned an error of some kind, I wouldn't have a good "success" response to hand back to the game code. (I'm assuming the "success" response contains some information that the game code needs.)


The idempotent operation is usually "Ensure that a VM exists with this name and spec".

An advantage of this style is that if the client dies or times out during the (long) operation, it can retry and get the same answer instantly.


What happens if some other process has already created a VM with this name and spec? Under most realistic scenarios I would rather VM creation failed than silently clobber someone else's VM.


With an idempotent API, starting a VM and doing something with it can look like this:

    let new_id = generate_id();
    retry_with_backoff(() => api.make_vm(new_id));
    retry_with_backoff(() => api.do_thing_with_vm(new_id));
This works even if any individual API calls fail, or if the API call makes it to the API server but the response fails to make it to the client.

If the APIs aren't idempotent, then you would have to do this to get the same behavior:

    let new_id = generate_id();
    retry_with_backoff(() => {
      try {
        api.make_vm(new_id);
      } catch (e) {
        if (e.info && e.info.code === 'vm_already_exists') {
          return;
        }
        throw e;
      }
    });
    retry_with_backoff(() => {
      try {
        api.do_thing_with_vm(new_id);
      } catch (e) {
        if (e.info && e.info.code === 'thing_already_done') {
          return;
        }
        throw e;
      }
    });
This nonidempotent API is harder to use. Someone that doesn't know about these error codes or the fact that the API isn't idempotent will write code without the try-catch blocks that doesn't handle retries correctly. With the idempotent API, users fall into the pit of success where things just work without them having to know the details about each of the edge cases.

The nonidempotent API is exposing some extra data to the user, but it's not super useful. You basically always want to treat the vm_already_exists error identically to a success response. Maybe you also want to log some data about how many retries were necessary so you can figure out how spotty the network connection is, but there's no reason that couldn't work with the idempotent API either. The idempotent API could include a header about whether the action was already taken previously.

Consider how TCP connections are used by applications. Your application doesn't have to opt in to handling packets that were resent. The fact that some packets had to be resent is by default just an implementation detail. You have to opt in to get information about the resent packets; by default they're handled like regular successful packets. Idempotent APIs are about making handling retries work by default in a very similar way.


Lets start simple, your example assumes that you generate the id yourself. In my experience a common API usage pattern would look more like

  try:
      vm_id = api.make_vm()
  except SomeError as e:
      log.error(e)
  else:
      res = api.do_thing_with_vm(vm_id)
and in your example, if we are generating ids ourselves, we still have to verify that we got the right VM. If your ids are provably unique, there is no reason to generate them, the API can take care of that, but if you want something like a named entity, you have a problem. What if the name is already taken? So your code would look more like

    new_id = generate_id()
    try:
        vm = api.get_vm(new_id)
    except VM_DoesNotExist:
        vm = api.make_vm(new_id)
    except SomeError as e:
        log.error(e)
    else:
        api.do_thing_with_vm(new_id)
because if the make_vm API simply returns a VM whether it was created or not, it is entirely possible that you are getting a VM that is busy doing something else for some other process.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: