Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What are the not-so obvious things to consider while API integration?
74 points by path89 on Nov 18, 2017 | hide | past | favorite | 51 comments
We are in the process of selecting an external vendor to integrate with. So, right now we are reading documentations of potential vendors and in the process of doing a doing a feasibility study. Assuming all vendors provide similar capabilities in terms of costs, features, support, API response times, load testing results and the like what are some pratical things to consider before moving forward ahead with one?

Two big questions:

* How do they handle deprecation?

At some point, they're going to realize that their API design is wrong, could be better, or could take advantage of new patterns, methods, or even customer usage patterns. As a result, they're going to have to restructure and potentially deprecate things. Do they handle it with transparency, honesty or even just make sure your stuff still works?

At Twilio, the 2008 endpoints have been deprecated for 5+ years but still work and - to the best of my knowledge - there's no sunset announced. Yes, all the examples, docs, sdks, etc encourage you to use the new endpoints but the old ones are there. Compare that to Facebook, Google, Twitter, etc which turn on and off APIs and break things with little regard for end users. Which leads to the second question..

* What's their business model?

If they make more money as you use the platform more, that's great! In biz speak, your interests are aligned! They're going to make sure you get started as quickly as possible, minimize downtime, and make sure that your stuff works now and forever. Once again, that's the Twilio model.

If you scale from 1 to 10 to 1M requests/day and they make the same amount of money (or none in many cases), then they're going to be less concerned if you can't make 1, 5, or even 10% of those calls. Even worse, if the API itself is primarily used for market penetration purposes (once again: Twitter), they may treat you as a free R&D group and let you do the exploration and prove the market before they step in.

* Early Twilio employee, current Okta employee, wrote a book on API Design

I totally agree on the second point (business model alignment), however I believe that amount of requests does in a lot of cases not mean that your interests are aligned.


* A REST API needs 3 calls for $foobar. A GraphQL API only needs 1 call. The vendor will loose money if they optimize their API in a way that you can make less calls.

* Some of your data does not frequently change. If you pay per request, you can save money by caching. However, that'll make your implementation more complex (esp. if the API is supposed to be used directly from mobile apps).

IMO it's better if the API provider can charge based on business metrics. E.g. Stripe charges based on payment volumne. It doesn't matter if you do 2 or 200 API calls to make a payment.

I started by saying "use the platform more" and went to API requests because that's the natural connection but you're right that the more proper phrasing should probably be "more successful with the platform"

Also relevant to this platform -- if your business is more successful when your customers are more successful, and they tell you that, and together you can measure that... you might be in a good place.

With graphQL you have a concept of query complexity, you would base costs on complexity not number of API calls.

Since you didn't mention your book probably because you didn't want to come off as spamming. I'll ask, what's the name of your book?

I think the book is http://TheAPIDesignBook.com

Yea man click his profile.

Hey that's awesome. I tell people about the Twilio API when people ask me about good APIs out there to learn from.

I have worked with a few vendors to integrate with their API. Some things that need to be considered:

1. Do they provide an easy way to create a sandbox/staging/test/dev environment ? Good example: Stripe. Really bad example: Paypal. Why ? Well, for starters, you need to create a completely separate account just to get a sandbox env in paypal. Then there is no guarantee that sandbox is identical to production in terms of ability to build and test.

2. What are their Rate Limits ? Some APIs are well designed but have terrible rate limits.

3. Authentication mechanism and security. How do they authenticate the API access ? This can matter for building critical apps.

4. Do they use REST or not ?

5. Do they have good documentation ? Can you do a quick test using their documentation or is it required to call their support ?

6. Do they have good support ?

7. Do they version their API changes and are those changes backward compatible ?

I am sure there are more but these are some good ones to start with.

Off-topic: Regarding Stripe's sandbox, I really dislike their web GUI. After creating a bunch of test customers for instance, deleting them requires going into each and every single customer and then clicking "Delete". There is no way to multi-select and batch-delete.

(Yes, it can be done programmatically, but that's not always feasible/practical, especially if I want to keep certain customer accounts alive.)

Similarly, after deleting a customer, it takes you back to the customer list, but it's not always up-to-date. Sometimes the front-end can detect that updates have occurred in the list (i.e. something has been deleted) but it requires a manual refresh.

Funny but I agree with you about Stripe's dashboard GUI. They could do better.

> Do they use REST or not

Meh. REST semantics really only works well when you're doing CRUD operations. They get clunky when something's modeled around actions rather than entities. That said, make sure their response codes and methods make sense.

Spinning this a slightly different direction: think about what their API operation model is, and how well it lines up with your current and (more importantly) expected future use cases?

If you expect that you might need to significantly change what you're doing in the future, having a more simple, composable model around resources that you can call as you need might save your ass down the line compared to if they model things as specific RPC use cases and you find yourself needing a particular variation they don't support out of the box.

I agree in the sense rest is not as imortant as say: is the api XML or JSON? I’ve integrated many B2B APIs which tend to be varying degrees of XML, plain and sane to rabbithole insane or worst of all SOAP which apparently is really still a thing people use and take seriously in B2B api context.

Sigh, xml hard at work solving a problem we never seen, but creates such a bloted horrible mess where json would have been simple and better.

I see both sides. The magic of SOAP is you can give a tool a WSDL and it will generate bindings in a language of your choice.

The downside is it has to be pretty complicated to enable the magic, so building it can be complicated, and debugging it is a nightmare. You could say the same about a lot of modern JS, and that's been somewhat solved by amazing tooling.

Yes absolutely: this very magic is precisely what is so horrible and imho a rabbithole of insanity. It is CORBA resurrected with XML.

In my data oriented mind, the primary thing that matters is the data that is sent, the second is other semantics like idempotency.

Neither SOAP nor XML solve or helps: there might be no wsdl bindings in your language, or they are badly implemented (true in my experience).

XML make people nest structures where it could have been flatter, SOAP even more, and annotates everything whithout a need for that.

This cost for no reason but to entertain the CORBA like pipe dream, or should I say nightmare.

We just resorted to pulling out data from xml with xpath. And more or less template the response back, no wsdl and very simple.

But why inflict this pain on your API users?

Swagger/OpenAPI does that for rest to some extent

Just want to add one thing about swagger:

please think through your API if you are to design an API, the tools will not think for you and you can fool yourself instead...

Swagger will not magically solve anything. I have seen and experienced that you can really create nonsensical but pretty and auto documented APIs using swagger in no time.

Great list. I'd also suggest researching what portion of their mutation API is idempotent. If they highlight those endpoints, they know what they're doing.

Network and system hiccups/outages cause non-trivial error rates with non-trivial load, so understanding how to recover from those errors is critical (to your ability to sleep through the night because your on-call pager stayed silent).

If the documentation says it is a REST API, you could be justified in expecting all POST requests to be non-idempotent, and all other requests to be idempotent.

But it would still be reassuring to see this stated explicitly!

That's only if all calls are isolated to simply modifying state of objects, and not causing events. For example, if doing a PUT on a user to change their email kicks off an email to verify the account, then it's not idempotent in my mind.

It could be argued this is a departure from the standard, but on the other hand allowing PUT to have side-effects might be more user-friendly than an extra POST to a special "email invitation" resource. Would you prefer to see this behavior (as long as it was documented)?

Come on: that’s easily fixable if the provider cares: just assign an ID to every mutation operation. Never repeat the operation if called again.

In the real world even bigger players such as AdWords would time out on our bulk requests to them if making >100M changes, without giving any errors back (the jobs just simply tend to disappear; or the jobs never get submitted in the first place).

Truth is, you may or may not be discussing the appeal of the REST; but the issue in this context is solved already - and it’s only a matter of professionalism and therefore, the resources required to build something that just works.

But in the real world we get stoned students writing the majority of the APIs; having read a few how-to’s.

I don't disagree, the original question was how to vet an API though, and I think it's a valid thing to look for.

#4 (REST) should not be on this list.

There are (at least) three senses of "REST" in use in the development world, and none of them are important in an API.

The first is the original definition, laid out in Fielding's thesis. (Normally I call this "the correct definition".) This includes requiring HATEOS, which is far from a necessary property in a good API, and arguably a bad feature because it increases payload sizes.

The next is the definition popularized by Rails [1]. This is never precisely outlined anywhere as far as I know, but most people seem to think it requires using URLs with particular patterns (to see an example, just do `rake routes` on any Rails project that doesn't use weird custom routing in routes.rb). Also most people seem to think it requires using PUT for overwriting updates and DELETE for deletions. All of these conventions are not-bad. But they're also not particularly good; they amount to a variable naming convention that happens to be pretty reasonable. Other systems of organization are also perfectly good! But anyway, this is not an important feature in an API.

The third definition you'll see sometimes is "our API uses HTTP, therefore it's REST, we're buzzword-compliant". Fuck your buzzword compliance, no that doesn't make your API RESTful. And in some cases, these bozos can't even use HTTP correctly (for example using GETs for side-effect-ful requests). This is pretty much indefensible, but it'll still happen. So if you DO care about one of the first two definitions, don't take it on faith that someone who says "our API is RESTful" knows what that even means.

Also, the first two definitions both work in terms of "representations of resources", and all three of the above apply to HTTP [2]. It's not at all clear that either of these are good features in an API. They both have the advantage that they make implementation more predictable, so that's often (usually? almost always?) a great idea. Yay! But resource-representation is a bit bloaty, and so is HTTP, so for some situations, it's more appropriate to discard either or both of those design parameters. For example, suppose you're monitoring some terrifyingly dense IoT deployment, and you want your swarm of sensors to send updates ten times a second. Differential updates over Apache Thrift (or over Google Protobuff) is certainly not over HTTP, and you have to squint pretty hard to to see the differentials as resources. (Mind you, you'd probably also want to send some checkpoints less-frequently, and you could characterize these as resources without too much difficulty.)

In summary: REST means different things to different people, and there's a lot of good sense in those communities (I mean, not the third-listed liars, but the other two). BUT, probably you don't actually care about RESTfulness per se, in that cute conventional URI/path patterns are unimportant, and HATEOS is somewhere between nice-to-have and undesirable.


Footnote 1: I'm not claiming that authors of Rails (DHH et al) were confused about the definition of REST. As far as I can tell, when they said "make your routes RESTful", they weren't claiming that doing so necessitated using their cute method/URI convention, they were simply claiming that doing so was compatible, and that following some convention is a good idea, and here's a good convention. Yay! Well said! Unfortunately most people who wrote simplifying articles on the subject got it wrong, so now we have an industry full of semi-literates quoting non-existent scripture.

Footnote 2: Strictly speaking Fielding's thesis is about what he thinks HTTP did well, and is intended to be largely advice to designers of future protocols intended to compete with HTTP. That's why so much of the "architectural style" is about stuff that's built into HTTP, such that the designer of an HTTP API cannot fuck it up. API authors are not the primary intended audience of the thesis!

This may sound a little random, but one of the things that varies most between APIs is date and time formats. Many API implementations come up with their own. Looking at what the date format is tells you how much thought has gone into looking the API design. Hopefully it one of the ISO 8601 options, otherwise I'd start looking more carefully at the documentation

Related to that is how they handle currencies.

Look for what users have said about them, and their own published uptime/event history. Some big-name services are surprisingly unreliable.

Agree with codegeek about sandboxes. With some vendors it's hard to test your code because their sandbox doesn't look like their live service, or hard to come up with instructions in those cases where you need to ask your customers to interact with a vendor. Some vendors make our tests that integrate against them flaky because the sandbox is flaky :/

Look carefully for things that smell either like too little attention to detail or great attention to detail. For example, for a while a big payment gateway had, basically, a webhook for telling you about payments/payment failures on recurring transactions, but there was no retry mechanism, so if a connection was lost for any reason, so was your data. At the other extreme, it's a good sign if they go above and beyond in some areas, e.g. Twilio and Stripe have great documentation, their approaches to deprecation/compatibility are good, and Stripe offering the idempotency-key mechanism showed they'd spent time solving the whole not-charging-people-twice-because-of-an-inconveniently-timed-network-glitch thing.

This is probably a quirky personal preference, but I'm encouraged when vendors offer APIs for single-item and bulk operations and the bulk ops perform OK with a lot of items. We have some generally alright vendors whose APIs turn out to be painfully slow when we sync data for our largest customers back to our servers. This is a problem that would not necessarily have shown up from testing a high request rate; it's a matter of scaling well with amount of data _stored_.

Some services have "old" and "new" APIs that are not fully integrated, e.g. auth credentials needed and object IDs might differ between them. Sometimes you have to use both to get all of the functionality you want. This hasn't tended to correlate with a good developer experience. On the other hand needing to make your new API distinct is probably inevitable sometimes due to business history or other practicalities--I wouldn't make a decision just based on that.

Comprehensible pricing is a good sign, though of course that's more achievable some times than others. Something like AWS or GCP has accumulated a lot of cost/pricing details to sort through since there are a lot of types of resources they offer, and there aren't really shortcuts to get around that.

Does the API return a 200 status code and empty response body for all calls to a given endpoint(even failed ones)? Hi Salesforce, great dreamforce event you threw last week!

Last year worked with a variant of this. Status 200 always with in the message body the actual status code and message. Whyyy.

Successful response? 200 and a JSON payload.

Error response? 200 and a human-readable string describing the error.

Yes, the idea is indeed a signal that the request was received and the message the actual response. Problem is, if you are going to send me the actual HTTP status code in the 200 response with no additional information, you might as well send directly. Works better with all the libraries as well. Now we had to write a layer of abstraction to pull the actual message and code an forward that into our applications and third party libs. Personally, I prefer APIs that adhere to the HTTP status codes as defined in the standard.

From memory Facebook popularized this approach way back when.

This can make sense when you're doing batch requests and allow partial failures. I wish there were a standardized way of making batch requests--maybe even HTTP-level support. (It isn't such a big deal with HTTP2)

This can make sense when you're doing batch requests and allow partial failures.

Wouldn't that be a 206?

No, 206 is only used with Range requests, when you try to get only some of the bytes of a larger resource.

There is 207, I think created by WebDAV, where you give back an XML document detailing the status of sub-requests. I feel like that'd be confusing in many contexts as well, and a clearly defined custom response is probably easier to handle for the user.

Some really great points here, so won't rehash those. Some additional things to consider:

- Request size limitations (implicit and explicit, such as only allowing GET and being limited by what can be URL encoded)

- Response size limits, especially for reporting or data retrieval APIs.

- Concurrency limits in addition to rate limiting (e.g. Requests per second limit AND number of queued requests limit)

- SLA have been talked about, but you always need to quadrupole check this and your contract outs based on number of incidents, etc.

- Consider upstream/downstream dependencies, for example Twilio itself is not likely to go down, but one of the regional telcos/carriers/rate centers are. What options do you have then? (Twilio allows SIP trunks from other carriers to be connected for example.)

> Assuming all vendors provide similar capabilities in terms of costs, features, support, API response times, load testing results and the like

This is a terrible assumption. This is one of the things that differentiates vendors. No two vendors are the same, especially things like staging environments, capacity limits, etc.

But other things are:

1) Do they have a real-time status page?

2) How many outages have they had in the last year, and how severe were they?

3) Where is their support based? And is it 24x7? What if you have an issue at 2am on Saturday morning, do they have support?

4) How attentive are they to fixing bugs?

Experience with customers similar to you, e.g. your competitors. There are many factors about your field, or size, etc. that the vendor may be already familiar with.

Don't forget to make your response parsing compatible with future versions. If the response is JSON, make sure that an added field or switching between a field being null/{}/absent doesn't change your behavior. These subtle things change often enough when vendors refactor their code, and you want your implementation to be robust.

I did similar research for a project. one vendor looked better than all others. had a good documentation, extensive, versioned API... or so I thought. Documentation was old and incomplete, API breaks regularly without notice. So my advive: try to get infos from someone who already uses that vendor. I was very shocked and disappointed after 1 month.

Look at if they have an SDK for the language you'll be using. How much of it is open-source? How hard would it be to extend it? How hard is it to roll your own? Were they careful in their resource cleanup? How many new dependencies does the SDK bring into your project, and how well-understood are they?

How they support backward compatibility

This is one of the biggest ones for me. Also known as "are you going to make me rewrite my damn application every time you think you had a good idea".

I just looked back at some old web projects from about 6-7 years ago that I'd mostly forgotten about, to see if anything was worth keeping. Every last one that called out to an API is now non-functional. Oh well.

1.) Can you test against it easily? E.g. how good are mocks and test version of API?

2.) Documentation quality. Can you read it and understand? Does it contain details and examples or rather fuzzy generalities?

3.) Do the company seem stable? How likely are they to disappear (or get into trouble) three years later?

Nothing else matters if the vendor stops operating or refuses to continue services/pricing that you'd been relying on.

If this API is mission-critical, write drop-in replacement drivers for other vendors and/or get a commitment from them in writing.

If the vendor you pick goes "tits up" with minimal notice, what is your Plan B, or Plan C?

How essential will any given vendor be to your biz model? If they are acquired, say by a competitor, where will that leave you?

How might any of these "relationships" effect your exit strategy? That is, what you find palatable and necessary your suitor might not.

See if they connect to Zapier? Good indicator of a dedication to connections imo.

In addition to what has been mentioned already (esp. backwards-compatibility):

* Ability to customize / extend

I'm assuming your API is targeted at a business domain (vs. an infrastructure API like an email API). The API vendor will do his best to cover all major use cases within that business domain. But expect that your project has a minor use case, or is the first project to ever need feature X. The vendor most likely won't want to implement this feature. (This isn't necessarily bad - in fact it is good, if the vendor would add any random feature to his API, it would evolve into a frankenstein that is a nightmare to work with)

However, the vendor should have features and patterns/best-practices on how to solve those cases. E.g. in addition to the API vendors data model, you should be able to add your own data (like https://stripe.com/docs/api#metadata) and you should be able to react to events to add custom behaviour (e.g. https://stripe.com/docs/webhooks).

* Quality of documentation and support

A lacking documentation, especially in combination with bad support, can slow down development significantly. Some APIs also have a good community outside of the vendor (e.g. it's active on Stackoverflow), take that into consideration as well.

Also, are the POs and Devs active in support? If yes, this is a good sign IMO, because they're trying to stay in touch with their customers. And it'll also raise the quality level of the support, because - if you run into trouble - you can escalate to someone who really knows what is going on.

* Good SDKs in your language

Some API vendors offer great SDKs. They can be a huge timesaver, e.g. if they provide models for all their endpoints, they provide additional helper methods ontop of the API, and they convert data-types into language-native objects (e.g. Money isn't a native data-type of JSON, but of most languages). It is also valuable if the SDKs (and docs) come with lots examples in your language.

Some SDKs are okay-ish, especially if generated from Swagger/Raml. Some generated ones are a bit strange and it may be better to work directly with the API if it let's you craft your models in the way you need them to.

Some SDKs are a nightmare. We had one that leaked threads and killed our servers after a couple hours... not fun!

* Development workflow

As a dev, how can you use the API for testing? On staging? Does this incur additional costs?

You may need to reproduce a bug from your production API. It's helpful if you can copy the production data into your dev project easily.

(I'm an API dev @ commercetools)

Is that correct English these days? I’m reading the headline thinking “while API integration...what?”

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact