Hacker News new | comments | show | ask | jobs | submit login
APIs as infrastructure: future-proofing Stripe with versioning (stripe.com)
498 points by darwhy on Aug 15, 2017 | hide | past | web | favorite | 50 comments



I work on a SOA team at a large healthcare enterprise. We currently write mostly SOAP APIs (yeah, I know, 2017 and all that...), and follow a typical pattern as what Stripe describes: Whenever we do a version bump (which is extremely frequent, since a WSDL breaks a contract the moment you sneeze at it), we create an XSLT transform from the new version back to the old. So if you're calling version 2, but the current version is 10, there will be transforms back for 10->9, 9->8, 8->7... all the way back to 2. It works well enough.

We're in the early stages of deploying a new RESTful stack, and versioning is a hot topic (along with getting people out of the RPC mindset and into a resource-based paradigm). While version bumps should be much less common, we'll probably end up doing something similar to our cascading transformations. Essentially, the old version becomes a consumer of the new version, and as long as the new version continues to hold to its API contract, everything should work with minimal fuss. Of course, that's assuming that we don't change the behavior of a service in ways that aren't explicitly defined in the API contract...


> Essentially, the old version becomes a consumer of the new version

As a raw concept, I really like this idea. Let's say you're bumping from 2.5 to 2.6 and there's a breaking change in-between.

You replace the old api code with a new thin layer that consumes 2.6 and puts it in 2.5 format. This would be easy to write a series of tests for to make sure that 2.5 is still providing what it should, and it has the added benefit of giving your engineers the opportunity to consume the new api internally in a structured way, so that would be a good way to catch any bugs you might not otherwise find.

And of course, the 2.4 version already only consumes the 2.5 version, so it's turtles all the way down.

That's a great concept. You would have to still deprecate APIs or provide an incentive for users to update to the newest version, otherwise the oldest versions would create a long chain of requests that add unnecessary overhead. But otherwise I love it.


Don't you mean the other way round? You always want to be running the most recent 2.6 servers, but when someone sends version 2.5 requests you transform the 2.5 to 2.6 and pass it along to your server (and vice Verda with responses)

The problem I guess is breaking changes come easily - it's fine if say we have example.com/homeaddress and now I add a zip code field in 2.6 - but a 2.5 request has no zip code and if 2.6 makes that mandatory it's really 3.0 - that just feels wrong. But it's a breaking change so, ok. But you get a deeper discussion then over why make a lack of zip code a breaking chnage ? Interesting


Maybe we're talking about the same thing, but basically, the way I see it is this:

1. Request for 2.5 comes in

2. Version 2.5 calls v2.6, takes the data, and transforms it into 2.5.

3. Since a 2.5 response shouldn't have a zipcode field, it's dropped from 2.6 before the 2.5 response is returned.


This kind of implies always have 2.5 running as front end code. I would rather have my latest code running (especially as it implies 2.4,2.3,,,1.3) and have that spin up the 2.5 munger code as needed. But yes I think it's more or less the same thing

Interestingly looking at stripe changelog https://stripe.com/docs/upgrades#api-changelog they still do breaking code changes every month or two so despite the effort and with a smallish api they break a lot of contracts.

The odd thing is that they keep a different api version for each customer, tagged at customers settings, so I suspect your approach (spin up api based on incoming version) might work better - but boy I think they may be in a lot of pain trying to maintain that backwards compatibility


If you can do an XSLT transformation to different versions I wonder how important the change was in the first place?


As TFA notes, replacing a boolean with an enumeration is a breaking change, but can be transformed back easily enough assuming the enumeration just increases the granularity of the original states.

And SOAP is usually schema-verified, so adding fields will generally be a breaking change.


Our most common case for this is adding a field to a data model: that will break a SOAP client. The XSLT just strips out the new field.


I worked for a place in the travel industry that was doing the same thing on our REST interface. That was a bit more relaxed since it's not the crazy WSDL so you could always add things to the interface without breaking, but it still allowed you to version cleanly. In something like 150 version releases (bi-weekly cadence, over the course of a few years), I believe there were 2 XSLTs in the version chain.

To the point of the XSLT being so trivial that maybe you didn't need the change, I recall one of our changes being around a field that got split into 2 fields. We certainly could have put special case code in to handle old version vs new version, but the great thing is, we didn't have to put that code in. It really keeps the interface a lot cleaner.

Also, XSLT is capable of some pretty advanced transformations, so I wouldn't dismiss it out of hand as triviality.

Regarding the documentation, we build all our docs off Javadoc and a few annotations, so docs were always current. As a bonus, when your Javadoc is used to generate public docs, you tend to treat it with more respect than Javadoc usually gets.


> there will be transforms back for 10->9, 9->8, 8->7... all the way back to 2. It works well enough.

We did something like this in the on a message bus: the channels had the version in the name and and older version service always asked for the same data on the next version channel (the only other version it knew about), and then transformed it before sending it on. This meant you never had to maintain anything accept your newest version. It also meant you could never completely remove data because versions going obsolete needed a way to find or derive it to create their older version.


Always excited to hear Stripe talking about versioning :D

For anyone else who's interested, they've written/talked about this a few times over the years, to fill out the picture:

- http://amberonrails.com/move-fast-dont-break-your-api/

- https://www.heavybit.com/library/video/move-fast-dont-break-...

- https://speakerdeck.com/apistrat/api-versioning-at-stripe

- https://brandur.org/api-upgrades

- https://news.ycombinator.com/item?id=13708927

It sounds like their YAML system has changed to be implemented in code instead, which maybe allows the transforms to be a bit more helpful/encapsulated. If anyone from Stripe is here, it would be awesome to know if that's true and why the switch?


I built a lot of the new system at Stripe: the old system would set a flag that was globally accessible inside of an API request. This meant that it was impossible to scope down which changes were relevant to which API methods or resources. Now we can statically introspect into changes.

Additionally, devs would need to specify what their changes were independent of where they made the change, which meant that our API reference (which can display warning flags next to changed fields) was missing changes. With the new system we can enforce that the change being made is properly documented, since we know that it's encapsulated inside of the change class itself.


I'm working on designing an API now that is based on Stripe's previous post from how they do versioning.

In general, the concepts employed by Stripe really encourage better design choices. All changes, responses, request parameters, etc should be documented and then handled automatically by the system. We took this approach in our design, although we don't do it with an explicit "ChangeObject" like Stripe does; it's a great idea though.

Hoping to be able to put out a blog post once we start implementing the system and getting feedback on what works and doesn't work well.


I wrote a library that will assist anyone wanting to do something similar for versioning using Django Rest Framework: https://github.com/mrhwick/django-rest-framework-version-tra...


I think this is one area where GraphQL really excels. It essentially can handle all of this versioning for you (as clients specify EXACTLY what they want) - you just need to make sure that as you evolve your schema that existing fields are left as-is and you only add new fields (not an easy task, but no harder than what you have to do in Stripe's protocol).


New fields don't constitute the kind of backwards-incompatible changes mentioned in the blog post (adding fields is just as easy with RESTful APIs). GraphQL does help cut down on payload size (no extra fields for clients that don't need them), but if you're making major changes to the shape of your data, you'll still need to write some kind of compatibility layer.


I've done this with GraphQL (major changes to the shape of the data), and I still think it's easier than other approaches.

For example, in those cases where the response object was fundamentally different, I've added a whole new Query field (and Mutation field) at the top level, that returned the new object. Under the covers there was still a bunch of complexity needed to support both top-level query fields, but old clients continued to access the old query, and new clients went to the new query.

Also, two other things I really like: With GraphQL it's trivial to log which clients are accessing which fields (at some point you can just decide to kill old ones). Also, the @deprecated support in the GraphQL schema IDL integrates wonderfully with the GraphiQL tool, so anyone browsing your schema for the first time always sees the recommended latest version that they should use.


This is awesome. As someone who's built many APIs, I have always wondered how Stripe managed all those versions. I knew their code couldn't just be littered with if/thens.

This is a really smart way to do it.

One question is, over the years, wouldn't you add a lot of overhead to each request in transformation? Or do you have a policy where you expire versions that are more than 2 years old, etc? (skimmed through parts of the article so my apologies if you already answered this)


Assuming that Stripe's userbase grows over time and that their new API versions offer features that are useful to a significant portion of their existing userbase, I would expect the majority of requests at any time to be for a recent API version that needs few (or no) transformations. This is especially likely considering that Stripe provides official client libraries (which are presumably up to date) for a lot of languages.

Stripe probably tracks the relative usage of each API version. If they found a lot of their users were stuck on an old version, that would point to a bigger problem than just some per-request overhead.


Stripe probably tracks the relative usage of each API version. If they found a lot of their users were stuck on an old version, that would point to a bigger problem than just some per-request overhead.

I'm not sure how true that is. In my experience, integrations with online payment services are mostly write-only code: you do it, you test it, and then no-one goes near it once it's in production unless there's some sort of known bug or security issue.

Literally the last thing I want to do with working, tested code integrating with the service that collects money for a business is make unnecessary changes that might break it. I know several businesses that use versions of APIs that are several years old with services like Stripe, because they have no need to change.


All your questions are answered in the article:

- They apply transformations to convert responses from the current version to the previous version, step by step.

- This means the overhead for old versions only applies to consumers of the old version.

- They support all versions going back 6 years.


To me this is one of the best things about the Stripe API. It's basically database migration files but for your API requests.

Does anyone know of packages that do this already? I have been contemplating creating one in PHP/Laravel for a long time but haven't had the time yet...


I’ve been looking for something like this in PHP too. Nothing yet though. Happy to work on something with you though!


I'm gonna hack on it a bit tonight before I go to bed. Here is the link: https://github.com/tomschlick/request-migrations


Hmm https://github.com/thephpleague/fractal

And it's underlying Transformers object API .... ?


Does anyone know if there's a publicly-available Ruby gem for doing what's described in this blog post - i.e., cascading transformations with a nice DSL? If someone from Stripe reads this, I think you could count on some decent community interest for this framework if you ever consider open-sourcing it.


It's not a DSL, but take a look at VersionCake.

https://github.com/bwillis/versioncake

"Version Cake is an unobtrusive way to version APIs in your Rails or Rack apps."


Thank you to Stripe for the efforts they make in this area. If you're building critical infrastructure around someone else's system -- and it doesn't get much more critical than the way you collect money -- then you really want this kind of stability.

I wish other payment services treated their long-time clients with the same respect (looking straight at you, GoCardless).


I'd love to see more specifics on the documentation automation. Keeping docs straight sounds like the biggest challenge with a system like this.


That's a great system - I'm totally going to borrow that technique when it next makes sense.

Hey, @pc with all the spare time your team has accumulated by using this api model maybe you could put it to good use. Might I suggest it's time to divert most of your tech resources into creating the next Capture the Flag? Because those were just awesome!

I'm joking, in case it's not obvious (but I would absolutely love another Stripe CTF).


From a Stripe developer perspective this sounds like a really clean way to handle API versioning.

From a consumer of Stripe's API's perspective, doesn't this make debugging or modifying legacy code a real pain? Let's say I'm using Stripe.js API's from a few years ago; where do I go to find the docs for that version? Do I need to look at the API change log and work backwards?


From the article:

"We also tailor our API reference documentation to specific users. It notices who is logged in and annotates fields based on their account API version."


Thanks for the clarification. I've used Stripe for side projects for over a year now and honestly didn't know about this until now (I haven't made almost any changes since I originally added Stripe integration). When viewing the Stripe API docs there is no sign in option, and only until I signed into my account did the docs provide a banner at the top indicating that I am on an older version.

I opened the docs in another browser to compare the differences and I am indeed seeing my older version when signed in. It's all very behind the scenes...I would have gotten tripped up by this had I tried to make changes to my code, and I wouldn't be surprised if many others have run into this issue.

My preferences would be for there to be multiple versions of the docs and have your version be very clearly laid out to you rather than automagically updating docs based on your pinned version.


Versioning of APIs have existed since the dawn of time, or at least the early 90s. ONC RPCs had versioning built-in IIRC for all their XDR structs. We got away from that over the last few decades, and now people run into the same problems that were effectively solved almost 30 years ago, but long forgotten. The more things change, the more they stay the same!


I wish every company would be this open on their strategies. Between this and netflix, its great to see the cutting edge in action


Most excellent write-up. I used Stripe for inspiration when writing a post that wound up here titled 'Pragmatic API Versioning', since been renamed 'How to Version a Web API'. It felt like the most clever way to reconcile change and stability, when compared to other major APIs.

It was a delight to get a peek behind the curtain. :)


Whenever an idea about API versioning springs into mind it is always worthwhile to watch Rich Hickey's Spec-ulation talk, https://www.youtube.com/watch?v=oyLBGkS5ICk for an alternative view of the world.


Github also versions their API via headers, but uses the `Accept` header instead: https://developer.github.com/v3/media/


I consider this a misuse of the Accept header. They're defining a bunch of custom media types, which in of itself is pretty weird, but then they're not actually using those media types as the Content-Type of the response. So I can send a request where I say 'Accept: application/vnd.github.full+json', but the response I get back is 'Content-Type: application/json; charset=utf-8', which means the server intentionally handed me a media type that I explicitly said I didn't accept.

I don't understand why they didn't just use the X-GitHub-Media-Type header as a request header.


>They're defining a bunch of custom media types, which in of itself is pretty weird,

No it isn't. If you're doing REST and your resources are not completely trivial (arguably, even if they are) you should be defining a custom media type for them (think HTML5).

>but then they're not actually using those media types as the Content-Type of the response.

Ok, that is very weird.


I disagree, if the response body is a valid application/vnd.github.full+json document, then they sent you what you asked for, they just labeled it differently (but still correctly).

Nothing in the spec says the Accept and Content-Type must match.


The spec may not mandate that Accept and Content-Type match, but it's still pretty strange. The client is saying "I accept this particular vendor-defined format, with an underlying structure of JSON" and the server says "ok here's a blob of JSON". Nothing in the server's response indicates that the response actually belongs to the specific vendor-defined media type, it could be handing back arbitrary JSON.

I get that from a practical standpoint the server needs to send back a JSON Content-Type, otherwise a lot of clients won't understand that it's JSON and won't decode it properly. But given this, shouldn't they have picked a different header to declare the API version?

What's the benefit of using the Accept header in this fashion? AFAICT there is no benefit at all over just using a custom header.


I'm surprised they don't run into CDN issues.

Have run into more than one issue with the vary: Accept being ignored, resulting in everyone getting the result from the first man in regardless of header variance.

Especially noticeable when we tried supporting both JSON and XML via the accepts header, with the version in the mime.


I don't think they run their API servers behind a CDN, that's not really a common thing at all.


If the API is transforming requests/responses essentially auto-magically behind the scenes, then how would a client know what version to upgrade to, in order to get new desired features?

Say it is splitting of street into street1 and street2 for an address element. How would I know what version to target to get this feature?


Considering the popularity of APIs, has anybody felt the need for a Accept-Version header?


In our experience, API versioning hits a roadblock in three cases:

1. Mandatory new field.

2. Field is split. For example, address field is now divided into street1 and street2.

3. Change in datatype.

In the above three cases, we had to force users to upgrade their versions.


I really like the rolling versioning approach. Makes a lot of sense.


Sounds like migration policy on iOS (Core Data)


Brilliant article. Whatever versioning schema that you use, you should at least start with _something_ at the outset of an application IMO




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: