Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How do you version control your microservices?
75 points by supermatt on June 12, 2015 | hide | past | web | favorite | 52 comments
When working on a project consisting of a large number of microservices, what solutions do you employ for controlling the composition of specific versions of each microservice.

I currently use git submodules to track the application as a whole, with commit refs for each "green" version of microservice. This "master" repository is then tested with consumer driven contracts for each of the referenced submodules, with subsequent "green" masters for deployment to staging.

This submodule approach requires a lot of discipline for small teams, and on more than one occasion we have encountered the usual submodule concerns. I'm concerned that this will only become more problematic as the team grows.

What are your thoughts for a replacement process?




    > I currently use git submodules to track the application 
    > as a whole . . .
This is a conceptual error. There is no such thing as "the application as a whole". There is only the currently-deployed set of services, and their versions.

You should have a build, test, and deploy pipeline (i.e. continuous deployment) which is triggered on any commit to master of any service. The "test" part should include system/integration tests for all services, deployed into a staging environment. If all tests pass, the service that triggered the commit can rolled out to production. Ideally that rollout should happen automatically, should be phased-in, and should be aborted and rolled back if production monitoring detects any problems.


> This is a conceptual error. There is no such thing as "the application as a whole". There is only the currently-deployed set of services, and their versions.

Im using Martin Fowlers description (whom I admire greatly): "In short, the microservice architectural style is an approach to developing a single application as a suite of small services" http://martinfowler.com/articles/microservices.html

My problem is knowing which microservice dependencies the application has.

> You should have a build, test, and deploy pipeline (i.e. continuous deployment) which is triggered on any commit to master of any service. The "test" part should include system/integration tests for all services, deployed into a staging environment. If all tests pass, the service that triggered the commit can rolled out to production. Ideally that rollout should happen automatically, should be phased-in, and should be aborted and rolled back if production monitoring detects any problems.

We have this, and it is what builds the manifest (as a git repo containing the submodules).

The problems occur when you have multiple commits to different services. If a build is marked "red" because it fails a "future integration" then it just means that it is failing at that point in time. It may, following a commit of a dependency, become valid. However, it would need to have its build manually retriggered in order to be classified as such.

This becomes cumbersome when you have a not insignificant number of services being commit to on a regular basis.


I admire Martin Fowler as well but he is so wrong here.

Microservices should be designed, developed and deployed as 100% autonomous entities. The whole point of them is that you can reuse individual services for new applications or integrate them with third party (e.g. different team in your company) services without affecting other applications. Otherwise what is the point of doing this ?

Using git submodules absolutely violates this and is a bad idea. I would have separate repos and deploy scripts and then set the version number of each service to the build number which guarantees traceability and makes it easier for change management.


The whole git submodule thing is solely a manifest of all the git refs for compliant microservices. Lets just pretend that it is a text file, and call it the service manifest (why the submodule hate!) :)

The microservices are indeed "deployed" independently, based on the version (ref) as indicated in the service manifest.

> I would have separate repos and deploy scripts and then set the version number of each service to the build number which guarantees traceability and makes it easier for change management.

They have seperate repos, they have seperate "versions" (we use the git ref, which is constant).

As for why I use submodules, this is (in pseudo-bash) what a service deployment looks like:

  git pull
  git submodule update SERVICE_NAME
  cd SERVICE_NAME
  ./start.sh


I beg to differ.

Microservices that depend on other microservices obviously are not entirely autonomous. You have to version large API changes and integration-test the whole chain of dependent services on every change.

It's not unlike libraries, just linked via network.


    > "In short, the microservice architectural style is an 
    > approach to developing a single application as a suite 
    > of small services"
Sure, it's a single application from the perspective of the user, and that's a great abstraction to keep in mind when you're talking about things like uptime. But from the perspective of the operator, it's purely an abstraction -- a name you give to the collection of things you're manipulating.

As such, I'm not sure building a single repo of git submodules makes much sense. Maybe for accountability reasons, but not for actual operations. The ·raison d'être· of microservices is to decouple the lifecycle of each service totally. By recombining them into a tagged release repo, you're arguably subverting that intent.

    > The problems occur when you have multiple commits to 
    > different services. If a build is marked "red" because 
    > it fails a "future integration" then it just means that 
    > it is failing at that point in time. It may, following a 
    > commit of a dependency, become valid. However, it would 
    > need to have its build manually retriggered . . .
Yep. To solve this problem, your CD system needs to automatically trigger rebuilds (and re-tests) of dependent (downstream) services when a dependency (upstream) changes. The easiest way to do that, in my experience, is to have a manifest in every repo that specifies dependencies as a flat list, which the CD system can parse to build a dependency graph.

Tangential point, but -- with this, we're just scratching the surface of the incidental complexity that comes when you move to a microservices architecture. This is the stuff that isn't in all of those exalting blog posts -- beware! :)


Something seems very wrong here.

You shouldn't have services rebuilding/redeploying other services. We don't insist that Facebook rebuilds their applications if my Facebook app changes. There should be either an (a) API client, (b) stub version of that service's APIs or (c) deployed test service for any services you depend on.

The architecture of microservices is pretty simple. It is a mini version of how the internet works.


On the Internet, if Facebook redeploys and breaks you, that's your problem, it's kind of fine for Facebook.

But in an organization or company, if Team-A redeploys and breaks Team-B, that's everyone's problem, it's not really fine.

So yeah, microservices are like a mini Internet -- to a point.


> The easiest way to do that, in my experience, is to have a manifest in every repo that specifies dependencies as a flat list, which the CD system can parse to build a dependency graph.

Thats pretty much how we manage the contracts.

> To solve this problem, your CD system needs to automatically trigger rebuilds (and re-tests) of dependent (downstream) services when a dependency (upstream) changes

This is GOLD. I think you may have just solved the problem with something so obviously simple that it has made me feel like a fool... Many thanks!!!


From your description, it sounds like your pain points don't come from versioning your microservice code; they come from versioning the data models that those microservices either 'own' or pass around to each other. While your approach of collecting your microservices as a collection of submodules is novel, that also defeats the purpose of microservices -- you should be able to maintain and deploy them independently without having to be concerned with interoperability.

While it's possible to alleviate some of your pains with versioned APIs to track changes to your data models, you also conflict with data you already have stored in schemaless DBs when those models update.

In a Node or frontend JS stack, I solve that problem with Vers [1]. In any other stack, the idea is fairly simple to replicate: Version your models _inside_ of your code by writing a short diff between it and the previous version every time it changes. Any time you pull data from a DB or accept it via an API, just slip in a call to update it to the latest version. Now your microservice only has to be concerned with the most up-to-date version of this data, and your API endpoints can use the same methods to downgrade any results back to what version that endpoint is using. And frankly that makes versioning your APIs far simpler, as now you move the versioning to the model layer (where all data manipulation really should be) and only need to version the actual API when you change how external services need to interact with it.

And now your other microservices can update to the new schema at their leisure. No more dependency chain driven by your models.

[1] https://github.com/TechnologyAdvice/Vers


> you should be able to maintain and deploy them independently without having to be concerned with interoperability.

How do you ensure that a service can be consumed? Or that an event is constructed with the correct type or parameters? Surely interoperability is the key for any SOA?

Vers looks interesting - I'll have a look at that! Thanks!


You're 100% correct -- interoperability is the key, but you ensure that by making sure everything at the interface layer (whether that's a REST API, a polled queue, a notification stream, etc) is versioned, and the other microservices using that interface include the version they're targeting.

If you include the version at the data level, any time it gets passed into a queue or message bus or REST endpoint, your microservice can seamlessly upgrade it to the latest version, which all of its own code has already been updated to use. If a response is required back to the service that originated the request, use your same versioning package (Vers if you go with that) to downgrade it back down to the version the external service is expecting.

If your interface layer is more complex, having responses independent from the data that change, that calls for a versioned API. Either throwing /v1/* or /v2/* into your URLs, or accepting some header that declares the version. But even in this case, you can drastically simplify changes to the model itself by implementing model versioning behind the scenes.


> you ensure that by making sure everything at the interface layer (whether that's a REST API, a polled queue, a notification stream, etc) is versioned, and the other microservices using that interface include the version they're targeting.

How would you handle deprecation, for example, in the case of a complete service rewrite?


You answered that one yourself :) Deprecation means to publicly notify that some API endpoint has hit end-of-life, and that a better alternative is available. If you completely rewrite a service, it's your responsibility to implement the same interface that you had before on top of it. Then you deprecate it and also publish your new api or data schema. Once you get around to migrating the rest of your application's services away from the deprecated endpoints, the next version of the microservice in question can remove that old code entirely.

Imagine Twitter or AWS completely rewriting their backend -- if they were to announce to the public that at a specific time, their old API URLs would 404 and the new ones would go live, it would be a wreck. They'd support the old API through deprecated methods and tell users they have X months to migrate away, if they remove that layer at all. Stress-free SOA must employ that same level of discipline in order to stay stress-free.

--And, functionally, the much easier alternative here isn't to re-implement your old API on top of your ground-up rewritten shiny new service, it would be to reduce your old service to an API shell and proxy any requests to the new service in the new format. Far less work that way. Use more traditional API versioning for the much more common updates. Unless you're rewriting your services every other week, in which case you have an entirely different problem ;-)


> You answered that one yourself :) Deprecation means to publicly notify that some API endpoint has hit end-of-life, and that a better alternative is available.

Thats exactly what the contracts do. The build would fail integration because the client is expecting the older version. This does mean that we don't have multiple incompatible versions of the service in the field, which is a mixed blessing.

> If you completely rewrite a service, it's your responsibility to implement the same interface that you had before on top of it.

Duplicate effort for redundant functionality?

> Then you deprecate it and also publish your new api or data schema. Once you get around to migrating the rest of your application's services away from the deprecated endpoints, the next version of the microservice in question can remove that old code entirely.

Which is exactly what the contracts do, wait until the services have started using the new API before the build is approved. The ONLY difference is that we can remove dead code and apis immediately, rather than doing it in the proverbial "tomorrow".

> Imagine Twitter or AWS completely rewriting their backend -- if they were to announce to the public that at a specific time, their old API URLs would 404 and the new ones would go live, it would be a wreck

None of these services are public facing. The public facing services are exposed via a border-api (which does indeed have a versioned api).


Vers is a really interesting idea. Is there anything analogous to this in python?


I'd love to hear back if you find something like this. Up until Vers was released, I lazily boilerplated it into whichever of my services needed the functionality. Even so, the pattern has been such a boon to our development cycles that it seems strange others haven't independently come to the same approach.


IMO this makes more sense.


We deploy every microservice separately and their API has to remain stable, because that is how APIs should work: remain stable. If there is a breaking change, it has to be forward and backward compatible for awhile. Each microservice is blue / green deployed, not all of them as one.

Also look into the `repo` tool by AOSP for managing many repositories.

At Clarify.io, we have about 60 repositories and 45 services we deploy.


Keeping the API stable definitely makes sense.

How do you define what is considered the current version for each microservice in your application?


Each team owns their own microservices. They deploy when ready. Usually this means master is what is deployed. If another team uses that API, then the owner of the service is responsible for maintaining that compatibility until they are able to get the other teams to update their usage of the API.


Sorry for any confusion, this is regarding continuous delivery pipelines, not manual release management.


Looks like I cant reply to you due to comment depth, so I'll reply here.

The reason for being able to push "breaking" changes into the codebase is for future consumption. We control the consumers, which verify that they are able to integrate.

When all consumers have had their contracts updated to be insync with that service, then that service is no longer breaking - even though it hasn't been changed.

Its basically a way of preventing inter-service blockers.

I must admit, I thought these kind of pipelines would have been a lot more commonplace than they appear to be.


They are very complicated pipelines to model. It is simpler if you simply treat other consumers of your services as customers who can't change their code at every whim. They've invested time and money in to your API, and are going to be pissed if you change it all the time.

Obviously this isn't the exact scenario you're in at the moment, but if you're big enough -- this is exactly what it'll be like.

Implement backwards compatible changes and implement tracking on the users of the old features you want to disable. When your tracking shows nobody is using the old features, delete them.

That is how we manage it.


We use a continuous delivery pipeline. This isn't a manual process, but a team isn't going to (by policy) deploy a breaking change which breaks their API for other users of that API.


We use the Jenkins build number and lock down the deploy job.

It's simple and provides traceability as you can go into Jenkins and see the commits associated with that release.


You don't really control the composition that closely. The whole point of having a lot of microservices is that you can update and work on very isolated functionality at a time distributed.

As other people have noted as well in here, you should always keep the interface backwards compatible, if needed make a second version of the API or the messages, but never really have to deploy more than a couple of services who really have changed their behavior. The ones just interacting with those services should experience the same interface, being it a couple of versions older or newer.

I'd recommend watching the ucon videos [1] or Udi Dahan's Advanced Distributed Systems design [2] for more in-depth reference material. If you're transforming a team of engineers I can really advise you to join the latter and afterwards order the videos so you can use them with your team as training material! This is less about microservices as them being micro, but more about setting up a distributed service oriented architecture.

[1] https://skillsmatter.com/conferences/6312-mucon#skillscasts

[2] http://www.udidahan.com/training/


Many thanks for the links. I'll definitely review them.

I suppose the synchronisation of consumer and service compatibility is the biggest concern.

So far, everybody is focusing solely on backwards compatibility, but not future compatibility, which is what the contracts are for.

With regards to the backwards compatibility - breaking changes happen! As long as the service remains functional, and remains compatible with its consumer contracts (which can also change) I shouldn't need to worry about deprecating APIs. Anyone keeping deprecated functionality around in an environment where they control the both the services and the consumers is simply asking for problems.

I can't see how it can be possible to not control the composition of microservices. Surely thats exactly what my CI pipeline is doing? Composing a network of compatible services?


/api/v1 ---> /api/v2 if you have breaking changes. You should expect things go be loosely coupled and potentially be updated independently. Maintain backwards compatibility by versioning things like the above


Having the version in the API URL is one way, but there are other methods. Examined by Troy Hunt here:

http://www.troyhunt.com/2014/02/your-api-versioning-is-wrong...


Sounds like you manage this manually. Why not have a CI server (travis, jenkins, w/e) build each microservice seperately (e.g. after a push to master) and then attempt an app release with all microservices?

You can also do parametric jobs in jenkins which could allow combining arbitrary microservices versions

Or just version your APIs and declare explicitly what microservice uses what version of the API to communicate with the other service.


We use jenkins for testing the individual microservices, and the application integration as a whole.

The approach is similar to parametric builds in jenkins. The problem is the management of the parameters.

For example:

msA v1.0 & maB v1.0 are individually green, and pass all integration tests. Green build for app.

msB is updated to v1.1. Passes individual tests but fails integration with msA v1.0. Red build for app.

msA is updated to 1.1. Passes individually. Still fails integration with msB v1.1. Red build for app.

However. msA v1.1 is still compatible with msB v1.0, so we could have a green app build with a newer version of msA.

Automating this process is what is becoming cumbersome. We have many more services in dev at any one time.


This sounds like a constraint satisfaction problem. If you want to go all hardcore on this (and willing to pay the complexity price), then you can devise a system that can, on every new build of every new service, try to find the optimal combination of service versions that "work together nicely". Here are some data points that you'd need to implement this:

- What other services each service depends on

- What previously released versions of services worked with what versions of dependecies

- Ability to automatically test new versions of each service against its dependencies

Then you plug all that in to a solver that runs on every build, and voila, you have your Microservice Dependency Constraint Satisfaction Manager Machine™. If you do go this route, it would make for a great open-source project.


Others already pointed a lot of things out. I will throw in my 2 cents.

Micro services should not be directly talking to each other. This couples them in a way that a small API change can be breaking. Instead use a messaging solution so that each service is passing messages and grabbing messages off a queue to do work. This is the easiest way to prevent coupling. It also allows you to version messages if need be and you can deploy a new service to consume those messages. We use JSON, so we can add to a message with no ill effect, and we are careful about removing any required attributes. So we haven't had a need to version messages, but the ability is there if we find it is needed at some point.

Adding messaging does increase complexity in some ways, but once you pass having a handful of services this is the easiest way to manage it.

As a side note. In our solution we have an API that the website and soon to come mobile app tie to. That API interfaces directly with some data schemas but in many places it simply adds messages to a queue for processing.


I don't get it, that's still an API.

All that will happen is that it won't be able to process the message, instead of not being able to serve the request.

This thread seems to be full of mad people determined to make simple concepts incredibly complex.


This.

We use microservices but we do allow services to call services.

All endpoint and data contracts exist separately to services and are used by both the service provider (server) and a consumer so all internal contracts are enforced.

That and a fuck ton of integration testing make it work.

And to be honest I really don't like the architecture. It's much much much better to serve hundreds to thousands of smaller instances of your application than one massive thing divided into lots of services.

This is a ghetto I wish we weren't in to be honest.


I have seen places where it makes sense to call directly to another service, and we violate the message rule sometimes ourselves. But in general I really try to avoid it as when we started we had a lot of it and the interdependencies. With that solution build and deployment became worse then just having a monolithic application, and we were ready to kill each other.

I'd also say, microservices are not for every problem, too many people are applying them to the wrong problems (but that is how we all generally learn limits). And a web/app API is generally not where I'd say deploy microservices, but the backend services for that API would make sense. That is exactly what we built now that we are on version 3 of this architecture. API for us is now a totally different beast than the microservices and the whole solution has become a much more manageable with way less churn and confusion. Not to mention deployments are way easier.


I also wonder about the real need to deploy mix and match versions. I like using a mono-repo, actually continuously integrating and then blue-green deploying any set of services that has passed my acceptance criteria. Roll the full set forward, or roll them all back, but I don't want to get into a situation where due to keeping some changes and rolling back others I end up with an untested combination of services deployed.

If I had to do that then I would definitely be looking at versions message schemas, or contracts like Pact/Pacto but it seems much simpler to just deploy a full set of tested and known to work together services at once than to version each one.

Someone else referred to this pattern as a distributed monolith but I disagree. I still gain the benefits of scaling the size and number of instances of my micro services independently, but remove a lot of the burden of independent versioning and keeping track of the version compatibility matrix.

For anyone interested in a simpler path I think there is a lot of useful information in the book Continuous Delivery by Jez Humble and Paul Hammond.


To me there is a distinct difference between an API and a microservice. A microservice should be self contained and have all the data it needs to do its job either passed to it or within its own data context (database/cache etc). This is what makes it simple to scale, manage and keep isolated.

The easiest example of a microservice is an email service that all it does it pick up an email off a queue and send it out the door and then send a message that it is complete, with possibly a return from the ESP or SMTP communication.

So keeping it super small, limited functionality makes it easy to debug and simple to scale and manage.

An API might have dozens of endpoints to serve its client(s) and many times cannot be async in nature as a client needs a response now. For example, logging into a website or application, I need to know if the user is authenticated now, not wait for a message to make its way through a series of services. So I don't consider and API and microservices to be the same, although I agree some functionality could be used in either one sometimes so it can be a little confusing.

The only thing that I do agree they share is a contract, in the case of the API it is the endpoint and data, in the case of the service/message it is the message itself.


Email is one of the few parts of any system that always needs to use a queue, because of the need to be asynchronous and to occasionally delay/retry.

It's not a great example.

Your distinction might make sense to you, but it's technically wrong. An API means a specific thing. You're almost saying that a car door and a house door aren't both doors. A library has an API and a directly called microservice has an API and a message based microservice has an API. A 'contract', as you call it, is an API.


> This thread seems to be full of mad people determined to make simple concepts incredibly complex.

all microservice architectures look like that sometimes after a day of fighting them.

source: have ~30 microservices in a project at $WORK


How do you control the development of independent libraries so that there's no incompatibility problems? Really, people have been solving this exact problem for decades. The fact that you changed a word there for something more specific does not change the world.

The answer, of course, is that you version it. Not put in version control, but manually assign version numbers to it.

You try to make it possible to use several versions at the same time, but that's not always possible. If you have to use only one version, make sure to not make any incompatible changes in a single step, first you deprecate old functionality, some time later you remove it. Some times that's impossible, it's natural but it'll hurt anyway, keep those times at a minimum.

Also, make sure you mark your versions differently for features added and incompatible changes, so that developers can express things like "I'll need an API newer enough for implementing feature X, but old enough so that feature Y is still there".


Have you considered keeping all your code in one git repo? That allows you to know exactly how code dependencies fit together.

This approach is almost certainly not a robust, long-term solution, but it has served us well for a couple years, allowing us to evolve our APIs quickly without spending any of our early dev effort on internal versioning.

Whether it's appropriate for you comes down to your reason for using microservices in the first place.


We don't share any code between the microservices.


There is no shared code (aside from dependencies, which are managed independently) - the "master" repo is simply a list of git refs (by way of submodules), and a script to bring up the services.

Microservices must be compatible with each other. We cant simply bring up the latest version of "microservice A", because comsuming "microservice B" may not have been updated to account for API changes (which are enforced by testing contracts). Thats what this master repo is for, to track which microservices work with each other.

Obviously, the master application is dependent on the microservices. Microservices are dependent on specific versions of other microservices, etc. That is the problem I am trying to solve.


You need to keep your interface contracts more stable, so you do not have to deploy all at once. You are building a distributed monolith if you have to deploy them all at once.


> We cant simply bring up the latest version of "microservice A"

This should not be the case. Build your services to be backwards compatible. Avoid making breaking changes, but when you do, make a new API version & maintain the previous one until it's no longer used.

i.e. /v1/blah

-- breaking change introduced --

/v1/blah - maintains backwards compatible behavior /v2/asdf - new behavior exposed as v2, update microservice B at some point in the future to use this


We have contracts to ensure API compatibility. That means we CAN simply rip out the old API and put in a new one, but the integration will fail until the consumer contracts sync up.

As these are internal services (microservices) that only we consume, there is no need to keep old cruft around.



An interesting article, but it doesn't deal with the versioning of the microservices - only the orchestration of those versions.


Versioning interfaces (APIs) is much more important than versioning the softwares that implement them.

Consider Semver for your interfaces. This is really important.


First, there is a major difference between the component (e.g. the software package) and the connectors (e.g. APIs). It makes sense to talk about versioning at the component level, but it should rarely apply at the API level. See this excellent article for further clarification: https://www.mnot.net/blog/2011/10/25/web_api_versioning_smac....

Versioning an API is a decision by the API provider to let the consumer deal with forward and backward compatibility issues. I prefer the approach of focusing on the link relations or media type vice URI or some other technique of versioning because it is consistent in the direction (wrt Hypermedia-based APIs) of the link relations as the point-of-coupling for your API, which makes managing and reasoning about changes to your API less complicated.

Whenever possible, hypermedia-based media type designers should use the technique of extending to make modifications to a media type design. Extending a media type design means supporting compatibility. In other words, changes in the media type can be accomplished without causing crashes or misbehavior in existing client or server implementations. There are two forms of compatibility to consider when extending a media type: forward and backward.

Forward-compatible design changes are ones that are safe to add to the media type without adversely affecting previously existing implementations. Backward-compatible changes are ones that are safe to add to the media type design without adversely affecting future implementations.

In order to support both forward and backward compatibility, there are some general guidelines that should be followed when making changes to media type designs. 1) Existing design elements cannot be removed. 2) The meaning or processing of existing elements cannot be changed. 3) New design elements must be treated as optional.

In short favor extending the media type or link relation and focus on compatibility. Versioning a media type or link relation is essentially creating a new variation on the original, a new media type. Versioning a media type means making changes to the media type that will likely cause existing implementations of the original media type to “break” or misbehave in some significant way. Designers should only resort to versioning when there is no possible way to extend the media type design in order to achieve the required feature or functionality goals. Versioning should be seen as a last resort.

Any change to the design of a media type that does not meet the requirements previously described in are indications that a new version of the media type is needed. Examples of these changes are: 1) A change that alters the meaning or functionality of an existing feature or element. 2) A change that causes an existing element to disappear or become disallowed. 3) A change that converts an optional element into a required element.

While versioning a media type should be seen as a last resort, there are times when it is necessary. The following guidelines can help when creating a new version of a media type. 1) It should be easy to identify new versions of a media type. a) application/vnd.custom+xml application/vnd.custom-v2+xml b) application/custom+JSON;version=1 application/custom+JSON;version=2 c) * REQUEST * PUT /users/1 HTTP/1.1 Host: www.example.org Content-Type: application/vnd.custom+xml Length:xxx Version: 2 … 2) Implementations should reject unsupported versions.

Hope this helps!




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: