Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What's your biggest struggle with Microservices?
78 points by mymmaster on July 25, 2017 | hide | past | favorite | 73 comments
I'm working on a book about microservices and want to know what questions and topics you'd like to see covered.

What were your biggest hurdles when first adopting microservices? Was it hard for your team to determine service boundaries? Did you struggle with developing and managing them?

Convincing people that microservices are not a cure-all but just another design pattern.

You have to start out with a monolith and only if you realise along the way that some components might work better as a service (micro or not) you should extract those. Until then, commonplace modularisation will serve you just fine.

Once you have more than 1 microservice running infrastructure becomes a huge problem. There's no real turnkey solution for deploying and running internal / on-premises microservices yet. You basically have to build monitoring, orchestration and logging infrastructure yourself.

One rules of thumb is "one team one service". If you have multiple teams working on a service then it might start making sense to migrate to multiple microservices.

That's a nice rule of thumb but it's radically different from the team I work on. We run half of one logical service (visible to users outside the team), but that logical service is implemented with a dozen or more microservices internally. We have user-facing services which need high priority, different flavors of batch jobs which need to be orchestrated and prioritized, and various other pieces of infrastructure.

The number of different services is close to the number of people on the team.

This is working very well for us, and it provides us with some welcome isolation when there are problems with one of the microservices. Maybe we can go into read-only mode or stop processing batch jobs for a while, depending on what services have problems.

But we also have good infrastructure support, which makes this a lot easier.

Yeah I could totally see how if you have very strict uptime requirements and you want to allow different pieces of the infrastructure to be able go down at different times then it's an exception to the rule.

Just for every team that I see that has a good use case for micro services, and does the hard work of instrumentation and deployment, I see 8 teams that go with microservices because they think it's a magic bullet. Then they don't spend the time and effort necessary to get instrumentation and orchestration up and running. They don't aggregate logs, they don't spend the time to create defined contracts between the services, they don't make services robust to the failure of other services. They just complicate their debugging, deployment, uptime, and performance scenario without getting any of the benefits.

In this case, we don't have strict uptime requirements. But there are enough times where our integration tests don't catch some kind of error, and it's nice that the service doesn't have to go completely down for that.

It's also a lot easier to prioritize process scheduling than it is to prioritize thread scheduling.

I would make an entirely different case.

I had a system where the main running cost was MySQL. It turned out that I needed to provision a lot of MySQL capacity because there was one table that had a high rate of selects and updates.

The hot table did not use many MySQL features and could easily be handled by a key-value store with highly tuned data structures. That's a place where a "microservice" which has it's own address space, if not machine, makes it possible to scale parts of the system that need to be scaled without scaling the rest.

I see no problem in a "monolith" using different kinds of databases.

Sure, but peeling off something that has radically different scaling properties is easier if you put a "web service" in the way.

Still, based on what you're describing it was MySQL in your case that was the bottleneck. In that case you scale database, either by sharding or creating separate ones for different usecases. It is irrelevant to whether you use microservices or not.

I don't understand why putting a web service in front of mysql and a key value store makes scaling easier. Would you mind explaining?

Simple. Most of the database is "large" in terms of data (say 50M rows) but that database gets maybe 10,000 updates a day. The read load is well-controlled with caching.

One table is small in terms of data but involves an interactive service that might generate 50 updates/sec at peak times.

With the "hot" service implemented on top of a key-value store, the database is the ultimate commodity, I have many choices such as in-memory with logging, off-heap storage, distributed key-value stores, etc.

The service is not "in front" of MySQL, but is on the side of it so far as the app is concerned.

Agree - the biggest gain microservices give you is they allow many-teams to make progress together. If you don't have many teams then the overhead may not be worth it.

That's a pretty good rule of thumb - nice

I agree with you that these things should probably start with monoliths and then migrate to microservices.

I'm lucky that the project that I'm working on has support to use JHipster (https://jhipster.github.io/) with microservices deployed to OpenShift (https://www.openshift.com/). I used MiniShift to test my deployments, metrics, monitoring, orchestration, and logging locally. This was mostly for a proof-of-concept.

We should upvote this to the top, it's a pretty good summary of the whole situation.

Our org made all the right decisions at the top, invented their own microservice fabric stuff with way too much tech debt, end result was a very bad (inefficient and painful) development experience with way more technical debt than the monolith we replaced.

There was so much fragmentation between each autonomous team that we were forced to standardize on a common framework (think of build/integration pipelines and common modules) which could almost never be changed or refactored.

The upside was that there was way more automated testing and flexibility in deployment, so overall it was a net gain.

Still, the actual ergonomics of working with the build/release system was so painful it made me leave to work somewhere where I could do actual innovation.

There was so much waste that was already 'baked into the system' and the other teams finished their work so if you had struggles with it it was your problem

Overall a net gain, but it made you leave?

That definitely does not sound like an overall net gain.

It was a net gain for them..

They were doing even worse before. They upgraded to a CD style pipeline to AWS instead of monolithic java apps. It was better but nobody realized there were deep and very wrong tech debt. The complacency around this was a big factor in causing me to seek new challenges.

The biggest pain points were always not the OSS stuff that was adopted, it was all the “NIH” innovations that were developed because the OSS stuff wasn’t quite right, stuff that “worked” but had no support or allowances for improvement.

Integration testing, dev environments, and data sharing.

When we started switching over to microservices, originally we were going to have some sort of standardized message format, which would include all of the info that service would need to do its job.

However, there were too many people building it too quickly, and it's since devolved into a spaghetti pile where almost every microservice has about 3-4 others on which it is dependent (and a handful on which those are dependent) - to query for some bit of data, set off an async task, etc. This obviously complicates the development environment significantly.

When I want to add a feature to a service it can take hours trying to bring up the correct dependencies, update their configuration, fix the errors that they're returning due to their dependencies being down, and set up the necessary databases and ssh tunnels.

Testing and debugging inter-service concerns. These, in my opinion, are inherently difficult to do when working with this architecture. In other words a lot of ugly real-world complexity hides in the space between the neatly maintained gardens that is the inside of a single service giving a somewhat false view that the problem has been successfully chopped up into small, compose and manageable pieces.

I've found (the hard way) the same thing.

Testing of individual microservices in isolation is insufficient. Interaction between services has to be tested as well (because real APIs never work exactly as documented/mocked). Proper integration tests aren't any easier than testing a monolith.

This is interesting. Can you provide an example - where it wouldn't have made sense to collapse two such interacting services into one? My understanding is that an important litmus test for carving up microservice boundaries is ensuring 'true' isolation/separation of concerns. Admittedly, I don't have enough practical experience to know if this is a misguided assumption.

Someone here mentioned a great rule of thumb that addresses most of issues here: one team = one service.

This is kind of weird rule since a MS is said to contain 1k loc. And having a full team (other than one guy) for 1k loc is... well how can you screw up so little code? The hard part isn't the teams. The hard part is to know what a MS should contain and still be useful.

My biggest struggle is people want to use them. I can't tell you how many junior devs I've ran into who dream of every single function call becoming a distributed read/write in the message bus.

It's tough to fit this in a small comment, but I would say avoid microservices until you know they are the right solution. They are often the right solution when you have multiple teams with independent development schedules. They are often the right solution if you have daemons that have fundamentally different characteristics (web server vs message consumer).

They are often not the right solution when you decide you need another module in your app.

Distributed systems are moving parts. Moving parts break. You shouldn't have more than necessary. Deciding what "necessary" is is a pretty tough task.

We started with a monolith which made determining service boundaries that much easier. What we struggled with the most is the concept of sharing data between services. We started off with publish/subscribe method which quickly became very strenuous to maintain. Ended up switching to plain service-to-service communication over HTTP/S. This allowed us to focus on resiliency and performance of each micro-service more so than before. And so far it's been paying off great. Other things like having an overall view of the health of the entire system and having a way to consume and analyze logs in one place is absolutely detrimental to success in this type of architecture.

"Ended up switching to plain service-to-service communication over HTTP/S."

That statement assumes this picture:

  ServiceA -> Network (https) -> ServiceB
I'm curious, what happens to the message being sent from ServiceA if Network or ServiceB is down?

We have a fraction of inter-service communication still flowing through pub/sub via RabbitMQ. This is typically done in cases where we absolutely cannot lose a message. Most of the communication, however, is via https with each microservice in the mix providing guaranteed high uptime, reasonable response time and some way of handling various failure conditions (ie fallback to cache, exponential backoff and etc). So while you do lose your message in this scenario, you have to have a very catastrophic network failure for this to have a wide-spread impact. We occasionally see blips which are dealt with via retries. So this has worked out fairly well at our not-so-crazy scale. I personally believe there is no right or wrong here. It all comes down to what your team is comfortable with. I think pub/sub is a great approach if you're willing to sink some time into it. Synchronous communication also has some very interesting development by companies like Netflix (Ribbon?).

That's why I like Foxx microservices/arangoDb - but it's really more of something you need to do from the ground up


instrumental is the right word lol :)


For us, it was managing them, in all senses (development, logging, secure communcation between, etc.). Kubernetes and (Calico for networking) help dramatically to address these problems. I finally really see the value in microservices; before we started using Kubernetes, all my attempts with a microservices approach made me hate them.

One of my biggest hurdles was trying to figure out if microservices were right for us (a small distributed team) and then communicating/selling the benefit of them to my team and bosses.

This involved creating a mutually agreed upon definition of what we actually meant by "microservices". I literally repurposed paragraphs out of Sam Newman's book to do this.

As a small team, I also didn't have examples to support my argument that microservices would benefit us.

What did you end up deciding? And what were you thinking the benefit might be?

My sense is that for a small team a microservices architecture might be more trouble than it's worth, but I can definitely see how some kinds of applications - or even kinds of teams - it could be a good fit.

Integration testing. Unit tests are not enough. You need to test the interactions between microservices at runtime because there are a lot of things outside of the code that can break, such as version incompatibilities, caching problems, timeouts, etc.

Running microservices is like running a city: you need roads, highways, police departments, fire departments, and so on.

Without an abstraction layer and proper planning, different teams will start building microservices with lots of common functionality like security, authentication, logging, transformations that over time will increase the complexity and the fragmentation of the entire system. Ideally you would want services to simply receive a network request, serve a response, and delegate all the complimentary middleware execution somewhere else, like an API gateway.

Traditionally API gateways in a pre-container world were centralized in front of an API, but modern API gateways can also run in a decentralized way alongside your server process, to reduce latency.

Then once you have everything in place, you end up with a bunch of microservices that your developers (maybe belonging to different departments) will have to use. You will need to fix discoverability, documentation and on-boarding. Some of these services may be also opened up to partners, so there's that too. Traditionally developer portals were only being used for external developers, but they are becoming a requirement for internal teams as well.

Finally, you need to carefully monitor and scale your services. Without a proper analytics and monitoring solution in place it will be impossible to detect anomalies, scale horizontally each service independently, and having an idea of the state of your infrastructure.

Generally speaking running microservices can be split in two parts:

1. Building the actual microservice.

2. Centralizing common functionality, documentation/on-boarding and analytics/monitoring.

Most of the developers or project managers I regularly speak to tend to forget the second part.

Disclaimer: I am a core committer at Kong[1]

[1] https://github.com/Mashape/kong

This. I really like the analogy that of running a city. There are just so many concerns in and around building and running microservice architectured systems: security, logging/analytics, failure modes/recovery/detection/self-healing, message protocol versioning, data persistence, capacity scaling and many more. Much of these are easy to defer/overlook early in dev't stage but will later become very hard, if not impossible to incorporate in.

Btw, we thank you and your team as we are Kong users as well. :)

Kong is an amazing piece of microservice architecture. I've been very happy with how it behaves.

Development environment becomes somewhere between extremely hard and impossible to set up. Resign yourself to running tests locally with staging as the only place to test your services talking to each other. This is my experience at a medium size public company on the bay area.

This is common, but it shouldn't be. If you have setup scripts to run each service, or if they're containerized, it's just a little additional work to get to where you can spin them up with a single command.

Docker has been a lifesaver in this realm. If you do the work to containerize each service (non-insignificant upfront cost), getting all your services running on a local box is as simple as a `docker-compose up`.

Orchestration: Keeping track of what versions of what services need to be live, where services should find the correct versions of the other services they need to talk to, spinning up staging, production, and test versions of services, feature flagging services, etc.

On a technical note, distributed transactions are tough to get right.

Authentication / Authorization. If you end up with a round trip to your big iron anyway, the benefits get muted.

Micro-services must be under the same mantra as bomb defusers:

1- You first mistake is wanna to (be a bomb defuser) use micro-services.


2- You second mistake is (getting close to the bomb) start building a micro-service infrastructure.


3- You third and worst mistake is believe (you can defuse the bomb safely) you can build a microservice solution correctly.

The only reason to (be a bomb-defuser) use microservices is (you are part of the police or army and somebody must do this) you ALREADY HAVE SOLVED THE MAIN THINGS FOR YOUR APP and now are in the hostile territory where micro-service truly make sense. You are facebook-alike.

Totally agreed. Few months ago we started building micro-services infrastructure. We were waiting with that for a bunch of years until things started to get harder and harder to maintain within one app. This is really costly at the beginning if you want to do that correctly.

Convincing people that "stateless" really does mean "stateless", and if you hold state, you're going to have a bad day when your service is stopped.

Latency is another big one - the communication latency between any two points is miniscule, but throw together even 4 services, and suddenly the time required to encode, route, transmit, receive, and decode adds up quickly. If you want to keep round trips to under 100ms in your own code, each microservice you add could easily subtract 10ms from that budget.

I'm privately skeptical about starting with microservices-first for new application development. I have seen a lot of literature about starting off with a monolith and then breaking that down as you see fit. This makes more sense to me than introducing a ton of complexity up front.

Or is it better to deal with that complexity up front and then adding new services/functionality will be easy as time goes on?

I'd recommend starting with a monolith, but pay extreme attention to properly separating modules and hiding them behind a small, well-defined API. Then it will become easier to separate these modules out as necessary.

Whether you work with a monolith or microservices, decoupling is the key, and the best way to ensure decoupling is using the proper abstractions and APIs.

Each services need to communicate with each other. And, 100 things can go wrong in that and lot of effort and fail-safe mechanisms has to be put in order for it to scale. Having said that there are lots of solutions out there but does require good amount of understanding and work

Just some topics, can elaborate on those later, if you want: - Development Setup (e.g. minikube) - Circuit Breakers - Managing Backpressure - Service Discovery - Encryption - Monitoring / Metrics - Domain Translation - Load balancing / Fail over / retry

The biggest struggle is dealing with the dependency chain when you need to update multiple microservices in order to deliver a major new feature to customers. You have to get everyone to agree on changes to the service APIs, then change the lower level services, deploy them, then move up a level and repeat. This is especially problematic if different developers are responsible for each service; lots of opportunities for delay and misunderstandings. It seems counterintuitive, but if you need to move quickly then a single monolithic application can actually allow for more rapid changes.

One big challenge I've found is the balance between standardization vs autonomy. E.g. monitoring, log formats, deployment, service discovery, etc are things that can benefit from being standardized across all services, but implementing it can be tricky. Too much autonomy and you end up reinventing the wheel (inevitably with variations) - too much standardization (e.g. by providing centralized libraries) can make development slower and create dependency hell.

I haven't seen anyone mention developer skill yet. We tried microservices where I work, but the developers skill in debugging async code proved to be problematic.

Long term ownership of services is always a problem.

In enterprise you might end up with different gate keepers, each maintaining a differe service... getting changed or added is impossible

Architecture, Architecture, Architecture.

The easiness to develop a single service makes many engineers loose focus on the big picture. Without an proper architecture and governance in place, larger microservice projects tend to fail pretty fast and in horrible ways.

I'm still not quite sure how to think about and code for eventual consistency.

Segregation and service discovery. How to avoid multiple jumps between networks that separate Microservices. Also distributed transactions. Reliability is a huge problem with network separated Microservices.

testing infrastructure: we have a bunch of teams that now need to run jenkins full suite tests, that comprise many servers, so there is an explosion of servers X branches that makes managing all of that very difficult.

Doing microservice right with proper data isolation is a challenge. Once you do that the next challenge awaits - strategy and infrastructure to replicate the right data set to other domains

From my experience, there are several issues that were mostly touched by the other commenters:

  * auth / authorization: easy to grasp, tough to implement, 
    opens a world of architectural wtfs
  * service configuration files / variables
  * understanding that more microservices can run on 
    the same host
  * system entry point, credential based, jumpbox
And the most important issue I keep seeing across the systems I work with is documentation and speed of onboarding new staff.

Happy to chat more, I am definitely not an expert on the domain but for what is worth I was technical lead (whatever that means to you) on an AWS Case Study project.

Well, I'd like more info...

What was the case-study? Can you link to it?

What has been your best resource to support/inform any of your arch design considerations?

Sure, the case study would be this https://aws.amazon.com/solutions/case-studies/rozetta-techno...

For architecture debates ( yes there were some :) ) the resources used would vary depending on the level of the issue at hand.

For inter-service communication we had a standardized API interface that would need to be exposed and most of the discussions would circle around how does the service solve a problem and what API does it expose to the outside world.

For general system workflow issues we would resort to diagrams or good old fashioned whiteboarding whose artifacts would also get converted into diagrams.

A general rule-of-thumb that I pushed was that every single service would be documented starting with the design process and ending with support info.

Hope that makes any sense :)

I've worked at least one place that did Microservice as a trendy tech salad. How do you write microservices without lots of lock in?

Can you please do an intro that is an ELI5 on Microservices, how/why/when/where they are used etc...

Determining the difference between a service boundary and a library boundary.

New docs / APIs / terminology / versioning

Management/orchestration. It's really convenient to be able to deploy things in different docker containers, and isolation, and etc., but the actual orchestration seems to be unnecessarily painful. In no particular order:

* k8s is apparently ridiculously hard to set up if you want to use something that isn't AWS / GCE / GAE / non-"big-cloud"-provider-here.

* Many of these have questionable (IMO) default authentication choices. I can understand if it's running on an internal network or something, but if you can't set up an internal / private network to hide the management interface (ex Marathon, k8s dashboard, ...) inside, you're SOL it seems. Ex. Openshift Origin having a login UI, but the default allowing logging in with basically anything you want.

* Rancher seems nice, but between the internal DNS service randomly failing all DNS queries until a full restart and rancher-server's internal MySQL server writing to the disk at 500Mbps, it's... iffy. At least it's a nice UI/UX otherwise and supports lots of authentication options out of the box.

* Straight-up not working. For example, I tried running Openshift Origin on stock Debian and Ubuntu installations, and it couldn't even finish starting up before it crashed and burned. I did file an issue against openshift/origin about this, but so far it's been unresolved. Another example with Openshift, after figuring out how to work around it failing to even start up, it can't even create pods/containers. This is apparently a known bug that's been around for a while (~4 months) with seemingly no resolution.

* Weird docker version requirements. I can get that things like k8s would want to pin to LTS versions of docker. That's fine. But if I can't even import the signing key to install the LTS docker version because it doesn't even exist in the keyservers anymore, that sounds like an issue to me.

Maybe this would be different if I was using one of the Big Cloud (TM) providers. Who knows. But with my budget of "poor college student," OSS offerings are all I can do when my entire budget is consumed by server bills - and I can't cheap out here; a few TB of bandwidth and maxing up to 16 CPU cores isn't cheap :( - and it just seems like the "state of the art" is pretty terrible, at least from the UX perspective. I spend enough time staring at text and delving into the CLI and whatnot when I'm writing software. Why is it seemingly so hard to get even a simple UI to cover the basic functionality - create a master node, add slave nodes, deploy containers, and scale them up? Rancher seems to cover this use-case the best, but I've run into enough issues with it that I'm starting to seriously consider figuring out how to write my own orchestrator. I'd rather not, since it'd be a lot of work, but setting up these "standard" tools is a ridiculous Herculean task if you don't have a massive budget, it seems.

Ninja-edit: For a bit of perspective, I'm talking about a spare-time project that ended up gaining far more users than I ever expected. I'm the only person working on it, so trying to figure out all the development, ops, etc. on my own has been a huge struggle since so many of these tools just have a god-awful UX.

Handling common concerns in each microservice such as authentication and authorization, building resiliency in each µservice, managing each µservice configurations. This causes duplicated effort in each µservice. Need to look at service mesh (istio, linkerd).

Handling service discovery, API gateway in ever-changing kubernetes ecosystem.

Is Building Microservices book not enough? Hope to see no overlap of ideas and topics in the new book.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact