Hacker News new | past | comments | ask | show | jobs | submit login

What I Wish Small Startups Had Known Before Implementing A Microservices Architecture:

Know your data. Are you serving ~1000 requests per second peak and have room to grow? You're not going to gain much efficiency by introducing engineering complexity, latency, and failure modes.

Best case scenario and your business performs better than expected... does that mean you have a theoretical upper bound in 100k rps? Still not going to gain much.

There are so many well-known strategies for coping with scale that I think the main take-away here for non-Uber companies is to start up-front with some performance characteristics to design for. Set the upper bound on your response times to X ms, over-fill data in order to keep the bound on API queries to 1-2 requests, etc.

Know your data and the program will reveal itself is the rule of thumb I use.




The main benefit of microservices is not performance, it is decoupling concerns. As scale goes up, performance goes down and decoupling gives a better marginal return: it's often easier to work with abstract networking concerns across a system than it is to tease apart often implicit dependenicies in a monolithic deployment.

Of course, this is context sensitive and not everyone has easily decouplable code bases, so YMMV. But not recognizing the myriad purposes isn't useful for criticism.


I think it's worth noting, it doesn't just split the code. It also splits the teams. 5 engineers will coordinate automagically, they all know what everyone is working on.

At perhaps 15, breaking up into groups of 5 (or whatever) lets small groups build their features without stomping on anyone else's work. It cuts, not only coupling in the code, but coupling in what engineers are talking about.

The two sort of obvious risks, those teams probably should be reshuffled from time to time, so code stays consistent across the organization. If that's done with care, specific projects can get just the right people. If it's done poorly, you sort of wander aimlessly from meaningless project to meaningless project.

The other one is overall performance and reliability. When something fans out to 40 different microservices, tracking down the slow component is a real pain. Figuring out who is best to fix that can be even worse.


a better marginal return: it's often easier to work with abstract networking concerns across a system than it is to tease apart often implicit dependenicies in a monolithic deployment.

Do the complexities introduced not have a cost, i.e. are those marginal returns offset by the choice in the first place? Call it "platform debt."


Oh they definitely do. It's just a different flavor and the debt scales differently. Ideally the debt amortizes across the services so you can solve a problem across multiple places. This is obviously really difficult to discuss or analyze without a specific scenario.

It's probably not worth discussing unless you're actively feeling either perf or coupling debt pressure.


I have no idea why people keep thinking microservices is all about scalability. Almost like they've never worked on a problem with them before.

Microservices is all about taking a big problem and breaking it down into smaller components, defining the contracts between the components (which an API is), testing the components in isolation and most importantly deploying and running the components independently.

It's the fact that you can make a change to say the ShippingService and provided that you maintain the same API contract with the rest of your app you can deploy it as frequently as you wish knowing full well that you won't break anything else.

It also aligns better with the trend towards smaller container based deployment methods.


You don't need to build a distributed system for that. Just build the ShippingServiceLibrary, let others import the jar file (or whatever) and maintain a stable interface.

I wrote about this a couple of years ago in more detail, so I'll just reference that: https://www.chrisstucchio.com/blog/2014/microservices_for_th...

HN discussion: https://news.ycombinator.com/item?id=8227721


The point is that unless you use JVM hot reloading (not recommended in Production) you will need to take down your whole app to upgrade that JAR. Now what if your microservice was something minor like a black word filter. Is adding a swear word worthy of a potential outage ?


If you run multiple instances of your server and follow certain reasonable practices, you can take down old-version instances one by one and replace them with new-version instances, until you have upgraded the entire fleet.

Alternatively, you can deploy the new version to many machines, test it, then make your load balancer direct the new traffic to the new instances, until all old instances are idle, and then take down these.


And in microservices you can just pick whatever stack you feel is the best for just that task


Is hot reloading that big of a problem?


not for everyone


You certainly can do that. A little discussed benefit (or curse) of a microservice is maintaining an API (as opposed to ABI). Changes are slower and people put more thought into the interface as a discrete concern. I am curious if an ideal team would work better with a lib--I think so, but I'm not sure!


I think people don't maintain discipline if it isn't forced. So its a round about way of forcing people to maintain boundaries, that wouldn't be needed much of the time if people had stricter development practices.


Exactly. Every single monolithic application I've seen has a utilities package that has classes shared amongst components.

With micro services you could have different versions of those utilities in use which would not be possible in a monolithic app.


> With micro services you could have different versions of those utilities in use which would not be possible in a monolithic app.

You have no idea...:)


> I have no idea why people keep thinking microservices is all about scalability.

It's an aspect. It's often the beginning of a micro-service migration story in talks I've heard.

> Microservices is all about taking a big problem and breaking it down into smaller components

... and putting the network between them. It's all well and good but the tradeoffs are not obvious there either. Most engineers I know who claim to be experts in distributed systems don't even know how to formally model their systems. This is manageable at a certain small scale as some computations take 35 or more steps to reveal themselves. But even the AWS team has realized that this architecture comes with the added burden of complex failure modes[0]. Obscenely complex failure modes that aren't detectable without formal models.

Even the presenter mentioned... why even HTTP? Why not a typed, serialized format on a robust message bus? Even then... race conditions in a highly distributed environment are terrible beasts.

[0] http://research.microsoft.com/en-us/um/people/lamport/tla/am...

You just have to know your data. The architecture comes after... and will change over time.


You seem to be repeating these weird myths that have no basis in reality.

You don't need to be an expert in distributed systems to use microservices. It's literally replacing a function call with an RPC call. That's it. If you want to make tracing easier you tag the user request with an ID and pass it through your API calls or use something like Zipkin. But needing formal verification in order to test your architecture ? Bizarre. And I've worked on 2 of the world's top 5 ecommerce sites which both use microservices.

And HTTP is used for many reasons namely that it is easy to inspect over the wire, is fully supported by all firewalls and API gateways e.g. Apigee. And nothing is stopping you using a typed, serialized format over HTTP. Likewise nothing is stopping you using an message bus with a microservices architecture.


> You seem to be repeating these weird myths that have no basis in reality.

No basis at all? I knew I was unhinged...

> You don't need to be an expert in distributed systems to use microservices.

True. Hooray for abstractions. You don't need to understand how the V8 engine allocates and garbage collects memory either... well until you do.

> It's literally replacing a function call with an RPC call.

You're not wrong.

Which is the point. Whether for architectural or performance reasons I think you need to understand your domain and model your data first. For domains that map really well to the microservice architecture you're not going to have many problems.

And a formal specification is overkill for many, many scenarios. That doesn't mean they're useless. They're just not useful, perhaps, for e-commerce sites.

But anywhere you have an RPC call that depends on external state, ordering, consensus... the point is that the tradeoffs are not always immediately apparent unless you know your data really well.

> And I've worked on 2 of the world's top 5 ecommerce sites which both use microservices.

And I've worked on public and private clouds! Cool.

The point was and still is the same whether performance or architecture... think about your data! The rest falls out from that.


> It's literally replacing a function call with an RPC call.

Until you want to make five function calls, in a transactional manner.

Architecting correct transaction semantics in a monolithic application is often much easier then doing so across five microservices.


Yeah, that absolutely didn't compute for me either. If it really was the case with no thoughts needed for transactions or correctness I would have started to split my monolithic apps years ago. Its when your data model is complex enough to make consistency/correctness difficult without transactions you have a real trouble splitting things up.


RPC vs PC means introduces a greatly increased probability of random failure. If you're trying to do something transactional, you're in a different world of pain.


> [Scalability i]s an aspect. It's often the beginning of a micro-service migration story in talks I've heard.

It absolutely helps with people and org scalability. I haven't seen it help with technical load scalability (assuming you were already doing monolithic development relatively "right"; we ran a >$1BB company for the overwhelming majority on a single SQL server).


Because what you described is "encapsulation", and microservices is a very heavyweight and expensive way to get it.


I wonder if there's any writing out there about the evolutionary hierarchy of refactoring, like:

code > function/method > module > class > ... > microservice


Exactly. And we have decades of experience with monolithic apps where encapsulation is not adhered to.

So yes it is heavyweight. But it also works.


Well, granted. It will take decades for we to have decades of experience with microservices that break clear encapsulation.


I've never heard the term over-fill data, and google seems to have no idea of it in reference to software.


Ah... it's almost the reason for GraphQL.

Basically if you're building a hypermedia REST API you return an entity or collection of entities whose identifiers allow you to fetch them from the service like so:

    {"result": "ok!"
     "users": ["/users/123", "/users/234"]
    }
The client, if interested, can use those URLs to fetch the entities from the collection that it is interested in. This poses a problem for mobile clients where you want to minimize network traffic... so you over-fill your data collection by returning the full entity in the collection.

    {"result": "ok!"
     "users": [{"id": 123, "name": "Foo"}, {"id": 234, "name": "Bar"}]
    }
The trade off is that you have to fetch the data for every entity in the collection, the entities they own, etc; and ship one really large response. The client would receive this giant string of bytes even if the client was only interested in a subset of the properties in the collection.

GraphQL does away with this problem on the client side rather elegantly by allowing the client to query for the specific properties they are interested in. You don't end up shipping more data than is necessary to fulfill a query. Nice!

... but the tradeoff there is that you lose the domain representation in your URLs since there are none.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: