Hacker News new | comments | ask | show | jobs | submit login
Introduction to Microservices (nginx.com)
137 points by fixel on July 9, 2015 | hide | past | web | favorite | 83 comments

What scares me about microservices is the case where some operations must be transactional.

What if, in a given use case, multiple microservices are involved but the operations must be transactional : if one of the services fails, all previous operations must rollback. What are the recommended ways of implementing this kind of transactional behavior in a modern HTTP/REST microservices architecture?

I know the pattern is called "distributed transactions" and is often related to two-phase commit protocol. But there doesn't seem to be a lot of practical information available about this topic!

I found this recent presentation[1] that talks about it, but I'd like to learn more on the subject. Also, I'm looking for practical tutorials, not highly academic ones! I'd really love to see code samples, for instance.

Any links, suggestions?

[1] http://www.infoq.com/presentations/microservices-docker-cqrs

We've moved down this path from a massively complicated distributed transaction environment on top of MSMQ, SQL Server etc and you know what? With some careful design and thought about ordering operations and atomic service endpoints, we didn't need them at all after all.

Transactions can be cleanly replaced with reservations in most cases i.e. "I'll reserve this stock for 10 minutes" after which point the reservation is invalid. So a typical flow for a order pipeline payment failure would be:

1. Client places order to order service.

2. Order service calls ERP service and places reservation on stuff for 10 minutes.

3. Order service calls payment service (which is sloooow and takes 2-3 mins for a callback) and issues payment.

4. Payment service fails or payment fails.

5. Order service correlation times out.

6. Order service calls notification service and tells buyer that their transaction timed out and cancels the order.

7. ERP service doesn't hear back from the order service and kills reservation.

etc etc.

At step (4) you have an option to just chuck the message back on the bus to try again after say 2 minutes. If everything times out, meh.

Thanks for this! It seems like a very interesting pattern and I was completely unfamiliar with it prior to reading your comment. Looks like a search for "reservation pattern" gives lots of good places to start digging, but I'm wondering if you have any favorite resources on the subject. Is there a good treatment of it in some particular book? Or maybe presentations you've found particularly enlightening?

It's an old SOA pattern. Most of those patterns are still valid if you take away the WS-* cack and stuff.

What happens if the payment succeeds, but the subsequent call to the ERP fails? Wouldn't the reserved items be released?

ERP call is atomic and happens first. All that says Is "can I have 2kg of potatoes for 10mins please?" If that returns a success code then you can process the payment.

Also if the ERP says "in ten days you can have that amount of potatoes" you can ask for a longer reservation and issue the payment later.

Its all about careful ordering and atomicity at the service level and determining what must be done synchronously and what can be done asynchronously.

I was interpreting the parent poster's question to mean:

1. reservation is placed.

2. Payment succeeds, but either success is not known, the process requesting payment crashes before the response, etc.

3. ???

Since we never got to telling ERP "hey, that reservation will be permanent because the payment succeeded", but the payment succeeded… what do you do? Does the reservation expire (but my potatoes!)? How do you even know that the payment succeeded, if perhaps a network connection goes dark and requires 2h to fix?

In this case it's not really any different than other distributed transaction systems... another process (potentially manually) has to review, and correct things...

What happens when your payment processor succeeds in processing the transaction, but you don't get the success code? You either retry/confirm/correct... One would assume you would, upon not getting confirmation that your reservation was made permanent, retry the commitment, if it was already committed, then the erp service can return the appropriate response.

I missed the confirmation step above. That would happen once the payment has been correlated.

At step 3 in your list above the payment would time out and a refund would be issued. Usually payments time out as well so you can usually reserve cash (pre-auth in banking terms). So we end up with stacks of reservations.

If something breaks you can retry within a reasonable limit or wait for everything to drop all the reservations.

The concept of Aggregates in Domain-Driven Design is based around the need for business invariants that must be maintained with transactional consistency in a system that is generally eventually consistent.

Overall, you have to learn to love eventual consistency, but small portions of the domain should absolutely be clustered together around transactional consistency needs that are absolutely necessary.

Check out "Implementing Domain Driven Design" by Vaughn Vernon; chapter 10 in particular talks about this.

Caitie McCaffrey gave a talk on this subject at GOTO Chicago 2015 [1], refreshing the 'Saga Pattern' for a distributed environment. Short summary, IIRC:

1) Use a linearizable data store to store transaction metadata

2) Each step must have a complementary rollback step

3) Rollback steps must be idempotent. Depending on the type transaction, sometimes 'rollback' is too strong and you instead implement other types of recovery (e.g. 'roll-forward')

[1] http://gotocon.com/chicago-2015/presentation/Applying%20the%...

Asked a similar question on Stackoverflow recently: http://stackoverflow.com/questions/30213456/transactions-acr...

Didn't get many useful replies. http://ws-rest.org/2014/sites/default/files/wsrest2014_submi... looked promising.

OASIS has got you covered: https://en.m.wikipedia.org/wiki/WS-Atomic_Transaction

Though it probably needs some JSON and more use of HTTP-specific features in order to be acceptable today.

Funny, I made a very similar comment recently:

A concrete example we've faced. A certain operation requires writing data to N flaky services. You successfully write to N-1 of them, but the Nth fails. Now what do you do? If these N things were just database writes to the same DB, transactions would save you, as you could just rollback. Without that, the answer has to be handled in code -- do you reverse the previous changes (if possible) by sending delete events, or leave the system in some sort of half-baked state and rectify things later via some other process? (I'm interested in hearing of other options...)

The answers I got were:

1) apologetic computing (Amazon)

2) consensus algorithms / paxos

The problem I see is that these may be non-trivial to implement and/or not fully understood or standardized.

3) Reservation pattern.

This one really works well.

Here is the example code for the talk: https://github.com/cer/event-sourcing-examples and a corresponding blog post with some links: http://plainoldobjects.com/2015/01/04/example-application-fo...

We have one instance where we update an account, that goes out to potentially 11 service calls which are not transactional. We are having to maintain state in our app because the micro services have split this up so much.

The alternative to Microservices would be to get better at building boundaries in your application before resorting to creating physical boundaries to interact with code.

There are a lot of approaches to this. I've explored these ideas with Obvious Architecture (http://retromocha.com/obvious/) and the talk I gave at MWRC 2015 on Message Oriented Programming (http://brianknapp.me/message-oriented-programming/).

I think the big lesson is that the Erlang stuff was WAY ahead of its time and it already solved a lot of the problems of large networked systems decades ago. Now that we are all building networked systems, we are relearning the same lessons telcos did a long time ago.

This. I know that quote has been overused already, but:

"Almost all the successful microservice stories have started with a monolith that got too big and was broken up.

Almost all the cases where I've heard of a system that was built as a microservice system from scratch, it has ended up in serious trouble."


Also good: https://rclayton.silvrback.com/failing-at-microservices

This makes some sense, intuitively. Until a system has some degree of maturity, you probably don't have a good enough picture of it to understand where service boundaries should be and to assign responsibilities appropriately.

There may be good ways of managing that (refactoring at the architectural level, essentially; microservices are in some ways like OOP on a different level), but the practices to do so probably haven't yet been developed.

It's a good quote, and a sentiment that I've seen echoed by CTOs and VP Engs who actually are running or are currently migrating their companies to microservices.

That said, I feel the quote should be: "As of 2015, almost all successful microservice stories..."

As the tooling and knowledge around microservices builds up over the next few years, I could imagine a world where starting with microservices makes sense. For a new company, the flexibility you get with microservices to try new tech and throw out failed experiments could result in much faster iterations, helping to nail the product-market fit.

Totally disagree. The advantage of monolith-first isn't that monoliths are easier than microservices (though they might be). The advantage is that early in a project's lifetime, you don't yet know the boundaries between services. Worse, guessing those boundaries incorrectly is more expensive than a monolith.

Breaking a monolith into services is difficult, but it's much hard to "rebalance" microservices once your product grows and you realize you have gotten the interface wrong. The monolith stage is important because it helps you figure out what the hell you're building. Establishing service boundaries overly early risks getting you "stuck" in the wrong architecture.

Reading the article, I'm not sure if that's an alternative so much as a prerequisite.

Looking at the monolithic architecture, it just took each feature within the monolith and created it as a microservice. Just because you have a monolith doesn't mean you can't have well thought out features and separations of concern.

Before coding, before deciding on architecture, I like to think in these terms. What features make the most sense together? Far apart? It should be a prerequisite of any project, regardless of architecture. If you're building a monolith, each one just goes in a different module or package rather than having its own service.

But then you lose a lot of the benefits of microservices, namely the scalability and reliability of being able to manage deployments seperately.

But if your project is still small, with a small team you likely don't need that yet. However, if your going to start with a monolith with the intent of going to microservices you had better have strong architectural discipline in the team. Otherwise, with no physical barrier to prevent it, developers are going to fall prey to the temptation of taking shortcuts and reaching across module boundaries, out making chatty APIs that can't be made distributed in a practical manner.

Even with a small project and small team, it's nice to be able to scale up just the part that is overloaded. Especially if you're on a small budget.

And I was working on the premise that you are already doing microservices, so presumably you are already taking the overhead hit.

I don't think it's an either or situation. If you have internal app services, it should be easy to break that/expose that as an external service.


Absolutely, if there is an additional cost to doing it. But if it's an either or, why not build it right the first time?


I don't think you understand what YAGNI means.

It sounds like YOU don't understand what YAGNI means to us developers, though.

In the context of our little conversation here, Microservices is not an either-or choice. There's quite a penalty you take to productivity/agility/cost with a Microservices architecture just like there was with SOA. It's not free, even if you believe it is "right". Take a look at this: http://martinfowler.com/bliki/MicroservicePremium.html

So, YAGNI certainly does apply here, and I do toss the acronym around lightly on purpose because that is the blunt response we programmers need to hear and give WAY more often. You ain't gonna need it!!!

Architecture astronauts are everywhere and they are mostly a-holes that create chaos for the rank-and-file. You want hell? Ok, go smash your head against the wall implementing another BDFL's pipe dream.

We developers are most to blame in this and we need to cut it out with all the fun meta-work we like to create for ourselves. Run a tight ship, be professional, deliver precisely the product that our customers ask for with no extra bells or whistles.

When you have Mt. Everest size workloads like Netflix has, and you need maximum isolation and monitoring and deployment flexibility then, yeah, you're in another league and Microservices is a really awesome approach. I'm guessing you're in my league though, so, I'm doing you a favor here, you can thank me later: YAGNI.

After reading about microservices, I feel like it's a great idea if you have a large team and a lot of resources, but that's not really explicitly stated (although it's nice this article features drawbacks). I feel like there's a ton of hype but nobody is saying "Don't do this if you don't have a 30-person team!"

Separating everything out into little APIs all with their own datastores that all talk to each other sounds great, but I would not to do this on a three person team. Just give me an old fashioned monolithic API, a large database, and then I can spend 80% of my time programming and 20% on maintenance. One app is hard enough to run, why consciously choose to run 10 of them if you don't have the capacity?

I don't think microservices are a bad idea at all...I love the architecture. But the hype makes it hard to see that this architecture probably isn't for you unless you have the capacity for it.

If you're interested in our experience on starting on microservices from scratch with only 2 people, I gave a talk on it: http://www.infoq.com/presentations/queues-proxy-microservice...

I agree they're not for every team, but it definitely allowed us to move, grow and scale faster than other dev environments I've worked in.

Great talk. This may be a silly question; however, how do you handle authentication between the main app and the services? Im in the middle of building a sizable application but have begun offloading some of the tasks to microservices, for many of the same reasons you gave. I currently use OAuth2 to authenticate to the main API, but dont really want to add additional overhead by having to authenticate to each service as its needed.

perhaps JSON Web tokens between main API and microservice where both have a shared secret?

Thanks, sounds really interesting. I'm bookmarking it for later (having a long day at work). Would love to hear how you achieved this with a smaller team.

Actually, I'd say the microservices architecture has helped in a complex project that I'm working on right now.

Consider the following that need to be done here.

- A main library that needs to load up a few gigs of data in memory

- A process that communicates with a queue of messages coming in

- A process that interfaces with mobile app (port x)

- A process that interfaces with a different kind of app (port y)

The goal is - every incoming message needs to go through to the main library and back to the app via the queue.

Monolith option - main.cc which contains all this, takes a while to start, can't queue up incoming messages till everything starts up and loads in memory, et al. Even using threads and whatnot.

Now with microservices,

- I can build a service that exposes my big-data-load library through a port. This can be loaded and restarted at will.

- Queue is running as a separate process. Messages queue when main lib is down and processed later.

- Server A and server b run separately

- A bug in one won't crash all the others

- I can manage each service independently (run them via supervisor or whatnot)

- Scaling it is easy - I can deploy each service behind load balancers, on different machines in the future without ever needing to change anything but the urls in a config file

- Monitoring - I have latencies for each service available via haproxy and the like.

My 2c.

That is a pretty accurate read on it I think. Fowler's been discussing similar data and findings from their research [1].

[1] http://martinfowler.com/articles/microservice-trade-offs.htm...

Microservices are great, but the underlying protocol they run on is important.

If you're building a REST interface to all your services, and something consumes them - they might be slower than a monolithic app unless you have something like a TCP or HTTP level keep-alive built in. Connections need to be long standing - otherwise the overhead of creating a new connection is pretty high.

Question here - what is a good way to make this long standing connection happen? Eg, if you use python urllib3 and nginx - can you keep these connections alive enough (with pings or whatever) that your latency is lower than bundling that service within the code itself as a library?

If you use keep alive and your server/load balancer doesn't have overly aggressive connection termination policies, I doubt re-connections every 60 or so s will be a major throughput hit. Regardless, you are always going to incur latency overhead by sending data over TCP relative to just sharing memory.

What I am more worried about with microservices is the data serialization overhead. Transforming data to some encoding that is robust against version changes (say using protobuf) can be quite costly on both sender and receiver, especially in languages with relatively slow object creation (e.g. python). This is highly application specific, but I'd love to hear others' thoughts on this trade-off.

Using a binary serialized format is actually pretty good that way. For a high throughput API that I built, shifting to protobuf from json, and using binary formats made a lot of difference. The packet size is lower, you can use JVM based platforms to do analysis and analytics (esp when storing logs as proto) - those benefits outweigh the slowness of the object creation (which isn't THAT slow on cpython). Atleast on the proto 2.3.x versions that I use.

That said, to my point of keep alives - lets say you've got process A talking to process B on localhost, which is making web service calls to the internet. Every 5ms delay is hurting your total response time, especially when your connection pool is waiting on a dropped connection to be reinstated.

I think this is the first time I've really understood the differences between SOA and Microservices (and realized that my workplace's architecture is a flavor of Microservice).

"On the surface, the Microservice architecture pattern is similar to SOA. With both approaches, the architecture consists of a set of services. However, one way to think about the Microservice architecture pattern is that it’s SOA without the commercialization and perceived baggage of web service specifications (WS-) and an Enterprise Service Bus (ESB). Microservice-based applications favor simpler, lightweight protocols such as REST, rather than WS-. They also very much avoid using ESBs and instead implement ESB-like functionality in the microservices themselves. The Microservice architecture pattern also rejects other parts of SOA, such as the concept of a canonical schema."

So SOA implies the existence of some heavy enterprise tools like WSDL and SOAP or other RPC type systems. Microservices favor RESTful interfaces.

SOA predates the enterprise tooling, which followed on with the popularity of SOA as an architectural style.

If microservices catch on, expect five years from now we'll be talking nano-services, and how microservices imply a whole stack of enterprise services that will have grown up around the microservices architecture.

I guess I am in what the author termed the "naysayers" camp, in that I cannot think of the "monolithic" variation they depicted as a "service-oriented architecture" of any kind. A single chunk of stuff sporting a lot of different APIs and interfaces can't be thought of as implementing services, imo, unless "services" is synonymous with "API" and for my part that is too general a definition to be of much use. If you agree with my view then services already required a separately designed, implemented, and managed piece of code that implements a single cohesive set of functions, and all that's left for microservices to add is minimality, which, imo, should already be in the mix.

I feel that I'm a proponent of microservices that has only ever worked on monolithic apps.

> I cannot think of the "monolithic" variation they depicted as a "service-oriented architecture" of any kind.

I think you're on the right track here. Note that the article uses "monolithic application", not "monolithic services"; it really is the lack of services.

> unless "services" is synonymous with "API" and for my part that is too general a definition to be of much use

I agree that that is too general. To me, a microservice would of course expose an API, but whereas a monolithic application exposes the entire application (or, perhaps, the entire API for everything your company does…), a microservice is exposing a small facet of the overall application, and is only responsible for that facet.

I probably agree with you that "services" is probably what most of us are after; I think the "micro" may just be an attempt to re-emphasize that it shouldn't be one huge thing, and that perhaps you should split services off sooner, rather than later, as it only ever gets harder.

Then again, I've never worked with in a microservices-oriented architecture.

This is a good read, but I'm wondering:

The Microservice architecture pattern significantly impacts the relationship between the application and the database. Rather than sharing a single database schema with other services, each service has its own database schema.

Is this a necessary prerequisite? One of the problems I'm dealing with now (and have been in the past) is the tyranny of multiple data stores. At any reasonable scale, this quickly leads to a lack of consistency, no matter how much you'd like to try.

It feels like most of the gain in a microservices architecture is from functional decomposition of code, with limited benefit from discarding the 'Canonical Schema' of SOA. I'd be interested to hear others' experiences with this, though.

The huge benefit that we see in our architecture (which I would call service-oriented, and not 'microservices' necessarily) is data separation.

Each of our services is a separate django app, and the database name is <consistent prefix>_<app name>. Originally, this meant we had 5-6 database schemas named <something>_friend, <something>_invite, <something>_news, etc., all one one database.

What ended up happening was some services rapidly outgrew the capacity of a single database server, such as our 'news' service, which handles chat services, private messages, and so on (and thus grows nonlinearly with community growth), unlike other services which grow linearly (like our 'identity' service). As a result, the 'news' database had to move to its own server. Thanks to this database schema separation, however, this was a trivial task. Dump the schema, restore the schema, change the DB host in the django config, and you're done.

If we had our data intermingled in the same schema, it would have been far, far harder to do this.

Fundamentally, your 'microservices' style architecture should be designed in such a way that you could take any of your services, tar up the code, and e-mail it to someone else, and they could use it in their architecture. For obvious reasons this isn't actually feasible (e.g. service interdependencies), but conceptually you should be able to draw firm, hard lines down your stack showing where each service starts and ends; this includes frontend services (nginx/haproxy/varnish/whatever configs), code (including interface definitions/client libraries), data persistence (database schemas, MongoDB collections, etc), and caching (Redis/Memcached/etc. instances).

The more interdependencies you have, the more problems you'll encounter down the road. If you intermingle MySQL data then any maintenance is downtime, any slowdown slows everything, any tuning is across your entire dataset, etc.

It's a requirement in the sense that sharing a datastore would break the abstraction. Each service should be independent from the others, which necessitates a separate data store.

Consistency should be maintained at the application level if you want to build a robust service, because doing it in the database leads to a single point of failure (the database)

> Is this a necessary prerequisite?

For something to really be using a microservice architecture? Yes.

Of course, real world systems don't have to use pure architectural styles, though its worth understanding why a named architectural style combines certain features before deciding to use some but not others.

> One of the problems I'm dealing with now (and have been in the past) is the tyranny of multiple data stores. At any reasonable scale, this quickly leads to a lack of consistency, no matter how much you'd like to try.

Honestly, I think if you have real inconsistency (rather than differences in data of similar form but different semantic meaning) with microservices with separate data stores, it means that you have designed your services improperly, such that they have overlapping responsibility.

I don't think it's a hard prerequisite (as there really aren't... too many of those, microservices can be built how you want them to be), but I think it's a good rule to follow.

If consistency is a necessary concern, and you have tightly coupled data, it's not a terrible idea to make the services a little bigger.

But also, if you have a service that depends on multiple other services to do work, I don't think it's so bad to get used to using the API for the other services (rather than trying to access their databases directly) -- despite the introduced latency overhead

If you have multiple "microservices", all operating on the same data store, it is difficult to guarantee the separation of concerns.

Conceptually, though, I don't think it is a requirement.

Separation of concerns or no, the problem with a single central datastore is that it will eventually become the bottleneck as you scale. This is really difficult to fix, not just technically but politically - as a central datastore grows, everything and everyone starts taking dependencies on it: reports, homegrown tools, documented troubleshooting strategies, etc. They become sacred cows of an organization.

Not only will there be resistance to the idea of splitting out that datastore, but major investment will be required to do it - implementing all of that disconnected messaging stuff you're going to need, reworking applications/services to communicate that way, and handling eventual consistency - which is a tough sell when the app works "perfectly fine" except for that scaling problem.

You could have separate schemas.

> The term microservice places excessive emphasis on service size. In fact, there are some developers who advocate for building extremely fine-grained 10-100 LOC services.

It's just a matter of time until someone writes a "nano-service" manifesto...

Let's go all out:

    Prefix      Symbol  Size                               Example
    yocto       y       1 bit                              Theoretical minimum
    zepto       z       1 byte (close enough to 10 bits)   Really small APL program
    atto        a       10 chars                           nc -l 8080
    femto       f       1 line (roughly 100 chars)         netcat piped into something else
    pico        p       10 lines                           tiny python service
    nano        n       100 lines                          small python service
    micro       μ       1000 lines                         typical "smallish" service
    milli       m       10,000 lines                       about as big as "microservices" would go these days, or a small monolithic app
    centi       c       100,000 lines                      decent-sized monolithic app
    deci        d       1 million lines                    large monolitihic app
    none        n/a     10 million lines                   roughly OS-level app
    deca        da      100 million lines                  god help you beyond here
    hecto       h       1 billion lines                    
    kilo        k       10 billion lines                   
    mega        M       100 billion lines                  
    giga        G       1 trillion lines                   
    tera        T       10 trillion lines                  
    peta        P       100 trillion lines                 
    exa         E       1 quadrillion lines                
    zetta       Z       10 quadrillion lines               
    yotta       Y       100 quadrillion lines

    googol  NYSE:GOOG      ??? lines              Google

It's obviously Googol lines.

This is excellent. Upvoted, obviously, but I also just wanted to reply to say I laughed out loud and also thought, "Yeah, that does seem about right."

Perhaps the nanoservice manifesto would just be to use vanilla erlang. One process per module, each module less than 100 LOC. Would that be small enough?

I think Amazon just did, they call it Lambda.

I think some teams are going to discover that RPC is a better fit for some APIs. Will we see Thrift get more popular? A resurgence in WCF(!) or something new and super light? For asynchronous are we going to see more pub / sub? Is this a good fit for ZeroMQ? I think there's a lot more mileage in these discussions...

ISTM RPC is eventually a too-leaky abstraction. That is, if used in synchronous fashion, in the long run it will cause pain. If you use it asynchronously (which seems to stretch the definition of RPC), why not just use something that is naturally asynchronous?

I don't think you need to be pure-rpc or pure-rest... I think the parent really means to say that rpc-like calls are sometimes a better api than pure rest... especially given some business and security rules that are in effect regarding a given record.

I've considered using ZeroMQ Request/Response interfaces with a defined JSON/UTF8->GZ instead of REST layer... My testing worked pretty well, and it could even be used behind http requests (packaged).. with 0mq, you can setup layers of distribution for service points.

At one level or another micro services architecture trades complexity of an application as a whole for complexity in the system as a whole. In the end, most of the services being used in practice could handle the few ms of overhead that http/tcp rest services had over 0mq...

The hardest thing for me was simplifying things as much as possible, I worked really hard to avoid SPOF, that many tened to go to. In the end, instead of the likes of etcd, table storage with a caching client was sufficient... instead of complicated distribution patterns, for a very small load, having a couple instances of each service on each machine was easier.

It really comes down to what you really need, vs. what's simple enough to get the job done, and not lock you down. In then end, love docker (and dokku-alt), but things like coreos, etcd, and fabric turned out to be overkill for the needs.

The article mentions using a different database for each service. Which means, you cannot join table between different databases and you lose the 'relational' aspect of RDBMS. How is this problem solved by people using micro-services? For example, how do you related trip data and driver data for reporting purpose?

One pattern I've seen is having an OLTP database per service and an ETL process to stream each service's data to a central warehouse or OLAP database that would satisfy reporting requirements.

> How is this problem solved by people using micro-services? For example, how do you related trip data and driver data for reporting purpose?

There are a couple of different ways that are obvious:

1) The act of scheduling a trip requires the trip service to get information from the driver service related to the driver for the trip (the action might be triggered from either service). While information about the driver in the driver service might change, the information about the driver that was recorded with that trip is fixed. All the information necessary to answer queries about involved drivers that are within the scope of the trip service is stored in the datastore for that service. The same thing is generally true of all services.

2) For generalized reporting, information required to support that function is sent by various services to a separate reporting service, which aggregates historical data for reporting purposes. (Even non-microservice architectures often involve this, having transactional operational databases export data into a analytical database, with different schema and capabilities, for reporting purposes, rather than using one datastore for operational and reporting use.)

One alternative is using "data virtualization" tools like Denodo Virtual DataPort, that create the illusion of a single relational RDBMS and let you perform joins between different databases (disclosure: I work at Denodo).

I wrote about it here: http://productivedetour.blogspot.com.es/2014/12/connecting-t...

Thanks for your answers. I see that data warehousing is an important aspect of microservice architecture. I wish it was highlighted more often on microservice discussions.

We're honored to be partner [1] of Nginx with the OSS Microservice Management layer KONG [2]. Kong is built on top of Nginx and uses Cassandra for storing the config.

[1] https://www.nginx.com/blog/nginx-powers-kong-api-management-...

[2] https://github.com/mashape/kong

I think architects have to be very careful in deciding which components would qualify as a microservice and which ones should be lumped together. This article sums up my thought. http://www.infoq.com/news/2014/11/ucon-microservices-enterpr...

I've been hearing the Microservices buzz for a while. I recently tried to set up a new project as an ensemble of microservices but got stuck. There is a fair bit of "common" tooling like load balancers and events/log system. I'm about to throw in the towel since it seems too complicated to get going for a brand new project.

From my impression of the recent buzz about Microservices, and having spent the last year building them, Microservices shine as a refactoring approach for large/monolithic applications.

I'm glad to see this because I cofounded a company to solve exactly this problem. Link is in my profile if you're interested, we just launched.

Thx .. I'll look into it. I recently ran into this talk which confused the heck out of me: https://yow.eventer.com/yow-2014-1222/implementation-of-micr...

Notice his use of Kafka and 0mq to create an SOA for microservices. All of the stuff I had seen previously had the services communicate with each other via REST. So everything is synchronous. In the talk whose link I posted above, communication is asynchronous via messages. Is it reasonable to do both?

Also, how the heck does one deal with transactions across services?

Sure, it's possible/reasonable to do both. I work with asynchronous Microservices using RabbitMQ where PubSub shines in that many services can respond when the application publishes a message (1 service for email, 1 for push notifications, 1 for capturing data for analytics, etc). But it is fire and forget as far as the application is concerned.

Compare that to synchronous services where the application has be aware of all of them and coordinate calling them all. It is easier to use synchronous when the application requires an output from the service to respond.

Transactions across services are certainly not easy:

"Distributed transactions are notoriously difficult to implement and and as a consequence microservice architectures emphasize transactionless coordination between services, with explicit recognition that consistency may only be eventual consistency and problems are dealt with by compensating operations." [1]

[1] http://martinfowler.com/articles/microservices.html

My main advice to anyone considering writing a microservice based architecture from scratch is to keep a really tight handle on code duplication and testing. Also take a deep look into Erlang, it's written with these types of systems in mind.

I’m having a really hard time understanding the pushback against a microservice architecture. Done properly, I can’t see the difference between building an app using a microservices architecture and a mashup. Am I being naive?

I think the big problem with the microservice movement is that it is easy to understand but difficult to implement. It reminds me of when OO was new on the block and people designed these really elaborate OO hierarchies only to have them break down over time. Then came "OO SUCKS" because there was a level of experience required to understand what level to take your modeling to.

The same thing is happening with microservices. Engineers are microservcing ALL THE THINGS at such a fine grained level that it becomes a nightmare to maintain, orchestrate and manage. Therefor "microservices suck!".

Most of the time, you can model your domain into a few key areas, say 'customers/security', 'interface' and 'processing'. That's a good 3 service start. You may never need to go beyond that. However, as your needs grow or change, you can start to refine your model based on changing business needs or scale/performance/infrastructure issues.

In my experience it's a completely logical way to design a system and is really no different than making 'libraries' of code all housed under a single master application. The only real difference is the underlying communication infrastructure.

I think of it as a "where do you want complexity?" tradeoff in this whole debate. Monolithic codebases have complex code but easy deployment and monitoring and coordination (in the sense you deploy and manage one big thing). Microservices have simple code individually but have complex deployment, monitoring and interface coordination. As my team (40 people) moved toward microservices, we spend extra effort ensuring passive changes. Then we have to update and release the multiple services that need updates to gain some new feature. In the past, what was some complex code updates is now complex packaging and intricate versioning. Less code changing...but in more places.

For several reasons, I think our move toward microservices is a good one. But in our case I have seen complexity move from code to coordination.

I think the pushback can be summarized by the fact that there are more moving parts with a microservice architecture. Harder to test, more things that can break, more coordination for deployments.

Are you referring to mashups in that the application is using external APIs? I'd say that's similar with the difference being you don't have to coordinate the deployment part. With mashups you still have the testing challenge and reliance on another system to be running that comes with a microservice architecture.

I guess I'm confused about why you would need to coordinate deployments. In my mind each microservice is completely decoupled from any other. There are a large number of tools to automate the process of testing and deployment. I think we as developers understand the need to code defensively and we understand how to build applications that have large amounts of asynchronous operations. I'm also not sure why people keep insisting on using messaging as a way to allow one microservice to communicate to another. Wouldn't exposing a RESTful interface over HTTP work well enough? The concept seems so simple and refreshing. What am I missing?

Interesting, I feel like I've seen much more literature about RESTful microservices than message oriented approaches. I've done it both ways and I'm of the opinion that RESTful services are more complicated for a couple reasons:

* All RESTful services must be up and running for application to be fully up and running * application must have all knowledge of RESTful services it calls

With messaging (and PubSub), services don't need to be running at all times and you can add as many services you'd like without the application needing to know. Applications just says "hey, something happened" and services go to work.

I agree with you, in either case, it is important to code defensively and be aware of possible request version mismatches. Deployment coordination is probably awash between the 2 approaches. I think testing is more of a challenge with message oriented too, as most integration testing tools are geared towards HTTP interactions.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact