I'd be concerned not just about the overhead of managing lots of different (though simple) services, but also the fact that you give up a lot of convenient & useful features that you could get for free with a monolithic application, such as transactions. You either have to avoid needing transactions between microservices, or deal with the complexity of coordinating distributed transactions.
That seems like an awfully big hurdle to overcome for a new application, especially if you're a startup that needs to focus on delivering customer value as quickly as possible.
You can emulate transactions to a certain extent by marking things are "incomplete" (which may result in things being invisible in some contexts) until they can be marked as complete across all services — an operation which cannot be atomic, but which can be "atomic enough" – or aborted. Versioning may be an option.
Sometimes you need "all or nothing" state management in a single microservice: You want to do "POST /object" multiple times and then cancel all of them on failure, for example. In one app we have a data import process which populates a microservice with a complex data model with many 1:n and n:m relationships. Instead of doing multiple REST calls (POST and so on), we build a "batch" object, which is essentially a JSON document. This batch lives in the microservice and acts like a persistent transaction. When we want to apply the changes, we tell the service to commit the batch. If the batch cannot be applied because it conflicts — ie., changes have been made by someone else since the batch started — the entire batch is dropped and the import starts from scratch. It's not terribly elegant or efficient, but it's fairly simple.
Jeppe Cramon explains it pretty well in http://www.tigerteam.dk/2014/micro-services-its-not-only-the...
If you have a goods purchase service and a goods delivery service, you want to reserve the good (limited quantity) and reserve a time slot for transport (limited number of deliveries a day) before you charge the customer, and make the "commit".
And if you'd merge two separate companies into one uber-service, you've just created more problems than you've solved.
Starbucks is a good analogy (http://www.eaipatterns.com/ramblings/18_starbucks.html). If Starbucks were transactional, you would have to stand there at with the money on the counter and wait until your drink was finished, then exchange at the exact same time. Even the mental image is ridiculous.
In practice, it makes much more sense to allow the system to enter certain inconsistent states and then remediate them. Out of an ingredient? Refund the money. Can't pay? Pull the drink out of the queue, or just write it off. This can involve some cost, but less cost than than the throughput you'd lose by enforcing consistency.
For your specific example of a goods delivery service, most take-out restaurants fly in the face of these claims: you can pay the driver in cash when they arrive. They are willing to accept the risk that you made your order fraudulently, won't pay, etc. because it's still better for them to have your business.
If you reserved a time-slot or inventory for someone whose credit card is declined, so what? Just un-reserve it. It's not hard to UPDATE a row in a database. The system was in an inconsistent state for 10 seconds. Big whoop. (You might have turned down a legitimate customer, but only if they were consuming the last resource, and if demand is that high, another customer will buy the item.)
Notice that Ticketmater and most airlines will hold your seats (i.e. leave the system in an inconsistent state) for up to 15 minutes while you enter your card details. Hell, Ticketmaster can easily reverse a transaction at any point up until the moment your ticket is scanned at the door. So could an airline, and they often do.
The classic example of account balances isn't necessarily the case either - banks can INSERT records of individual transactions. Since addition is commutative, race conditions are irrelevant. Balances are then calculated nightly offline by playing back the transaction records. I guess that's a form of locking, but even if it weren't, the balance isn't the source of truth - the transactions are, and the balance can be recomputed.
Banks also don't hide the complexities of transaction processing. Your "current balance" is different from your "available balance" and your most recent transactions show as pending anyway.
You don't need transactions as often as you think you do.
I don't know what practice you're referring to, but unlike commodity coffee drinks paid for in cash 1) CC refunds are not free, and they're not cheap in volume at all if you intend to do it casually during normal operation 2) not all goods are standardized and available in large quantities.
Your examples are all over the place. Ticketmaster is an example of a reservation system similar to one I was trying to give an example of (two step commit). A resource is locked, the lock is held for a short period while collecting answers from the other subsystems (in this case, payment gateway), and then a final commit is issued (or a rollback is issued).
Airline overselling isn't done because it was some microservice design dogma about how great inconsistent state is, but because every seat costs the airline a fortune if left empty, and a certain % of passengers cancel or reschedule their tickets, and the airline is trying to arrive at an airplane with as few empty seats as possible. Having your tickets canceled is certainly not something that happens "often", thank god, but it does happen as a result of that tradeoff.
But if I reserve and buy my seats online for a cinema movie, and then I go with company and get handed my money back because "it's practical", I'll make a scene. And so no one implements cinema ticket reservation this way.
For bank overdrafting, it's a very special case - your money is a number in a computer, and the bank owns that computer. They make the rules... so they did. It's easy to mess around with numbers like that. Bank account overdrafting is probably the biggest exception of them all as no physical products and services are involved. No one's going to have their lawn un-mowed because the bank allowed your account to overdraft.
The only common thing between your examples is that they're driven by business concerns, not some ivory tower concern about service design. And this is why they're so different, and reserving resources is and will remain a common practice for many, as long as the business logic calls for it. There's nothing wrong about it.
In the Ticketmaster example, I would be able to get a Ticket resource assigned to me before paying for it. But the ticket would be "locked" (in the sense that I couldn't print the barcode) until the Payments service marked it as okay, which would indeed be an example of one service locking another's resource.
I thought you were claiming that the action needed to be performed in a single database transaction, but you don't need or want isolation here. You want the world to observe inconsistent state in this case (i.e. no one sees my seat as "available" while I'm fumbling around with my credit card number). In which case it's perfectly practical to implement transactionality at the application layer, between microservices, rather than at the database level. But that point was never in contention. I apologize.
Obvviously there's not a dogmatic preference for inconsistent state, just an acknowledgement that ACID properties are not necessarily needed as often as some people think they are.
Also, obviously, you have to be careful with your failure modes. Charging someone without delivering shouldn't happen often but you don't need to design your entire infrastructure around making it impossible, given that refunds (and simply not capturing the charge you authorized) are both relatively easy to do, even if you want to avoid doing them all the time.
I really don't feel there's a need for us to separate "native" database transactions and app-level transactions. They're both implemented using the same underlying principles. But I've noticed people see a huge difference between them in blog posts, articles and conversations.
I think it reveals a kind of thinking that database transactions look like magic, while those we roll ourselves... we see all the ugly parts of the sausage factory there, and it no longer feels as "atomic" or magical as what databases expose as an encapsulated abstraction.
If you're creating objects independently in multiple microservices during signup, and one of those fails, you've then got to deal with rolling back any changes that have already completed with other services.
Maybe traditional web applications just aren't a good use case for microservices?
As I said in my original comment:
> That seems like an awfully big hurdle to overcome for a new application, especially if you're a startup that needs to focus on delivering customer value as quickly as possible.
I'm really curious to hear if/how others are able to leverage the benefits of microservices yet avoid the perceived complexity that goes along with it.
Edit: I was reminded that Pat Helland (from Amazon) wrote a famous paper "Life Beyond Distributed Transactions" which gives some great guidelines on how to build an SOA that avoids reliance on distributed transactions: http://www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf
I still think this is clearly quite a bit more complex and harder to develop than i.e. a single Rails app, so I'm still very curious to hear when this approach starts making sense.
As for your concerns... you know how every time some good idea pops up people have to ruin it by pushing it to ridiculous extremes? Case in point, microservices.
You don't have to make things so modular that you give up SQL, transactions, or anything. With experience you'll naturally start finding where the domain of each microservice falls, and coordinating between them won't be a problem.
I strongly disagree with the poster who said that having joins means it's not a microservice anymore. That's non-sense. A microservice is defined by what it does, not how it does it.
Even the simplest service might be managing several entities that are in some kind of relationship. If the entities in one service are not in strong relationship with one another, it's a sign you can split them in two services. But if two microservices talk to each other so extensively, that the service boundary is becoming a bottleneck, it's a sign that they should be one service.
Do not break down a service into several services, just because it manages 2-3 entities. That's counterproductive, and it'll be the topic of DHH's upcoming blogpost "Why microservices suck" sometime in 2017.
In other words it's best to break down services by Bounded Contexts.