Hacker News new | past | comments | ask | show | jobs | submit login
Eric Brewer on Why Banks are BASE, Not ACID – Availability Is Revenue (highscalability.com)
134 points by abhijitr on May 2, 2013 | hide | past | web | favorite | 48 comments

>If an ATM is disconnected from the network and when the partition eventually heals, the ATM sends sends a list of operations to the bank and the end balance will still be correct.

I don't think so. I support ATM client software for a large bank in the US and we certainly don't do this. This may be true for "remote" ATMs that are installed in convenience stores on POTS. I can't say I've ever actually heard of it though - the main problem with this idea is that cards cannot be authenticated without network access, and just spewing out money to every piece of plastic calling itself a card when your network connection has been dropped isn't really a recipe for success. Fraud is a real problem.

The ATM client software I support cannot do any transactions without a connection with its authorization system. That authorization system though, can stand-in for the various accounting systems and external networks up to pre-defined limits. So for example if for some reason we can't reach the checking account system we'll authorize up to $xxx total for the day on a stand-in basis. The transaction with the authorization system is definitely ACID; the ATM will not get a response code authorizing a withdrawal unless the transaction has been recorded in the authorization system. The account system may well be caught up later. The funny thing is, ACID is a property of individual database systems and it has absolutely nothing to with a question of whether two separate ledgers are guaranteed to be changed together or not at all. That would be the job of a distributed transaction coordinator - and those really are not used very much in banking. Instead there is a protocol of credits and debits and a settlement process to work out the exceptions. Maybe this is what the article was trying to say up to a point but they sort of confused the issue between the point of view of the ATM and the accounting systems of record.

> just spewing out money to every piece of plastic calling itself a card when your network connection has been dropped isn't really a recipe for success.

Well they do this in Australia... http://www.news.com.au/money/banking/computer-glitch-hits-cb...

> "People were running past me screaming 'Free money! Free money!'," Punchbowl Pharmacy manager Feriale Zakhia said of the people using a nearby ATM.

> "Everyone was so happy. They were running around with huge smiles."

> [A technical problem] forced the bank to put all of their ATM machines into offline mode. Customers had no access to their account balance but were still able to withdraw money - more than their accounts held.

> Those withdrawal limits are up to $2000 a day for holders of keycards and debit Mastercards.

> "No one has received free cash," Mr Fitzgerald said. "What they've done is overdrawn their accounts. We will be following those people up and recovering that money."

Leaving aside the fact that I wouldn't trust a newspaper to for the technical details of something like this, nothing in the article contradicts what jeremyjh said.

In that case, the ATM was disconnected from the accounting system and allowed withdrawals up to a set limit ($2000), but it (probably, the article is unclear) was still connected to the authorization system.

It (probably) still checked your PIN, and checked whether your card had been cancelled, etc. It just didn't connect through to check your balance.

As Jeremy said if for some reason we can't reach the checking account system we'll authorize up to $xxx total for the day on a stand-in basis

In this case some reason == "[A technical problem]" and xxx == $2000

Which in turn agrees with the point of the original article. ATMS use BASE not ACID as it's more profitable to be available.

Those humungous overdraft fees will definitely be profitable :)

BTW I've been scammed out of money by ATMs before - money was withdrawn from my account but some system jammed and I didn't get the money - and the bank was awfully uncooperative.

So far I've had more luck with the "money under the mattress" method than with banks - and I wasn't trapped in the "Corralito" or other bank-aided money-stealing schemes


I see that you are from Uruguay, where you in Argentina at Corralito's time? ITT Uruguay is more trustworthy in banking terms.

Uruguay had a smaller Corralito (and I was just starting out at the time, so I had no money in the bank).

Ecuador and Brazil also had their own versions. In the Uruguayan version, they didn't forcibly exchange the money, but they froze all bank assets for 3 years (losing out on interest, investment opportunities, exchange rates, etc...).

Uruguay is more trustworthy (especially with foreign investment) but it's not above such things.

Currently there's a big scare due to the huge exchange rate disparity with Argentina - which has an "official" exchange rate and a "real" exchange rate which is almost double the official one, and makes Uruguay non-competitive.

Edit: you're from Argentina, that's obviously not news for you :)

Reading this, it doesn't sound like you said anything different. Sure, it isn't the ATM itself that makes the decision, but the authorization system can still step in and allow a transaction that is not committed to the actual account's log.

I'm sure that, under the hood, there are a lot of ACIDic transactions going on, but, stepping back, it still looks pretty BASEic. When I hit "withdraw $200", there is no guarantee that my actual account has a transaction commit for that amount. Instead, there may just be a log message saying "SoftwareMaven withdrew $200".

Yes and I acknowledged this at the end of my comment I think.

I don't really think it is useful to try and use the terms ACID/BASE to refer to the aggreggate process behavior of an entire industry's technology. Not that it is wrong, it just really doesn't mean anything at that point. Most data interchange that takes place between thousands of different parties is going to have similar characteristics. Maybe a market/exchange is a good counter-example but I can't think of many others.

That's smart, but what he's saying is plausible. First of all significant parts of the world use chip-and-pin cards that can authenticate the pin locally. Not that they are hack proof by any means, but considerably better than mere magnetic strips like we still have here in the US. Moreover, if the offline mode is ephemeral and unpredictable, then it's less prone to exploitation. Again, not immune and not the safest thing for an ATM vendor to support, but the overall fraud risk could conceivably be within their comfort zone.

An offline mode might be ephemeral and unpredictable as a natural occurrence, but should be pretty easy to create...

I think you're missing the forest for the trees here. The point of the article is not the specific semantics of ATMs, but rather the whole system of banking and how it accounts for CAP theorem.

> the main problem with this idea is that cards cannot be authenticated without network access, and just spewing out money to every piece of plastic calling itself a card when your network connection has been dropped isn't really a recipe for success. Fraud is a real problem.

I believe some of the first ATMs actually worked offline (the PIN was encoded on the magentic stripe), but networked models came out a few years later. Of course, this was in the late 1960s, when card readers (and the expertise to use them) were far harder to obtain.

> Instead there is a protocol of credits and debits and a settlement process to work out the exceptions.

Exactly--the ecosystem as a whole is BASE, but the individual systems are ACID, and the "eventual consistency" aspect is implemented as first-class logic in applications/processes.

I think BASE is a good ecosystem-level principle, but when it gets into datastores, then each individual system doesn't know what it's true opinion of the world is, much less how it can effectively coordinate with other similarly potentially confused systems.

With any significantly complex architecture, there are many ways of being off-line. You may be unable to check a balance, but able to validate a PIN, or unable to make a transfer, yet perfectly capable of doing every other transaction. In that situation, if you can validate the PIN without being able to check the balance and you know this type of card is issued to clients with a certain overdraft limit, it's safe to clear the transaction and just tell the backend it happened when all remote functions are back online.

Pretty much all POS systems have the ability to work in offline mode. The vendor can set the offline transaction amount to whatever they want, including disabling it. Typically, it is set to $75. If you go to a convenience store and they say "the system is slow today.. but it works", now is your chance to get away with < $75 worth of crime.

Source: I am a former software engineer for a credit card transaction system vendor.

It certainly appeared that way to me as well. Although I don't really use ATMs much today, it used to be a fairly regular occurrence to arrive at an ATM then have to find the "next closest ATM" due to the machine having connectivity problems.

Sorry, but Eric Brewer is wrong. He seems to be implying that there was some sort of intelligent design behind the software at banks that lead them to choose BASE. This couldn't be further from the truth.

Banks are one software kludge after another in attempt to not rewrite something new or offer the consumer anything of value... while paying out the butthole to whatever vendor has his arm shoved so far up your ass you can't ever migrate from their platform without colon replacement surgery.

So while it's a nice thought "Hey look Banks/ATMs are BASE" this was by complete accident through years of incompetence and corporate bureaucracy, not by any sort of engineering choice.

That's not true. I've worked for a company that makes banking software and we did tons of migrations. The company itself changed its own software from RPG in as400 to windows forms and now it's all web with servers in Java and .Net (they use a middleware language which generates in every new platform). They even made a transitional install for one branch of one of the most important banks. The branch used my former company's solution for a few years until they could use their own software, which had to be adapted for the new market.

Banks used to work on paper an did fine, software migration, although can take some years, is no problem for them.

Each bank is different. Some banks have shitty IT and other ones have dynamic IT that can adjust to fit changing realities.

The reality is that you and Eric are right.

Banks depend on techniques based on double book keeping accounting to reconcile accounts at end of day. Different data about transactions are stored in different places by different organizations and they compare books to make sure that balances are correct.

You cannot depend on every transaction to be recorded perfectly. You must have the ability to compare books and reconcile accounts. This is simply how the world works.

Trying to make every perfect and depending on storing data in a central place with the assumption that it's always going to be consistent is too much of a liability. It doesn't work because the systems required by modern financial systems are incredibly complex and availability during markets is the highest priority. You ARE going to have faults and you ARE going to have problems. The ability to take hits gracefully and give yourself time later on to fix stuff after the fact must be built into your systems.

Well given that his thesis is that banking has always worked this way, and since banking is older than computers, I don't think there was any implication here that it was an engineering choice.

... and why would they do that?

This is a little misleading. Transactions are used primarily to prevent inconsistent data, not global data consistency.

The ATM network is distributed and eventually consistent, and financial transactions in general are not real time.

Within an ATM or within a bank you can be damn sure transactions are used widely to prevent inconsistent data.

This is one of those cases where deciding on whether the system is ACID or not depends entirely on where you draw the system boundaries.

If you draw it at the boundary of the central general ledger, it's going to be ACID.

If you look at the way transactions pass through several intermediate systems (each of which is ACID) en route, I guess it could be called BASE.

I think the point is that people who say it is impossible to build a banking system without full transaction support for every action are ignoring the reality that transactions are not guaranteed to occur, but the actions themselves are. If, at a level of abstraction, the system can be said to be a BASE, then it is probably true that the underlying data stores are not required to be ACIDic. Whereas everything I've read has always said "you can't do banking without ACID".

Whether that is true or not is a much deeper discussion that three or four paragraphs on a blog.

> If, at a level of abstraction, the system can be said to be a BASE, then it is probably true that the underlying data stores are not required to be ACIDic.

I don't think this follows. While the system as a whole might be eventually consistent (where consistency is defined as: what's in the General Ledger), it doesn't follow that you relax constraints on the parts.

The individual components are generally ACID and the steps to move data are ACID as well. The only thing that prevents the whole system from being ACID is that transactions don't go immediately from POS/ATMs/card clearance/cheque clearance into the General Ledger, but instead must go through a series of intermediate transactions. But those intermediate transactions must, themselves, be ACID.

That's why I talked about how the boundaries matter. The final central accounts are ACID, the subsystems are ACID and the data transfers are ACID.

edit: though to contradict myself, I expect that there will be counterexamples in different banks where some stores or steps will not be strictly ACID, but will have been "good enough" or with sufficiently-acceptable workarounds that they haven't been upgraded. I don't think this fatally breaks my argument, but YMMV.

I think "if the system as a whole doesn't require ACID, maybe the pieces don't" is correct and useful, but it still requires looking at the pieces and seeing if that's the case. In this case, I think that the system is relying on the ACIDity of some components to ensure Eventual consistency - it's conceivable that an alternate method might not, but one would have to be proposed and evaluated.

I think this is reasonable; but then the problem becomes that while consistency may be eventual there are nevertheless fixed deadlines to meet. Soft realtime consistency isn't good enough when the annual report has to be printed.

Personally, I feel that ACID is an abstraction achievable only within single, non-distributed systems. Not a very compelling insight, I hear you say.

Well no, but ACID is a tremendously advantageous state of affairs and I feel it should be surrendered only begrudgingly. I think it is better to repair it than to abandon it wholesale at the first sign of mild inconvenience.

Even though it is, in a physical sense, untrue, it is a useful untruth. Newtonian physics is wrong. It's also what we use to build bridges.

If you pick sbsets of the transaction, then cassandra is ACID in part because it guarantees that a write leaves an entry in a node's log. A banking system only guarantees a log of all increments and decrements, not agreement of the current balance throughout the system. So, as a whole, the system is definitely BASE and not ACID.

So we're back to a boundary argument, again. Partisans will draw the boundaries as it suits their argument.

Though, I do need to emphasise a point:

> subsets of the transaction

Each step in moving the information from the ATM to the General Ledger is itself a transaction. There are no "subsets of a transaction". If it's divisible, then it's not a transaction (this is the atomicity requirement of ACID).

Yes, each step is a transaction, but the whole is not consistent at any one moment. Availability was chosen over consistency all the way up.

We see this pattern all over the place. The primary example is any place that accepts checks as payment. Availability is immediate, but consistency is not necessarily there.

You see it with a business making a PO. Lots of places will still take credit cards which are only processed daily (ok, this is becoming rare where there is cell service.)

Yes. I'd restate my original argument but I'd seem more than usually repetitious :D

I'd tend to agree on a general sense - yeah when faced with failure in a WAN environment, it makes sense to attempt to continue when at all possible.

However, banks are one of the biggest purchasers of ACID systems... They still bet heavily on oracle, and when they want their accounting systems to run, and balance transactions, they dont rely on "BASE" systems. Banks are also heavily dependent on business cycle and batch processing. Daily batches are common in credit card processing (end of business day settlement for example), and also in general bank systems.

Also eventual consistency means different thing. Some systems have a "eventually inconsistent" property to them (eg: Cassandra/Dynamo), and I'm pretty sure banks would NOT be ok with that.

I respect Brewer, he is a smart guy, but he is extrapolating too much from a small fact that is ATMs will sometimes dispense cash (of what amounts? $200? $1000? or maybe just $20?) when remote communications are interrupted or broken.

I don't think Brewer is stating an opinion, rather a fact. Nor is it very surprising. I'm sure banks have huge IT operations where parts rely on ACID or BASE depending on need. But any massive distributed system can't really expect ACID to work very well.

I'm pretty sure "eventual consistency" doesn't mean "eventual inconsistency" under any reasonable interpretation.

I dont think anyone would claim a massively distributed system should be fully consistent or ACID.

As for the eventual inconsistency remark, this is from the original authors of Cassandra at Facebook. They did not extend the use of Cassandra because node flaps and packet losses caused nodes to be inconsistent and have old data. Bringing that data back up to date was very difficult, since the anti-entropy algorithms were too expensive to run frequently. They also found that the R=W=2 was too costly in terms of performance, and well you know the rest :-)

Do you have the reference?

Citation needed.

I've never heard of financial systems being eventually consistent, certain not ATMs which need to know your exact balance and how much money you've taken out already today.


I studied ATM's at Uni, and they have been like that for over 26 years.

As someone else on the thread pointed out, they do authenticate you, and record the transaction in a aci(Durable!) database before they give you money. But the system clearly has a degraded mode when the exact balance is unavailable.

Also why do you need a balance to process a deposit?

In most cases when processing a withdrawal, ATMs or POS terminals don't know your exact balance and how much money you've taken out already today. A withdrawal tends to involve a request for $X which gets a yes/no response from the bank or from the chip of the card if the terminal is not online connected - the bank or chipcard logic will check the balance and daily limits, not the ATM.

You can think of the (rare nowadays, but possible) offline, paper-based card transactions (physical imprint of the card number + signature) as a very strong example of eventual consistency - your card will be billed for that amount sometime later when the documents are processed and all the balances will be correct, but for many days the "online visible" card balance as known to the bank will be different from the "real" legal/accounting balance of that card.

I don't know if you've ever travelled but very often withdrawals from an ATM will be processed without the balance of the user being known at that time.

It's why your bank account can go into negative territory.

A nit, but the remark of "auditing == everything is written down twice == double-entry accounting" is cute, but doesn't seem applicable to availability.

Double-entry is more of an internal (financial) system implementation detail, and so an orthogonal concern to intra-system audits.

(It's not like one side of the entry is in one bank's IT system, and the other side of the entry is in the other bank's IT system.)

(...speculating further, I really doubt the OLTP systems of banks are double-entry anyway.)

Funny, but "It's not like one side of the entry is in one bank's IT system, and the other side of the entry is in the other bank's IT system." is actually false in interbank deals such as correspondent accounts, money market or forex deals.

If a deal involves two banks, then the authoritative "other side" of the entry will be in the other bank's IT system. Of course, you'll maintain some records of what the entry should be in your opinion, but they won't always match as you don't have full info and you'll reconcile with data you get from the other bank's IT system by (for example) SWIFT network.

And I've seen only double-entry OLTP's for the core system that includes general ledger. Maybe no all of them are built that way, but I haven't seen such examples.

The article is not entirely wrong but misleading. The two important aspects are authorization and limits.

Authorization: An ATM does not issue money without authorization which is done by some 'central authority', not the ATM.

Limits: In some corner cases you may be able to exceed your (daily, weekly, monthly) limits. But, as the article points out, this doesn't imply financial inconsistency.

As someone who's had to dispute NSF charges for reordered transactions, I question the author's claim that ATM operations commute. Specifically, assume a $0 beginning balance. Then "withdraw $20 then deposit $200" might yield an error, $0 cash, and a $200 balance where "deposit $200 then withdraw $20" would net $20 cash and a $180 balance.

Or, as in my disputed case, $20 cash and a $140 balance, because the deposited check hadn't actually cleared by close-of-business, so end-of-day processing assessed a $40 negative balance fee despite the fact that the deposit "eventually" cleared. The first manager I discussed this with had the audacity to claim it was my fault for not somehow recognizing that the portion of the deposited funds the ATM made available for immediate withdrawal by design were not, in fact, available for immediate withdrawal.

This seems like a pretty bad strategy to sell nosql etc. What are we supposed to believe, that every ATM downloads account numbers, pin hashes and balances for every account in their network?

Banks have chosen consistency over availability regularly as they've been able to rebuild their systems over the decades. 40 years ago, if you had a credit card the place just took it, copied it down and trusted you and the bank were good for it. Try to get anyone to take your credit card if the network is down now.

IMO banking culture and standards probably were direct motivators of many of the ways traditional systems were designed. They were some of the earliest adopters of IT. It may not be accurate to say banks or ACID. It may be more accurate to say ACID is banking.

This is an example of why analogies should never be used (except perhaps if trying to explain a concept, not something tangible).

ATM software has nothing to do with the backend storage of the data. My point is that I bet when the data is actually written to durable storage in the backend, the set of data being written will be wrapped in an ACID transaction.

I think it's funny that the photo in the article is of two guys stealing an ATM (note the pantyhose masks).

Banks can have availability over consistency because we have enough laws and protections in place to reverse any fraudulent transactions.

With BitCoin, such an architecture could be abused rather badly.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact