Hacker News new | comments | show | ask | jobs | submit login
Accounting for Computer Scientists (kleppmann.com)
520 points by martinkl 2180 days ago | hide | past | web | 75 comments | favorite

Two observations.

First, coming from computer science, introductory accounting - the bookkeeping mechanics - is quite easy. I took a 101 class and was surrounded by future Lords of The Universe who complained about how "hard" it was to add and subtract numbers according to a small system of rules.

Accounting is a system of metrics. The system under measurement is your business. Its purpose is to give an accurate readout of business performance. The nice thing is that accountants produce lots of metrics and these can be used to probe the behaviour of different parts of the system.


> One thing to watch out for: profit doesn’t say anything about your bank account ... That’s why it’s possible for a company to be profitable but still run out of money!

This is why accountants produce a third document, the cashflow statement. It shows how cash is coming in and going out. This is different from the P&L statement, which deals with revenues and expenses, both of which may include future events as opposed to actual cash changing hands in the period covered by the report.

If there's one thing you absolutely must learn from accounting, it is that positive cashflow and profits are not the same thing. But if you run a business without both of them, that business is doomed.

I agree with observations and comments.

Well you can live without profits for sometime, if you run out of working cash, your stuffed. This is why loans, and factoring come in and is why even profitable business require loans because of bad cashflow.

One easy way to try avoid this, is to always try to get in flow payments coming in faster, and outgoings slower. Even if the net total is the same in end, it allows more flexibility with cash flow.

Accounting is obvious to me, but we probably haven't encountered the hard problems yet.

Most of the hard problems are to do with depreciation, and the legalities of how you classify them.

For example, Police departments have dogs. Dogs are assets. If all government departments are told to use accrual (rather than cashflow), they need to depreciate assets.

So the account needs to figure out how dogs depreciate. Should you assume that a dog has a useful life of 10 years, and loses 10% of its initial value every year? Or that it exponentially decays, as it gets older and less able to sniff bad guys? Or should it only be 5 years? What do you do if a dog ends up working long past its expiry date?

It can be tricky to come up with sensible numbers. Especially when accountants don't have domain knowledge, and people who do don't like speaking to them.

> Most of the hard problems are to do with depreciation, and the legalities of how you classify them.

I would generalise that to say that all the hard problems in accounting are ones of classification.

Does sale X fall in 2010 or 2011?

Do we book the inventory when we order it, when it is delivered to our warehouse, when it is delivered to our shop, or all three? And if we use some combo, how do we combine them in LIFO calculations?

We've just donated $100,000 to Fashionable Cause, a charitable foundation founded by Irish rockstar Nobo and in exchange he will wear our platform shoes exclusively. Do we book this under philanthropy or marketing? If philanthropy, do we consider it part of the overhead cost for the shoes sold?

And so on and so on.

Most of the big bucks in accounting comes from coming up for clever excuses as to why something should be categorised in a certain way.

Yep. When you're out of cash, the game is up, no matter what the P&L says.

A different way to look at it is this: cashflow tells you the health of the company right now, P&L in the long run. You need to watch both.

For example, you've booked $100 million of sales for the period. According to the P&L you're profitable by $20 million. A bill arrives from a supplier for $10 million.

But supposing you only had $5 million cash on hand. Unless those customers begin paying you money for the booked sales, you are in trouble. Time to take corrective action.

Likewise. Consider that you're in a business where customers pay on purchase (most retail industries). Your cashflow might be excellent in that you have plenty moving across your books. However costs are rising and you are making a loss on the P&L. Eventually this will deplete your cash and you will be out of business. Time to take corrective action.

Used properly, P&Ls and cashflows are a monitoring tool, a kind of standard Nagios/munin/zenoss for your business.

"User-friendly" accounting software (such as QuickBooks) tends to obscure the fundamental simplicity of double-entry accounting.

If you want to actually understand your books, use something simple and powerful like John Wiegley's ledger:


I wish I could vote this up twice.

Ledger is awesome and strips away all the crap that gets in the way of actually seeing the data.

There are also haskell and python implementations out there if that's more up your hacking alley.

The haskell implementation has a web command that presents a nice html page with summaries of every account along with the ability to filter by date or account. It is a nice way of quickly analyzing the current state of affairs.

My favourite feature is this: like the article, ledger completely ignores any the concept of debits and credits. Negative numbers FTW!

I find this strange. Last time I had some interaction with a ledger negative credit and positive debit were different beasts. Granted, their contribution towards the balance was the same, nevertheless they were different things.

In particular a correction of an erroneous debit entry was stored as a negative debit, not positive credit. Representing both with the same number seems a bit suspicious to me.

Thanks zdw! Your comment made me try out ledger and I'm loving it after a day or two. Everything else I tried before seemed to eventually "get in the way" of how I'd like to do things.

Only downside is that the documentation is a bit lacking for version 3.0. I'm going to be using this for my business in parallel with what I have and I'll switch over if I continue to like it. There is probably a business case here for turning this into a commercial product with more importing / reporting.

Anyone know of a good Ruby clone?

I wrote a Ruby gem for accounting stuff a while ago: https://github.com/ept/invoicing --- it's unmaintained at the moment, but it does a lot of things right, so anyone interested in taking it over would be most welcome to do so.

I recently became aware of ledger when Wiegley was interview on the floss podcast:


Worth a listen both from a general open source development perspective as well as for an overview of the system itself.

We've been extremely happy with SQL Ledger - http://www.sql-ledger.com/

It's open-source and it's PostgreSQL-backed, so it's easy to automate various bookkeeping phenomena by attaching triggers to SQL Ledger table changes that propagate elsewhere, or vice versa. Its interface is very web 1.0-ish, and in virtue of that, very fast and responsive.

It's definitely rather low-level in the sense that it's easy to shoot yourself in the foot with it. It's like programming in C instead of C# or Java. It requires understanding double-entry accounting and GAAP at a fairly erudite fundamental level, because unlike QuickBooks, there are no glossy splash screens or inline thought balloons to teach you Accounting 101. It's also lacking in many high-level features such as a payroll module, and has fairly rudimentary stubs for many others. It is, however, quite good at managing parts, assembles and inventory if tangible goods are your thing. It's a little more problematic for a service company; outside of AR and AP, I tend to think of it as little more than just a general ledger. That may not be giving it enough credit, though; it does have quite a lot of features.

Nevertheless, it is a great tool for a startup comfortable with using home-grown open-source solutions.

Been using ledger for over a year now and it was exactly what I was looking for. I have used various things in the past, but they never imported everything or didn't work for some edge case and eventually stopped using them. Being able to quickly write scripts on top of it for import, analysis, graphs, price and budget predication etc makes me a happy camper. Not to mention being text the files get saved in git for even more possible analysis :)

If you use Ledger, check out my command-line version of Mint for Ledger called Reckon.


By coincidence, did you know that the company that distributes Quicken / QuickBooks in Australia is called Reckon?

How does ledger compare to gnucash?

Quite different, basically no gui. I like ledger better. I have my personal and two different business accounts using ledger. I use git to VC the data files. I've done quite a bit of scripting to make the system more convenient (automate billing, tracking inventory, converting between cash-based and accrual reporting).

Ledger is really quite simple. The core of it could be implemented pretty easily (witness the number of "ledger" clones). I have my own somewhat compatible ledger program written in Python. It's crude otherwise I would release code.

> It's crude otherwise I would release code.

Everything starts out crude--release it anyway! Seriously.

I have to say, I hated the blog post, but being pointed out this plus that Reckon code somebody else posted, has made it worth reading and criticizing.

It's always enlightening when someone manages to explain a somewhat complicated subject in terms of an abstraction that the audience understands. This is a great example of that.

And it works the other way too.

When you've read this article and understood it you wil know how to explain graph theory to an accountant - "see it's actually not that hard, it's just like bookkeeping"

If you have a formal systems bent, as I do, you might enjoy "Algebraic Models for Accounting Systems" (http://www.amazon.com/Algebraic-Accounting-Systems-Salvador-...).

"This book describes the construction of algebraic models which represent the operations of the double entry accounting system. It gives a novel, comprehensive, proof based treatment of the topic, using such concepts from abstract algebra as automata, digraphs, monoids and quotient structures."

Think of it as a primer for building yourself an exceedingly awesome and utterly-unnecessary Haskell-based QuickBooks.

OK, that sounds pretty amazing, but is it as good as it sounds? For example, Leon Sterling wrote a 400 page book that included a crappy unrealistic example of how to write a Tamagotchi program. http://www.amazon.com/dp/0262013118 Funny, considering Luca Cardelli's paper about biologist's fixing tamagotchi's.

Having recently taken an accounting class, this is amazing.

The course I took was chock full of "because that's the way it is" explanations of terms and practices, which without fail left me feeling confused and unsatisfied. I think most people here are like me and really need deep explanations of the lower level concepts in order to be able to apply higher level concepts. Accounting coursework is absolutely horrible at this.

Agreed. It always surprises me how a subject that is very regimented in its rules and regulations can easily skip over defining all of those foundational principles (my small forays into learning accounting consists of learning through deciphering examples rather than being able to see something as clear as this page). It may be that the subject is liable to the problem of "everyday" words that carry deeper, technical meanings in the context of the subject - as in, everyone already knows what sales and accounts receivable are, right? Great, now let's move on....

Accounting courses are horrific, but in fairness they try to get you up to speed on the basics before diving into the detail.

A typical CSI 101 course is often like this -- you learn about operators, etc without understanding how things actually work until later.

I had really great accounting professors, who explained WHY things were done.

As with anything else, assume a normal distribution of teaching talent, and hope you end up being taught by somebody on the right side of the curve.

"Because that's the way it is" describes my entire mathematics education. It didn't work for me, and I can't help but think there has to be a better way.

If you’re a real accountant reading this, please forgive my simplifications; if you spot any mistakes, please let me know.

There is one pretty important section of the P&L / Balance Sheet that's missing . . . taxes.

On that note, I am hosting a tax workshop on 3/15 @ Hacker Dojo in Mtn. View (very close to YC's office)


Please make a recording available more widely afterwards, for those of us who can't attend. I'd pay the same amount you're suggesting as an in-person donation for this, if it comes to that.

Any plans on livestreaming/recording this and making it available afterward?

It would be pretty easy to include as another node in the graph though.

David Friedman wrote a very short post 5 years ago on accounting:

"I have been teaching a new course that includes two weeks explaining accounting to law students. To do so, I first had to understand it myself. I think I now do, and in the hope that the information might be useful to others ... ."


I am not an accountant. That said ...

There are a few key pieces of functionality missing from this description, and what (I think) is a really important insight that wasn't really emphasised.

Key pieces missing: the hierarchical chart of accounts, and grouped transactions. (You want to be able to find each 'transaction' with Dell, even though one given transaction might include $4000 in depreciable assets, $1000 in extended warranties and other services, shipping, tax... each of which could be a different 'edge' on the graph itself.)

And the insight is that it's the edges that count; nodes don't matter -- if you're storing "account" objects with a "balance" property, it ought only to be for caching.

As someone who both 1) checks the comments to see if I should click a link I think might be dodgy, and 2) thinks a lot of links look dodgy, including this one at first, let me give a hearty recommendation to this article. Well worth reading from start to finish.

For a programmer I'd rather explain it in database terms. It's a single table T, which basically contains the edges of Kleppmann's graphs.

  amount  source  target  (metadata like date ...)
For each account X you get the left and right side with simple SQL queries:

  select * from T where source=X
  select * from T where target=X
All the other mumbo-jumbo about double-entry bookkeeping is implicitly baked in. For example, "The double-entry bookkeeping system ensures that the financial transaction has equal and opposite effects in two different accounts." Of course, each entry subtracts amount from the source and adds it to target.

While this representation is easy to implement (see ledger, i suppose), it does not lead to pretty graph pictures.

This analogy seems a bit off, and I don't see how it simplifies things.

I guess this is similar to the "Monad tutorial" problem, where the author forces a (typically way off) analogy onto the reader.

One concrete problem with "accounting as graphs" is that a transaction typically involves more than two legs, while Graph edges are ALWAYS associated with exactly two nodes. You can emulate this, of course, and the author hints as much in his description of complex "deals", but it raises the question whether graphs are the best analogy.

Loved this post. Starts to convince people that Accounting really can be beautiful, in many of the same ways that software can be beautiful. It does tend to get overrun with terminology, edge cases, and other necessary issues given the consequences of ambiguity, but at the lowest level, accounting is just telling a story. Wonderful post!

OH MY GOD! I finally understand why Sales is a liability on all the balance sheets I translate! !!!

It's not actually a liability; it's an account with a credit balance. Sales flows into retained earnings which is an equity account, which has a credit balance per the accounting equation of Assets (Debit balances) = Liabilities + Equity (Credit balances).

I think the rules may differ in Europe (the way things are classified) - but the insight I had was that something that seems clearly to be an asset to my naive view (sales) can actually be a sort of accounting fiction to make the balances come up right, and thus be shown as a liability. Without checking my notes on a specific translation I did about 15 months ago, I don't even remember if it was sales or not. (I have computers so I don't have to remember things, obviously, ha.)

I think the rules may differ in Europe (the way things are classified)

Nope. Sales is NOT a liability, and there is no accounting fiction. Sales are also not an asset. They are an income. The money earned from the sale is the asset. I think you may be confusing ledger credits with liabilities.

All credits are not increases in liabilities, but all increases in liabilities are credits.

Good background on Wikipedia: [ http://en.wikipedia.org/wiki/Debits_and_credits ] [ http://en.wikipedia.org/wiki/Double-entry_accounting_system ] [ http://en.wikipedia.org/wiki/Accounting_equation ]

Ah - finally I found the balance sheets I was thinking of (some Deutsche Bank stuff from 2009). Turns out I was misremembering; it was equity capital that falls under liabilities, which is counterintuitive to me. Capital is, after all, something I have - but I suppose since it's an investment, it represents something I owe.

And I see capital is shown in the balance sheet in the original post as well - but what struck me as counterintuitive in the original post is representing sales as, effectively, a cash flow towards the customers. If not a fiction, surely you have to grant it's an abstraction.

Equity is on the same side of the balance sheet as liabilities because of the accounting equation: Assets = Capital + Liabilities.

See this tutorial as well: http://www.principlesofaccounting.com/chapter%201.htm

OH MY GOD! I finally understand why Sales is a liability

And expenses are an asset? Your comment got 9 upvotes, which would indicate either that I am either missing your sarcasm, or notwithstanding the article, many hackers still don't get accounting.

You won't see Sales on a Balance Sheet. It lives on the Income Statement.

This i found very enlightening to read. I wonder how many accounting apps actually store their data as he describes in this article, to aid their calculations

There are some further complications that make it impossible to use that exact representation without change; for instance a single transaction may involve more than two nodes; you could buy furniture and computers in one transaction and something in the system needs to represent this was one purchase, even if it is further broken apart after that. But with suitable modifications it seems like this ought to work. Whether or not it is actually done this way, I don't know... and to be honest, I doubt it, filed under "I don't need all that theory crap, I've got problems to solve!". Programmers are often quite happy to write much more complicated and redundant programs if it means not having to learn about graph theory. Not that they think of it that way. And I find it unlikely anybody started a real accounting project with "First, let's learn all about accounting and strip it down to the minimal conceptual core", rather than "Our accountants say this is how it works, so let's start putting together the matching class hierarchy", which will not produce a clean graph structure where one did not previously exist.

(And yes, theory can't be perfectly followed either, all systems will generally need some amount of dirty things, but the correct balance IMHO is definitely not 100% "pragmatic".)

Accounting programs already employ such directed, weighted hypergraphs. They are just not storing pretty pictures, but the incidence matrix.

The programmers probably don't think of their tables as an incidence matrix though, and end up coding lots of special cases, instead of re-using general graph algorithms.

off the top of my head, none. :-)

This is very easy to model in a relational db and enterprise software that does this has many more features than described here.

Accounting gets interesting when you get derivatives and the fed involved. You own a volatile asset, so the price can go up and down, everyday. However, you don't necessarily know how much said asset is worth. So, you have no P&L. You can possibly borrow against said to cover cash flow, but if asset declines in value, your creditors can demand payment.

Furthermore, your balance sheet can look great. However, if your debtors go bankrupt, a solid balance sheet can quickly deteriorate. If your customers pay late, and you can't borrow money, the delicate balance can collapse.

I think it'd be worth mentioning this is accrual basis accounting. In cash basis you wouldn't write down things which haven't already happened, like the anticipated payment from customer 2.

Tony Bowden described an accounting system built on Semantic Mediawiki in 2006.


Interesting. One of the side effects of having the zero sum rule for all transactions means that graphs can be superimposed on each other and also get a valid balance-sheet graph. Cool! Are loops possible? What do they mean?

loops represent the economy. Other than new currency from the Federal Reserve or Treasury, or physical dollar media being destroyed, its a closed loop system.

The “new currency” exception is very significant, because under our fractional-reserve banking system, banks have the power to create money.

If you deposit $100 in a bank, the bank can immediately loan out $90 of it to borrowers (who may put that money in their own banks, etc. etc.), thus creating money from nowhere. I.e., if those borrowers can’t pay back their loan and the bank writes it off, your $100 is still safe.

Money creation by banks has nothing to do with people "failing to pay back loans".

It works like this, banks have an account with their money, and an account with your money, and an account with Bob's money, and an account with everyone's money. Bank: 100$. You: 100$. Bob: 0$. Bank's Vault: 200$

If you loan 100$ to bob the accounts now look like: Bank: 100$. You: 0$. Bob: 100$. Bank's Vault: 200$

If the bank loans the money to bob, who deposits the loan back into the bank: Bank: 100$. You: 100$. Bob: 100$. <- Magic! Bank's Vault: 200$

However, the bank is really adding a new account Loans's -1 00$. Note: (Loan's: -100$ + Bank: 100$ + You: 100$ + Bob: 100$ = Bank's Vault: 200$ ) By law they can can only give a negative value up to ~90% of the Bank's Vault's value to Loan's. Granted world wide there are more than 1 bank, but because people don't keep a lot of hard cash on hand loan's just end up being deposited in another bank in the system which can then loan the money back to your bank as needed etc.

PS: Now when Bank's Fail the Fed can actually make huge amount's of money from thin air, but that's a separate process.

I thought fractional reserve meant that the bank can now loan out $900, which is the creating money part. (perhaps you had a typo?)

No, they only loan out $90. But that is creating money. Where did that $90 come from? It didn't come out of your account; you still have $100. Imagine how pissed you'd be if you went to make a withdrawal and the bank said, "no, you can only have $10 right now, we lent the rest to somebody and they haven't paid it back yet."

The $900 figure comes from the assumption that the $90 they lend out eventually lands in somebody else's account, and they then also lend out 90% of that $90, and then recurse that all the way down till there's nothing left to lend out again:

perl -e'$s = 100; while($s > 0.1){$t = $t + $s * 0.9;$s = $s * 0.9;} print $t'


(edited: I have no idea why HN dropped out the asterixes in that code...)

Asterisks without whitespace (or a leading indent) mean italics: http://news.ycombinator.com/formatdoc

A common misconception, but OP is right. A single bank can't create money this way, but a network of banks in a fractional reserve system can. Wikipedia has a particularly clear explanation of how it works:

"When cash is deposited with a bank, only a fraction is retained as reserves, and the remainder can be loaned out (or spent by the bank to buy securities). The money lent or spent in this way is subsequently deposited with another bank and increases the cash reserves of that second bank, allowing that second bank to keep a fraction of the new deposit and lend or spend the remainder. Thus the excess cash travels from bank to bank to bank creating new deposits as it goes. Although no individual bank does anything other than lend part of what is deposited with it, the practice of fractional reserve banking in a multi-bank system expands the money supply (cash and demand deposits) to a large multiple of the cash reserves in all banks."

There's a good table and graph illustrating the process down the page a bit:


Banks do not loan deposits. And yes, wikipedia is wrong. Ask a bank accountant.

Anyone willing to recommend a good book that introduces graph theory?

The wikipedia page on graphs contains much more material than it is needed to understand the article.

A good effort but all the sign flipping gets confusing

substituting programmer jargon for accounting jargon is the answer?

Hard to believe this article didn't contain the word "invariant."

Nicely done thanks. Much clearer than quickbooks or quicken.

Someone should write one of these for aspiring MPs. I'm pretty certain most modern Western Governments seem to have forgot the basics.

There should be an app for this. A dashboard even.

( I mean one that actually displays and lets you edit your books as a graph)

Was a fun read. But you don't have to get complicated.

Accounting is just taking notes. No matter how you write then

Check out accountingcoach.com

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact