Hacker News new | past | comments | ask | show | jobs | submit login
How you can track your personal finances using Python (sgoel.dev)
140 points by siddhant 48 days ago | hide | past | favorite | 58 comments

There’s so many ways to do personal finances but I found most of them to either be too tedious or require doing stupid things like letting third parties log into your bank account.

For me I simply had a spreadsheet with rows for each month and columns for each account: chequing, mortgage, RRSP, RESP, etc. I’d spend 10 mins a month logging in and typing out each balance. I’d chart the over-time trend and that’s all I needed to detect any changes in habit or ugly trajectories. 90% of what signal I needed with 10% of the effort.

If you don’t need to micromanage all your flavours of spending or other details, consider keeping it dead simple.

I agree that this works at the income level of most of the HN audience. If you're living paycheck to paycheck, you may have more need to closely track exactly what your money is going to (of course, you'll also likely have the least spare time to spend on such tracking). At the income level of most tech workers/the HN audience, however, I agree with your system. My spreadsheet is similar to what you describe - savings/checking accounts, investment accounts, and balance of all credit cards/mortgate/loans that give me a single number of "net worth". So long as its trending significantly positive, I don't care about the details.

Yup. And at the per-theme level if I notice an anomaly I just check it. “Oh my chequing took a $1500 dip in its usual pattern… let’s skim the account… ah yea there it is, we bought that sectional for the patio.”

>So long as its trending significantly positive, I don't care about the details.

I find that once I normalize against the price movements of my goals, the significantly positive trend becomes much less significant. Since I am relatively young with likely a couple decades or more work left in me, I like to benchmark against a low cost broad market equity index ETF.

Yeah, I wrote my own open source app to do basic personal finance accounting 12 years ago after getting annoyed at the propriatary OS X solutions (iBank, in particular), and KMyMoney / GNUCash not supporting split transactions at the time (they do now), and have been using that for the past 12 years. It's not double-entry (I recognise the benefits of that, but don't like it), and it's a manual process, but it works for me, and I can graph stuff easily to provide some indication of where money is going.

I ported it to Qt for Linux last year and still use it, but to be honest, I don't really know why I track my finances that religously - although it has helped me a few times with accounting / tax purposes given I'm a contractor now.

I suspect one of the reasons there are so many different solutions is everyone wants to do things differently, and integration with the banks for automated transaction import/download is quite complicated.

Try GNU Cash - https://gnucash.org/

I'm not really sure about the python aspect in this article however I've been using beancount and Fava for nearly 3 years now and it's one of my favourite pieces of software. Outside of installing it I've never had to really think about python.

All of my financial history is contained in a text file processed by beancount and long may it continue!

One thing I do wonder though is how well it scales, I'm a single man so tracking everything isn't too much effort but if I had a partner it suddenly becomes at least 50% more things to track

You didn't have to write any Python to import transactions from your bank?

I have some custom code to convert CSV files downloaded from my bank to beancounts text format, but it is all dotNet code since that is what I know the text format is easy to work with

That's actually a good point.

You only need to write Python code if you're using the Beancount importers framework and intend to use bean-identify and/or bean-extract.

Shameless plug of my open source finance system. https://github.com/darcys22/godbledger

I love these command line self hosted accounting software packages. But double entry bookkeeping was invented using ledgerbooks with ruled tables. I feel the plain text dataformats are a regression compared to a sql database. A general ledger just works so well with columns that you can sum.

It also saves you from needing a custom tool like bean-query to replicate sqlite-ish queries because it could have been in a database from the start

I struggled with the accounting concepts until I stumbled on this blogpost from Martin Kleppmann: https://martin.kleppmann.com/2011/03/07/accounting-for-compu.... After that I finally 'got' the accounting graph, and I don't care too much about the medium or the software anymore (its a google sheet in my case).

Martin's blog post is awesome! It helped clear up a bunch of stuff for me too.

And on a somewhat unrelated note: his book on building data-intensive applications is another goldmine (on a different topic).

I used HLedger for years it was great. I thought I could do better with a paid solution. YNAB was awful and slow. So I switched to Lunch Money. It's amazing fast and intuitive. Only it's not a good fit for me. I'm trying to shoe horn my budgeting style into and it's not working.

I use the envelope style. Every dollar is assigned a bucket when it comes in. All spending is done from a bucket and balances always persist. I don't like monthly budget targets. I prefer to put money in a bucket when the money comes in.

I might move back to plain text accounting as it is so flexible. I will miss the amazing interface to Lunch Money. I also like supporting a small business like Lunch Money.

I still use Hledger along with a small phalanx of accessory scripts, mostly written in Python, that do everything from parse CSVs to calling smart contract read functions to categorize crypto transactions for tax purposes.

Many thanks to Simon Michael for not only that wonderful program but also being so responsive.

I might go back to HLedger soon. I guess I can always donate monthly to the project to scratch my small business support itch.

I also had a good collection of scripts to manage my finances. I ran it all on a VPS so I could access anywhere. I had an ssh client on my phone to make changes on the go. Not as ergonomic as an app but it worked.

Agreed Simon is an amazing project owner. He and his documentation are the reasons I like HLedger over the alternatives.

I'll admit that GoDBLedger is also tempting.

I think it's https://onefinance.com that provides separate routable account numbers and credit card numbers per envelope/"pocket". A testimonial popped up about two weeks ago: https://news.ycombinator.com/item?id=28247788#28248539.

I'm currently on a trial of Actual Budget (https://actualbudget.com/). It's very similar to the old desktop version of YNAB. It has desktop and mobile apps (web app in beta), and does syncing. I haven't used the mobile app yet, but the desktop app is fast and entering transactions is fast. Only costs $4/month.

Thanks I'll have a look. I found something similar in the past https://www.budgetwithbuckets.com/en/index.html

I thought I had seen every envelope budget software. Haha! Why aren't you using this one?

I haven't tried it yet. I'm still trying to make Lunch Money work. They just implemented roll over budgeting. I want to see if I can get that going. TBH the main problem is a lack of time dedicated to figuring it all out.

I'm slowly working on a SaaS app to combine in-browser real-time collaborative text editing with plain text accounting, using the hledger format. Just curious if that is something you, or others, would be interested in.

I'm interested, but I don't want to go the route of text editing. It seems odd but for me the pain point of plain text accounting is the text editing. I like a GUI to enter my transactions. I want the least clicks possible to enter my transactions. I want budgeting to be a fast as possible.

I’m personally interested in something similar for the beancount format.

> We take the output of the previous step, pipe everything over to our .beancount file, and "balance" transactions.

> Recall that the flow of money in double-entry accounting is represented using transactions involving at least two accounts. When you download CSVs from your bank, each line in that CSV represents money that's either incoming or outgoing. That's only one leg of a transaction (credit or debit). It's up to us to provide the other leg.

> This act is called "balancing".

Balance (accounting) https://en.wikipedia.org/wiki/Balance_(accounting)

Are unique record IDs necessary for this [financial] application? FWICS, https://plaintextaccounting.org/ just throws away the (probably per-institution) transaction IDs; like a non-reflexive logic that eschews Law of identity? Just grep and wc?

> What does the ledger look like?

> I wrote earlier that one of the main things that Beancount provides is a language specification for defining financial transactions in a plain-text format.

> What does this format look like? Here's a quick example:

  option "title" "Alice"
  option "operating_currency" "EUR"

  ; Accounts
  2021-01-01 open Assets:MyBank:Checking
  2021-01-01 open Expenses:Rent

  2021-01-01 * "Landlord" "Thanks for the rent"
      Assets:MyBank:Checking     -1000.00 EUR
      Expenses:Rent               1000.00 EUR
What does the `*` do?

The star is just an 1-digit field for "flag". I don't think there are any defined semantics for the field but by convention 'star' means something 'no special flags on this transaction'.

People use other flags to indicate, for instance, whether a transaction has been reconciled, or cleared their bank, or whatnot.

It's funny that a register application is basically a trivial database, but it requires so much subtle UI/UX that it is really hard to make something usable. I've been using Quicken for 30+ years and I've tried other software, including my own and open source, and nothing really comes close the the ease of what they've built.

<shameless plug> If you're using Python, and you want to import transaction/balance data without handing over your passwords to some third party, you can try


</shameless plug>

Looks cool! This will definitely be of interest to folks using Beancount.

I just added a link to it on https://awesome-beancount.com/#tools.

I find Google Sheets pretty hard to beat - you can get stock pricing data via the GOOGLEFINANCE function, and I have a simple script that does imap -> sheets for bank and credit card balances. I'm able to keep an up to date view of my net worth without resorting to Mint/Plaid et al. And you can generate whatever charts and graphs you like fairly easily - have access on your phone, etc.

For budgeting, I import csv from my bank/credit cards into google sheets and tag transactions with ~20 different categories - I do this every month or two and it doesn't take that long. It gives me a backwards view of my spending that I can use to influence future spending. I don't run a hard budget really.

hledger is another open-source CLI tool for accounting that works amazingly well for me. You can add transactions editing in plain text too and there is a basic web UI on top of it.

And hledger is based on the original https://www.ledger-cli.org/ which is also a great option for CLI accounting.

Regarding "1. Download transactions from your bank" - to avoid having to deal with PDFs, it's possible to use account aggregators to fetch your own bank data in Python-friendly formats like JSON.

We have this feature at Nordigen (I'm the cofounder), where we allow developers to download their bank statements in JSON format from 1,000+ European banks (free): https://nordigen.com/en/blog/download-your-bank-statement-js...

For a data broker that deals with highly sensitive data, your privacy policy is very broad and unspecific (and as such not actually GDPR compliant). I would not trust you with my data that you will (amongst other purposes) use for:

"in order to derive statistical models for future data enrichment purposes", "to improve the performance and accuracy of the Services", "to improve Your experience while using the Services" and "to send Users and website visitors relevant informational content regarding Services and personalized offers they subscribed to".

"Nordigen may also provide personal data to companies that process personal data on behalf of Nordigen such as marketing service providers." - wonderful, thank you for doing that. not.

GDPR requires you to be more specific in terms of "purpose", to actually list your data processing partners, what purpose the data sharing with that particular partner serves, and which data exactly is shared with which partner.

This is great. Looking forward to a few of your "in development" entries of the Croatian banks!

Coming real soon. If you sign up to the platform, you can get updates on our coverage as we're ramping up in Croatia.

If you want an open source self hosted budgeting solution, firefly is the most commonly recommended (https://www.firefly-iii.org/). If you are looking for something supported/less DIY and with features beyond budgeting, checkout Homechart (https://about.homechart.app).

The author of firefly claims to do double entry but it's just not true. I quickly gave up when I realized that I could not input a refund on my credit card account. It doesn't allow positive transactions. The only suggested workaround by the author himself was to edit the original transaction. Firefly is probably useful for many people but to me it was too opinionated and not natural to use. I moved to Gnucash.

That's still double entry actually, because it refers to the DB structure ;)

I'm thinking about a PF app concept where you enter your own transactions manually, but you don't have to keep track of everything down to the cent.

At the end of every month, banks send out statements so importing all transactions aren't necessary to keep track of one's overall financial standing.

Banks do make mistakes but extremely rarely that it's not worth wanting to import all transactions in the off chance that you'll catch a mistake. Retailers can make mistakes like double charge you but again those are rare and you can take a quick look at your statements to find them usually.

Keeping track of individual transactions are important to be able to figure out where the money is going in general but this doesn't require keeping track of 'every' transaction. Just the ones that make up the biggest percentage of the total money spent. They are usually easy to spot on the bank/creditcard statements.

One doesn't even have to be perfectly accurate to gain from inspecting statements either. You spent $834.56 for new tires? You can enter that as $850 and it's fine.

Sure, there might be cases where the total is split more or less evenly across several small transactions and that can be handled similarly to the above.

Knowing where the money is going isn't terribly useful actually unless there are a lot of frivolous spendings. It usually comes down to "you are not making enough money" or "the lifestyle you desire is too expensive for your current income". No app can fix either of these, other than point out the obvious problem. Most personal finance blogs end up leaning towards "how to earn more" because that's more important than keeping track of your transactions or budgeting.

This is the philosophy behind the FOSS app I developed for my own system. The way it works is that I track the big stuff and ignore the little stuff. Periodically (typically every month but it doesn't have to be) I enter the current balances of my bank accounts.

I have a report that does how much money I spent per period that is unaccounted for, that's money that got spend on "miscellaneous". As long as that number is beneath some threshold that I consider acceptable I let it be. If it's high, I might look back and try to determine where my money went and make adjustments.

The guiding principle behind this approach is resilience to laziness. It's not an all or nothing system, instead it gives me whatever I give it. If I am busy and don't enter as many transactions, no worries - I'll just have more in the miscellaneous category. If I don't track my balances every month, the period will be longer. But the system always "balances".

The system also serves as a log for bigger purchases. When did I buy that fridge? How much did I pay? What is the exact model number of that piece of equipment in case I need to buy it again? I can search my transaction log.

Awesome! Could you provide a link for it?

I'd rather not link my HN profile to github which uses my real name. I went to you profile to see how I could send it to you directly but...there's no email listed on your site or github and I can't message you on Twitter (I guess that's a setting?)

Anyway, while it's FOSS it's not really packaged for easy reuse. You'd need to set up a server and configure it. It might be good as inspiration or as a starting point for your own project though, so you're still interested I can log in to LinkedIn later and message you there.

Ah yes, that's true, thank you for trying. I didn't expose email on purpose. You are right LinkedIn would be the only way for now if you didn't want to post it publicly.

I don't think double-entry bookkeeping is such a common knowledge. Anyways, there are discussions and links trying to teach "accounting for developers" on https://news.ycombinator.com/item?id=23964513

I personally find the topic difficult and comes with a lot of nonsensical jargons.

I built https://mygraph.ca as I was tired of Mint failing to sync with Canadian banks (and later dropped Mint anyway because of the security/privacy concerns).

It’s not targeting technical users but it would be cool to add a SQL integration like Stripe Sigma, etc at some point.

Hey this is pretty cool, I posted an app too and see some opportunities for collaboration here (especially since I'm starting to explore data visualization capabilities). Can you email me? My contact info is in my profile.

This is really cool but I feel like it glosses over married accounting, which is a longer session of yelling things like "Honey! What's this PayPal for $117 3 weeks ago?!"

a REST API such as: https://plaid.com/

Plaid has serious privacy and security concerns, so I would be careful:

- https://news.ycombinator.com/item?id=28200076

- https://news.ycombinator.com/item?id=28389576

From https://news.ycombinator.com/item?id=28203393 :

> No, your personal data is not sold or rented or given away or bartered to parties that are not Plaid, your bank, or the connected app. We talk about all of this in our privacy policy, including ways that data could be used — for example, with data processors/service providers (like AWS which hosts our services) for the purposes of running Plaid’s services or for a user’s connected app to provide their services.

>> I saw that. Thank you for your patience and persistence in responding to so many pointed questions.

>> For any interested, here is a link to relevant section of the referenced privacy policy: https://plaid.com/legal/#consumers

>> I am also impressed by the Legal Changelog on the same page that clearly lays out a log of changes made to privacy & other published legal documents.

The comments in those threads were more negative than positive, and the fact that Plaid paid a $58 million settlement for allegedly sharing personal banking data with third parties without consent is telling enough. I am not going to give my banking usernames and passwords to Plaid in plaintext, when their employee is arguing on HN over what the word "sold" means:


Are you making claims without evidence? Settling is not admission of guilt.

Banks should implement read-only OAuth APIs, so that users are not required to store their u/p/sqa answers.

From "Canada calls screen scraping ‘unsecure,’ sets Open Banking target for 2023" https://news.ycombinator.com/item?id=28229957 :

> AFAIU, there are still zero (0) consumer banking APIs with Read-Only e.g. OAuth APIs in the US as well?

Looks like there may be less than 3 so far.

> Banks could save themselves CPU, RAM, bandwidth, and liability by implementing read-only API tokens and methods that need only return JSON - instead of HTML or worse, monthly PDF tables for a fee - possibly similar to the Plaid API: https://plaid.com/docs/api/

> There is competition in consumer/retail banking, but still the only way to do e.g. budget and fraud analysis with third party apps is to give away all authentication factors: u/p/sqa; and TBH that's unacceptable.

> Traditional and distributed ledger service providers might also consider W3C ILP: Interledger Protocol (in starting their move to quantum-resistant ledgers by 2022 in order to have a 5 year refresh cycle before QC is a real risk by 2027, optimistically, for science) when reviewing the entropy of username+password_hash+security_question_answer strings in comparison to the entropy of cryptoasset account public key hash strings: https://interledger.org/developer-tools/get-started/overview...

> Are you making claims without evidence?

No. Plaid did agree to pay the $58 million, and the lawsuit was for alleged data sharing with third parties without user consent. I don't care if they admit guilt or not. They agreed to pay $58 million to end the lawsuit, and that does not engender trust. Shifting the blame to banks doesn't make Plaid any more reputable.

Providing usernames and passwords of sensitive accounts to a third party is a privacy and security risk, and Plaid has not earned enough trust from me to justify the risk I would need to assume to use their services.

How did their policies change before and after said settlement?

From https://my.plaid.com/help/360043065354-does-plaid-have-acces... :

> Does Plaid have access to my credentials?

> The type of connection Plaid has to your financial institution determines whether or not we have access to the login credentials for your financial account: your username and password.

> In many cases, when you link a financial institution to an app via Plaid, you provide your login credentials to us and we securely store them. We use those credentials to access and obtain information from your financial institution in order to provide that information, at your direction, to the apps and services you want to use. For more information on how we use your data, please refer to our End User Privacy Policy.

> In other cases, after you request that we link your financial institution to an app or service you want to use, you will be prompted to provide your login credentials directly to your financial institution––not to Plaid––and, upon successful authentication, your financial institution will then return your data to Plaid. In these cases, Plaid does not access or store your account credentials. Instead, your financial institution provides Plaid with a type of security identifier, which permits Plaid to securely reconnect to your financial institution at regularly scheduled intervals to keep your apps and services up-to-date.

> Regardless of which type of connection is made, we do not share your credentials with the apps or services you’ve linked to your financial institution via Plaid. You can read more about how Plaid handles data here.

What do you think this should say instead?

Do you think they use the same key to securely store all accounts, like ACH? Or no key, like the bank ledger that you're downloading a window of as CSV through hopefully a read-only SQL account, hopefully with data encrypted at rest and in motion.

When you download a CSV or a OFX to a local file, is the data then still encrypted at rest?

Again, US Banks can eliminate the need for {Plaid, Mint, } as the account data access middlemen by providing a read-only OAuth API. Because banks do not have a way to allow users to grant read-only access to their account ledgers, the only solution is to securely store the u/p/sqa. If you write a script to fetch your data and call it from cron, how can you decrypt the account credentials after an unattended reboot? When must a human enter key material to decrypt the stored u/p/sqa?

Here, we realize that banks should really have people that do infosec - that comprehend symmetric and assymetric cryptography - audits to point out these sorts of vulnerabilities and risks. And if they had kept current with the times, we would have a very different banking and finance information system architecture with fewer single points of failure.

I'm not interested in what Plaid puts in a help page, since Plaid's $58 million settlement is for alleged data sharing with third parties without consent, meaning that Plaid is accused of not properly communicating the alleged data sharing to its users or obtaining permission.

And Plaid's terms of service (https://plaid.com/legal/#how-we-use-your-information) contains vague catch-alls such as:

> We share your End User Information for a number of business purposes:

> With our data processors and other service providers, partners, or contractors in connection with the services they perform for us or developers

Sure, it would be great if banks offered different authentication systems, but that has nothing to do with my lack of trust for Plaid. A different authentication system wouldn't eliminate the data sharing concerns I have with Plaid.

Wow! Great work on an alternative.

Sounds interesting, but it’s still a long way to beat the best finances tracking system, good ol’ Excel sheets.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact