Off by 1 (Day)

Jan 28, 2019 | Beancount , Personal Finance , Practices , Python

One of the most common bugs when writing software is the classic "off by 1" error. In this post, I'll talk about a similar bug I found in some code I maintain, and how I fixed it.

Backstory

I maintain beancount-dkb, which is a Python package that provides helper classes for converting DKB CSV exports to the Beancount format. In Beancount's terminology, these "helper" classes are called "importers".

If you're not familiar with Beancount, it's a plain-text accounting tool which lets you keep track of all your finances using plain text files. The idea is that you maintain all your bank transactions in one text file, and then use the tools that Beancount provides to run reports over all that data. The transactions in this file follow the Double Entry Accounting method, and are written in a DSL strictly specified by Beancount.

The way this works in practice is that every few weeks, you download transactions from your bank (often this is a simple CSV export), and run them through an importer to convert them into a data format that Beancount expects. You then append the resulting data to a .beancount file you maintain which contains all your transactions, going all the way back to stone age. Finally, you use the suite of tools that Beancount provides to run all sorts of analysis on your financial data.

It's actually much less complicated than it sounds.

I've used it to import my financial history from the last three years and the whole process has been quite smooth, except for one hiccup. Balance assertions. And that's what this post is about.

Balance Assertions

Here's a short code snippet that represents the history of a single bank account, written in the Beancount DSL. The bank account is named Assets:DKB and starts out with an opening balance of €100.

For simplicity, the history here consists of a single "going to the supermarket" transaction.

;; -*- mode: beancount -*-

; Date format - YYYY-MM-DD

option "title" "Max Mustermann"
option "operating_currency" "EUR"

2019-01-01 open Assets:DKB
2019-01-01 open Equity:Opening-Balances
2019-01-01 open Expenses:Supermarket

2019-01-15 * "Initialize Assets:DKB"
    Assets:DKB                           100.00 EUR
    Equity:Opening-Balances

2019-01-15 * "Going to the supermarket"
    Assets:DKB                          -30.00 EUR
    Expenses:Supermarket                 30.00 EUR

2019-01-16 balance Assets:DKB            70.00 EUR

The transaction with the description "Going to the supermarket" shows that the owner went to some supermarket and spent €30, which means that the account has €70 left at the end.

The interesting bit here is the last line.

2019-01-16 balance Assets:DKB            70.00 EUR

This line instructs Beancount to assert that the balance of the given account is the given amount at the beginning of the given date. And in case that's not true, Beancount should refuse to process things any further because there's obviously something wrong with the data.

Such assertions are not completely necessary, but having them gives you the peace of mind that the data you're working with is not wrong. This can happen in case of something like duplicate transactions. Often, when you have two accounts and you transfer money from one account to the other one, the same transaction is going to show up in both the account summaries. For Beancount they are two different transactions, but practially speaking that's not true. They are two legs of the same transaction. Left unmerged, these would result in wrong numbers on both the accounts. This is why balance assertions come in handy.

The Bug

When I first started out with Beancount, there were no balance assertions in my data. I knew the concept, but the initial versions of beancount-dkb I released didn't output any balance directives.

After a few months of regular Beancount usage and realizing how useful these assertions can be, I decided to implement support in beancount-dkb. This was not too much work since the documentation is pretty clear on how to do this.

So something like this,

"01.01.2019";"";"";"";"Tagessaldo";"";"";"100,00";

... becomes this

2019-01-01 balance Assets:DKB            100.00 EUR

What turned out to be more work was testing the whole thing.

After the initial implementation, I noticed the numbers just weren't adding up. The test cases were fine, but the test cases were made-up data anyway. The output on the actual data just didn't add up. After a few hours of trying to figure things out, I found this little gem in the documentation.

Note that a balance assertion, like all other non-transaction directives, applies at the beginning of its date (i.e., midnight at the start of day). Just imagine that the balance check occurs right after midnight on that day.

The difference is subtle, but can easily lead to numbers not adding up.

What's happening here is that Beancount is expecting the balance amount to be valid at the beginning of the day, while the balance values from the DKB output correspond to the amount at the end of the day. Note that the DKB behavior is not documented anywhere (and if it is, I couldn't find the relevant docs), but from all the data I saw, this makes the most sense.

Conclusion

The fix in this case was easy. Just set the date of the balance directive to 1 day after what's in the CSV. This shifts the time of the assertion to midnight at the start of the next day, which in turn makes the numbers all look good.

Admittedly, this bug wasn't too hairy and the fix wasn't that tricky either. But often in cases like this where things are rather subtle, it can take you anywhere between a few minutes to a few hours to find a fix. For me, it was somewhere in between.

The latest release of beancount-dkb includes this patch. I have (more or less completely) rewritten my Beancount data using the latest code with the correct balance assertions and so far things have been smooth. Apologies if an older version affected your data!