Escape from QuickBooks (with data in hand)
This article is not a review of GnuCash as a business accounting application — that will come later. But GnuCash seemed like a reasonable place to start. But pity the poor explorer who goes searching for information on how to move from QuickBooks to GnuCash and stumbles into this text in the GnuCash FAQ:
Such discouragement from the authors of the application itself is certainly enough to send the hunt for a new accounting package elsewhere. Interestingly, it seems that most people who have looked at this problem have concluded that the proprietary QuickBooks format makes the whole task impossible. But it is possible to get most of a business's data out of QuickBooks in a relatively useful machine-readable form, especially if your use of QuickBooks is relatively simple. Here is how I did it, starting with QuickBooks Pro 2015.
Liberating the data
It would be nice if one could select an option to extract the entire contents of a QuickBooks company file in some sort of open format. Failing that, one has to do it in two steps, and the result is not quite as complete as one would like.
QuickBooks stores much of its information in "lists", and those can be extracted in the IIF format. IIF is also said to be proprietary, but it's text-based and relatively easy to parse. Even so, some time spent searching for an available Python module for reading IIF was in vain, strangely; there must certainly be many of them out there. Oh well, when in doubt, reinvent a new wheel and go forward.
Anyway, when faced with the QuickBooks main screen, one can follow the menus through File → Utilities → Export → Lists to IIF Files and get a dialog with the set of lists that can be written out. The lists of the highest likely interest (and the ones that my importer tool can deal with) are the lists of customers and vendors, and the chart of accounts. There are a number of others available, including the "other names" list, but the purpose of that list seems to have a place to put a name when QuickBooks is absolutely determined to add it to a list, and no other list is applicable. The employee list will be of interest to some, undoubtedly, and perhaps the payment terms list.
The other piece of the puzzle is the set of transactions stored in the general ledger — the data that one uses an accounting system to track. That can be had by pulling up the Edit → Find dialog, selecting "Advanced", and entering the date range of interest; the result of the search can be exported to a CSV file with the transaction data. Be careful, though: QuickBooks 2015, at least, will silently cap the number of lines written to this file to 32768, so if you have a lot of data to export, it will need to be done in multiple steps.
That gets most of what most people are likely to want, with a couple of exceptions. One is the initial balances for the accounts. The chart of accounts is exported with balance information, but those are the final balances and unlikely to be of any real use. There does not appear to be an easy way to avoid copying down and entering initial balances by hand. The other missing piece is bills and invoices. The relevant entries in accounts payable and accounts receivable will be there, but the surrounding metadata will not. If that data is needed, it will need to be moved by hand — a somewhat painful prospect if one has a lot of outstanding items.
Feeding it to GnuCash
This data, once extracted, can eventually be fed to just about any sort of
accounting system, but the job is likely to be different for each.
GnuCash, happily, offers a Python-based interface (Python 2 only) that
can be used to manipulate its database. Unhappily, one of the most
accurate and complete parts of the documentation on the Python bindings can
be found on the
GnuCash wiki: "Python bindings have been recently added to
gnucash. There is still very little documentation and probably few people
would know how to use it.
" Developers wanting to use the Python
bindings are left scrounging the web looking for examples to crib from.
It is worth noting that there is another GnuCash interface called piecash. It has the advantages of being better documented and being ported to Python 3. On the other hand, piecash only works with GnuCash files stored in a relational database — a GnuCash mode that appears to be poorly developed and maintained and, as a result, generally not enabled by distributors. The stock GnuCash bindings, instead, will work with both database-backed and native XML files.
What follows is an overview of two programs I wrote using the GnuCash Python bindings. qb_iif_to_gc is a simple importer for the list data extracted above; it needs to be run first to set up a new GnuCash file. Then qb_trans_to_gc can be run to import the transaction data. All of this code can be pulled from the repository listed at the end of this article.
One starts, of course, by installing the Python bindings, which are usually packaged separately from the application. The gnucash module provides access to basic functionality, while gnucash.gnucash_business has some of the more business-oriented features. Beyond that, it's not unusual to need to go beyond what has been provided in those files; in that case, gnucash.gnucash_core_c has a much more extensive — but lower-level — interface generated directly with the SWIG tool.
The first order of business is to open a GnuCash session to work with a file:
session = gnucash.Session(file, ignore_lock=False, is_new=False,
force_new=False)
The file parameter is, of course, the name of the GnuCash file to work with. It uses the URI notation, so the way to specify a local XML file would be with something like xml://path/to/file.gnucash. If the file exists, only the file name needs to be provided and things will work. When creating a new, empty file (by passing is_new=True), though, the full URI must be provided. The ignore_lock parameter can be used to open a file that is locked elsewhere — most helpful when debugging a script that may not have gotten around to properly closing the file the last time it exited.
When the job is done, changes to the session should be written out and the session closed with calls like:
session.save()
session.end()
Beyond that, the only real use for the session within a script is to extract session.book, which is the "book" containing the actual accounting data. One other bit of useful data to grab at the outset is the default currency to use with accounts; my scripts do:
book = session.book
ctable = book.get_table()
dollars = ctable.lookup('CURRENCY', 'USD')
LWN is based in the US, so we can happily pretend that everybody just uses dollars and be done with it. If your company manages accounts in more than one currency, a little more attention will need to be paid here.
Accounts
Before much of anything else, it is necessary to set up a chart of accounts. Depending on how one is doing the migration, one might wish to create a new chart by hand or to simply import the chart from QuickBooks. One other small twist is that the GnuCash way of organizing accounts groups them by type in a hierarchy; all expense accounts start with Expenses/, for example. QuickBooks does not do things that way. There appears to be nothing in GnuCash that requires this organization, so following it is a matter of choice. The qb_iff_to_gc tool will, if run with -r, reparent accounts to fit them within the default GnuCash model.
One creates an account in GnuCash by instantiating an Account object, setting its parameters, and slotting it into the hierarchy. The root of the hierarchy can be had by calling the book's get_root_account() method. So a simple example of creating a new income account would look something like this:
acct = gnucash.Account(book)
acct.BeginEdit()
acct.SetName('Advertising Income')
acct.SetType(gnucash.ACCT_TYPE_INCOME)
acct.SetCommodity(dollars)
book.get_root_account().append(acct)
acct.CommitEdit()
The BeginEdit() and CommitEdit() methods are common to most GnuCash objects and, as one might expect, they are used to bracket a set of changes to that object. There is also a RollbackEdit() method should one change one's mind. One thing that jumps out from the example code found around the net is that almost nobody bothers with these calls except when dealing with Transaction objects. The code works just fine without them, but I feel that they have been provided for a reason and it is probably safer to call them.
The IIF file exported by QuickBooks provides the name and account type, so filling in this information is relatively simple. But the above example creates a top-level account. When creating a hierarchy, there's a bit more to do. LWN's chart of accounts includes one that is exported by QuickBooks as "Professional Fees:Freelance Authors"; in GnuCash that would need to be stored as Professional Fees/Freelance Authors, or even Expenses/Professional Fees/Freelance Authors. Creating that requires walking the hierarchy. The necessary code looks something like this:
def find_parent(name, root):
sname = name.split(':')
parent = root
for acct in sname[:-1]:
parent = parent.lookup_by_name(acct)
if not parent:
print 'Failed to find container account', acct
return root, sname[-1]
return parent, sname[-1]
The parent account returned by this function is the one on which append() should be called to add the new account to the hierarchy. Note that this function assumes that the upper-level accounts already exist. When importing a chart of accounts exported by QuickBooks without changes, that will always be true. If account names are being remapped (a feature built into qb_iff_to_gc), though, that condition may not hold and intermediate accounts may need to be created while walking the tree.
Vendors and customers
An established business is likely to have a long list of vendors and customers; fortunately, those can be imported automatically. Doing the import requires dealing with the GnuCash business interface, which looks a little different from what we have seen so far. A vendor with a given name would be created with a call like:
vendor = gnucash_business.Vendor(book = book,
id = book.VendorNextID(),
currency = dollars,
name = name)
The explicit call to book.VendorNextID() is needed to get an ID number to associate with the vendor; the rest should be relatively self-explanatory. Setting the vendor's contact information requires a few extra calls, though, to set up an address object:
addr = vendor.GetAddr()
addr.BeginEdit()
addr.SetName(ventry['PRINTAS'] or name)
addr.SetAddr1(ventry['ADDR1'])
addr.SetAddr2(ventry['ADDR2'])
addr.SetAddr3(ventry['ADDR3'])
addr.SetPhone(ventry['PHONE1'])
addr.CommitEdit()
Here, ventry is the vendor entry from the QuickBooks IIF file; most of the information copies over in a fairly straightforward way. There is one little glitch, though: sometimes vendors have a tax ID number that must be used for sending (for example) 1099 forms or reports to the government. GnuCash doesn't seem to have a way of storing that number directly, which is a bit of a shortcoming, unfortunately. There is, though, a way of storing notes with a vendor, so I chose to stash the tax ID there. The high-level interface provides no access to the notes, though, so one has to resort to the low-level interface:
if ventry['TAXID']:
inst = vendor.get_instance()
gnucash_core_c.gncVendorSetNotes(inst, ventry['TAXID'])
Let's just say that a certain amount of digging was required to figure that out.
The interface for customer data is almost identical:
cust = gnucash_business.Customer(book = book,
id = book.CustomerNextID(),
currency = dollars,
name = name)
Once again, a new ID number must be explicitly generated. Contact information for the customer is set by calling its GetAddr() method as is done for vendors above.
And that is the core of the qb_iif_to_gc program; interested readers can read the full script or grab the repository. There is more information that can be exported an imported in this manner; in particular, some will certainly have a need to import employee information. We don't store that information in QuickBooks, though, so I haven't implemented an import function for that data.
Transactions
Once the chart of accounts is in place, it's time to populate the ledger with transaction data; that is the task of the qb_trans_to_gc tool. Importing this data is actually fairly straightforward, but a couple of terms need to be defined first. A "transaction" describes a complete financial operation, such as LWN buying a case of beer using a debit card. A transaction is made up of two or more "splits", each of which describes an entry in a single ledger. In this case, one split is entered into the ledger for the bank account, decreasing its balance by the cost of the beer. The other split increases the balance of the appropriate expense account ("Office Supplies" in this case). The amounts in the splits differ only in sign in this case — the amount paid for the beer equals (the absolute value of) the amount taken from the bank account.
A more complicated transaction can have more splits. Paying an employee may involve a big debit from a bank account, with corresponding splits for the money paid to the employee, taxes paid to the government, withholdings for employee benefits, etc. An LWN subscription purchased with PayPal generate three splits: the funds come in via "Subscription Income", then out to the PayPal "bank" account and the expense account for PayPal fees. Regardless of the number of splits, the amounts involved must add up to zero in the end.
QuickBooks dumps each split into the CSV file as a separate line, with nothing to mark when one transaction stops and the next begins. It does, however, include the in-transaction balance in each line, so a balance of zero is a reasonably good end-of-transaction indicator. It is not foolproof; one could easily construct a transaction with a balance of zero partway through but, in the real world, that tends not to happen. So, by keying on that zero balance, it's possible to assemble a transaction from the splits provided by QuickBooks.
A transaction is, unsurprisingly, represented by a Transaction object, created and initialized with code like the following (where entry) is the first line from the CSV file describing the transaction):
trans = gnucash.Transaction(book)
trans.BeginEdit()
trans.SetCurrency(dollars)
trans.SetDescription(entry['Name'])
if entry['Num']:
trans.SetNum(entry['Num'])
trans.SetDate(day, month, year)
The description field for a transaction is often the counterparty to the transaction — the beer store, for example. Those of us old enough to remember writing checks will also remember that they have a number associated with them; if a number is present, SetNum() can be used to add it here.
Note that nothing done so far describes any money changing hands; that's all done in the splits. Before getting there, though, we should look at how GnuCash represents numbers like currency amounts. These amounts are often given as decimal values, but anybody who has taken a numerical analysis class understands the danger of storing them as floating-point numbers. GnuCash doesn't do that; instead, these numbers are stored as scaled integers in the GncNumeric class. Here is the utility function I bashed together to create such values from the (text) floating-point values output by QuickBooks:
SCALE = 1000
def GCVal(value):
dollars, cents = map(int, value.split('.'))
ival = dollars*SCALE
if value[0] != '-':
ival += cents*(SCALE/100)
else:
ival -= cents*(SCALE/100)
return gnucash.GncNumeric(ival, SCALE)
With that in place, creating a split is a matter of instantiating a Split object, filling in the relevant information, and attaching it to the transaction:
split = gnucash.Split(book)
split.SetValue(GCVal(entry['Amount']))
split.SetAccount(LookupAccount(entry['Account']))
split.SetMemo(entry['Memo'])
split.SetParent(trans)
Note that, unusually, splits do not have BeginEdit() and CommitEdit() methods.
One final complication is that, in an existing company file, most of the transactions are likely to be marked "reconciled"; that is information that should not be lost. GnuCash stores this information in the splits rather than in the transaction object. Unfortunately, this functionality isn't available in the high-level interface, so setting the flag (to either "y" or "n") requires going low:
gnucash_core_c.xaccSplitSetReconcile(split.get_instance(), reconciled)
Once all of the splits are in place for a transaction, a call to CommitEdit() completes the job.
Next steps
These two scripts will yield a mostly complete GnuCash file, with the omissions described above, of course. A good confidence-building measure at this point is to generate some reports in GnuCash and verify that they match what QuickBooks says. In my case, an obvious next step is to toss the scripts used with the painful QuickBooks data import process and write new ones to push company data directly into the GnuCash file. With the tools described above, that should not be that hard to do.
These scripts are available under the GPL; they can be cloned from the
repository at git://git.lwn.net/qb-escape.git. As the
process of investigating accounting systems continues, this repository will
likely accumulate more import scripts. Stay tuned.
