Leading items

Welcome to the LWN.net Weekly Edition for August 15, 2019

This edition contains the following feature content:

Hardening the "file" utility for Debian: the challenges that come with using seccomp() to harden the file tool.
Grand Schemozzle: Spectre continues to haunt: yet another hardware vulnerability that must be addressed in the kernel.
Long-term get_user_pages() and truncate(): solved at last?: a potential solution to a persistent kernel memory-management problem.
Corner cases and exception types: Python 3.8's "walrus" operator creates a couple of late surprises.
Akaunting: a web-based accounting system: another installment in our quest to find a suitable free business accounting solution.

This week's edition also includes these inner pages:

Brief items: Brief news items from throughout the community.
Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Hardening the "file" utility for Debian

By Jake Edge
August 14, 2019

The file command would seem to be an ideal candidate for sandboxing; it routinely handles untrusted input. But an effort to add seccomp() filtering to file for Debian has run aground. The upstream file project has added support for sandboxing via seccomp() but it does not play well with other parts of the Debian world, package building in particular. This situation provides further evidence that seccomp() filtering is brittle and difficult to use.

The discussion began with a post to the debian-devel mailing list where Christoph Biedl announced that he had enabled the file sandbox feature for the unstable repository. He was asking that other Debian developers keep an eye out for problems. He noted that the feature has some drawbacks:

This however comes with a price: Some features are no longer available. For example, inspecting the content of compressed files (disabled by default, command-line parameters -z and -Z) is now supported for a few compressions only: gzip (and friends, see libz), bzip2, lzma, xz. Decompressing other formats requires invocation of external programs which will lead to a program abort (SIGSYS).

In addition, he had already encountered problems with file running in environments with non-standard libraries that were loaded using the LD_PRELOAD environment variable. Those libraries can (and do) make system calls that the regular file binary does not make; the system calls were disallowed by the seccomp() filter.

Building a Debian package often uses FakeRoot (or fakeroot) to run commands in a way that appears that they have root privileges for filesystem operations—without actually granting any extra privileges. That is done so that tarballs and the like can be created containing files with owners other than the user ID running the Debian packaging tools, for example. Fakeroot maintains a mapping of the "changes" made to owners, groups, and permissions for files so that it can report those to other tools that access them. It does so by interposing a library ahead of the GNU C library (glibc) to intercept file operations.

In order to do its job, fakeroot spawns a daemon (faked) that is used to maintain the state of the changes that programs make inside of the fakeroot. The libfakeroot library that is loaded with LD_PRELOAD will then communicate to the daemon via either System V (sysv) interprocess communication (IPC) calls or by using TCP/IP. Biedl referred to a bug report in his message, where Helmut Grohne had reported a problem with running file inside a fakeroot. The msgget() system call was the cause in that case; Biedl changed the Debian file whitelist to specifically allow that call before his announcement:

Also, when running in LD_PRELOAD environments, that extra library may use blacklisted syscalls. One example is fakeroot which caused breakage in debhelper (#931985, already fixed). In both cases you should see a log message in the kernel log then.

There is a workaround for such situations which is disabling seccomp, command line parameter --no-sandbox.

As it turns out, though, his fix was specific to the sysv IPC mechanism; in order to make it work with TCP/IP, more whitelisting of system calls will be needed, as Grohne pointed out. Furthermore, blocking mechanisms like IPC and networking is just what the filter should be doing; those are the kinds of calls you don't want to make if file is compromised, he said. Instead of playing whack-a-mole with system calls, he suggested checking for the presence of LD_PRELOAD libraries and turning off the sandbox for those cases.

That idea did not sit entirely well with Biedl, who was concerned with "silently disabling this security feature in a production system". He thought that perhaps disabling the filter for build environments might be a way forward. Meanwhile, on debian-devel, several people thanked Biedl for enabling the filter, seeing it as a good step toward helping to secure the system. Russ Allbery said:

I would love to see more places where seccomp is at least present, if optional and off by default, since it provides an option to use the program more securely and accept that it breaks a lot of features. A great example would be ghostscript -- I would love to be able to prevent it from executing programs, writing to arbitrary files, and many other things that, strictly speaking, are part of the PostScript spec and therefore upstream wants to support in the more general case. Everyone who cares about this already has to pass in -dSAFER, so we're already dealing with the complexity cost of this being optional.

But Biedl eventually had to deliver some bad news in the thread. He disabled the system-call filtering in file because of the problems it caused:

Several issues popped up in the last days as a result of that change, and in spite of some band-aiding to current implementation of seccomp in the file program creates way more trouble than I am willing to ignore. So, sadly, I've reverted seccomp support for the time being to avoid further disruption of the bullseye development.

However, he did point out that Grohne had suggested some ideas for ways to make the sandboxing of file more workable. In the bug report, Grohne said:

The issue at hand is that file sets up its sandbox early and thus has to allow a ton of system calls (including open and write) even in the full sandbox. You can easily append to ~/.bashrc and escape the next time a user logs in. I'm arguing that this sandbox is somewhat useless, because it is way too weak. If file were opening the input file and the magic database prior to applying the sandbox, it could support a much stricter sandbox. In principle, it could do with just write (for communicating the result) and _exit. You can implement that using prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT, 0, 0, 0), which is available for more than 10 years now. The code for loading (not parsing) the input file and the magic database is relatively harmless and confining it is what breaks fakeroot. The parsing code doesn't need syscalls, so it should be unaffected by most LD_PRELOAD hacks.

Of course, getting there is essentially rewriting the seccomp feature in file. You cannot easily bolt it onto file in the way it currently is.

That is something that will need to be worked out with the upstream project and Biedl said that he plans to do so. There were several suggestions on how to approach the problem in the mailing list thread as well. Colin Watson commiserated with Biedl, reporting on the problems he encountered when adding seccomp() filtering:

I ran into a ton of annoying problems like that when I added seccomp filtering to man-db (the idea there being to limit what damage might be done by potential bugs in groff and friends). The worst difficulties are from third-party programs that some people have installed: there are a couple of apparently fairly popular Linux antivirus tools that work by installing an LD_PRELOAD wrapper that talks to a private daemon using a Unix-domain socket and/or a System V message queue; there's a VPN that does something similar even though it really has no business existing at this level or interfering with processes that have nothing to do with networking; and there's the "snoopy" package in Debian that logs messages to /dev/log on execve.

At the moment my compromise solution is to reluctantly open up the minimum possible set of syscalls I could find that stopped people sending me bug reports that were in fact caused by something injected from outside my software, and to limit most of that to only those cases where I've detected the relevant LD_PRELOAD wrappers as being present.

The fragility of the seccomp() solution extends to glibc and kernel versions, as Vincent Bernat pointed out. Those kinds of problems could be detected through automated testing, Philipp Kern suggested. Biedl said that it is something he is working on.

In file, we have a strong candidate for hardening, as it parses and handles file data that often has unknown origins—textbook untrusted input, in other words. But actually using seccomp() filtering to reduce its attack surface has not been successful for Debian. In truth, hardening programs that are often used in conjunction with LD_PRELOAD is always going to be difficult to impossible. But even just changing the version of glibc (which can potentially change the system calls it makes) or which kernel the tool is running on can invalidate the carefully crafted whitelist.

The OpenBSD pledge() system call provides a different path. Developers can specify which system calls are allowed, but only in broad categories like stdio (file operations, mostly), inet (IPv4 and IPv6 calls), or proc (process calls, such as fork(), but not including execve(), which is governed by the exec category). By not tying the filtering directly to individual system calls, some of the problems that Linux seccomp() users have encountered can be avoided. It also doesn't hurt that the OpenBSD user space is released in lockstep with the kernel.

For its file utility, OpenBSD systematically reduces the privileges that the tool has with multiple pledge() calls. It starts by disallowing all but a handful of categories after processing the command-line arguments. It then forks a process that executes the child() function, which reduces privileges further, eventually to only have stdio and recvfd. The child reads messages from the parent, each of which includes a file descriptor for a file to be tested. In that way, the code that is most at risk for compromise is only able to perform fairly minimal operations.

For Linux, it may well be that seccomp() filtering just isn't suitable for retrofitting onto existing projects. Completely separating the "worrisome" code (file-format parsing for file, for example) from the unavoidable code (e.g. opening files) may provide a path, but also probably means the existing code will have to be rewritten or at least majorly thrashed. The calls that LD_PRELOAD libraries are targeting for interception will likely be in that unavoidable part. Perhaps that could even lead hardened subprocesses to simply use the older, simpler seccomp() mode, as suggested by Grohne. That seems preferable to playing a never-ending game of whack-a-mole.

Comments (46 posted)

Grand Schemozzle: Spectre continues to haunt

By Jonathan Corbet
August 8, 2019

The Spectre v1 hardware vulnerability is often characterized as allowing array bounds checks to be bypassed via speculative execution. While that is true, it is not the full extent of the shenanigans allowed by this particular class of vulnerabilities. For a demonstration of that fact, one need look no further than the "SWAPGS vulnerability" known as CVE-2019-1125 to the wider world or as "Grand Schemozzle" to the select group of developers who addressed it in the Linux kernel.

Segments are mostly an architectural relic from the earliest days of x86; to a great extent, they did not survive into the 64-bit era. That said, a few segments still exist for specific tasks; these include FS and GS. The most common use for GS in current Linux systems is for thread-local or CPU-local storage; in the kernel, the GS segment points into the per-CPU data area. User space is allowed to make its own use of GS; the arch_prctl() system call can be used to change its value.

As one might expect, the kernel needs to take care to use its own GS pointer rather than something that user space came up with. The x86 architecture obligingly provides an instruction, SWAPGS, to make that relatively easy. On entry into the kernel, a SWAPGS instruction will exchange the current GS segment pointer with a known value (which is kept in a model-specific register); executing SWAPGS again before returning to user space will restore the user-space value. Some carefully placed SWAPGS instructions will thus prevent the kernel from ever running with anything other than its own GS pointer. Or so one would think.

There is a slight catch, in that not every entry into kernel code originates from user space. Running SWAPGS if the system is already running in kernel mode will not lead to good things, so the actual code in the kernel in most cases is the assembly equivalent of:

    if (!in_kernel_mode())
    	SWAPGS

That, of course, is where things can go wrong. If that code is executed speculatively, the processor may make the wrong decision about whether to execute SWAPGS and run with the wrong GS segment pointer. This test can be incorrectly speculated either way. If the CPU is speculatively executing an entry from user space, it may decide to forego SWAPGS and run with the user-space GS value. If, instead, the system was already running in kernel mode, the CPU might again speculate incorrectly and execute SWAPGS when it shouldn't, causing it to run (speculatively) with a user-space GS value. Either way, subsequent per-CPU data references would be redirected speculatively to an address under user-space control; that enables data exfiltration by way of the usual side-channel techniques.

That looks like a wide-open channel into kernel data structures, but there are some limitations. Only Intel processors will execute SWAPGS speculatively, so the already-in-kernel-mode case is limited to those processors. When entering from user mode, though, the lack of a needed SWAPGS instruction can obviously be speculated on any processor.

The other roadblock for attackers is that, while arch_prctl() can be used by unprivileged code to set the GS pointer, it limits that pointer to user-space addresses. That does not entirely head off exploitation, but it makes it harder: an attacker must find kernel code that loads a value via GS, then uses that value as a pointer that is dereferenced in turn. As Josh Poimboeuf notes in the mitigation patch merged into the mainline:

It's difficult to audit for this gadget in all the handlers, so while there are no known instances of it, it's entirely possible that it exists somewhere (or could be introduced in the future). Without tooling to analyze all such code paths, consider it vulnerable.

The use of supervisor mode access prevention will block this attack — but only on processors that are not vulnerable to the Meltdown problem, so that is only so helpful.

It is also worth noting that there is a longstanding effort to add support for the FSBASE and GSBASE instructions, which allow direct (and uncontrolled) setting of GS from user space. There are a number of performance advantages to allowing this, so the pressure to merge the patches is likely to continue even though they would make exploiting the SWAPGS vulnerability easier.

The mitigation applied in the kernel is relatively straightforward: serializing (LFENCE) instructions are placed in the code paths that decide to (or not to) execute SWAPGS. This, of course, will slow execution down, which is why the pull request for these fixes described them as coming from "the performance deterioration department". On systems where these attacks are not a concern, the new barriers can be disabled (along with all other Spectre v1 defenses) with the nospectre_v1 command-line option.

The Spectre vulnerabilities were so-named because it was assumed that they would haunt us for a long time. The combination of speculative execution and side channels leads to a huge variety of possible attacks and an equally large difficulty in proving that no such attacks are possible in any given body of code. As a result, the pattern we see here — slowing down the system to defend against attacks that may or may not be practical — is likely to be with us for some time yet.

Comments (18 posted)

Long-term get_user_pages() and truncate(): solved at last?

By Jonathan Corbet
August 13, 2019

Technologies like RDMA benefit from the ability to map file-backed pages into memory. This benefit extends to persistent-memory devices, where the backing store for the file can be mapped directly without the need to go through the kernel's page cache. There is a fundamental conflict, though, between mapping a file's backing store directly and letting the filesystem code modify that file's on-disk layout, especially when the mapping is held in place for a long time (as RDMA is wont to do). The problem seems intractable, but there may yet be a solution in the form of this patch set (marked "V1,000,002") from Ira Weiny.

The problems raised by the intersection of mapping a file (via get_user_pages()), persistent memory, and layout changes by the filesystem were the topic of a contentious session at the 2019 Linux Storage, Filesystem, and Memory-Management Summit. The core question can be reduced to this: what should happen if one process calls truncate() while another has an active get_user_pages() mapping that pins some or all of that file's pages? If the filesystem actually truncates the file while leaving the pages mapped, data corruption will certainly ensue. The options discussed in the session were to either fail the truncate() call or to revoke the mapping, causing the process that mapped the pages to receive a SIGBUS signal if it tries to access them afterward. There were passionate proponents for both options, and no conclusion was reached.

Weiny's new patch set resolves the question by causing an operation like truncate() to fail if long-term mappings exist on the file in question. But it also requires user space to jump through some hoops before such mappings can be created in the first place. This approach comes from the conclusion that, in the real world, there is no rational use case where somebody might want to truncate a file that has been pinned into place for use with RDMA, so there is no reason to make that operation work. There is ample reason, though, for preventing filesystem corruption and for informing an application that gets into such a situation that it has done something wrong.

Layout leases

The bulk of the patch set, though, is a recognition that the creation of long-term mapped kernel-mapped pages on persistent storage is not something that should be done lightly. Any user-space process that wants to set up such a mapping first has to convince the kernel that the associated issues have been thought about. That involves extending the existing lease mechanism in the kernel.

The F_SETLEASE command for fcntl() allows a process to obtain a "lease" on a file. Leases will not prevent other processes from carrying out most operations, but they will lead to the leaseholder receiving a notification (in the form of a signal) when another process is about to make a change to a file. The leaseholder then has a fixed amount of time to perform any necessary cleanup and remove its lease before the change is allowed to proceed. Leases, thus, do not guarantee exclusive access to a file, but they do give the leaseholder a window in which to prepare for and cope with any independent changes.

Inside the kernel, there is a concept of a "layout lease" that is not exposed to user space. Layout leases are used with pNFS to arbitrate access with respect to operations that can change the on-disk layout of a file. The first step in Weiny's patch set is to make the F_LAYOUT lease type available to user space as well. A process that is mapping a file on persistent storage into kernel memory could use this type of lease to do the right thing when an independent operation changes the layout of a file. The patch set modifies the XFS and ext4 filesystems to break layout leases before performing a layout-changing operation in order to support this mode of operation.

For the intended RDMA use case, though, the "right thing" — unmapping the file and letting the operation proceed — is not really an option. The memory in question is, almost certainly, under the control of a network interface for DMA operations, and unmapping it would be disruptive at best. So the second patch in the series adds a new F_EXCLUSIVE flag to indicate a lease that cannot be broken. A holder of such a lease need not worry about handling signals or coping with independent changes; any operation that would cause that to happen — truncate(), for example — will instead fail with an ETXTBSY error.

The final step is to change get_user_pages() to require the existence of an exclusive layout lease before any pages can be mapped with the FOLL_LONGTERM flag (which indicates that the mapping is expected to be held for a long period of time). Applications that need to create such mappings will thus need to be changed to obtain the exclusive lease first. That might be seen as an ABI change except for the fact that, until the final patch in this series, such mappings have not been allowed at all.

It's worth noting that the application need not continue to hold the lease after mapping the file; indeed, it need not keep the file open at all. While an exclusive layout lease does exist, no attempt to change the layout of the file will succeed. In the absence of the lease, layout changes will still fail if they involve any pages that have been pinned with get_user_pages(). So it may still be possible to call truncate() on a file with long-term pinned pages, but only as long as there is no layout lease in place and the operation does not affect those pages. This behavior fits the intended use case: registration of a file on persistent-memory storage with the kernel for RDMA operation.

Accounting and the future

The remaining patches in this series are concerned with keeping track of what is going on. Should this code be merged, it is surely only a matter of time until some user complains about being unable to truncate a file and wondering just what is blocking the operation. Some new files added to /proc will provide information about which files have pages pinned and who is responsible for them.

Given the heat that has surrounded this problem for years, the response to this patch set has been relatively muted. Dave Chinner has expressed some reservations regarding the semantics of the newly user-visible layout leases, but he seems to be more concerned with nailing down the exact behavior than with the idea as a whole. Nobody has argued that the overall idea will not work. So it seems that the proposed solution might just be good enough to eventually find its way upstream.

Comments (5 posted)

Corner cases and exception types

By Jake Edge
August 13, 2019

Some unanticipated corner cases with Python's new "walrus" operator—described in our Python 3.8 overview—have cropped up recently. The problematic uses of the operator will be turned into errors before the final release, but just what exception should be raised came into question. It seems that the exception specified in the PEP for the operator may not really be the best choice, as a recent discussion hashed out.

PEP 572 ("Assignment Expressions") describes the walrus operator (though not by that name, which came later). It allows making assignments as part of another statement, like an if or while statement—or in a list comprehension. It is this latter use where the corner cases recently arose. The following is how the walrus operator is meant to be used in lists or list comprehensions (from the PEP):

# Reuse a value that's expensive to compute
[y := f(x), y**2, y**3]

# Share a subexpression between a comprehension filter clause and its output
filtered_data = [y for x in data if (y := f(x)) is not None]

In those comprehensions, y is assigned once but used multiple times.

But a bug report from Nick Coghlan pointed out some oddities:

>>> [i := 10 for i in range(5)]
[10, 10, 10, 10, 10]
>>> i
10

Normally, you would not expect the iteration variable (i) to leak out of the comprehension, but here it is assigned with the walrus operator; it is confusing, but is arguably logical. Another example seems plainly wrong:

>>> [False and (i := 10) for i in range(5)]
[False, False, False, False, False]
>>> i
4

Because of the short-circuit evaluation of the boolean expression, the "i := 10" is never even executed. But it still causes the iteration variable to leak out of the comprehension. As Coghlan pointed out, a non-executing walrus assignment deeply nested in a comprehension leaks the iteration variable as well.

He has submitted a pull request that makes some changes to the PEP (including slipping a "walrus operator" reference in) to outlaw these specific instances and others like them. The original PEP had specified that problems found in the parsing of the walrus operator would raise a TargetScopeError exception, which would be a subclass of SyntaxError. But that was questioned by Barry A. Warsaw in the bug:

I know the PEP defines TargetScopeError as a subclass of SyntaxError, but it doesn't really explain why a subclass is necessary. I think seeing "TargetScopeError" will be a head scratcher. Why not just SyntaxError without introducing a new exception?

Guido van Rossum agreed that perhaps it would be better to simply raise SyntaxError, though he was concerned that it would require a full PEP review. Coghlan explained the reasoning:

I believe our main motivation for separating it out was the fact that even though TargetScopeError is a compile-time error, the affected code is syntactically fine - there are just issues with unambiguously inferring the intended read/write location for the affected target names. (Subclassing SyntaxError is then a pragmatic concession to the fact that "SyntaxError" also de facto means "CompilationError")

Searching for "Python TargetScopeError" will also get folks to relevant information far more quickly than searching for "Python SyntaxError" will.

The discussion soon moved to the python-dev mailing list at the behest of Van Rossum. Warsaw started the thread by posting a summary of the debate. In his view, any PEP change would be relatively minor, so it makes sense to nail that all down before the 3.8 release, which is slated for October. In the bug report, Serhiy Storchaka had mentioned the IndentationError and TabError subclasses of SyntaxError as possible reasons to continue with TargetScopeError, but those "feel different" to Warsaw because those names are self-explanatory.

Tim Peters, who co-authored the PEP with Van Rossum and Chris Angelico, didn't really care what the exception is called, but was not convinced that SyntaxError is any better, really:

Whereas SyntaxError would give no clue whatsoever, and nothing useful to search for. In contrast, a search for TargetScopeError would presumably find a precisely relevant explanation as the top hit (indeed, it already does today).

I don't care because it's unlikely an error most people will make at all - or, for those too clever for their own good, make more than once ;-)

Most who posted in the thread were either in favor of switching to SyntaxError or ambivalent about such a change, with Steven D'Aprano being the main exception (so to speak). He was concerned with the idea of emitting a syntax error for something that was not syntactically incorrect. "There's a problem with the *semantics* not the syntax." But Warsaw was not convinced that the distinction is useful: "What you wrote is disallowed, so you have to change your code (i.e. syntax) to avoid the error." He also noted that PEP 572 specifies TargetScopeError as a subclass of SyntaxError, so even under the current definition the distinction is not being made.

Eric V. Smith concurred with Warsaw; in particular, Smith noted that he could not see a reason that a programmer would catch and handle SyntaxError and TargetScopeError separately. He suggested (as did others in the thread) that the text emitted for that error make it clear to the user where the problem lies. For example, Kyle Stanley suggested something like:

SyntaxError: Invalid scope defined by walrus operator: 'i := 10 for i in range(5)'

In a long post, Coghlan agreed with the overall consensus that a simple SyntaxError is the right path forward for 3.8. He does think that there is value in a new exception (perhaps with a better name, such as AssignmentScopeError) that would cover more than just walrus operator errors. But that is something that can wait to be considered for 3.9.

With three steering council members in favor (Coghlan, Warsaw, and Van Rossum) and two of the three PEP authors in agreement (Peters switched to being in favor after Coghlan's message), Van Rossum said that TargetScopeError should be removed. As it turns out, Angelico is also in favor and another council member, Brett Cannon, was on board as well. Coghlan's pull request for the PEP was updated to reflect that; it will presumably be merged along with the code changes.

This episode is another example of the Python development process in action. The long beta cycle for releases helps flush out problems, as with the escape sequences issue we looked at last week. In this case, the corner cases were found by MicroPython developers who were adding the walrus operator to their version of the language, so the diversity of language implementations helps find issues as well. Finding those problems in MicroPython ensured that the PEP would be fixed; the discussion around them allowed the exception issue to be worked out. It certainly shows the advantages of having an active community that is actually using and testing beta releases well in advance of the final release.

Comments (11 posted)

Akaunting: a web-based accounting system

By Jonathan Corbet
August 9, 2019

Free accounting systems

One of these years, LWN will have a new accounting system based on free software. That transition has not yet happened, though, despite the expending of a fair amount of energy into researching alternatives. Your editor recently became aware of a system called Akaunting, so a look seemed worthwhile. This tool may have the features that some users want, but it seems clear that your editor's quest is not done yet.

As an aside, additional motivation for this effort came in the form of an essentially forced upgrade to QuickBooks 2019 — something that QuickBooks users have learned to expect and dread. There appear to be no new features of interest in this release, but it does offer a newly crippled data import mechanism and routine corruption of its database. If your editor didn't know better, he might just conclude that proprietary software is buggy, unreliable, and unfixable.

A perfect choice

Akaunting bills itself as the "perfect choice" for those looking for a QuickBooks alternative. The page hits a lot of the right points: free software, maintaining control over your accounting data, etc. Just the sort of thing your editor was in the mood to hear.

The Akaunting project is relatively young, having announced its existence in September 2017. The system is written in PHP and JavaScript; the code is licensed under GPLv3. Akaunting is able to use MySQL, PostgreSQL, or SQLite to store the actual data. It is, as one might expect given the implementation languages, designed to run as a web application; one can install it on a handy machine, but Akaunting (the company) also offers to host accounts free of charge on its own servers. The company promises "we cover it, for free, forever" — a pretty big promise for a free-software startup with a minimal track record.

Akaunting is an open-core offering; the basic functionality is available in the free version, and various "apps" can be added to increase that functionality. The site talks about "buying" apps, but the prices are quoted in dollars per year, which might be a bit confusing. The site claims that all apps must be GPLv3-licensed, and the apps in the store appear to adhere to that. One must pay to get the source initially (at least, as long as nobody posts it elsewhere); after that, ongoing payment is only required to get support and updates. There are 48 apps listed in the Akaunting app store as of this writing.

As seems to often be the case with company-owned free software, pointers to the source repository are rather hard to find. A look at the Akaunting repository shows about 1,200 non-merge commits over a period of about two years. The number of developers is quite small, as might be expected. There appears to be no contributor license agreement associated with this project, so any outside contributors would retain the copyrights for their code; it's not clear whether that was intended or not. Ominously, the last commit in this repository happened in April; an Akaunting employee responded to a forum query two months ago with a promise that "new versions will be available soon", but that has not yet happened.

One of the key attributes that anybody should consider before adopting a new accounting system is the strength of the community that works on it. Akaunting raises some red flags in that regard. Work seems to be done exclusively within the company, and there is no evident effort to build a wider community around it. Using an instance hosted on the company's servers appears too risky to be considered seriously; hosting it locally eliminates the concern that one's entire accounting system could vanish overnight, but the risk of ending up with an unsupported system remains.

Test-driving Akaunting

Your editor chose to experiment with the company-hosted version of Akaunting in the hope that it would eliminate the pain of installing a large PHP application and show off the software as the company wants it to be seen. One immediate observation was that the system was generally slow to respond; that might be the result of an overloaded server rather than slow software. Either way, it is not a hugely encouraging introduction to a new system.

The basic company setup is easy enough; the system doesn't ask for much more than a company name. The user is then dropped into the "dashboard" screen, shown in the screenshot to the right. Adding bank accounts would be a logical next step; the screen to do so is straightforward enough but somewhat limited. There is no distinction between types of accounts, for example; a savings account is the same as a credit-card account. There is no provision for asset or liability accounts.

This is about the point where one runs into a significant show-stopper: Akaunting is a single-entry accounting system, with all of the limitations that implies. Double-entry accounting is available as an app ($69/year), which is perhaps better than nothing. But a proper accounting system should have double-entry wired deeply into its design; bolting it on as an app seems unlikely to work well. Since this functionality is a paid add-on with no public repository, it is unlikely to see any community support even if a community grows around Akaunting in general. While Akaunting with the double-entry app is still free software, it begins to have some of the characteristics of the proprietary variety.

Importing one's data might well come next. Akaunting does feature an "import" button for most of the sorts of records — customers, vendors, transactions — that one might want to feed into the system. The only format accepted, though, is an XLS spreadsheet; the system will helpfully provide a "sample spreadsheet" as a way of documenting the expected layout of the data. Limited support for other file formats, such as one might want to use to import data from a bank, appears to be available (for a fee, naturally) from the app store. There is also an API that one could, in theory, use to write an importer, but the documentation is rather limited.

Support for the entry of invoices, bills, and cash payments works well enough. The use of drop-down menus for selecting customers and vendors will get unwieldy quickly, though, for companies with large numbers of either. There is support for recurring payments and sending email to slow-paying customers. The ability to attach arbitrary files seems like a useful feature not found in all accounting systems. There is no support for check printing (or the printing of any sort of form, for that matter). The basic set of expected reports is there, but there is little ability to customize them. Reports that one would expect from a double-entry system, such as a balance sheet, are missing.

One could go on, but the conclusion seems clear enough by this point. Akaunting has been put together as a visually pleasing web application, but its accounting capabilities fall short of what even a small company like LWN needs and its development future seems uncertain at best. Perhaps, with time, Akaunting will become a more fully featured system with an active development community. Your editor hopes so: one can only wish well to a company with an apparent commitment to free software that is trying to improve the state of the art in this area. But, for now, Akaunting is not a candidate to be LWN's next accounting system.

Comments (13 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>