User: Password:
Subscribe / Log in / New account

Leading items

PyPy: the other new compiler project

By Jonathan Corbet
May 19, 2010
We have recently seen a lot of attention paid to projects like LLVM. Even though the GNU Compiler Collection is developing at a rapid pace, there are people in the community who are interested in seeing different approaches taken, preferably with a newer code base. LLVM is not where all the action is, though. For the last few years (since 2003, actually), a relatively stealthy project called PyPy has been trying to shake up the compiler landscape in its own way.

On the face of it, PyPy looks like an academic experiment: it is an implementation of the Python 2.5 interpreter which is, itself, written in Python. One might thus expect it to be more elegant in its code than the standard, C-implemented interpreter (usually called CPython), but rather slower in its execution. If one runs PyPy under CPython, the result is indeed somewhat slow, but that is not how things are meant to be done. When running in its native mode, PyPy can be surprising.

PyPy is actually written in a subset of Python called RPython ("restricted Python"). Many of the features and data types of Python are available, but there are rules. Variables are restricted to data of one type. Only built-in types can be used in for loops. There is no creation of classes or functions at run time, and the generator feature is not supported. And so on. The result is a version of the language which, while still clearly Python, looks a bit more like C.

Running the RPython-based interpreter in CPython is supported; it is fully functional, if a bit slow. Running in this mode can be good for debugging. But the production version of PyPy is created in a rather different way: the PyPy hackers have created a multi-step compiler which is able to translate an RPython program into a lower-level language. That language might be C, in which case the result can be compiled and linked in the usual way. But the target language is not fixed; the translator is able to output code for the .NET or Java virtual machines as well. That means that the PyPy interpreter can be easily targeted to whatever runtime environment works best.

The result works. It currently implements all of the features of Python 2.5, with very few exceptions. There are some behavioral differences due to, for example, the use of a different garbage-collection algorithm; PyPy can be slower to call destructors than CPython is. Python extensions written in C can be used, though one gets the sense that this feature is still stabilizing. PyPy is able to run complex applications like Django and Twisted. On the other hand, for now, it only runs on 32-bit x86 systems, it is described as "memory-hungry," and Python 3 support seems to be a relatively distant goal.

Beyond that, it's fast. PyPy includes a built-in just-in-time compiler (JIT); it is, in a sense, a platform for the creation of JITs for various targets. The result is an interpreter which, much of the time, is significantly faster than CPython. For the curious, the PyPy Speed Center contains lots of benchmark results, presented in a slick, JavaScript-heavy interface. PyPy does not always beat CPython, but it often does so convincingly, and speed appears to be a top priority for the PyPy developers. It may well be that the speed of PyPy may eventually prove compelling enough that, as Alex Gaynor suggests, many of us will be using PyPy routinely instead of CPython in the near future.

There are some other interesting features as well. There is a stackless Python mode which supports microthreaded, highly-concurrent applications. There is a sandboxed mode which intercepts all external library calls and hands them over to a separate policy daemon for authorization. And so on.

What really catches your editor's eye, though, is the concept of PyPy as a generalized compiler for the creation of JITs for high-level languages. The translation process is flexible, to the point that it can easily accommodate stackless mode, interesting optimizations, or experimentation with different language features. The object model can be (and has been) tweaked to support tainting and tracing features. And the system as a whole is not limited to the creation of JIT compilers for Python; projects are underway to implement a number of other languages, including Prolog, Smalltalk, and JavaScript.

It could easily be argued that PyPy incorporates much of the sort of innovation which many people have said never happens with free software. And it is all quite well documented. This is a project which is not afraid of ambitious goals, and which appears to be able to achieve those goals; it will be interesting to watch over the next few years.

Comments (26 posted)

Looking forward to GnuCash 2.4

By Jonathan Corbet
May 18, 2010
Long-time LWN readers will be aware that your editor has, for some time, been looking for a free accounting system which would handle the needs of a small business like, well, LWN. So it is with some interest that your editor has followed the progress of the GnuCash 2.3 development series. GnuCash is currently in the "string freeze" stage: development is seen as being close enough to done that no more changes to any visible text are allowed. So perhaps it's time to look at what the GnuCash developers have done this time around.

The actual list of new features is surprisingly short; much of the work in this development cycle has been aimed at internal improvements. Near the top of the list is the ability to use WebKit to render graphs along with GtkHTML. Your editor will confess that he has not tested out this feature; HTML rendering is a tiny part of what GnuCash does, so the availability of an alternative rendering engine is not particularly exciting.

The other headline feature is more interesting, though: the GnuCash developers have added a database backend which is capable of interfacing with Sqlite, MySQL, or PostgreSQL. The XML-based native file format is adequate for personal finance uses (though it is quite bulky), but it does not scale well to the numbers of transactions seen in a typical business setting, and it is not well suited to a multi-user, integrated workflow. GnuCash had a database backend in the 1.6 days, but it was never well supported and it did not work with the GnuCash business features. The database backend in 2.4 has been redone from the beginning, and it supports all of the program's capabilities.

Those wanting to use this feature need to be prepared for a bit of a rough start, though; the related documentation is, one might say, sparse. Actually, this documentation does not exist at all. What your editor figured out is this: users should create an empty database (default name is "gnucash") to hold the accounts. The relevant libdbi drivers must be installed on the system. Then, opening GnuCash and selecting "save to" will yield a dialog with a pulldown at the top; the default value in that pulldown will be "XML". If GnuCash sees any database drivers, the associated databases will show up as options on that menu; selecting "postgres" will allow the accounts to be saved into the selected PostgreSQL database.

A couple of related notes: GnuCash looks in /usr/lib/dbd for the database drivers. Fedora x86-64 puts them in /usr/lib64/dbd. The result is that GnuCash behaves as if the database option simply did not exist. Your editor retains a longstanding grudge against whoever came up with the /usr/lib64 idea; it seems to break every application which it comes into contact with at least once. The other thing to bear in mind is this: if you give GnuCash a password for access to the database, it will store that password - in plain text - in your .gconf directory.

The database-backed mode looks and works almost identically to the file-backed mode. The biggest visible difference is probably the lack of a "save" button; when working with a database, all changes are committed immediately. The program will also populate the working directory with files named like translog.20100517132703.log. Leaving log files lying around is a habit GnuCash has had for a long time, but, in the absence of an accounts file, GnuCash has to improvise when it comes to the naming and location of the log files.

Working from a database should offer some performance advantages when doing searches through large accounts, but that's not the real reason your editor is interested in the feature. What is far more compelling is (1) interoperability with other business processes, and (2) multi-user access to the accounting database. Unfortunately, GnuCash 2.4 does not seem to properly support either of those features.

Even a small business like LWN generates thousands of transactions over the course of a year. Unsurprisingly, it can be quite difficult to find anybody who is interested in typing all of those transactions into the accounting system - especially when "the computer" already knows all about them. In the current system, getting this information into the accounting database is done through a bunch of glue scripts and the always hair-raising QuickBooks import process. Certainly it would be nicer to just store that data directly into the accounting database.

This sort of direct storage is certainly achievable with GnuCash, but it won't be easy. The database schema is surprisingly hard to find; there is a version of it available in an old mailing list posting, but that's about it. Said schema is heavily reliant on GnuCash's "GUID" type - a 32-character ASCII representation of a 16-byte internal object identifier. So even storing a simple transaction requires generating new GUIDs and digging around to find the other GUIDs necessary to tie the transaction in properly. It is, in other words, messy, error-prone, and just crying for a nice API that could be called at a higher level. One assumes that said API will exist in library form someday, but it is not there now.

Parallel access is not there either; the FAQ is clear on the subject:

Even the dbi (SQL) backend which will appear in the upcoming 2.4 release is currently not designed for true multi-user access. Trying to work with multiple users at the same time in the same database can cause data loss. You can however share the database with different users, provided you ensure serialized, exclusive access.

This makes storing the accounts in a centrally-located database actively dangerous. If GnuCash cannot have two clients running safely with the same database, it should go out of its way to prevent that from happening (as it does with the file backend). But there are no checks for concurrent access in the current 2.3.12 development release.

While GnuCash does have some support for dealing with various tax regimes, it's still not where it needs to be to displace other applications. Among other things, it simply has no provision for storing some of the required information. LWN, being a US-based business, needs to track which of its authors needs to receive 1099 forms at the end of the year and produce those forms at the right time. The production side could perhaps be made to work relatively easily: GnuCash has a scripting engine which enables the creation of custom reports - as long as the user is not afraid of programming in Scheme. But, without the ability to track who needs those forms (and their tax numbers), the report generator simply lacks the information it needs.

Time for one final grumble. In current stable GnuCash, the QIF importer (useful for importing credit card transactions, for example) is fast, efficient, and flexible. In 2.3, importing has turned into a slow, click-intensive, error-prone process. Users are supposed to classify transactions without seeing the amount of each, for some strange reason. In general, GnuCash is moving forward in this release, but importing transactions has regressed.

Import dialogs notwithstanding, the general appearance of GnuCash has changed little in this cycle. It remains the solid, workhorse financial management program that your editor has used for years. For those who would like to see a bigger change, there is something on the horizon, though it doesn't look like it will be ready for the 2.4 release. The Cutecash project has set itself the goal of taking the GnuCash engine and putting a Qt-based interface onto it. It will be interesting to see where the developers go with this project.

Meanwhile, lest your editor seem just a little too grumpy, let it be said that GnuCash seems to be headed in the right direction. It has been more than good enough for most personal finance management for some time, and it is slowly developing the features it will need to compete in the business area. One of these years it should start displacing proprietary accounting tools in a serious way.

Comments (35 posted)

Canonical Goes It Alone with Unity

May 14, 2010

This article was contributed by Joe 'Zonker' Brockmeier.

With the Lucid Lynx release safely shipped, the Ubuntu developer community gathered in Brussels, Belgium the week of May 10 to prepare for the 10.10 release scheduled for October. Two of the focal points for the release will be a new netbook interface called "Unity," and "Ubuntu Light" — a stripped-down version of Ubuntu intended to ship on systems running Microsoft Windows as an instant-on alternative.

Mark Shuttleworth announced the new interface design and Ubuntu Light concept on Monday, May 10th via his blog and keynote at the Ubuntu Developer Summit (UDS). Ubuntu already has a Netbook Remix that's customized for small screens, but the new design is meant to focus less on access to all applications and more on rapid access to the most-used programs. Shuttleworth says that Canonical has been spending time analyzing what users use most and identifying things that are not needed in lightweight configurations. He also says that the focus is to get the user to the Web as quickly as possible:

Instant-on products are generally used in a stateless fashion. These are "get me to the web asap" environments, with no need of heavy local file management. If there is content there, it would be best to think of it as "cloud like" and synchronize it with the local Windows environment, with cloud services and other devices. They are also not environments where people would naturally expect to use a wide range of applications: the web is the key, and there may be a few complementary capabilities like media playback, messaging, games, and the ability to connect to local devices like printers and cameras and pluggable media.

We also learned something interesting from users. It's not about how fast you appear to boot. It's about how fast you actually deliver a working web browser and Internet connection. It's about how fast you have a running system that is responsive to the needs of the user.

(Emphasis in the original).

How fast can you get to a working Net connection? It looks like users will have to buy a new machine to find out. Ubuntu users who get the distribution as a download, as opposed to purchasing Ubuntu via OEM hardware, will not have access to Ubuntu Light. According to Shuttleworth's post, the company won't provide a general-purpose download due to "the requirement to customize the Light versions for specific hardware." While customization may provide an edge, it doesn't seem to be a blocker for other distributions that provide "instant on" versions, such as Mandriva InstantOn, so it's a bit disappointing to learn Canonical won't be providing a Light edition for general distribution. Presumably they will be providing the code for the improvements where required, but it may not be trivial for the community to piece a Light version together.


Ubuntu Unity, however, is already available in its somewhat unfinished state. Users on Ubuntu 10.04 only need to add the canonical-dx-team/une Personal Package Archive (PPA), install the unity package, and log out. Choose the Unity UNE (Ubuntu Netbook Edition) Session option and log back in.

The Unity interface is stable enough, though not yet feature-complete. Currently the interface consists mostly of the launcher and panel. Unity's plan also includes "Dash," which would display applications and files as an overlay. It is a sort of super-menu that's displayed over the current windows; it would replace the GNOME menus or Netbook Remix side panel. The idea is to maximize vertical space, which is at a premium on netbooks, and make the interface "finger friendly." That is to say that users should be able to navigate the interface via a touchscreen as well as using a mouse. This suggests that Canonical is targeting not just netbooks, but also tablets.

When Unity is launched, it has a set of default applications like Firefox, Rhythmbox, and the Software Center in the panel. Because the Dash is not yet ready, the current Unity default includes an Applications link, which provides access to /usr/share/applications. The Unity interface doesn't seem to include a "Run" dialog or utility. It's also unclear what the plans are for keyboard access, and whether most of the interface will be navigable using keyboard shortcuts.

Overall, the Unity interface is responsive and easy to use. Users familiar with a dock metaphor will take to Unity pretty quickly. Applications can be removed from the Dock by dragging them off the dock or right-clicking and selecting "Remove from Launcher." Applications [Screenshot] can be added by selecting "Keep in Launcher." The Ubuntu logo in the upper-left corner tiles the open windows, showing a smaller view of all windows, enabling users to choose between them. Unlike the Netbook Remix, the title bar is currently not merged with the main title bar of the Unity interface, so there is wasted space resulting from the title bar plus Unity bar.

One interesting question raised by Unity and Canonical's push for running applications full-screen is how the company plans to ensure that the applications run well in the full-screen mode. Applications like Chromium, which has been tapped as the default browser for 10.10, handle full-screen mode well enough. But applications like Empathy, the default IM client (and presumably one of the most desirable instant-on applications) do not have a full-screen mode. Is Canonical going to work with upstream to develop this feature? Apparently not, according to this comment from Canonical developer Neil Patel. In response to questions about single-window Empathy, Patel responds "Not that I know of, we hope to use Empathy as its default in Ubuntu. Maybe we can get some community to help to make it netbook friendly."

Another question is how the Unity effort meshes with GNOME Shell, and whether Ubuntu's path is taking it too far from upstream. GNOME Shell is coming in GNOME 3.0 and planned for release in September. It appears that GNOME Shell will not be making an appearance in Ubuntu's netbook offerings, though Shuttleworth noted that GNOME Shell technologies like the Clutter libraries for and Mutter window manager, are used. Shuttleworth says that the "design seed of Unity was in place before GNOME Shell," and that the company decided to use that design for instant-on rather than GNOME Shell. GNOME Shell will be available in the standard release of Ubuntu for 10.10, but not as the default. Ubuntu is also diverging from standard GNOME with its Windows Indicators, which were offered upstream but not accepted.

Unity and Ubuntu Light also seem to mark the end of Canonical's Moblin/MeeGo efforts. The company has confirmed that it isn't planning another netbook edition of the Ubuntu Moblin Remix. This potentially puts Canonical in competition with the MeeGo effort and Google's Android/Chrome OS, and in the position of maintaining much more of the desktop environment and back-end than many other distributions. Novell discovered several years ago that innovating ahead of upstream GNOME was not particularly sustainable or effective. Whether Canonical has learned from the mistakes of others or is poised to repeat them is an open question.

It would be good to see Canonical find mainstream success for Linux with the Unity interface and Ubuntu Light. Whether this is the solution that will win over the market remains to be seen, but Canonical does seem to be pursing the netbook market with a bit more enthusiasm than any other vendor. The only concern is that the company seems increasingly out of step with the rest of the community in doing so.

Comments (49 posted)

Page editor: Jonathan Corbet
Next page: Security>>

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds