LWN.net Logo

LWN.net Weekly Edition for May 17, 2012

LGM: GIMP's new release, new über-core, and future

By Nathan Willis
May 16, 2012

GIMP took center stage at Libre Graphics Meeting (LGM) 2012 in Vienna. Day three featured a block of GIMP-related talks that saw the official release of the new stable 2.8 version of the application, plus a look at three new developments that will impact the future of the raster editor in the coming months. The biggest change is that the 2.9 development series has already been ported to the generic graphics library (GEGL) engine — the ease of which reportedly surprised even the developers — but there were interesting revelations about GPU-accelerated image processing and a new take on text handling, too.

FLOSS historians may recall that the very first LGM evolved out of a GIMP developers' summit. Still, in the past few years other application projects have grabbed the spotlight — Blender, Krita, Inkscape, and Scribus, for starters. Part of the reason has been GIMP's slower development cycle over the past few releases; as a long-established project it has a sizable codebase to maintain, and considerable effort in recent years has gone into building GEGL, the next-generation image processing library designed to bring high bit-depth image support, non-destructive operations, and other such niceties. Planning that transition while simultaneously implementing feature requests on the existing core added up to some multi-year waits between major revisions. The previous stable release, 2.6, landed in October 2008.

Thus it was a minor surprise when GIMP maintainer Michael Natterer and GEGL maintainer Øyvind Kolås announced the official release of GIMP 2.8 during their talk. Technically, news of the release had leaked out the day before when the files appeared on the official FTP site, but the announcement was still unexpected good news. Feature-wise most of the new work in GIMP 2.8 was in place when we covered the 2.7.3 development build in November 2011. The headlines include the option of docking all of the tool palettes onto the image editor to function as a single window (with tabs for keeping multiple files open at once), layer groups, on-canvas text editing tools, a new "cage transform" tool, and numerous enhancements to the layout and manipulation of of tool settings.

The GEGL has landed

Upstaging the release of 2.8 was Natterer and Kolås's session showcasing their recent work porting the GIMP development tree to GEGL. GEGL has been slated to replace the legacy GIMP core for close to a decade, and although recent GIMP releases have included an option to activate GEGL for specific functions (such as color transformation), as recently as December 2011 the official plan was still for integrate other work (such as Google Summer of Code projects) into GIMP 2.10, and make the transition to GEGL in GIMP 3.0. The GEGL transplant had been long-planned, the maintainers said, but it took both of them being in the same room at the same time to jump-start it. As they reported, when that happened almost accidentally in March, the port took off, and is now more than 90% complete.

[Kolås and Natterer]

As Natterer explained it, Kolås was in town for a week-long hacking session, and the two decided to attempt some GEGL porting just to verify their planned approach. Kolås added a GEGL feature that used GIMP's existing image-tile storage as its back-end. Natterer patched in the new feature, and it immediately worked so well that he began cutting out other bits of legacy code to replace it with GEGL buffer manipulation. The more legacy code he ported to GEGL, the more fifteen-year-old layers of abstraction in the existing code base simply "collapsed away." In addition to simplifying and shrinking the code, he said, migrating tile manipulation to GEGL buffers made it possible to replace many image filters (such as blurs) with primitive GEGL operations. The two continued to work, and roughly a month later, merged the result into the GIMP trunk. The 2.9 code not only replaces GIMP's core process with a GEGL engine, but it provides a separate GEGL engine for GIMP plugins. In 2.10, the legacy plugin API will be officially deprecated.

In addition to telling the tale of its development, the two also demonstrated the GEGL-backed GIMP 2.9 live. First, they showed how compressing the range of an 8-bit-per-channel image resulted in serious color-banding. Then, they switched to 16-bit-per-channel mode, re-compressed the same image, and restored it to its original look without any discernible banding or quality loss. Another demonstration illustrated how 16-bit-per-channel mode allowed similarly higher-quality painting and gradients. Those examples are easily visible, but in practice the benefits of high bit-depth editing are not so immediate; rather, the errors accumulate over several steps — but they do accumulate, with every operation done in 8-bit mode.

There are still other benefits to the new image pipeline, however. Because GEGL operations are defined on abstract buffers, adding support for an entirely different image format is a matter of writing a new format for babl, the underlying pixel transformation layer. During the GEGL hack-a-thon, Kolås wrote such a back-end for indexed color images (such as you find in the GIF format). Natterer had originally planned to drop support for indexed images, but with the babl format defined, they work just as well as any other format in the GEGL-ified GIMP. GEGL also allows GIMP to use all sorts of painting and filter operations on indexed images (such as smudging and blurring) that are typically not possible on indexed color.

The GEGL transition is still not complete; in particular UI elements frequently mismatch when in high-bit depth mode. For example, the levels and histogram sliders still show 0-255 as the range, rather than 0-100% (or some other bit-depth-neutral label). But the brave can already experiment with GIMP 2.9. More work remains to adapt GIMP's filters and plugins to GEGL, some of which will be tackled by 2012 GSoC students. But in addition to its practical usage for the GIMP project, Natterer pointed out that the ease and success of the GIMP port demonstrates that GEGL is stable, fast, and demonstrably useful for "arbitrarily complex" applications.

Kolås went into more depth on the inner workings of GEGL, which has the potential to serve as the graphics engine for other editors (and indeed there is some work underway on GEGL usage in the MyPaint project). It could also support vector and path operations, or be used in other application classes altogether, such as map transformations. Because GIMP's plugin engine is separate from the main application core, it, too, could be usable by other applications, allowing them to call GIMP plugins without requiring them to adopt GEGL itself.

The modern sounds of the GIMP

The GEGL integration talk joined two other presentations about GIMP development. Peter Sikking and Kate Price of Man Machine Interface Works (MMIworks) presented a glimpse at their still-in-progress work to revamp the editor's text handling capabilities. MMIworks is a user interaction firm that methodically investigates UI challenges and develops solutions; their past work with GIMP includes the new scaling and transformation tools and the single-window-mode implementation.

[Peter Sikking]

MMIworks set out to re-design the text manipulation tools for GIMP hoping to unify the sometimes conflicting uses of text in GIMP images: some text elements are part of the design, while some are used as annotations; some text is high-resolution typography destined for print, while other text is small and used for icon creation. In addition, GIMP users frequently apply multiple filters and effects to text, but want it to remain editable, which complicates the implementation.

Further complicating the task is one feature many users want but that is not yet implemented in GIMP: the ability to warp text along a path. Exactly where that warping takes place is another variable: at the top of the line, the bottom of the line, at the midpoint, or perhaps at the x-height. In addition, they said, there is no reason that text-warping could not be applied to the margins or the alignment of a paragraph. Throw in the fact that there are four possible writing directions, and text manipulation amounts to a substantial challenge.

The two did not have a final solution to present, but they did provide hints as to the direction the GIMP is heading. While experimenting with how to re-implement text manipulation from scratch, they decided that many of the parameters associated with text manipulation had features in common with vector or path editing tools. As a result, GIMP's eventual text-manipulation tool will be integrated with its path tools. As they explained it, whether the artist starts with a line of text and warps it to a path, or starts with a path and attaches text to it, the tools should offer the same control.

Victor Oliveira presented his work porting GEGL and GIMP functionality to the OpenCL framework. OpenCL is a cross-platform language for writing code blocks (called kernels) that can be executed on either CPUs or on GPUs (as well as other resources, such as FPGAs). The specification is managed by the Khronos Group (most widely known for OpenGL), and there are Linux implementations for all of the major GPU architectures as well as the Mesa software library.

Previous work has already ported specific GEGL functions to OpenCL; Oliveira ported more functions, including colorspace conversions. He also implemented device-to-device data transfer, wrote a filter API, and benchmarked OpenCL performance. The resulting work is already available, it was merged from the opencl-ops branch in GEGL 0.20 and GIMP 2.8 — Oliveira estimated that it encompassed around 15,000 lines of code.

For systems without a GPU-based OpenCL implementation, the OpenCL core also "automatically" makes GEGL multi-threaded, so there are still performance increases on multi-core CPUs. Some of the biggest gains, he said, come in OpenCL-based color transformations, which were historically a bottleneck. For any GPU-based system, however, the speed-up is quite significant. OpenCL is particularly fast with floating-point arithmetic, which is at the heart of GEGL's high-bit-depth operations. Thus, the more GEGL is integrated with GIMP (such as in the 2.9 series), the faster GIMP will become on multi-core and GPU-accelerated machines.

There are still more operations that could be ported to OpenCL, he said, but the next big step is to get more users to test the OpenCL code on a wider variety of CPU and GPU combinations. Intel, AMD, and Nvidia each have their own implementation, plus the open source Mesa implementation, which adds up to a lot of possible set-ups. Oliveira has already done some caching work to overcome the latency caused by transferring pixels from the CPU to the GPU and back; he suggested that adding an OpenGL output node to GEGL would further speed things up by allowing the image to be rendered directly on the graphics hardware rather than making yet another round trip.

GIMP interpretation

Together, the release of GIMP 2.8, the sudden port of GIMP 2.9 to GEGL, and the immediate availability of GPU-accelerated operations add up to more GIMP news this year than in the previous few LGMs combined. Granted, true GIMP fanatics have probably been running development builds for many months, but for everyone who depends on distribution-provided package, GIMP has taken a big jump forward.

Looking to the future, MMIworks always produces solid tools, and the rebuilt text-manipulation tools suggest that more original features are still in the works — warping text along the x-height line or paragraph-alignment line is certainly a new idea. Taken together with GPU accelerated operations, GIMP is moving forward on the user-facing tools and on the back-end simultaneously. But the most significant news is perhaps Natterer's comments at how much smaller and easier to maintain the GEGL-based GIMP is. As an aside during his talk, he mentioned that the simplified core is easier for new developers to dig into and understand. For an open source project, that is one of the best features one could wish for.

[Thanks to the Libre Graphics Meeting for assistance with travel to Vienna.]

Comments (12 posted)

Tasting the Ice Cream Sandwich

By Jonathan Corbet
May 15, 2012
Owners of Android handsets can be forgiven for feeling frustration over how long it took to get an update from the 2.3 "gingerbread" release. Google's flat-out effort to improve tablet support led to a 3.0 ("honeycomb") release that was not deemed suitable for handset use—or for open-source release. It was only with the 4.0 "Ice Cream Sandwich" (ICS) cycle that all that new code became available for handsets—sort of. Six months after the 4.0 release, your editor finally got his hands on a device that can run it; what follows is a review of sorts.

The availability of the 4.0 release has not been as wide as a lot of users might have hoped for. Upgrades for existing handsets have been slow in coming. And for many handsets, including your editor's trusty Nexus One, there will never be an official 4.0 system. Even worse, the CyanogenMod developers have also decided that they will not be working on porting the upcoming 4.0-based CM9 release to the Nexus One. Their reasoning is understandable—in short, the Nexus One lacks the memory to run this release in a pleasing way—but it is still somewhat sad. Once upon a time, the Linux community prided itself on continuing to support older hardware long after its manufacturers had forgotten about it. As Linux moves into the consumer electronics world, the ability and the desire to support last year's devices are both falling by the wayside.

Interestingly, it is not just the core system that leaves older devices behind. Users of older Android releases who search for this week's hot application often get a surprising result: nothing. If an application does not run under a given Android version, the "Google Play" program simply will not show it at all. (Perhaps more disconcertingly, viewing an application's page with a desktop browser yields the message that one's handset—not currently in use—is not supported). The end result is that users of last year's hardware can find themselves in a situation where parts of the application landscape simply seem to disappear over time—updates stop happening, and new applications may never become available.

The inability to run an application for LWN's payroll service, combined with the availability of an unlocked version of the Galaxy Nexus from Google and the simple desire for a new toy drove the acquisition of a 4.0-capable device. There is one thing about the Galaxy Nexus which immediately stands out to a Nexus One (or Nokia N9—your editor's other device) user: its size. The Galaxy Nexus could almost be considered to be an extra-small tablet; it is large enough to be an uncomfortable fit in a pants pocket.

That size brings some advantages, of course, starting with the larger screen with its 1280x720 resolution. The phone features five-band 3G connectivity, a dual-core processor, a front-facing camera, and even a built-in barometer. The extra processing power and memory are immediately evident when using the handset; it is far more responsive than any Android handset your editor has used previously. Given all that, it may just prove possible to get used to hauling a larger handset around.

Google's version of the Galaxy Nexus is fully unlocked, meaning it is a simple matter of a single "fastboot" command to unlock the bootloader, which is the key to installing a different operating system on the device. There is one little surprise worth knowing about: on this phone, unlocking the bootloader will wipe the device. Anybody who wants to do the kinds of things enabled by an unlocked bootloader is presumably prepared to cope with an amnesiac handset, but this behavior is still a good thing to know about.

The ICS experience

To users of previous Android's versions, the Ice Cream Sandwich release can be a little jarring at first. Some things just aren't where one expects them to be anymore. Certain ingrained behaviors—holding down the home key to get a list of running applications, for example—don't work in the same way anymore (in this case, the application list has been moved to its own dedicated key). The application directory now scrolls sideways instead of upward. One can no longer place widgets or contact icons on the background by holding a finger there; one must, instead, notice the little tab in the application directory and use that. The search and menu buttons are long gone. In the menu case, the button has been replaced by an icon that may appear at the bottom of an applications screen, except when it appears at the top instead; playing "find the menu" can be one of the more awkward parts of the ICS experience. That notwithstanding, the interface mostly works well once one gets used to the new ways of doing things.

One simultaneously good and bad feature of Android phones is the way they upload so much information to the Google mothership. The good side becomes immediately evident when one moves into a new handset; an awful lot of things Just Work like they did on the previous one. Contacts and calendar events are there, applications magically install themselves, and so on. Your editor was a little surprised to observe that Android handsets now pass wireless network passwords up to Google as well; the new handset associated itself with the local network without even asking. Searching through the menus turns up a mention of WiFi passwords in the "backup" option; they are stored with the list of installed applications and other bits of miscellaneous information. There is no apparent way to turn off the backing-up of these passwords, which might well be regarded as sensitive information, without turning off the backup feature entirely.

One other surprise that has clearly hit a number of Galaxy Nexus owners is that the handset cannot function as a USB mass storage device when plugged into a computer. Instead, it wants to talk to the media transport protocol (MTP), which gives it better control over what is shared with the host. Alas, MTP is not particularly well supported in Linux; there is a FUSE-based mtpfs module, but it failed to function properly on your editor's system. The best approach seems to be to use an application that has libmtp support built into it; nautilus, for example, is able to move files to and from the phone with relatively little trouble.

There is, as it turns out, a whole series of applications out there aimed at making it easier to move files back and forth. Most of them set up some sort of web server on the device that can then be accessed from elsewhere on the net; some have fairly slick JavaScript-based browser interfaces. These applications also must be given full access to the entire device, and one must trust that they will let only the intended user into the device. One of them demanded the ability to access location data, which was a bit disconcerting: it certainly does not need that information to carry out its intended task. Linux-based users may be most at home with an application like SSHDroid, which runs an SSH server accessible in the usual ways.

[Data usage screen] There are some other nice 4.0 features worth a quick note. It includes a reasonably advanced mechanism for controlling and limiting wireless data use that can, among other things, monitor and clamp the usage of specific applications. Internet telephony with SIP is a native Android feature now, but, in a move clearly intended to mollify carriers, the handset will not do SIP calls unless a WiFi network is available. Android can now use dm-crypt to encrypt all the storage on the device; an encrypted phone requires a password at power-on or it will not be able to function. Those curious about the details of how whole-phone encryption works on Android can find some information on this page.

One other thing one notices quickly with the 4.0 release is the presence of a number of user interface features that, previously, were only available with CyanogenMod. The ability to tweak the color of the notification LED, more home screens, the configurable "favorite applications" bar at the bottom of the home screen, and the ability to go straight to an application from the lock screen—though the latter is limited to the camera on official Android—are examples. CyanogenMod may not have any sort of special path into official Android, but it seems clear that Google's developers are paying attention to what CyanogenMod is doing. That is not how a typical open source system might work, but it's far better than nothing.

On the other hand, other CyanogenMod features are still very much missing. Your editor misses the configurable "power bar" widget, for example. CyanogenMod allows the application directory to be displayed more densely, even on the Nexus One's smaller screen. The CyanogenMod camera application is superior to what Android offers, though, it must be admitted, the new panorama mode in the 4.0 camera application is kind of fun. And, of course, Android just does not offer the sort of configurability provided by CyanogenMod.

The good news is that, of course, there is a version of CyanogenMod 9 in the works for the Galaxy Nexus. Experimenting with the CM9 nightly builds has not yet begun in the LWN laboratories; it seemed worthwhile to get a good sense for stock Android 4.0 first. But the truth of the matter is that one does not truly appreciate a shiny new gadget until one has attempted to brick it. So stay tuned for a look at CM9 on this device sometime in the near future.

In the meantime, it is clear that the development of the Android platform continues at a fast pace. It has become visibly slicker and more capable over a relatively short period of time. For better or for worse, Android represents a highly successful combination of fully free software, corporate-controlled open source, and fully proprietary code. The result may not be quite the 100% free device that we would like, but it has led to a series of nicely shiny toys with a lot of hackability, which is not an entirely bad result.

Comments (142 posted)

Highlights from the PostgreSQL 9.2 beta

May 14, 2012

This article was contributed by Josh Berkus

The PostgreSQL project has just released a beta of its next major version, 9.2. As usual with its annual release, this version includes many new features, most of which are targeted at improving database performance. The developers have been hard at work improving response times, increasing multicore scalability, and providing for more efficient queries on large data. They also found time to include some other major features, so let's explore a few of the things 9.2 beta has to offer.

JSON support

If the new non-relational databases (or "NoSQL") have proved anything, it's that many application developers want to store JSON objects in a database. With version 9.2, that database can be PostgreSQL.

The JSON support in PostgreSQL 9.2 isn't complete JSON database functionality, but it goes a long way toward that. First, there's a validating JSON data type, so that you can create tables with a specific JSON field:

	    Table "public.users"
	Column    |  Type   | Modifiers                        
    --------------+---------+------------
     user_id      | integer | not null 
     user_name    | text    | 
     user_profile | json    | 

Better is that there are multiple JSON conversion functions, including row_to_json() and array_to_json(), which allow you to get the results of a query in JSON format.

 
    select row_to_json(names) 
    from ( select schemaname, relname from pg_stat_user_tables ) 
    as names;
		    row_to_json                
    -------------------------------------------
     {"schemaname":"public","relname":"users"}

This means that applications can now send queries to PostgreSQL, get back results in JSON format, and immediately act on those results without further conversion. Unfortunately, it is not yet possible to send your query as a JSON object or JavaScript code, but that's likely to come in some future version of PostgreSQL.

To make the JSON support really useful, though, you need two optional components, or "extensions" to PostgreSQL: hstore and PL/v8. Hstore is a data type that stores indexed key-value data and ships with PostgreSQL. PL/v8 is a stored procedure language based on Google's v8 javascript engine, sponsored by Heroku and NTT.

Hstore allows you to store flattened JSON objects as a fully indexed dictionary or hash. PL/v8 lets you write fast-executing JavaScript code which can run inside the database to do all kinds of things with your JSON data. One such is to create expression indexes on specific JSON elements and save them, giving you stored search indexes much like CouchDB's "views".

The PostgreSQL project chose to implement its own JSON formatting rather than utilizing any external library, reducing external dependencies and improving portability.

Range types

Anyone who's ever written a calendaring application can tell you that there's no such thing as a "point in time". Time comes in blocks, and pretty much everything you want to do with times and dates involves spanning minutes, hours, or days of time. For a long time, the only way relational databases had to represent spans of time was as two endpoints in different fields, an unsatisfactory and error-prone method.

In 9.2, contributor Jeff Davis introduces "range types" which allow the representation of any one-dimensional linear range, including time, real numbers, alphabetical indexes, or even points on a line. Such ranges can be compared, checked for overlap, and even included in unique indexes to prevent conflicts. PostgreSQL is the first major relational database system to have this concept.

To give you a concrete example, imagine you're writing a conference scheduling application. You want to make sure that no room can be scheduled for two speakers at the same time. Your table might look something like this:

    CREATE TABLE room_reservations ( 
	    room_no TEXT NOT NULL, 
	    speaker TEXT NOT NULL, 
	    talk TEXT NOT NULL, 
	    booking_period TSTZRANGE,
	    EXCLUDE USING gist (room_no WITH =, booking_time WITH &&)
    );

That odd "EXCLUDE USING gist" clause says not to let anyone insert a record for the same room at overlapping times. It utilizes two existing PostgreSQL features, GiST indexes and exclusion constraints. This substitutes for dozens of lines of application code in order to enforce the same constraint. Then you can insert records like this:

    INSERT INTO room_reservations
    VALUES ( 'F104', 'Jeff Davis', 'Range Types Revisited', 
	    '[ 2012-09-16 10:00:00, 2012-09-16 11:00:00 )' ); 

As you can see, PostgreSQL's range types support mathematical closed and open bracket notation, helping you define ranges which do or don't overlap with adjacent ranges.

Scalability to 64 cores

Thanks to its Multiversion Concurrency Control (MVCC) architecture, PostgreSQL does not need to hold any locks for reading data, just for writing data. This should, in theory, allow for near-infinite multicore scalability of a read-only workload. But in reality, PostgreSQL 9.1 only scaled to around 24 cores before throughput per core fell off sharply. This really irritated PostgreSQL contributors Noah Misch and Robert Haas, so they decided to do something about it.

The main reason was that PostgreSQL was actually holding locks for each read. The biggest of these was a unitary lock on the table to make sure that it didn't get dropped while the read query was still running. When you have a lot of very short queries doing reads against the same table, contention on this table lock becomes a major bottleneck. Through a combination of repartitioning the lock memory space, and reducing the time required to get a lock, they largely eliminated that bottleneck.

Other contributors, such as lead developer Tom Lane, made other optimizations to the read-only workload, by, for example, reducing memory copying for cached query plans. The University of California at Berkeley donated the use of a high-memory 64-core server for testing. The results of all of these optimizations are gratifying and dramatic.

PostgreSQL now scales smoothly to 64 cores and over 350,000 queries per second, compared to topping out at 24 cores and 75,000 queries per second on PostgreSQL 9.1. Throughput is better even at low numbers of cores. Note that this is on an extreme workload: all primary-key lookups on a few large tables which fit in memory. While it may seem obscure, that workload describes many common web applications (such as Rails applications), as well as the kind of workload many of the new key-value databases are designed to handle.

Index-only access

One of the features other database systems have, which PostgreSQL has lacked, is the ability to look up data only in an index without checking the underlying table. In the databases which support it (such as MySQL and Oracle) this is a tremendously useful performance optimization, sometimes called "covering indexes".

The reason why it's useful is that for very large tables an index on one or two columns could be 1/100th the size of the table, and is often cached in memory even when the table is not. Thus if you can touch only the index, you can avoid IO, making your query return twenty times faster. It's even more useful if the table in question is only going to be used to join two other tables on their primary keys.

However, the same MVCC which makes read queries scale so well on PostgreSQL made index-only access difficult. Even if the data you wanted was in the index, you had to check the base table for concurrency information. But then contributor Heikki Linnakangas created a highly cacheable bitmap called the Visibility Map, so that the query executor only has to check the base table for data pages which have been recently updated.

This means that, in 9.2, you'll be able to get your query answered just by the index in many or most cases, speeding up queries against large tables. Yes, this also means an end to most "COUNT(*) is slow on PostgreSQL" issues.

The caveat with this feature is that the table or tables in question need to be fairly up-to-date in database maintenance ("VACUUM"), which limits the ability to use the optimization on tables with a lot of "churn". Regardless, for many common use cases and for data warehouses, index-only access will be an order-of-magnitude performance improvement.

Cascading replication

Of course, these days horizontal scalability is a lot more popular than vertical scalability. The PostgreSQL developers, particularly Jun Ishiduka, Fujii Masao, and Simon Riggs, have continued to improve the binary replication introduced in PostgreSQL 9.0. Version 9.2 now contains support for cascading replication, which I will explain by example:

Imagine that you have three load-balancing PostgreSQL servers in Amazon US East, and another cluster of three replicated PostgreSQL servers in Amazon US West for failover in case of another AWS region-wide outage. If you want to use streaming replication, each server in US West needs to replicate directly from the master in US East, driving your transfer costs through the roof.

What you really want is the ability to replicate to database server 1 in US West, and have the two other servers in US West replicate from that server. With PostgreSQL 9.2, you can.

Configuration is fairly simple if you already have PostgreSQL 9.1 replication set up. Simply tell the cascading replica to accept replication connections by setting the wal_senders parameter. Then connect to it from the downstream replicas.

Other features

This isn't everything in the PostgreSQL 9.2 beta. There's performance enhancements for writes such as group commit, a new class of index called SP-GiST, reductions in CPU power consumption, multiple enhancements to modifying database schema at runtime, and even several new database monitoring commands. You can read about the new features in the PostgreSQL 9.2 beta release notes and the beta documentation.

Some features, planned for 9.2, didn't make it into this release due to issues found during development and testing. These include checksums on data pages to detect faulty storage hardware, federated databases, regular expression indexing, and "command triggers" which can launch an action based on arbitrary database events. Hopefully we'll see all of these in PostgreSQL 9.3 next year.

The PostgreSQL project hopes you'll download and test the beta to help identify and fix bugs in version 9.2. If the project holds true to its timeline for the last couple of years, the final release of version 9.2 should be some time in September. In the meantime, you can download and test the beta version.

Comments (14 posted)

Page editor: Jonathan Corbet

Security

A ".secure" top-level domain

By Nathan Willis
May 16, 2012

On May 10, Ars Technica reported on a new proposal to create a .secure top-level domain (TLD) that would enforce deployment of encryption and security protocols on all sites and scrutinize all registrants to verify their identity. The idea is being floated by the company that wants to be the registrar and manager of the TLD, however, and regardless of how noble its intentions may be, it still needs to make a strong case for how a domain-bound solution improves on existing security practices.

The initial proposal and initial reactions

The proposal comes from Alex Stamos of research firm iSec Partners, and would appoint Artemis Internet as the gatekeeper of .secure. Artemis would require registered domains to encrypt all web and email traffic (except for HTTP redirects funneling connections towards the appropriate TLS-encrypted site), use DNSSEC, and deploy DomainKeys Identified Mail (DKIM) for spam prevention. In addition, Artemis would employ a rigorous screening process to verify registrants' identities (including reviewing articles of incorporation and human interviews), and routinely conduct security scans of registered sites. The venture has $9.6 million (US) in funding provided by Artemis' parent company NCC Group, a UK-based IT security firm.

The Artemis site says ".secure is not a category or destination. It is an expression of the user's intent to use the Internet safely," but so far offers few additional implementation details. After a number of Ars Technica readers expressed their skepticism in the comment thread, though, Stamos responded to several questions on the site (although, ironically, we have no way to verify that the poster is Stamos).

Several commenters expressed doubt that the lofty goal of enforcing strong encryption and strictly-verified identities was practical on a large scale, while others argued that the proposed rules for registrants offer nothing that secure sites cannot already do today — thus making the special domain unnecessary. To the latter charge, Stamos replied that the goal was to automatically make secure choices whenever the user entered a .secure URL, rather than rely on the user to verify the security of a connection upon visiting a site:

The basic goal of .Secure is to invert the security user experience when using the Internet. Right now you type somesite.com, and then you are expected to look for user interface clues and even dive into details (depending on your expected adversary) to determine whether you got there safely. And if things go slightly wrong, you are prompted to make a decision based upon your understanding of discrete mathematics and the X.509v3 spec.

A reader who goes by VideoGameTech argued that even with a strict security policy enforced on the actual .secure sites, users would still be vulnerable to attacks like DNS hijacking and URL-obfuscation through email spam. Stamos responded that requiring DNSSEC would protect users against DNS hijacking, and that other measures would make spam attacks less likely, saying:

We continue to recommend that legitimate emails like this do not contain clickable links, and that mail/spam providers continue to improve their detection of URL/text mismatch. That being said, we'll be attacking the spam problem in the From: field aggressively with required DKIM and SMTP TLS with a real certificate.

Policy

Stamos also added that Artemis was drafting a proposal called the Domain Policy Framework (DPF), which would allow a domain to specify a security policy to be stored in DNS TXT records that browsers and other client applications could securely retrieve over DNSSEC. He later posted an initial specification of DPF to his blog. The specification itself is hosted at Scribd.com; a PDF version can be downloaded from the Scribd page.

The proposed DPF specification enumerates 15 separate security entries, broken up into "network and identity," "email," and "WWW" sections. They cover basic requirements like whether a domain requires DNSSEC and/or TLS, plus details like expected email signature algorithms, plus general identity information. Artemis has also started a project it calls the Domain Policy Working Group to create DPF, although at the moment it does not list any other members. The group's site does state that its intention is to submit DPF to the Internet Engineering Task Force (IETF) RFC process, and to create supporting standards.

Yet another CA?

Some commenters at Ars Technica and on our own story felt that the proposal boiled down to little more than Artemis vowing to be a certificate authority (CA) that actually acted responsibly — which so many CAs seem to have trouble doing. It would also appear that the .secure domain would require that only the Artemis CA be allowed to sign site certificates for the TLD or else a subverted CA could start issuing bogus .secure certificates. That, of course, means that subverting Artemis itself could lead to the same problem On the other hand, it is at least true that running an entire TLD securely would be more visible to the end user than would running a flawless CA.

Consider, for comparison's sake, the Extended Validation (EV) SSL certificate concept. EV certificates were supposed to bolster security by inflicting a rigorous, fraud-proof identity verification process on all applicants. The trouble was that, as the Wall Street Journal reported, most users failed to notice the difference in their browser's UI. Barring any other improvements, if Artemis managed to cultivate a good reputation for running .secure, it would be easier for a casual user to spot the TLD than to dig into the the certificate chain-of-trust for another site.

Speaking of previous enhanced-security efforts, the .secure TLD was floated as an idea in 2011 as well, under greatly differing circumstances. Back then it was the US government proposing a heavily-fortified subset of the Internet to live in .secure. That proposal had different components, including ongoing monitoring with intrusion detection and prevention software. But Chris Palmer of the Electronic Frontier Foundation (EFF) responded by critiquing the core notion that better identity verification would provide a security for users:

But what does “authentication” mean on the internet? People often (implicitly) take it to mean something like “Through this web browser I am talking to the true Wells Fargo Bank, and Wells knows that I am the true Chris Palmer.” However, when one computer presents credentials (such as a username and password pair or a cryptographic certificate) to another, the link between software data structure and real-world entity (like a person or a business) is weak. It is no stronger than the person’s or business’ ability to ensure that the computers on both ends are operating correctly and are not compromised, and that the channel between them is secure against network threats. From painful experience we know that our operating systems suffer from numerous design and implementation flaws, and that malware and system hacks are all-too-prevalent. [...] For economic reasons (such as Metcalfe’s Law and economies of scale), networks tend to converge, not diverge. (We will probably use the same computers (and wires!) to connect to the real internet as to the .secure internet.)

Echoing that sentiment, Ars Technica reader Mark Havel asked whether Artemis' .secure proposal would come with any guarantees that security patches would be quickly and routinely applied to all servers on .secure sites. Others pointed out that for end users, the only difference between a .secure site and a site in another TLD would be if Artemis offered liability protection against attacks. As reader Fuzzmz said, "Yes, I can visit a .secure domain with a bit more peace of mind, but if something goes haywire then I'm left in the same place as I was if it happened on a .com domain."

There is also the question of whether a hardened .secure domain would attract a larger share of attackers than would the same sites hosted at .com or other TLDs. Some commenters suggested that cracking a .secure site would be an attractive achievement for fame-seekers. On a separate note, if the .secure TLD did successfully become associated with higher security in end users' minds, it would consequently make a more appealing target.

In short, there is not much in the .secure proposal from Artemis that a server could not implement today (DPF, which is not limited to .secure, necessarily, depends on browser adoption). But even with a secure connection to a fully-patched server, it is up to the human being in front of the screen to pick out all manner of security attacks, from phishing spam to URL obfuscation, and that human being still relies heavily on the browser to expose and communicate threats in an understandable manner. Without question, the principles that the Artemis site hails as critical — verification, security, and enforcement — are vital to creating a secure web and email environment. But the company still needs to make a stronger case for how tying those principles to a particular TLD improves things over the status quo for the human/browser combination, not to mention why that responsibility should rest with a private, commercial enterprise.

Comments (7 posted)

Brief items

Security quotes of the week

Consider disabling SELinux and auditing. We recommend to leave SELinux on, for security reasons, but truth be told you can save 100ms of your boot if you disable it. Use selinux=0 on the kernel cmdline.
-- Lennart Poettering

I operate a ~10k botnet using a ZeuS software I modified myself, including IRC, DDoS and bitcoin mining (13GH/s - 20GH/s atm). Everything operating tru TOR hidden service so no feds will take my servers down. (Don't worry, traffic intensive stuff is not tru TOR and the bots work as relays too, enchancing your TOR experience!)
-- "throwaway236236" in a reddit "Ask me anything"

When I got to the orthopedist’s office a few days later, I gave the receptionist the CD, which she promptly read into the medical records computer and returned to me. It occurred to me that the risk taken in reading a CD or other media from an unknown source is pretty substantial, something we’ve known in the security world for decades but has not filtered well into other fields. On the other hand, every time I’m on a conference program committee I open PDFs from people I may never have heard of, so it’s not as if I’m immune from this risk myself.

When I got home, I read the CD on my Mac laptop, and discovered that it has an autorun.INF file to start the application that reads the x-ray data files. I don’t know whether the doctor’s office disables AutoRun on their computers; undoubtedly some doctors do and others don’t.

And even if the doctors’ computers have disabled AutoRun and don’t use the software on the CD to view the test results, how secure are they against data-driven attacks, such as we saw a number of years ago against JPEG files in browsers?

-- Jeremy Epstein

Comments (23 posted)

My own private Internet: .secure TLD floated as bad-guy-free zone (Ars Technica)

Dan Goodin at Ars Technica reports on iSec Partners, a company proposing to make .secure into a heavily-vetted high security domain. "Sites that wanted to be a part of this exclusive domain would have to undergo rigorous screening to verify their identity. Physical addresses, trademark registrations, articles of incorporation, and other legal documents would be reviewed by human beings. Upon approval, applicants would receive two-factor authentication hardware to register online. They would also be required to meet a minimum set of security practices, including end-to-end encryption of virtually all Web and e-mail traffic."

Comments (30 posted)

New vulnerabilities

bind-dyndb-ldap: denial of service

Package(s):bind-dyndb-ldap CVE #(s):CVE-2012-2134
Created:May 16, 2012 Updated:May 23, 2012
Description: From the Red Hat bugzilla:

A denial of service flaw was found in the way the bind-dyndb-ldap, a dynamic LDAP back-end plug-in for BIND providing LDAP database back-end capabilities, performed LDAP connection errors handling / attempted to recover, when an error during a LDAP search happened for a particular DNS query. When the Berkeley Internet Name Domain (BIND) server was patched to support dynamic loading of database back-ends, and the LDAP database back-end was enabled, a remote attacker could use this flaw to cause denial of service (named process hang) via DNS query for zone served by bind-dyndb-ldap.

Alerts:
Fedora FEDORA-2012-6722 2012-05-15
Fedora FEDORA-2012-6759 2012-05-15
Red Hat RHSA-2012:0683-01 2012-05-21
CentOS CESA-2012:0683 2012-05-21
Scientific Linux SL-bind-20120522 2012-05-22
Oracle ELSA-2012-0683 2012-05-22

Comments (none posted)

chromium: multiple vulnerabilities

Package(s):chromium CVE #(s):CVE-2011-3078 CVE-2011-3079 CVE-2011-3080 CVE-2011-3081 CVE-2012-1521
Created:May 14, 2012 Updated:May 16, 2012
Description: From the openSUSE advisory:

Chromium version 20.0.1128 fixes several security issues:

  • - CVE-2011-3078: Use after free in floats handling.
  • - CVE-2012-1521: Use after free in xml parser.
  • - CVE-2011-3079: IPC validation failure.
  • - CVE-2011-3080: Race condition in sandbox IPC
  • - CVE-2011-3081: Use after free in floats handling.
Alerts:
openSUSE openSUSE-SU-2012:0613-1 2012-05-14
Gentoo 201205-01 2012-05-15

Comments (none posted)

connman: code execution

Package(s):connman CVE #(s):CVE-2012-2320 CVE-2012-2321 CVE-2012-2322
Created:May 16, 2012 Updated:May 16, 2012
Description: From the Gentoo advisory:

Multiple vulnerabilities have been found in ConnMan:

  • Errors in inet.c and rtnl.c prevent ConnMan from checking the origin of netlink messages (CVE-2012-2320).
  • ConnMan does not properly check for shell escapes when requesting a hostname via DHCP (CVE-2012-2321).
  • An infinite loop error exists in client.c (CVE-2012-2322).
A remote attacker could execute arbitrary code with the privileges of the process or cause a Denial of Service condition.
Alerts:
Gentoo 201205-02 2012-05-15

Comments (none posted)

coreutils: command injection

Package(s):coreutils CVE #(s):CVE-2005-4890
Created:May 15, 2012 Updated:May 16, 2012
Description: From the openSUSE advisory:

when running "su -c" to execute commands as different user the target user could inject command back into the calling user's terminal via the TIOCSTI ioctl.

Alerts:
openSUSE openSUSE-SU-2012:0621-1 2012-05-15

Comments (none posted)

ffmpeg: multiple vulnerabilities

Package(s):ffmpeg CVE #(s):CVE-2011-3929 CVE-2011-3936 CVE-2011-3940 CVE-2011-3947 CVE-2012-0853 CVE-2012-0947
Created:May 14, 2012 Updated:May 16, 2012
Description: From the Debian advisory:

Several vulnerabilities have been discovered in FFmpeg, a multimedia player, server and encoder. Multiple input validations in the decoders/ demuxers for Westwood Studios VQA, Apple MJPEG-B, Theora, Matroska, Vorbis, Sony ATRAC3, DV, NSV, files could lead to the execution of arbitrary code.

Alerts:
Debian DSA-2471-1 2012-05-13
Mandriva MDVSA-2012:076 2012-05-15

Comments (none posted)

gridengine: privilege escalation

Package(s):gridengine CVE #(s):CVE-2012-0208
Created:May 16, 2012 Updated:May 16, 2012
Description: From the Debian advisory:

Dave Love discovered that users who are allowed to submit jobs to a Grid Engine installation can escalate their privileges to root because the environment is not properly sanitized before creating processes.

Alerts:
Debian DSA-2472-1 2012-05-16

Comments (none posted)

grub2: insecure permissions in bootloader configuration

Package(s):grub2 CVE #(s):CVE-2012-2314
Created:May 10, 2012 Updated:May 16, 2012
Description:

From the Red Hat bugzilla entry:

A security flaw was found in the way bootloader configuration module of Anaconda, a graphical system installer, stored password hashes when performing write of password configuration file (0755 permissions were used instead of 0700 ones). A local users could use this flaw to obtain password hashes and conduct brute force password guessing attacks (possibly leading to password circumvention, machine reboot or use of custom kernel or initrd command line parameters).

Alerts:
Fedora FEDORA-2012-7579 2012-05-10

Comments (none posted)

kernel: multiple vulnerabilities

Package(s):kernel CVE #(s):CVE-2012-1601 CVE-2012-2133
Created:May 10, 2012 Updated:May 22, 2012
Description:

From the Debian advisory:

CVE-2012-1601: Michael Ellerman reported an issue in the KVM subsystem. Local users could cause a denial of service (NULL pointer dereference) by creating VCPUs before a call to KVM_CREATE_IRQCHIP.

CVE-2012-2133: Steve Grubb reported in an issue in fcaps, a filesystem-based capabilities system. Personality flags set using this mechanism, such as the disabling of address space randomization, may persist across suid calls.

Alerts:
Debian DSA-2469-1 2012-05-10
SUSE SUSE-SU-2012:0616-1 2012-05-14
Red Hat RHSA-2012:0571-01 2012-05-15
CentOS CESA-2012:0571 2012-05-16
Ubuntu USN-1445-1 2012-05-17
Scientific Linux SL-kern-20120518 2012-05-18
Red Hat RHSA-2012:0676-01 2012-05-21
CentOS CESA-2012:0676 2012-05-21
Ubuntu USN-1448-1 2012-05-21
Oracle ELSA-2012-2013 2012-05-21
Oracle ELSA-2012-2013 2012-05-21
Oracle ELSA-2012-0571 2012-05-21
Scientific Linux SL-kvm-20120522 2012-05-22
Oracle ELSA-2012-0676 2012-05-22

Comments (none posted)

kernel: unfiltered netdev rio_ioctl access by users

Package(s):kernel CVE #(s):CVE-2012-2313
Created:May 14, 2012 Updated:May 16, 2012
Description: From the Red Hat bugzilla:

The dl2k driver's rio_ioctl call has a few issues:

  • - No permissions checking
  • - Implements SIOCGMIIREG and SIOCGMIIREG using the SIOCDEVPRIVATE numbers
  • - Has a few ioctls that may have been used for debugging at one point but have no place in the kernel proper.
Alerts:
Fedora FEDORA-2012-7538 2012-05-13
Fedora FEDORA-2012-7594 2012-05-15

Comments (none posted)

libjakarta-poi-java: denial of service

Package(s):libjakarta-poi-java CVE #(s):CVE-2012-0213
Created:May 10, 2012 Updated:May 21, 2012
Description:

From the Debian advisory:

It was discovered that Apache POI, a Java implementation of the Microsoft Office file formats, would allocate arbitrary amounts of memory when processing crafted documents. This could impact the stability of the Java virtual machine.

Alerts:
Debian DSA-2468-1 2012-05-09
Fedora FEDORA-2012-7683 2012-05-19
Fedora FEDORA-2012-7686 2012-05-19

Comments (none posted)

libsoup: insecure SSL handling

Package(s):epiphany, libsoup CVE #(s):CVE-2012-2132
Created:May 10, 2012 Updated:May 16, 2012
Description:

From the openSUSE advisory:

libsoup considered all ssl connections as trusted even if no CA certificates were configured.

Alerts:
openSUSE openSUSE-SU-2012:0609-1 2012-05-10

Comments (none posted)

mysql-cluster: multiple unspecified vulnerabilities

Package(s):mysql-cluster CVE #(s):CVE-2009-5026 CVE-2012-0583 CVE-2012-1688 CVE-2012-1690 CVE-2012-1696 CVE-2012-1697 CVE-2012-1703
Created:May 14, 2012 Updated:May 16, 2012
Description: From the CVE entries:

Unspecified vulnerability in the MySQL Server component in Oracle MySQL 5.1.60 and earlier, and 5.5.19 and earlier, allows remote authenticated users to affect availability, related to MyISAM. (CVE-2012-0583)

Unspecified vulnerability in the MySQL Server component in Oracle MySQL 5.1.61 and earlier, and 5.5.21 and earlier, allows remote authenticated users to affect availability, related to Server DML. (CVE-2012-1688)

Unspecified vulnerability in the MySQL Server component in Oracle MySQL 5.1.61 and earlier, and 5.5.21 and earlier, allows remote authenticated users to affect availability via unknown vectors related to Server Optimizer. (CVE-2012-1690)

Unspecified vulnerability in the MySQL Server component in Oracle MySQL 5.5.19 and earlier allows remote authenticated users to affect availability via unknown vectors related to Server Optimizer. (CVE-2012-1696)

Unspecified vulnerability in the MySQL Server component in Oracle MySQL 5.5.21 and earlier allows remote authenticated users to affect availability via unknown vectors related to Partition. (CVE-2012-1697)

Unspecified vulnerability in the MySQL Server component in Oracle MySQL 5.1.61 and earlier, and 5.5.21 and earlier, allows remote authenticated users to affect availability via unknown vectors related to Server Optimizer. (CVE-2012-1703)

unknown (CVE-2009-5026)

Alerts:
openSUSE openSUSE-SU-2012:0617-1 2012-05-14
openSUSE openSUSE-SU-2012:0618-1 2012-05-14
openSUSE openSUSE-SU-2012:0619-1 2012-05-14

Comments (none posted)

openssl: denial of service

Package(s):openssl CVE #(s):CVE-2012-2333
Created:May 11, 2012 Updated:May 18, 2012
Description:

From the Mandriva advisory:

A flaw in the OpenSSL handling of CBC mode ciphersuites in DTLS can be exploited in a denial of service attack on both clients and servers (CVE-2012-2333).

Alerts:
Mandriva MDVSA-2012:073 2012-05-11
Debian DSA-2475-1 2012-05-17
Ubuntu USN-1451-1 2012-05-24

Comments (none posted)

postgresql-pgpool: multiple vulnerabilities

Package(s):postgresql-pgpool CVE #(s):CVE-2009-1669 CVE-2008-4811 CVE-2008-4810 CVE-2008-1066
Created:May 14, 2012 Updated:May 16, 2012
Description: From the Red Hat bugzilla:

Silvio Cesare reported that pgpoolAdmin includes an embedded copy of the Smarty PHP template engine that is vulnerable to a number of security-related issues. The version of Smarty bundled in pgpoolAdmin 2.2 is 2.6.13, while the current version of Smarty is 2.6.25. This would make the embedded version of Smarty,and thus pgpoolAdmin, vulnerable to a number of issues.

Alerts:
Fedora FEDORA-2012-7124 2012-05-13

Comments (none posted)

roundcubemail: multiple vulnerabilities

Package(s):roundcubemail CVE #(s):CVE-2011-1491 CVE-2011-1492 CVE-2011-2937 CVE-2011-4078
Created:May 10, 2012 Updated:May 16, 2012
Description:

From the Mandriva advisory:

The login form in Roundcube Webmail before 0.5.1 does not properly handle a correctly authenticated but unintended login attempt, which makes it easier for remote authenticated users to obtain sensitive information by arranging for a victim to login to the attacker's account and then compose an e-mail message, related to a login CSRF issue (CVE-2011-1491).

steps/utils/modcss.inc in Roundcube Webmail before 0.5.1 does not properly verify that a request is an expected request for an external Cascading Style Sheets (CSS) stylesheet, which allows remote authenticated users to trigger arbitrary outbound TCP connections from the server, and possibly obtain sensitive information, via a crafted request (CVE-2011-1492).

Cross-site scripting (XSS) vulnerability in the UI messages functionality in Roundcube Webmail before 0.5.4 allows remote attackers to inject arbitrary web script or HTML via the _mbox parameter to the default URI (CVE-2011-2937).

include/iniset.php in Roundcube Webmail 0.5.4 and earlier, when PHP 5.3.7 or 5.3.8 is used, allows remote attackers to trigger a GET request for an arbitrary URL, and cause a denial of service (resource consumption and inbox outage), via a Subject header containing only a URL, a related issue to CVE-2011-3379 (CVE-2011-4078).

Alerts:
Mandriva MDVSA-2012:072 2012-05-10

Comments (none posted)

taglib: denial of service

Package(s):taglib CVE #(s):CVE-2012-2396
Created:May 14, 2012 Updated:May 16, 2012
Description: From the CVE entry:

VideoLAN VLC media player 2.0.1 allows remote attackers to cause a denial of service (divide-by-zero error and application crash) via a crafted MP4 file.

Alerts:
openSUSE openSUSE-SU-2012:0615-1 2012-05-14

Comments (none posted)

wordpress: multiple vulnerabilities

Package(s):wordpress CVE #(s):CVE-2012-2399 CVE-2012-2400 CVE-2012-2401 CVE-2012-2402 CVE-2012-2403 CVE-2012-2404
Created:May 11, 2012 Updated:May 16, 2012
Description:

From the Fedora advisory:

Bug #815384 - CVE-2012-2399 wordpress (X < 3.3.2): Unspecified vulnerability in SWFUpload https://bugzilla.redhat.com/show_bug.cgi?id=815384

Bug #815387 - CVE-2012-2400 wordpress (X < v3.3.2): Unspecified vulnerability in the SWFObject https://bugzilla.redhat.com/show_bug.cgi?id=815387

Bug #815388 - CVE-2012-2401 wordpress (X < v3.3.2): Plupload - Same origin policy bypass via crafted SWF content https://bugzilla.redhat.com/show_bug.cgi?id=815388

Bug #815389 - CVE-2012-2402 wordpress (X < v3.3.2): Remote authenticated site administrators able to deactivate network-wide plugins under certain circumstances https://bugzilla.redhat.com/show_bug.cgi?id=815389

Bug #815391 - CVE-2012-2403 wordpress (X < v3.3.2): XSS when making URLs clickable https://bugzilla.redhat.com/show_bug.cgi?id=815391

Bug #815392 - CVE-2012-2404 wordpress (X < v3.3.2): XSS in redirects after posting comments, and when filtering URLs https://bugzilla.redhat.com/show_bug.cgi?id=815392

Alerts:
Fedora FEDORA-2012-6511 2012-05-11
Fedora FEDORA-2012-6542 2012-05-11
Debian DSA-2670-1 2012-05-11

Comments (none posted)

wordpress: multiple vulnerabilities

Package(s):wordpress CVE #(s):CVE-2011-3122 CVE-2011-3125 CVE-2011-3126 CVE-2011-3127 CVE-2011-3128 CVE-2011-3129 CVE-2011-3130 CVE-2011-4956 CVE-2011-4957
Created:May 14, 2012 Updated:May 16, 2012
Description: From the CVE entries:

WordPress 3.1 before 3.1.3 and 3.2 before Beta 2 treats unattached attachments as published, which might allow remote attackers to obtain sensitive data via vectors related to wp-includes/post.php. (CVE-2011-3128)

The file upload functionality WordPress 3.1 before 3.1.3 and 3.2 before Beta 2, when running "on hosts with dangerous security settings," has unknown impact and attack vectors, possibly related to dangerous filenames. (CVE-2011-3129)

wp-includes/taxonomy.php in WordPress 3.1 before 3.1.3 and 3.2 before Beta 2 has unknown impact and attack vectors related to "Taxonomy query hardening," possibly involving SQL injection. (CVE-2011-3130)

unknown (CVE-2011-4956)

unknown (CVE-2011-4957)

Unspecified vulnerability in WordPress 3.1 before 3.1.3 and 3.2 before Beta 2 has unknown impact and attack vectors related to "Media security." (CVE-2011-3122)

Unspecified vulnerability in WordPress 3.1 before 3.1.3 and 3.2 before Beta 2 has unknown impact and attack vectors related to "Various security hardening." (CVE-2011-3125)

WordPress 3.1 before 3.1.3 and 3.2 before Beta 2 allows remote attackers to determine usernames of non-authors via canonical redirects. (CVE-2011-3126)

WordPress 3.1 before 3.1.3 and 3.2 before Beta 2 does not prevent rendering for (1) admin or (2) login pages inside a frame in a third-party HTML document, which makes it easier for remote attackers to conduct clickjacking attacks via a crafted web site. (CVE-2011-3127)

Alerts:
Debian DSA-2670-1 2012-05-11

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The current development kernel is 3.4-rc7, released on May 12. Linus says: "This is almost certainly the last -rc in this series - things really have calmed down, and I even considered just cutting 3.4 this weekend, but felt that another week wouldn't hurt." Expect a 3.4 final release in the near future.

Stable updates: 3.3.6 was released on May 12, as was 3.2.17. The 2.6.34.12 update is in the review process as of this writing; it can be expected on or after May 17.

Comments (none posted)

Quotes of the week

Also, when Van [Jacobson] says something, you can be fairly sure its right, and if it's not, then you didn't understand what Van said.
Eric Dumazet (thanks to Dave Täht)

The thrust of the argument seems to be that by establishing good habits from the very beginning you can avoid the need for change. That may well be true, but it isn't particularly "user friendly". We should make things simple and safe so that people don't *need* to carefully form good habits.
Neil Brown

Comments (1 posted)

The end of the token ring era?

By Jonathan Corbet
May 16, 2012
Paul Gortmaker was recently doing some cleanup work when he found the token ring networking code getting in the way. Which led him to wonder: was anybody still using that code? He concluded that the answer was "no":

A search on the internet for users tends to show that even the die hard enthusiasts who cared to poke at MCA/TR just for hobby sake have pretty much all given up somewhere in the 2003-2005 "pre-git" timeframe, and never really moved off their 2.4.x kernels.

In response, he put together a patch to remove the token ring subsystem altogether. The patch was presented as a demonstration, without a lot of hope that it would be applied in the near future. Paul's real goal was to get comments and see if he could build a consensus for the removal of the code at some more distant time.

Thus far, there has been one objection. But, that notwithstanding, David Miller has accepted the patch and fast-tracked it directly into the net-next repository. Barring some sort of reversion prior to the merge window, it looks like the 3.5 kernel will be missing support for token ring networking.

Comments (10 posted)

Kernel development news

User and group mount options for ext filesystems

By Jake Edge
May 16, 2012

When transporting files between systems using USB sticks or other removable media, one can run into an annoying problem: the UIDs or GIDs of the files on the media don't match those on the system. In most situations, those kinds of devices have a VFAT filesystem that avoids the problem entirely by not storing UID/GID information. But if a user wants to use a "real" filesystem on the device, one of the ext* family for example, it might be useful to specify the local owner of the files. Ludwig Nussel's patch set would do just that for ext2, ext3, and ext4 filesystems.

The patch comes from some work Nussel did "years ago", he said, when re-introducing it. It simply adds two new mount options for ext filesystems. Following in the footsteps of the VFAT filesystem, the patch would add uid= and gid= options that would treat all files in the filesystem as being owned by that UID/GID combination. When a filesystem is mounted using these options, files retain their ownership on disk, but they appear to be owned by the specified user and group. Existing files cannot have their ownership changed, but new files will be created with the user and group given at mount time. If a different UID/GID combination is desired for new files—to match the UID/GID on the device for example—they can be added to the mount option:

    uid=m:n
    gid=x:y
which would make the files appear to be owned by m.x and would create new files as n.y.

One of the first questions to greet Nussel's patch was about putting the code into specific filesystems, rather than the VFS layer. While the VFS seems like the right place, Ted Ts'o points out that there is no easy way to do it all there:

The problem is that there will need to be at least some support in the individual file system, since there isn't a good place for the VFS to intercept the internal file system iget() function to patch in the override uid/gid values.

So the question at this point is whether it's cleaner to have the functionality split between the VFS and the file system layers (i.e., with the options parsing and storing the override uid/gid values in the super_block structure) or keeping it all in the file system layer, and accepting the duplication of code across multiple file systems.

Ts'o leaned toward the first approach in that message, but later reluctantly accepted the code duplication. From what he could see, there wasn't enough of a win to put it into the VFS.

There was a little more discussion when Nussel resent the patch on May 10. First off, Jan Kara and Ts'o both wanted to see the patch split into three parts (one for each of ext2, 3, and 4), which Nussel did and posted the next day. But, Roland Eggner and Boaz Harrosh were both concerned about the underlying idea of the patch. Circumventing the access restrictions on the files via a mount option is not a sensible way to address what is, really, an administrative problem, they said.

Eggner described how he "solves" the problem for systems he administers by essentially creating and using a static list of UIDs and GIDs. His position is: "If UIDs differ on machines FORESEEN for file exchange, this is an administrator error, not a kernel deficit." Furthermore, exchanging files with unexpected systems requires root privileges, he said, so there is no need for the mount option override.

Like Eggner, Harrosh is concerned about security issues with the proposed change. He also doesn't see anything particularly special about the ext filesystems in terms of removable media, noting that VFAT is the dominant choice. Beyond that, he questions the definition of "removable media", and notes that the problem is common in the NFS world: "we constantly encounter multiple domain uid/gid views, and it does not mean we blow a hole in POSIX security rules."

But Neil Brown sees things a little differently. He notes that VFAT suffers from limitations including a 4G file size limit and an inability to handle some special characters in file names. That aside, when someone has physical access to a device, it is essentially "removable" in some sense, so that someone may want to easily access the data:

[...] if I "own" a filesystem - whether because I hold the physical non-encrypted devices or because I know the encryption key - then I want to be able to leverage that "ownership" to full access rights to the contents of the filesystem. By typing in a key or plugging in a device I want to get full "root" access to the filesystem on the device. Not giving that to me is just getting in my way.

When users insert a VFAT-formatted USB stick or disk, suitably configured systems will give full access to the user by using the VFAT uid/gid options. Nussel's patches essentially just give that same power for ext-formatted devices. While it could certainly lead to problems, those problems are already latent, as Brown pointed out:

You cannot prevent data destruction on such devices if you lose physical control, and the only workable data privacy option is encryption. Trying to pretend that file permission bits mean anything is extremely naive.

While Harrosh is concerned that automounters will start using the options, Brown believes that makes sense for removable devices. In the patch, Nussel mentions that it could be done statically in /etc/fstab or be handled dynamically through udev rules. The alternative suggested by Harrosh is that root can mount the device and then chmod (or chown, presumably) the files appropriately. That seems like a pointless exercise that will just have to be repeated, potentially every time the device is plugged into a new system. Eggner's method is certainly workable, at some level, but makes things more difficult and less "user friendly", Brown said.

In the end, it is a convenience feature. Anyone with physical access to a unencrypted removable device already has the tools available to read the data on it or to put malware onto it. It's a little hard to see how making it easier for legitimate owners of removable USB storage to access their data somehow opens the floodgates for attackers of various sorts. Those of a malicious bent can find any number of ways (live CD, their own Linux system, ...) to access the device as root if they wish.

It is unclear how prevalent ext-formatted removable devices are, so there may be an argument against adding the feature on those grounds. On the other hand, making the ext family work better may encourage people to use those filesystems more often for removable media. The patches do duplicate code in the three separate filesystems, but the total number of lines is changed is only around 100 lines for each. Moving some of that into the VFS (like parsing the mount options and storing the flags in the superblock) might reduce that a bit, but it's not much code overall. Administrators who are worried about the feature will be able to avoid it entirely, though they may need to keep an eye on their distribution's udev rules. Given that it brings the same convenience as VFAT to ext-formatted devices, it seems like a feature worth having.

Comments (27 posted)

Various tweaks to printk()

By Jonathan Corbet
May 16, 2012
For the most part, the logging reliability patches covered here in April have been quietly stabilizing and appear to be set for merging for 3.5. But printk() is a heavily-used function, so there are a lot of people with strong opinions on how it should work. Thus the discussion on how printk() can be improved has stretched out for some time. The result, so far, is a better understanding of how continuation lines should be handled and, possibly, a new format for timestamps.

Messages are sent to the system log with printk(), but that function has an interesting bit of historical behavior: like printf() in user space, printk() can be used to send partial lines to the log. Multiple printk() calls can be used to produce a single line in the log stream, piece by piece. The patches for 3.5 make printk() much more record-oriented internally, but the API does not change. So there is a bit of an impedance mismatch between a record-oriented logging system and its stream-oriented API. That mismatch has been there since the beginning, but it has become more clear over time.

The mixed nature of kernel logging leads to a bit of an ambiguity, because any message can be either of two things: (1) a new message to be logged or (2) a continuation of a previous log message. The kernel decides which of the two situations holds by remembering whether the previous log message ended with a newline or not. If there was no trailing newline, a new message will be appended to the previous line.

This approach works much of the time, but it is not without its hazards. In particular, there is nothing that guarantees that two successive printk() calls will be executed one right after the other. Even on a uniprocessor system, interrupt handlers can emit messages between two printk() calls that are supposed to produce a single line of output. Adding more processors to the system clearly makes the situation worse; there is only one log buffer containing messages from all processors, so it is easy for one processor to jump into the middle of a sequence of printk() calls being executed on another. What happens then is not especially pretty: messages get mashed together and corrupted. The result is a log that is harder for humans to read, and which can totally confuse automated log-processing tools.

This patch set was supposed to be about increasing logging reliability, so that sort of message corruption is not welcome. The original plan devised by developer Kay Sievers was to require an explicit KERN_CONT "log level" marker for continuations. In this scheme, every printk() call will generate a new log line unless merging has been explicitly requested with the KERN_CONT "log level." There is a little problem in that most continuation lines are not so-marked in current kernels, leading to lines being split up; Kay's plan was to audit the kernel and fix all of those calls to work properly in the new scheme.

Linus didn't like that idea, saying that things work well as they are now; to him, adding all those KERN_CONT markers just represented unnecessary noise. After some back-and-forth, Kay came around to Linus's point of view, but he still wanted to avoid the corruption of messages whenever possible. The result was a new patch that tries to explicitly remember partial printk() calls and associate them with a specific process. Lines passed to printk() will be merged only if they both come from the same process and only if the second line is clearly not the start of a new log message. The end result is not perfect: if two processors try to output partial lines at the same time, at least one of them will be split. But there will be no more joining of unrelated messages, and that seems like a good thing.

A different branch of the same discussion got into the formatting of timestamps, which will always be present in the new scheme. In current kernels, that timestamp comes in the form of seconds and microseconds since the system booted. But what developers often really want to see is some combination of the absolute time of an event and the relative time from previous events. After some discussion with Sasha Levin, Linus requested a format that looks like this:

    [May12 11:27] foo
    [May12 11:28] bar
    [  +5.077527] zoot
    [ +10.235225] foo
    [  +0.002971] bar
    [May12 11:29] zoot
    [  +0.003081] foo

In other words, events that are relatively far apart in time would be marked with the absolute time with one-minute precision. When things happen more closely in time, the elapsed time between successive events would be printed instead. For any driver developer trying to figure out the relative timing of device-related events, this kind of output format would help to save a lot of mental arithmetic.

The patches to produce this format have not yet been posted, so it is looking likely that we will not see it in the 3.5 kernel. The rest of the logging work should be there for 3.5, though, taking Linux one small step closer to the sort of structured and reliable logging that many users and developers would like to see.

Comments (10 posted)

A bcache update

By Jonathan Corbet
May 14, 2012
Flash-based solid-state storage devices (SSDs) have a lot to recommend them; in particular, they can be quite fast even when faced with highly non-sequential I/O patterns. But SSDs are also relatively small and expensive; for that reason, for all their virtues, they will not be fully replacing rotating storage devices for a long time. It would be nice to have a storage device that provided the best features of both SSDs and rotating devices—the speed of flash combined with the cheap storage capacity of traditional drives. Such a device could simultaneously reduce the performance pain that comes with rotating storage and the financial pain associated with solid-state storage.

The classic computer science response to such a problem is to add another level of indirection in the form of another layer of caching. In this case, a large array of drives could be hidden behind a much smaller SSD-based cache that provides quick access to frequently-accessed data and turns random access patterns in something closer to sequential access. Hybrid drives and high-end storage arrays have provided this kind of feature for some time, but Linux does not currently have the ability to construct such two-level drives from independent components. That situation could change, though, if the bcache patch set finds its way into the mainline.

LWN last looked at bcache almost two years ago. Since then, the project has been relatively quiet, but development has continued. With the current v13 patch set, bcache creator Kent Overstreet says:

Bcache is solid, production ready code. There are still bugs being found that affect specific configurations, but there haven't been any major issues found in awhile - it's well past time I started working on getting it into mainline.

The idea behind bcache is relatively straightforward: given an SSD and one or more storage devices, bcache will interpose the SSD between the kernel and those devices, using the SSD to speed I/O operations to and from the underlying "backing store" devices. If a read request can be satisfied from the SSD, the backing store need not be involved at all. Depending on its configuration, bcache can also buffer write operations; in this mode, it serves as a sort of extended I/O scheduler, reordering operations so that they can be sent to the backing device in a more seek-friendly manner. Once one gets into the details, though, the problem starts to become more complex than one might imagine.

Consider the buffering and reordering of write operations, for example. Some users may be uncomfortable with anything that delays the arrival of data on the backing device; for such situations, bcache can be run in a write-through caching mode. When write-through behavior is selected, no write operation is considered to be complete until it has made it to the backing device. Clearly, in this case, the SSD cache is not going to improve write performance at all, though it may still improve performance overall if that data is read while it remains in the cache.

If, instead, writeback caching is enabled, bcache will mark the completion of writes once they make it to the SSD. It can then flush those dirty blocks out to the backing device at its leisure. Writeback caching can allow the system to coalesce multiple writes to the same blocks and to achieve better on-disk locality when the writes are eventually flushed out; both of those should improve performance. Obviously, writeback caching also carries the risk of losing data if the system is struck by a meteorite before the writeback operation is complete. Bcache includes a fair amount of code meant to address this concern; the SSD contains an index as well as the cached data, so dirty blocks can be located and written back after the system comes back up. Providing meteorite-proof drives is beyond the scope of the bcache patch set, though.

Of course, maintaining this index on the SSD has some performance costs of its own, especially since bcache takes pains to only write full erase blocks at a time. One write operation from the kernel can turn into several operations at the SSD level to ensure that the on-SSD data structures are consistent at all times. To mitigate this cost, bcache provides an optional journaling layer that can speed up operations at the SSD level.

Another interesting problem that comes with writeback caching is the implementation of barrier operations. Filesystems use barriers (implemented as synchronous "force to media" operations in contemporary kernels) to ensure that the on-disk filesystem structure is consistent at all times. If bcache does not recognize and implement those barriers, it runs the risk of wrecking the filesystem's careful ordering of operations and corrupting things on the backing device. Unfortunately, bcache does indeed lack such support at the moment, leading to a strong recommendation to mount filesystems with barriers disabled for now.

Multi-layer solutions like bcache must face another hazard: what happens if somebody accesses the underlying backing device directly, routing around bcache? Such access could result in filesystem corruption. Bcache handles this possibility by requiring exclusive access to the backing device. That device is formatted with a special marker, and its leading blocks are hidden when accessing the device by way of bcache. Thus, the beginning of the device under bcache is not the same as the beginning when the device is accessed directly. That means that a filesystem created through bcache will not be recognized by the filesystem code if an attempt is made to mount the backing device directly. Simple attempts to shoot one's own feet should be defeated by this mechanism; as always, there is little point in doing more to protect those who are really determined to injure themselves.

There seems to be a reasonable level of consensus that bcache would be a useful functionality to add to the kernel. There are some obstacles to overcome before this code can be merged, though. One of those is that bcache adds its own management interface involving a set of dedicated tools and a complex sysfs structure. There is resistance to adding another API for block device management, so Kent has been encouraged to integrate bcache into the device mapper code. Nobody seems to be working on that project at the moment, but Dan Williams has posted a set of patches integrating bcache into the MD RAID layer. With these patches, a simple mdadm command is sufficient to set up an array with SSD caching added on top. Once that code gets into shape, presumably the user-space interface concerns will be somewhat lessened.

A harder problem to get around may be the simple fact that the bcache patch set is large, adding over 15,000 lines of code to the kernel. Included therein is a fair amount of tricky data structure work such as a complex btree implementation and "closures," being "asynchronous refcounty things based on workqueues." The complexity of the code will make it hard to review, but, given the potential for trouble when adding a new stage to the block I/O path, developers will want this code to be well reviewed indeed. Getting enough eyeballs directed toward this code could be a challenge, but the benefit, in the form of faster storage devices, could well be worth the trouble.

Comments (37 posted)

Patches and updates

Kernel trees

Core kernel code

Development tools

Device drivers

Filesystems and block I/O

Memory management

Networking

Architecture-specific

Security-related

Virtualization and containers

Miscellaneous

Page editor: Jonathan Corbet

Distributions

Stable distributions and unstable software

By Jake Edge
May 16, 2012

Some projects evolve quickly, with rapid release cycles that often leave older major versions behind. That may work just fine for users who are getting the code directly from the project, but it can be problematic for users getting the code from distributions. The problem becomes more acute when security updates are wrapped up inside releases for new features and other bug fixes. The tension between stability and the latest and greatest version was discussed in a recent debian-devel thread regarding WordPress, but the problem goes beyond just Debian—or WordPress.

The discussion started from a bug report filed by Bernd Zeimetz entitled "wordpress: no sane way for security updates in stable releases". He was reacting to a recent wordpress security update that upgraded Debian's wordpress package (based on 3.0.5) to the latest upstream version (3.3.2) because "specific fixes are usually not identified", which makes it difficult or impossible to backport the fixes. The update announcement goes on to warn users that compatibility (especially for plugins or themes that have been installed) may be impacted by the update.

That's not generally the experience that Debian users expect. As Zeimetz put it:

Being forced to upgrade to a new major version by a stable security support is nothing we should force our users to. Debian stable is known for (usually) painfree updates and bugfixes only, not for shipping completely new versions with a forced migration.

His suggestion was to leave WordPress out of the upcoming "Wheezy" (7.0) release "until upstream handles such issues in a sane way". It's not the first time that idea has been raised. Back in 2007, Moritz Muehlenhoff argued that "Etch" (Debian 4.0) should not ship WordPress due to its security track record. That suggestion was overridden by a vote of the technical committee. So far, at least, it doesn't seem like Zeimetz's bug (which was closed by Muehlenhoff) is headed toward the technical committee, but it did bring up some interesting discussion.

The general consensus seemed to be that WordPress (and other web-oriented applications and frameworks) just move too fast to fit in well with the Debian stable model. Each new release of WordPress likely has some security fixes, Russell Coker said, that are undocumented, so the safest approach is to always update to the newest release. That led Jon Dowland to wonder what value Debian is providing by packaging WordPress if there are no stability guarantees. Several people suggested that it does provide for an easy way to install and upgrade the package, though it is a bit unclear how many people actually do things that way.

In the thread, several users said that they install directly from upstream, rather than using the packages, for a number of reasons. There are numerous plugins and themes for WordPress, many of which are not packaged for Debian for licensing or other reasons, and that typically require the latest version to function. In addition, the Debian package is not really targeted at multi-blog installations. For example, Russ Allbery described the reasons that Stanford University installs from upstream; others concurred with that assessment.

Other distributions have essentially been forced down the same path that the recent Debian update took. Fedora, for example, also updated to the latest WordPress in order to fix a number of security problems. Fedora users are probably more used to living on (or close to) the bleeding edge than Debian stable users are. But maintaining a package that upstream has left far behind for 2-3 years, as Debian tries to do, is likely to be difficult.

Evidently, WordPress doesn't have a lot of interest in declaring a stable release to maintain over that kind of time frame. That's not a surprise, nor a knock on WordPress, as the web moves very quickly and the project can make its own decisions about how to support its users. That said, it would certainly help distributions and others to give better information about security fixes so that backports could potentially be made. While the WordPress security track record may have gotten better over the years—that depends on whom you listen to— some of the same problems that we wrote about in 2009 persist.

The problem is not limited to WordPress, of course, as there are lots of projects, particularly in the web space, that are rapidly updating and leaving their older major versions behind. Firefox is another example of a project that generally forces distributions to upgrade to the latest version due to its rapid release cycle (though the extended support release may blunt the impact for some distributions). Other content management systems, web browsers, frameworks, and so on, have had similar situations that required a major version upgrade for security fixes.

It is still an open question how Linux distributions should handle packaging these kinds of projects. One possible solution for Debian is just to document the problem as is done for browsers, which was suggested by Martin Bagge. Essentially, Debian alerts users that some browsers may not get updates because of the lack of a long-term maintenance branch.

This is yet another example of the difficulty in maintaining a stable base using an ever-shifting array of parts. Distributions are dependent on the upstream projects, but those projects may have an entirely different focus. For distributions like Fedora that turn over every year or so, it's less of an issue, but distributions like Debian (or Ubuntu LTS) are going to have to carefully decide which packages they can maintain—and how they maintain them—over the long haul.

In the future, it may make sense to explore other options. Perhaps distributions could concentrate on the core "plumbing" of the system (libraries, desktops, development tools, utilities, etc.) while providing a means for users to easily install applications (especially fast moving ones) from upstream. That is the model that the Google's Play store follows for Android, and Ubuntu is experimenting with that to some extent in its Ubuntu Software Center. With cooperation of the upstream projects, some kind of middle ground might be found between using the package manager and installing upstream code with an entirely different mechanism. There are lots of things to like about the Linux distribution model, but that doesn't mean that there is no room for improvement.

Comments (25 posted)

Brief items

Distribution quotes of the week

-kgd, longtime RedHat-er torn between a distro that I get along with and a distro with at least three kitchen sinks included
-- Kris Deugau

Recipe to stop biggerism: Stop upgrading everything.
-- Chris Murphy

I have a Fedora sticker in front of *acebook*.

All I need is for folks to come up and ask me whats the deal with that? Next thing ya know, I am telling them about Linux, Fedora, and the Beefy Miracle. Sense of humor is key here. If life gives you lemons, make lemonade. Then maybe put some booze in it, and share it with others. At this point, you and your new friends can talk about the bastard that thought it was a good idea to give you lemons.

-- Mark Terranova

Comments (none posted)

Debian Administrator's Handbook published -- and Freed

The Debian Administrator's Handbook by longtime Debian developers Raphaël Hertzog and Roland Mas has been published in a wide variety of formats. Due to a successful "liberation" fundraising campaign, it is also freely available. "This translation into English of the fifth edition of the French “Cahier de l'Admin Debian” (published by Eyrolles) has been crowdfunded, and the results are just released. The funding campaign was so successful that the book is even published under not one but two free licenses (GPL-2+ and CC-BY-SA-3). It is available as paperback, in several electronic formats for easy consumption, and even browsable online from the website. And of course, it's also been made available to Debian users in the "debian-handbook" package."

Full Story (comments: 2)

Red Hat Celebrates 10 Years of Red Hat Enterprise Linux

It's been 10 years since Red Hat first released Red Hat Enterprise Linux. Here's a press release.

Comments (1 posted)

Newsletters and articles of interest

Distribution newsletters

Comments (none posted)

Notes from the Ubuntu Developer Summit (The H)

The H rounds up reports from the Ubuntu Developer Summit that is currently being held in Oakland, California. Chris Kenyon, Canonical's Vice President of OEM Services, reports that Ubuntu shipped pre-installed on 8-10 million computers last year and predicted that it would ship on 18 million next year (which would be 5% of the market, he said). Also: "Ubuntu developers are planning to fork the GNOME Control Center to create their own Ubuntu Control Center package. Other than GNOME Shell, it is planned that the installation CD for Ubuntu 12.10 "Quantal Quetzal" will include almost all core components of GNOME 3.6, including Clutter. Up to now, Clutter has been missing from the default install which had forced the Ubuntu developers to include Totem 3.0 instead of 3.4 because the newer version depends on Clutter."

Comments (16 posted)

Page editor: Rebecca Sobol

Development

LGM: Unusual typography

By Nathan Willis
May 16, 2012

Libre Graphics Meeting (LGM) always has a healthy serving of typography and font talks on the menu. This year in Vienna, there were several sessions on the program that showcased new and unusual approaches to getting text on screen. One was an OpenGL-based renderer that offers noticeably smoother text in video and animation, while another was a toolkit for web developers to seamlessly integrate advanced type design features into modern web pages — including multi-color fonts.

Fonts and the GPU

Google's Behdad Esfahbod is the author and maintainer of HarfBuzz, the OpenType layout engine in use by Gecko and many other free software projects. HarfBuzz takes a sequence of Unicode characters and returns a set of properly-arranged glyphs that corresponds to them in the current language and script. For simple alphabets, the process of selecting the right output is rarely more complex than looking for pre-composed versions of any accented characters, then deciding how to compose any missing ones out of base glyphs and accents. But for complex scripts like Arabic, the shaping, positioning, and cursive joining process is tricky enough to require a complete shaping engine.

[Behdad Esfahbod]

Esfahbod gave a talk about HarfBuzz, which is a well-established project, but he also presented his personal side-project GLyphy, which uses OpenGL to render smooth text in video and animation. He wrote GLyphy in an attempt to circumvent a common rendering problem visible with text rendered in animations: the stuttering of outlines whenever a piece of text moves on screen. The root of the problem, he explained, was that most such text was re-rendered for each frame using the traditional rasterization process designed for still images (on screen or in print). When the text moves in every frame, the anti-aliasing algorithm picks a different set of pixels to shade, and the sharp outlines of the glyphs make visible jumps.

Esfahbod's approach was to ditch the algorithm most frequently used to rasterize the glyphs: coverage-based anti-aliasing. In this approach, the program calculates the portion of each pixel covered by the outline, and uses that percentage to shade each pixel. The fact that the algorithm is pixel-oriented is what causes the jitter when translating or rotating text in an animation. GLyphy instead uses a signed-distance-field (SDF) algorithm to shade the glyphs, which is independent of scaling and other transformations. SDF calculates the distance from each pixel to the nearest point on the outline; together those values form an anti-aliased outline of the glyph — but one that looks good at high resolutions.

SDF is math-intensive, so GLyphy leverages OpenGL to speed up the process. It also decomposes the Bezier curves used in most outline fonts into arc segments, because arcs are easier to compute distances against. Unlike other OpenGL renderers, GLyphy sends the glyph outlines to the GPU in vector format rather than copying pre-rendered samples into textures. The result, as Esfahbod demonstrated, is very smooth text animations. Getting decent performance is not quite as simple, he said — GLyphy hits lots of driver bugs, and the specific font and text directly affect how fast and how much memory is required for rendering. Consequently, making good use of GLyphy requires some knowledge of the content and the graphics hardware, so it will probably never be a general-purpose text renderer. For now, it remains experimental — but other projects certainly may find things to learn from it.

Web typography gone wild

Ana Carvalho and Ricardo Lafuente of Manufactura Independente presented a session on web typography with open fonts and open source software. Manufactura Independente handled the recent redesign of the Open Font Library web site, which they demonstrated during their discussion of other online resources. But in addition to fonts themselves, their talk focused on a set of jQuery-based JavaScript libraries that implement new and (for now) unusual text rendering options for the web.

[Carvalho and Lafuente]

Covered in the talk were FitText.js, which auto-scales text to fit a given box element (regardless of the screen orientation or size) and Lettering.js, a library that automatically splits typographic elements into individual CSS classes so that they can be manipulated precisely. Both are open source products from design firm Paravel. Also discussed were Kerning.js, which offers CSS rules for altering the size, kerning, weight, slant, and color of text elements — by the word, individual character, or by matching letter patterns — and Ligature.js which allows you to replace specific letter sequences with ligatures, beyond the basic fl, fi, ae, and AE ligatures that the developers say can be reliably rendered in vanilla HTML.

It is certainly possible already to use CSS to manually tweak individual page elements, right down to wrapping individual characters in <span> tags, but web designers will appreciate the simplicity of using a library instead.

[Colorfont example]

The team also presented its own library, Colorfont.js, which implements fonts sporting multiple colors in each glyph, using CSS. The process is specific to the typeface used — you start with a base font and create an "overlay" version separately. The overlay font should match up to the glyphs of the original exactly, but only include the portions of each letter to be shown in the second color. Colorfont.js then allows you to mark up text with the colorfont style class, which it will render with the two versions of the font superimposed.

Here again, a designer could use other means to accomplish the same end result (such as generating a PNG image), but using Colorfont allows the two-tone text to remain actual text — visible to search engines, mouse-selectable, and everything else. Depending on the size of the image it replaces, it is likely to save load time as well, plus fall back gracefully on image-less browsers or those without JavaScript.

Carvalho and Lafuente also led a design workshop in which participants created an overlay font to match an existing base font. I was unable to attend, as it was scheduled at the same time as a workshop I was proctoring. However, I was able to talk to several of the attendees after the fact and see their results. The hands-on experience certainly makes the library more useful to the attendees.

Final word

Having worked with font design and development for the past two years, I am keenly aware how easy it is to take it for granted. Users and developers both count on fonts appearing on screen as desired with little intervention, so it can be quite a reality check to dig into the glyph shaping and rendering process and see how many steps are involved. Esfahbod's work with GLyphy may remain an experimental exercise for the foreseeable future, but as more and more processing gets offloaded from the CPU onto graphics hardware, it is also possible that someday OpenGL rendering will be commonplace — in which case we will be glad that open source software is ready.

Ironically, the jQuery font libraries showcased by Carvalho and Lafuente at the same event are really backward-looking, in the sense that they restore a level of control and precision to typography that was commonplace centuries ago, but lost in the transition to digital fonts and web publishing. After all, illuminated manuscripts are some of the oldest multi-color typography projects in history; it is quite entertaining to see JavaScript and CSS used to reproduce them today.

[Thanks to the Libre Graphics Meeting for assistance with travel to Vienna.]

Comments (13 posted)

Brief items

Quotes of the week

I couldn't have told you the first thing about Java before this problem. I have done, and still do, a significant amount of programming in other languages. I've written blocks of code like rangeCheck a hundred times before. I could do it, you could do it. The idea that someone would copy that when they could do it themselves just as fast, it was an accident. There's no way you could say that was speeding them along to the marketplace. You're one of the best lawyers in America, how could you even make that kind of argument?
-- Judge Alsup (Oracle v. Google) has a clue

Janet Jackson's wardrobe malfunction was merely notorious, whereas Adolph Hitler was an infamous tyrant. See what I mean? [...]

So that's why I switched it. I didn't feel that the Unicode bug deserved to be labeled as evil incarnate, a vile abomination deserving utter reprobation. I didn't want to be seen to make a moral judgment.

-- Tom Christiansen

Comments (9 posted)

ConnMan 1.0 released

Marcel Holtmann has announced the 1.0 release of ConnMan, an internet connection manager targeted at embedded systems. It is designed to be small, require few resources, and handle a wide array of connection types (wired, wireless, cellular) with IPv4 and IPv6 support, along with proxy handling, tethering support, statistics gathering, and more. "The ConnMan 1.0 release means that we from now on provide a stable D-Bus API. From now on, the API will only be extended and we will keep backward compatibility going forward. [...] ConnMan is a monolithic system that includes many features right out of the box without requirement of any external daemons or programs." (Thanks to Gustavo Padovan and Daniel Wagner.)

Comments (23 posted)

Kdenlive 0.9 released

Version 0.9 of the Kdenlive video editor has been released. Improvements in this release include the ability to align multiple video tracks using the audio stream, a rewritten effects subsystem, improved importing of online media, and a number of usability enhancements.

Comments (none posted)

Lotus Symphony code for OpenOffice coming soon

IBM has announced that the paperwork has been signed and that the contribution of the Lotus Symphony code to OpenOffice will happen shortly. "The successful delivery of Apache OpenOffice 3.4 has enabled us to finalize our grant with the the Apache Software Foundation and initiate this new phase of effort within the community. This is about envisioning a future for Apache OpenOffice that builds on the best code we can offer together with the best developers who have mastered it." For those wondering about what this code offers, there is a Symphony Contribution wiki page describing the most interesting features.

Full Story (comments: 70)

OrientDB 1.0 released

The OrientDB "NoSQL graph-document database management system" project has produced its 1.0 release. New features include a new multi-master replication scheme, a new object database interface, an undo mechanism, server-side scripting, and more.

Comments (37 posted)

PowerTOP v2.0 Release

The PowerTOP power consumption monitor has released version 2.0. "Version 2.0 has several new key features and updates. The first big change is the use of a hardened library called libparseevents, for accessing the kernel "perf" infrastructure. With this enhancement, we are able to provide much more accurate data, and be more flexible with any future kernel development. There has been a great deal of work done in the area of CPU data measurement and diagnostics. Full accurate support was added for CPU idle, frequency, and power traces, along with expanded frequency state reporting for CPUs with more than 10 states. With these additions, PowerTOP v2.0 now gives a clearer picture of how programs affect CPU utilization, and the impact on important power-saving sleep states." In addition, version 2.0 adds the ability to export HTML and CSV reports, has a re-designed, tab-oriented UI, and more.

Comments (17 posted)

PulseAudio 2.0 released

Version 2.0 of the PulseAudio sound system is out. New features include support for multiple sample rates, jack detection, a number of VOIP support improvements, a virtual surround module, and more; see the release notes for details.

Full Story (comments: 10)

pypy-stm: a parallel Python

PyPy developer Armin Rigo has announced "the very first Python 2.7 interpreter to run existing multithreaded programs on multiple cores." He also advises that it is "horribly slow," so it is mostly of interest to people wanting to learn about the technology and help make it better. This documentation file contains some good information on how the software transactional memory system used in pypy-stm works and what can be done with it.

Full Story (comments: 1)

tig 1.0 released

"Tig" is a curses-based front end to the git version control system; the 1.0 release is now available. It includes a long list of new features and customization options; click below for the details.

Full Story (comments: none)

Newsletters and articles

Development newsletters from the last week

Comments (none posted)

Open Source Robotics Foundation incorporated (The H)

The H has a story about the launch of the Open Source Robotics Foundation (OSRF). "The mission of the non-profit organisation is to support the development, distribution, and adoption of open source software for use in robotics research, education, and product development." Spearheading the OSRF is Willow Garage, whose Robot Operating System (ROS) we covered in January 2012.

Comments (none posted)

Open source Java moving to Linux, AIX on PowerPC (IT World Canada)

IT World Canada is reporting that a team from IBM and SAP is working to bring support for PowerPC processors to OpenJDK, on Linux and on IBM's AIX. "'This reference implementation can then be used by IBM and SAP to provide their commercially licensed Java offerings in much the same way in which Oracle offers its Oracle JDK product based on OpenJDK,' [Volker] Simonis said. 'The big advantage for the open source community is that everybody (i.e. Linux distributors like Debian, Red Hat, or Ubuntu) will be able to build and provide free and state-of-the-art versions of Java based on the new OpenJDK platform ports. And of course they are highly welcome to engage in the project as well.'" OpenJDK would replace IBM's proprietary JDK as the leading Java implementation on PowerPC. The project was first proposed May 7 on the OpenJDK discussion list.

Comments (7 posted)

Page editor: Jonathan Corbet

Announcements

Brief items

2012 Linux Foundation T-shirt Contest

The Linux Foundation has announced the 3rd annual T-shirt design contest. "This year’s theme is “Inspired by Linux.” The deadline for submissions is June 8, 2012 at 11:55 p.m. PT."

Comments (none posted)

New Books

The Artist's Guide to GIMP, 2nd Edition--New from No Starch Press

No Starch Press has released "The Artist's Guide to GIMP, 2nd Edition" by Michael J. Hammel.

Full Story (comments: none)

Rails Recipes: Rails 3 Edition--New from Pragmatic Bookshelf

Pragmatic Bookshelf has released "Rails Recipes, Rails 3 Edition" by Chad Fowler.

Full Story (comments: none)

The Rails View--New from Pragmatic Bookshelf

Pragmatic Bookshelf has release "The Rails View" by Bruce Williams and John Athayde.

Full Story (comments: none)

Calls for Presentations

Tracing Micro-conference at LPC2012

There will be a tracing micro-conference taking place during LinuxCon, San Diego (August 28-30, 2012). "Our intent is to gather people involved in development and users of tracing tools and trace analysis tools to allow discussing the latest developments, allow users to express their needs, and generally ensure that developers and end users understand each other." The call for presentations ends soon.

Full Story (comments: none)

hack.lu 2012 - Call for Papers

Hack.lu takes place October 23-25, 2012 in the Grand-Duchy of Luxembourg. The call for papers deadline in July 15.

Full Story (comments: none)

Call for Papers - PostgreSQL Conference Europe 2012

The PostgreSQL Conference Europe 2012 takes place October 23-26, 2012 in Prague, Czech Republic. The call for papers deadline is August 15.

Full Story (comments: none)

PyCon DE 2012 - Call for Papers

PyCon DE will be held in Leipzig Germany, October 29-November 3. "The conference language will be German. However, talks in English by non-native German speakers will be accepted." A call for tutorials is also out.

Full Story (comments: none)

EHSM 2012 CFP

The Exceptionally Hard & Soft Meeting "exploring the frontiers of open source and DIY" will take place December 28-30, 2012 in Berlin, Germany. The call for presentations deadline is November 21.

Comments (none posted)

Upcoming Events

Events: May 17, 2012 to July 16, 2012

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
May 13
May 18
C++ Now! Aspen, CO, USA
May 17
May 18
PostgreSQL Conference for Users and Developers Ottawa, Canada
May 22
May 24
Military Open Source Software - Atlantic Coast Charleston, SC, USA
May 23
May 25
Croatian Linux Users' Convention Zagreb, Croatia
May 23
May 26
LinuxTag Berlin, Germany
May 25
May 26
Flossie 2012 London, UK
May 28
June 1
Linaro Connect Q2.12 Gold Coast, Hong Kong
May 29
May 30
International conference NoSQL matters 2012 Cologne, Germany
June 1
June 3
Wikipedia & MediaWiki hackathon & workshops Berlin, Germany
June 6
June 8
LinuxCon Japan Yokohama, Japan
June 6
June 10
Taiwan Mini DebConf 2012 Hualien, Taiwan
June 7
June 10
Linux Vacation / Eastern Europe 2012 Grodno, Belarus
June 8
June 10
Southeast LinuxFest Charlotte, North Carolina, USA
June 9
June 10
GNOME.Asia Hong Kong, China
June 11
June 15
YAPC North America Madison, Wisconsin, USA
June 11
June 16
Programming Language Design and Implementation Beijing, China
June 13
June 15
2012 USENIX Annual Technical Conference Boston, MA, USA
June 14
June 17
FUDCon LATAM 2012 Margarita Margarita, Venezuela
June 15
June 16
Devaamo summit Tampere, Finland
June 16 Debrpm Linux Packaging Workshop in the Netherlands The Hague, Netherlands
June 26
June 29
Open Source Bridge: The conference for open source citizens Portland, Oregon, USA
June 26
June 29
PostgreSQL Conference Denver, CO, USA
June 26
July 2
GNOME & Mono Festival of Love 2012 Boston, MA, USA
June 30
July 6
Akademy (KDE conference) 2012 Tallinn, Estonia
July 1
July 7
DebCamp 2012 Managua, Nicaragua
July 2
July 8
EuroPython 2012 Florence, Italy
July 7
July 12
Libre Software Meeting / Rencontres Mondiales du Logiciel Libre Geneva, Switzerland
July 8
July 10
SouthEast LinuxFest Charlotte, NC, USA
July 8
July 14
DebConf12 Managua, Nicaragua
July 9
July 11
GNU Tools Cauldron 2012 Prague, Czech Republic
July 10
July 15
Wikimania Washington, DC, USA
July 11
July 13
Linux Symposium Ottawa, Canada

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds