User: Password:
Subscribe / Log in / New account


digiKam nearing its 2.0 release

July 15, 2011

This article was contributed by Nathan Willis

DigiKam has always stood out among the Linux photography tools because it incorporates the features of what are often two separate tools: the photo manager and the raw image editor. When the KIPI plug-in API is added into that list, the feature set can grow quite large. DigiKam announced the first release candidate for version 2.0.0 recently and, as one might expect, a host of new features dominates the new builds.

[digiKam editor]

DigiKam is a KDE program, but apart from dependencies on core KDE and Qt libraries, operates in stand-alone mode, so is perfectly usable in GNOME or other desktop environments. For example, the application's photo collection management features take center stage: it manages large image collections and a wealth of metadata about each entry. But it maintains this information in its own application-specific database (user-configurable for either MySQL or SQLite, and with a built-in migration tool lest one decide to change), rather than relying on Tracker, Zeitgeist, or another external indexer. This approach also allows digiKam to keep an eye on multiple discrete directory locations, rather than requiring you to move your library to a central location (or copying it to a separate location for you) as many of its competitors do.

The database-backed collections framework gives digiKam powerful search capabilities. It is aware of IPTC, EXIF, XMP, and Makernote metadata tags, plus user-defined tags, labels, and ratings, geolocation information, filesystem data (file size, modification time, etc.), and more. The search tools enable you to drill down into large collections with compound queries. DigiKam even allows you to create multiple "metadata templates" that are pre-filled with frequently used information. Since 21st Century Laziness is often the primary reason users do not make use of the metadata formats and ontologies available to them, this helps keep the collection organized.

Much of Digikam's functionality is implemented through KIPI plug-ins. The KIPI API is shared with other KDE-based image programs, such as GwenView and KPhotoAlbum. The official Digikam packages use the plug-ins to implement many of the export and display functions, plus auxiliary functions such as DNG conversion. Often, new functionality is first implemented as a plug-in, such as support for editing a new type of metadata.

The 2.0.0 release candidate code can be downloaded from the project's SourceForge page. There, a source code bundle and a 32-bit Windows installer are available. Linux users wishing to test binary packages will need to find a distribution-specific build provided by a downstream maintainer — digiKam maintains a list of known packages, but does not currently release its own. FreeBSD and Mac OS X builds are also available from third parties.

Image organization

On the image management side of the application, there are a half-dozen or so new features in this release, several of which are the result of Google Summer of Code projects integrated into the main code base during a sprint this spring. The first is XMP sidecar support. XMP sidecars are metadata files that are associated with image formats that cannot store metadata internally. The sidecar files typically retain the base of the original filename, but use the .xmp extension.

As mentioned earlier, digiKam supports its own local metadata, such as user-assigned tags and ratings. The 2.0.0 series adds a pair of new label types: "color labels" and "pick labels." The pick labels appear as red, yellow, and green flags, and their meaning is described in tooltips as "rejected," "pending," and "accepted," respectively. Color labels are visible as a colored highlight around the thumbnail in the image browser. The colors available include the six basic primary and secondary colors, plus black, white, and gray, and there is no pre-defined semantic meaning assigned to any of them.

Considering that digiKam already gives users a wealth of other ways to sort and mark up collections (tags, star ratings, albums), it might seem odd to add more. But I think it is helpful to have multiple, orthogonal ways to mark up a collection, simply to sift through it on multiple factors — particularly when the sorting process may involve transitory issues not suitable for the assignment of a persistent tag. Consider trying to find the "best" image to accompany a particular blog post. Star ratings might reflect overall picture quality, which would leave the color labels open to use in some other part of the decision (such as illustrating different parts of the story). Thus sorted, the picks might come in handy for another user (e.g., an editor) to select among the alternatives. Attempting to do the same thing with star ratings alone or with tags would get confusing.

In addition to the new sorting dimensions, 2.0.0.-RC introduces keyboard shortcuts for assigning common tags, and it allows the user to select and "group" images in the thumbnail browser. Groups of images seem to operate much like a multi-item selection, in the sense that the user can apply changes to the entire group simultaneously, but they do not disappear with a stray mouse click. The tag-assigning keyboard shortcuts are entirely user-configurable, provided that one does not choose a key combination also captured by the window manager or another system component.

The so-called "reverse geocoding" feature is also new. This allows the user to look up human-readable place names to associate with latitude and longitude coordinates typically assigned automatically by GPS tagging software. The upshot is simply metadata that is easier to browse and easier to search.

Technical and editing changes

[digiKam face tagging]

Sorting is not the only area of improvement in this release, however. Several new technical features make their debut as well, starting with face recognition (yes, the facial recognition data can be searched on, but it constitutes a substantially new feature in its own right). Users can add "face tags" in two ways: either by drawing rectangles on faces in individual images, or by allowing digiKam to scan the entire image collection and automatically mark what it determines to be faces.

At the moment, the documentation of the feature is scant, but the workflow seems to involve marking as many faces as you can stand to manually, adding a name for each. The names are converted to "People tags" in the general tag database. Upon a blind scan-and-identify run, digiKam will compare the unknown faces to the already tagged-and-labeled specimens. Obviously, the higher the percentage of your suspects you tag, the easier digiKam will recognize them in the future.

In the image editing arena, this release of digiKam uses an updated version of the LibRaw library (0.13.5), which adds a few noteworthy features of its own. LibRaw began as an attempt to massage Dave Coffin's dcraw utility into an API-stable shared library usable by other applications. This release, however, also imports several new advanced raw decoding options originally found in the RawTherapee application. Owners of Sigma DSLRs will also be happy to learn that the more recent version of LibRaw includes support for their cameras' Foveon sensors. The Foveon uses a three-layer light sensor that captures RGB data at the same grid location, as opposed to the matrix of single-color detectors found in most other cameras. As a result, entirely different decoding mechanisms are required. Canon is also reportedly working on a 3-layer sensor, so LibRaw and digiKam support for the decoding algorithms is important news.

DigiKam has also added support for file versioning in the editor component. As with all raw photo editors, the editing process is non-destructive to the original image, but most applications do not easily allow the user to save multiple versions of the "edit list" file. digiKam's editor component allows you to view the version history as a flat list (similar to the history pane in GIMP), or as a tree that preserves individual branches created when you roll back and make different edits. DigiKam's editing capabilities lie somewhere in between the color- and exposure-adjustment-only functions found in a typical raw converter and those of a full-blown raster editor. There are a few filter effects and simple touch-up tools (such as a red-eye corrector), but it also allows you to open any image in an external editor application from the right-click context menu.

Finally, digiKam has always supported easy export of images to devices and other applications, and this release adds two: the Czech web service RajCe, and MediaWiki. The MediaWiki exporter is compatible with Wikimedia properties (including Wikipedia) that require authenticated user accounts to upload content.


By and large, the new additions to digiKam are welcome. Most, such as pick and color tags, keyboard shortcuts, or reverse geocoding, are designed to make searching and managing your images a simpler and more intuitive task. A few of the new features, however, I still find difficult to use.

The face recognition process, for example, is awkward. Drawing rectangles over people's faces is simple enough, but the pop-up window that appears once you do so is unhelpful: it pre-fills the top line with "Unknown," which you might expect to leave the newly-marked face in a blank state, but instead creates a People Tag named "Unknown." The other two buttons on the pop-up window are "Confirm" and "Remove" — but Confirm appears only to remove the unknown face from the set of faces to be scanned by the recognition software. Add to that the fact that by default all of the face tags are invisible to the eye, and you have a confusing user experience. Perhaps the documentation will improve on the situation.

[digiKam tabbed interface]

Speaking of awkwardness, I have always disliked vertical side-tabs in GUIs, in any application. At best they are difficult-to-read labels, and at worst they make it unclear which portions of the UI belong to the "tab" and which do not. That problem goes double for interfaces that feature a set of vertical tabs on the left hand side and a separate set on the right.

For horizontally-written languages, vertical tabs give you sideways text labels (running in two different directions depending on whether they are stacked on the left- or right-hand edge), and applications that use them invariably also use normal horizontal menus and toolbars across the top of the window, introducing ambiguity as to which edge of the pane controls its contents. digiKam inflicts this on you, plus it provides no text labels for the un-selected tabs, forcing you to hover the cursor over them to discern the meaning of the cruelly-tiny icons. I suppose the only good thing about this UI design is that it is a clear sign that the application is filled to the brim with features. Still, I wouldn't shed any tears if it went away.

Apart from the interface woes, though, I found the digiKam 2.0.0 release candidate remarkably stable and fast. As always, it scores points for managing sizable collections of images and for providing a myriad of ways to arrange and edit content as the situation dictates. The final release of 2.0.0 is slated for "late July," so it should be a short wait for what looks to be a great update.

Comments (4 posted)

Brief items

Quotes of the week

But what brings us here today is a gentle reminder that when you write code this bad, you can actually kill people.

I'm leaving the DB-dump images in the following quote as a reminder of just how insane this code was. Think of these as skulls on sticks at the edge of the wasteland, saying "Never pass this way again".

-- Jamie Zawinski

It annoys me to no end that I even feel I need to write these down, but apparently I do. So, here are a couple of simple rules to guide your behaviour around here:

  1. Do not call other developers nor users idiots nor other derogatory terms on the mailing list.

  2. Do not use Twitter or other public broadcasting systems to call other developers or users idiots or other derogatory terms.

That's all, pretty simple actually. Most well-adjusted people would not stand up in a crowd of people and start calling people around them idiots. Just because there is a monitor and a network cable separating you from the crowd doesn't make it ok, and I am tired of it.

-- Rasmus Lerdorf

People *state* that it would be good to have more Parrot developers on Windows, but they really would like those developers to be *somebody else*...

Our mantra: "Parrot is a virtual machine aimed at all dynamic languages." The reality: "Parrot is a virtual machine aimed at all dynamic languages, provided you're on Linux."

-- James E Keenan

Reality showed us that the balance of maintaining the code of other OSs in the main repository is much more work than the work the few lines of useful code the few people of the other OSs contribute. We tried with many projects in the past, and decided against it. The needed abstractions are just hard to manage and get into our way all the time.

The only really thinkable solution for the niche OSs is to port the needed Linux interfaces to their kernels. But I guess that will never happen, and so systemd, udev, ... will probably never happen for them.

-- Kay Sievers

Comments (4 posted)

Parrot 3.6.0 "Pájaros del Caribe" Released

Version 3.6.0 of the Parrot multi-language virtual machine is available. This release cleans up some code and fixes bugs.

Full Story (comments: 2)

IBM to contribute Symphony to

IBM's Rob Weir, noting that "we at IBM have not been exemplary community members when it came to," has announced that IBM will be contributing its "Symphony" fork to the new Apache-based OpenOffice project. "First, we're going to contribute the standalone version of Lotus Symphony to the Apache project, under the Apache 2.0 license. We'll also work with project members to prioritize which pieces make sense to integrate into OpenOffice. For example, we've already done a lot of work with replacing GPL/LPGL dependencies. Using the Symphony code could help accelerate that work and get us to an AOOo release faster."

Full Story (comments: 36)

Telex: a new anticensorship system

The Freedom to Tinker site carries an announcement for Telex, a new approach to the circumvention of censorship of the net by national governments. "As the connection travels over the Internet en route to the non-blacklisted site, it passes through routers at various ISPs in the core of the network. We envision that some of these ISPs would deploy equipment we call Telex stations. These devices hold a private key that lets them recognize tagged connections from Telex clients and decrypt these HTTPS connections. The stations then divert the connections to anti-censorship services, such as proxy servers or Tor entry points, which clients can use to access blocked sites. This creates an encrypted tunnel between the Telex user and Telex station at the ISP, redirecting connections to any site on the Internet." There is a proof-of-concept implementation available on the Telex site.

Comments (23 posted)

Tilt: 3D web page visualization

From comes Tilt, a Firefox extension which creates a 3D display of a pages document object model. "Unlike other developer tools or inspectors, Tilt allows for instant analysis of the relationship between various parts of a webpage in a graphical way, but also making it easy for someone to see obscured or out-of-page elements. Moreover, besides the 3D stacks, various information is available on request, regarding each node's type, id, class, or other attributes if available, providing a way to inspect (and edit) the inner HTML and other properties." There's a video available for those who want to see the eye candy in action without actually installing the extension.

Comments (9 posted)

VirtualBox 4.1 released

Version 4.1 of the VirtualBox virtualization system from Oracle is out. New features include a virtual machine cloning mechanism, support for guests with up to 1TB of RAM, better remote access, and more; see the press release and the changelog for more information.

Full Story (comments: 1)

Newsletters and articles

Development newsletters from the last week

Comments (none posted)

Blizzard: Goals for multi-process Firefox

Christopher Blizzard describes the goals for - and motivations behind - the move toward multi-process Firefox. "Physical pages of memory are allocated at the operating system layer and handed to user processes, at the process level, as virtual pages. The best way to return those to the operating system is to exit the process. It's a pretty high-level granularity for recycling memory, for very long-running browser sessions it's the only way to get predictable memory behaviour. This is why content processes offer a better model for returning memory to the operating system over time."

Comments (24 posted)

Markham: Mozilla's competitive advantages

Gervase Markham has put up a brief posting on Mozilla's advantages as he sees them, and on how those advantages could be better used. "I'm sure a large proportion of Chrome users are ex-Firefox users. In the time before Chrome, when we were the new shiny, we missed an opportunity to educate them about why Mozilla is different, why the open web is important, and why having the coolest, fastest, slickest browser around is a great thing but it's not the most important thing. So when something they perceived as cooler, faster and slicker came long, they left us for precisely the same reason they arrived. We didn't tell them why they should stay."

Comments (36 posted)

Maximum Calculus with Maxima (Linux Journal)

Linux Journal examines using the Maxima computer algebra system for doing calculus. "Putting all these techniques together, you can solve a differential equation for a given variable—for example, solve dy/dx = f(x) for y. You can do this by doing all the required algebra and calculus, but you don't really need to. Maxima has the very powerful function, ode2, which can do it in one step."

Comments (4 posted)

Screencasting Stars of the Linux World (

Nathan Willis looks at some tools for capturing full-motion video and audio of your desktop. "The leading screen recorders at present are recordMyDesktop and Istanbul. The feature sets are roughly comparable: both record to Theora-encoded video with Vorbis audio, both allow you to select just the portion of the screen you are interested in recording, and work in multiple desktop environments."

Comments (11 posted)

Page editor: Jonathan Corbet
Next page: Announcements>>

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds