Development
Recovering data with PhotoRec and TestDisk 7
Data recovery is a topic that few people enjoy talking about, but it is a fact of life. Storage hardware fails, devices get damaged, and accidents happen. For many years, Christophe Grenier's TestDisk and PhotoRec utilities have been among the most reliable tools available for the task. Version 7.0 of both programs (which are developed and released together) were released on April 18 after a lengthy development and testing cycle, bringing support for more filesystem options and for recovering new file types.
Both TestDisk and PhotoRec operate by performing low-level scans of a hard disk's sectors, but for different purposes. TestDisk searches for lost partition and boot-sector information in order to help users recover a corrupted or damaged partition, partition table, or boot sector. PhotoRec searches partitions or entire disks for file headers that match the patterns from a user-selected list of file types.
As the name suggests, PhotoRec was initially designed to recover lost or deleted image files from digital camera flash storage. Now, however, it can recognize more than 260 file types—although the emphasis remains on media formats, rather than (say) executables, and it is skewed toward actively developed applications. TestDisk can read and restore a wide variety of partition table formats and filesystems, including newer formats like Btrfs, and it supports mdadm RAID arrays, logical volume manager (LVM) drives, and all major flavors of full-disk encryption.
Version 7.0 was a long-in-development release. The previous stable release, 6.14, was made in July 2013. Although both programs are packaged by many desktop Linux distributions, the tools' usefulness in forensic and data-recovery tasks also makes them a popular choice for inclusion with smaller "rescue" distributions designed to boot from removable media. Downloads are available for 32-bit and 64-bit Intel systems, as is source code.
Due to the lengthy development cycle for 7.0, a number of these rescue distributions have been shipping pre-release builds of 7.0 for some time. For the most part, these pre-release builds support the same feature set as the 7.0 final release. A lot of the most recent work leading up to the release was devoted to fixing bugs—security bugs in particular.
The project highlighted three separate security reviews performed in the lead up to the 7.0 release. One was a static-analysis scan conducted by Coverity. The second was a fuzz test performed with the american fuzzy lop (afl) fuzzer. The third was a review done by Security Assessment that turned up an exploitable buffer overflow.
Apart from the security issues fixed as a result of these reviews, the changes to TestDisk 7.0 include improved support for the exFAT filesystem (which is commonly used on flash storage media, particularly on cards over 32GB in size) and improvements to ext4 support. In particular, ext4 support now includes correct handling of 64-bit block numbers and 64KB block sizes and reading the ext4 journal. ExFAT support now includes proper handling of non-ASCII file and directory names.
PhotoRec gained support for recognizing 14 new file types, including several from free-software applications. These include Krita's native format as well as the OpenRaster format produced by Krita, MyPaint, and several other image-editing programs, the Magic Lantern Video raw-video format created by the Magic Lantern open-source camera firmware, openNURBS 3D geometry files, the Digital Imaging and Communications in Medicine (DICOM) format used in medical and scientific visualizations, and the Web Open Font Format (WOFF).
In addition, many of the file-detection functions in PhotoRec have seen improvements, such as the ability to detect the file size of a number of different formats (including MPEG, Flash video, GIF, and RealAudio). PhotoRec also includes a brute-force mode that can identify individual file fragments, potentially rescuing files that could not be located by file-header identification alone. This is a significantly processor-hungry operation, and the new release includes some significant brute-force-mode speedups in comparison to previous releases.
TestDisk has the potential to damage partitions if used incorrectly, although the same risk is found in any application that can be used to modify partition tables. Nevertheless, the application takes a number of steps to prevent accidental loss of information. It is interactive and menu-based, with most destructive operations requiring a confirmation to commit, and it can be run against saved disk images as well as against live disks. PhotoRec is not potentially destructive, since it requires the user to specify a separate directory for storing any recovered files.
There are a wide array of problems that can be solved between the two programs. The most common are, arguably, how to recover accidentally deleted files from a storage device and how to rescue files from a corrupted disk. I had an opportunity to do the latter recently, and found PhotoRec 7.0 to be remarkably fast at locating lost files.
Naturally, the speed of the storage device helps matters, but there are other factors that make using PhotoRec a good experience. Few alternatives present as much rescued data to the user in as short an amount of time. PhotoRec provides a running log of what it is locating while it scans through a disk, which gives you a rough idea of whether or not it is successfully reading the storage medium (and whether or not the files you are expecting to find are still there). It also defaults to running lightweight scans at the beginning, only progressing to the brute-force option if the user insists. The result is less time spent waiting for results and, potentially, less time wasted on an unrecoverable medium.
TestDisk is a bit more complicated to work with, as using it involves some detective work and some knowledge about the partitioning scheme used on the disk. For an older disk (which, naturally, is more likely to experience failures), the knowledge could be ancient history from the user's point of view. There are certainly valid alternatives to TestDisk for such disk-recovery tasks (such as gdisk), but most of them tend to expect more detailed knowledge of the appropriate recovery procedure. Where TestDisk has an advantage is in guiding users along the path to recovery one step at a time: beginning with quick and optimistic scans, then proceeding to more involved searches only where necessary.
The 7.0 release of both tools appears to be a solid upgrade. They have kept pace with newer filesystems (on Linux and other platforms) and with newer file-format developments as well. For a data-recovery utility, that is about the best that one can expect. Reviving lost storage is never a task one looks forward to, but at least it can be a reliable operation and comparatively painless.
Brief items
Quotes of the week
Sadly, these are often just empty words. "Patches welcome" can be a seemingly-polite way of saying "your problem is not important to me. Go solve it yourself and I’ll accept your solution." And telling a user to go file a bug can be equally dismissive, especially if the bug-filing process is unpleasant.
KDE Ships Plasma 5.3
KDE has announced the release of Plasma 5.3. This release features improved power management, better Bluetooth capabilities, improved Plasma widgets, a tech preview of the Plasma Media Center, big steps towards Wayland support, and more.WordPress 4.2 released
Version 4.2 of the WordPress blogging and content-management system has been released. New features include "Press This," a bookmark-able shortcut to create a new post from the current browser page, support for additional Unicode character sets (including Chinese, Japanese, and Korean scripts as well as Emoji), and improved accessibility with screen-reader support in the administration tools.
GNU Mailman 3.0 released
GNU Mailman 3.0 has been released. " Version 1.0 of the 2D animation suite Synfig Studio has been released. Among the notable changes are an updated GUI using GTK+ 3, reworked sound support, advanced image-deformation tools, and improvements to the "bones" system used to rig up animated characters for easier movement. Libre Graphics World has posted a look at the new release that also reviews the project's progress over the past few years.
Over seven years in development, Mailman 3 represents a major new version,
redesigned as a suite of cooperating components which can be used to mix and
match however you want. The core engine is now backed by a relational
database and exposes its functionality to other components via an
administrative REST+JSON API. Our new web user interface, Postorius is Django-based, as is our new archiver
HyperKitty. The core requires Python 3.4 while Postorius and HyperKitty
require Python 2.7. LWN looked at Mailman 3.0 in March, and at HyperKitty in April 2014.
Synfig Studio 1.0 released
Newsletters and articles
Development newsletters from the past week
- What's cooking in git.git (April 27)
- LLVM Weekly (April 27)
- OCaml Weekly News (April 28)
- OpenStack Community Weekly Newsletter (April 24)
- Perl Weekly (April 27)
- PostgreSQL Weekly News (April 26)
- Python Weekly (April 23)
- Ruby Weekly (April 23)
- This Week in Rust (April 27)
- Wikimedia Tech News (April 27)
Rust Once, Run Everywhere
The Rust blog has posted a guide
to using Rust's foreign function interface (FFI) with C code.
Highlighted in particular are Rust's safe abstractions, which are said
to impose no costs. "Most features in Rust tie into its core
concept of ownership, and the FFI is no exception. When binding a C
library in Rust you not only have the benefit of zero overhead, but
you are also able to make it safer than C can! Bindings can leverage
the ownership and borrowing principles in Rust to codify comments
typically found in a C header about how its API should be
used.
"
Page editor: Nathan Willis
Next page:
Announcements>>