By Jake Edge
June 6, 2012
Greg Kroah-Hartman is on something of a mission: reducing the grumpiness
factor among kernel developers, and maintainers in particular. His keynote
at LinuxCon Japan was meant to help the audience understand what the
maintainers do, and how contributors' actions can sometimes result in grumpy
maintainers. But, if contributors can follow the rules and make things
easier on him, there are a number of things that he will promise to do on
their behalf.
He called the Linux kernel the "largest software development project
ever" and noted that its development pace is
"unprecedented". From 3.0 to 3.4, some 2833 developers from
at least 373 companies contributed. In that year (from May 2011 to May
2012), the
kernel had a change rate of 5.79 changes per hour. But the rate keeps
increasing and if you look at just the 3.4 cycle, the rate is 7.21 changes
per hour. That is, of course, just patches that are accepted into the
mainline, so it
doesn't count those patches that are rejected.
Developers typically send their changes to the maintainer of the file
that is being changed. Those maintainers, who number around 700,
feed those changes up to the 130 subsystem maintainers. From there, the
patches make their way into linux-next, then to Linus Torvalds, and,
eventually the mainline—if they get accepted at each step along the way.
So, in order to see why some patches might not get accepted, he looked at
those that he received in the last two weeks, which coincided with the 3.5
merge window. The merge window is a time when he really shouldn't be
getting many patches. He should have received them all earlier in the
cycle
so that he could potentially
pass them on to Torvalds during the merge window. But, he said, he
got 487 patches in that two-week period, many with a wide variety of
problems, and some of those from core kernel developers who should know better.
Broken patches
With that, he launched into a description of some of the broken patches he got.
One patch was labeled "patch 48/48" (i.e. the last patch in a set of 48)
but all of the other pieces were
missing. He
also got a patch series with no order specified, which means that he would
have to guess at the order and undoubtedly get it wrong. The alternative
is to ignore the patch entirely. He also got a ten-patch set that was
missing patch two in the series.
Another patch came in an email with a signature claiming that it was
confidential. He actually sees that one a lot, he said, and there is
nothing he can do with those kinds of patches. Linux development is done
in the open and you can't send a confidential email to mailing lists or get
a confidential patch merged. Obviously, it is boilerplate that gets added
somewhere in the email process, but it has to be removed before the patch
can be used.
There are also malformed patches that end up in his inbox, including those
with tabs converted to spaces. Microsoft Exchange does that, he said, so
if that's a problem in your environment, do what IBM, Microsoft, and others
do: put a Linux box in the corner for the developers to use to send their
mail. Sometimes the leading spaces have been stripped off the diff or the
diff is not in unified format. Linux developers have gotten good at raw
editing diff format, he said, which is scary in itself, but they shouldn't
have to do
that.
Patches are also created in the wrong directory, like down in a driver
directory for example. He got a patch created in
/usr/src/linux-2.6.32 and noted that there were multiple things
wrong with that, including the age of the source tree and that it implied
it was being built by root. The latter is very dangerous as there was a
bug in the Linux build process at one point that would delete the entire
root filesystem if it was run as root. None of the core developers noticed
because
they don't build as root. Suggestions that the bug be left in as a
deterrent were ignored, but things like that can happen.
In addition, patches came in that were made against a different tree than
any he would expect. He got a patch made against the SCSI development
tree, for reasons unknown because it had nothing to do with SCSI.
Then there are those that don't have the right coding style. In one case,
the coding style was wrong and the developer acknowledged that but wanted
him to take the patch anyway. That gives the impression of "we don't
care, take our code anyway", he said. There are tools to help find
and fix those kinds of problems, so there is no excuse: "send it in the
right coding style".
Something he sees much more than he should are patches that don't even
compile. The submitter clearly hasn't even built the patch, he said. Or
there are
patch sets that break the build in 3/6 but then fix it in 6/6. He even got
a patch that broke the build in 5/8 but contained a note that sometime in
the future the submitter would send changes to fix it. Another patch had
obviously wrong
kernel-doc in
it that would cause failures building the documentation, so it was clear
that the contributor had never even tried to run the kernel-doc extraction
tool.
One of the patches he got "had nothing to do with me". It was
an x86 core kernel patch, which is not an area of the kernel he has ever
dealt with.
But the patch was sent only to him. "I get odd patches" a lot,
he said.
The last patch he mentioned was 450K in size, with 4500 lines added.
Somebody suggested that it be broken up, but in the meantime several
maintainers actually reviewed it, so the submitter didn't really learn from
that mistake.
All of this occurred during a "calm two weeks", he said.
These are examples of what maintainers deal with on a weekly basis and
explains why
they can be grumpy. That said, he did note that this is the
"best job I've ever had", but that's not to say it couldn't be
improved.
If someone sends him a patch and he accepts it, that means he may have to
maintain it and fix bugs in it down the road. So it's in his self
interest to ignore the patch, which is an interesting dynamic, he said.
The way around that is to "give me no excuse to reject your
patch"; it is as simple as that, really.
Rules
Kroah-Hartman then laid out the rules that contributors need to follow in
order to avoid the kinds of problems he described. Use
checkpatch.pl, he said, because he will run it on your patch and it
is a waste of his time to have to forward the results back when it fails.
Send the patch to the right people and there is even a script available (get_maintainer.pl) to
list the proper people and mailing lists where a patch should be sent.
Send the patch with a proper subject that is "short, sweet, and
descriptive" because it is going to be in the kernel changelog. It
should not be something like "fix bugs in driver 1/10". In
addition, the changelog comment should clearly say what the patch does, but
also why it is needed.
Make small changes in patches. You don't replace the scheduler in one
patch, he said, you do it over five years. Small patches make it easier
for reviewers and easier for maintainers to accept. In a ten-patch series,
he might accept the first three, which means that the submitter just needs
to continue working on the last seven. The best thing to do is to make the
patch "obviously correct", which makes it easy for a
maintainer to accept it.
Echoing the problems he listed earlier, he said that patches should say
what tree they are based on. In addition, the order of the patches is
important, as is not breaking the build. The latter "seems like it
would be obvious" but he has seen too many patches that fail that
test. To the extent that you can, make sure that the patch works. It is
fine to submit patches for hardware that you don't have access to, but you
should test on any hardware that you do have.
Review comments should not be ignored, he said. It is simply
common courtesy if he takes time to review the code that those comments
should be acted upon or responded to. It's fine to disagree with review
comments, but submitters need to say why they disagree. If a patch gets
resent, it should be accompanied with a reason for doing so. When
reviewer's comments are ignored, they are unlikely to review code the next
time.
Maintainer's role
When you follow those rules there are certain things you can expect from
him, Kroah-Hartman said, and that you should expect from the other
maintainers as well. That statement may make other maintainers
mad,
he joked, but it is reasonable to expect certain things. For his part, he
will review patches within one or two weeks. Other maintainers do an even
better job than that, he said, specifically pointing to David Miller as one
who often reviews code within 48 hours of its submission.
If you don't get a response to a patch within a week, it is fine to ask him
what the status is.
He can't promise that he will always give constructive criticism, but he will
always give "semi-constructive criticism". Sometimes he is
tired or grumpy, so he can't quite get to the full "constructive" level.
He will also keep submitters informed of the status of their patch. He has
scripts that will help him do so, and let the submitter know when the patch
gets merged into
his tree or accepted into the mainline. That is unlike some other
maintainers, he said, where he has submitted patches that just drop into a
"big black hole" before eventually popping up in the mainline
three months later.
He ended by putting up a quote from
Torvalds ("Publicly making fun of people is half the fun of open
source programming. ...") that was made as a comment on one of
Kroah-Hartman's Google+ postings. The post
was a rant about a driver that had been submitted, which even contained
comments suggesting that it should not be submitted upstream. He felt bad about
publicly posting that at first, but Torvalds's comment made him rethink that.
Because kernel development is done in the open, we are taking
"personal pride in the work we do". As the code comment indicated,
the driver developer didn't think it should be submitted because
they realized the code was not in the proper shape to do so. It is that
pride in the work that "makes Linux the best engineering project
ever", he said. Sometimes public mocking is part of the process and
can actually help instill that pride more widely.
[ The author would like to thank the Linux Foundation for assistance with his travel to Yokohama. ]
Comments (68 posted)
By Nathan Willis
June 6, 2012
Lars Wirzenius's new backup tool Obnam was just declared 1.0. There
is no shortage of backup options these days, and in some way
Wirzenius's decision to scratch his own itch with the project is par
for the course. But the program does offer a different feature set
than many of its competitors.
For starters, Obnam makes only "snapshot" backups — that is,
every backup looks like a complete snapshot of the system: there are
not separate "full" and "incremental" backup options. That obviates
the need to separately configure full and incremental backups on
different schedules, and it similarly simplifies the restoration
process. Any snapshot can be restored, without "walking" a chain of
deltas from a full backup starting position. In his 1.0 release
announcement, Wirzenius argues that full-plus-incremental backups make
sense for tape drives, where sequential access favors adding deltas
with incremental changes after an initial full backup, but that
hard-disk backups make the incremental delta approach pointless.
But the sneaky part is that under the hood, Obnam's snapshots are all
incremental, at least in the sense that each snapshot only records
changes since the last. The difference is that they are stored in copy-on-write (COW)
b-trees like those Btrfs uses
for filesystems. Any snapshot can be reconstructed from the b-tree,
and individual snapshots can be removed by deleting their node and
re-attaching the sub-trees. To make the COW b-tree approach
space-efficient, it uses pervasive automatic data de-duplication. The
same chunk of data on disk is re-used — both across multiple
files and over multiple snapshot generations. In addition to saving
space by not duplicating files that have not changed between
snapshots, moving or renaming large files does not result in duplicate
copies of the bits. By default, Obnam uses one-megabyte chunks,
although this setting is adjustable in Obnam's configuration file.
Obnam sports other features of practical value, such as built-in GnuPG
encryption, which Wirzenius cited as a weakness in most rsync-based
backup tools. It also works with local disks or over the network,
including NFS, SMB, and SFTP. Wirzenius admits that the latter
protocol is slow, but that SCP (which should be faster) lacks support
for tracking information like file removals, which Obnam depends on.
In network backup setups, Obnam supports both push (client-initiated)
and pull (server-initiated) backup sessions.
Storing and retrieving
Installation requires several of Wirzenius's other code projects,
including his B-tree library larch and
terminal status-update library ttystatus, plus paramiko a third-party SSH2
library. Most are packaged for Debian (Wirzenius packages his own
projects for Debian), but not all of them are available in downstream
derivatives like Ubuntu. He provides an Apt repository for the
necessary packages; instructions and a link to the repository's
signing key are provided on his Obnam tutorial page.
The tutorial goes into further detail about Obnam's data
de-duplication with practical examples. You can create a new backup
with
obnam backup ~/projectfoo
and subsequently back up a parent directory with
obnam backup ~
Rather than re-save the files from
projectfoo, the new
backup will point to the copy already on disk. Each backup created
with Obnam is specific to a directory; you can exclude specific
subdirectories with the
--exclude= flag, but you cannot
backup several directories in a single command.
The tutorial also explains that Obnam automatically saves checkpoints
every 100MB while creating a new backup. This is valuable because the
initial snapshot is always akin to a full backup in other tools, and
can be large enough to introduce failures. Checkpoints are
not guaranteed to preserve the entire data set as are regular
snapshots; they only allow an interrupted backup to resume without
starting over from scratch.
Obnam's basic usage is straightforward; the same
obnam backup ~ command that is used to start a
new backup in the above example is used verbatim to perform the
subsequent snapshots. You store snapshots on a remote repository by
appending --repository=URL, specify a filesystem storage
location with --output=PATH, and specify a GnuPG encryption
key with --encrypt-with=KEYID.
You can restore a directory from a snapshot with
obnam restore --to=/mnt/recovery-volume ~
(which will restore the most recent snapshot of your home directory to
/mnt/recovery-volume). You can optionally restore just a
file or a subdirectory from the snapshot with
obnam restore ~/importantfiles --to=/mnt/recovery-volume ~
You can also specify a specific intermediate snapshot by
appending a
--generation=N flag to the restore command; you
can get a list of the available snapshots by running
obnam generations. The
obnam verify command checks
snapshot data against the files on disk, and
obnam fsck
checks the internal consistency of the b-tree.
Forgetfulness
The only real confusing part of working with Obnam is the snapshot
retention process. You can tell the program to immediately delete
older snapshots by running
obnam forget --keep=7d
(which will keep the most recent seven days' worth of snapshots), or
some variation. The wrinkle is that the
7d attribute will
keep only one backup
per day for those seven days, even if
you run Obnam hourly. To keep seven days' worth of hourly snapshots,
you would need to specify
--keep=168h.
You can set a snapshot retention policy in your configuration file
that uses these rules in combination. You can retain hourly, daily,
weekly, monthly, and yearly snapshots by providing a comma-separated
list. For example, 12h,7d,3m will keep the last 12
hourly snapshots, the last seven daily snapshots, and the last three
monthly snapshots. When the numbers start to converge (such as the
last 48 hourly snapshots and last two daily snapshots) is when the
potential for miscounting sets in; Wirzenius recommends that you try
your retention policy on the command line with the --pretend
option to simulate results before deploying them in the real
world.
In an email, Wirzenius elaborated a bit on those tricky
multi-factor retention policies. Each retention rule (e.g., hour,
day, or month) is examined separately by Obnam, he said, and a
snapshot is kept if it matches any of the rules. So a 48h,2d
policy would match 48 hourly snapshots, then match two additional
daily snapshots, for 50 total.
As of the 1.0 release, there are a few areas that need improvement,
such as managing multiple clients storing snapshots on one repository;
Wirzenius says that further thought is required before implementing a
real "server mode." For example, two or more machines can run Obnam
and push their backups to the same remote repository, and they will be
tagged with the hostname of origin. However, Obnam can also be run from
the repository machine and "pull" backups from the two remote sources, but
in that case each one needs to specify a client name with the
--client-name= flag in order for Obnam to keep their metadata
separate.
In practice, my interest in backup utilities stems largely from how
rarely I make good backups on a regular basis (i.e., paranoia). I may
be atypical in that way, but the primary reasons I have abandoned most
of the backup utilities I have test driven in the past are the overhead
in keeping track of full and incremental backup schedules and the lack
of good tools for rotating old backups out without manual
intervention. Obnam scores on both of those metrics. If you have a
complicated setup with multiple machines, you may find quirks (such as
the client name issue or the speed of SFTP) working against you, but
Wirzenius is still at work on the code — and he seems quite
happy to take bug reports and questions.
Comments (17 posted)
By Jonathan Corbet
June 5, 2012
The UEFI secure boot mechanism has been the source of a great deal of
concern in the free software community, and for good reason: it could
easily be a mechanism by which we lose control over our own systems.
Recently, Red Hat's Matthew Garrett
described how the Fedora
distribution planned to handle secure boot in the Fedora 18 release.
That posting has inspired a great deal of concern and criticism, though,
arguably, about the wrong things.
On a system with secure boot enabled, the hardware will refuse to run any
system that has not been signed by a key it recognizes. Secure boot is
meant to be a way to thwart boot-time malware by ensuring that only trusted
(and unmodified) software gains control of the system. It is not
effective as a digital rights management (DRM) mechanism; if you can gain
control of the system, it is relatively easy to fool an operating system
into thinking that secure boot is in effect when it is not. Providing the
degree of control needed for effective DRM requires a trusted platform
module (or similar) and associated software.
Secure boot does
offer some hope of preventing a system from booting if its bootloader or
kernel have been compromised by malware, though, as the "Flame" malware
shows, there are limits to how much one can rely on signatures to keep
systems secure. Secure boot could also, unfortunately, be effective in
preventing booting if the user has tried to install an operating system of
his or her choice.
The Windows 8 logo requirements specify that secure boot must be enabled.
After some pushback, the requirements have been amended to also say that it should be possible
for the owner of a system to disable secure boot or install new keys. It
does not say that these actions need to be easy to carry out,
though. Given that changing secure boot is a firmware-level operation,
users wanting to make changes will be subjecting themselves to the very
best sort of user experience that can be created by BIOS developers. It
would be entirely unsurprising, for example, if users were forced to
hand-enter new keys as long hex strings. For this to be an unpleasant and
error-prone process would not be surprising.
Fedora's plan
Developers in the Fedora camp have evidently come to the conclusion that
they do not want to force their users to endure such an experience to be
able to install Fedora on their systems. So Fedora has chosen to take a
different approach. Availing themselves of the Microsoft developer
program, they will purchase a Microsoft-signed key for $99, then use that
key to sign a minimal bootloader. UEFI-enabled hardware will then consent
to boot that bootloader, which will immediately turn around and boot a
special version of the GRUB2 bootloader which, in turn, will boot the Fedora
kernel. A Fedora system set up in this mode should boot on a system with
secure boot enabled with no changes required.
The appeal of this solution is clear: Fedora will "just work" on UEFI
systems without forcing (possibly highly non-technical) users to make scary
firmware-level changes. But there is a down side as well. The signed
bootloader must ensure that it only runs GRUB2 if the GRUB2 binary has been
signed by Fedora (using its own key at this point), and GRUB2 will only boot
kernels that have been signed by Fedora. GRUB2 will need to be locked down,
and the kernel too; the kernel will, for example, only be able to load
modules that bear Fedora's signature. Given that, Red Hat's persistent
attempts to get signed module enforcement into the kernel despite some interesting resistance make more sense.
Much of the coverage of this plan in the mainstream media bore headlines
like "Red Hat to pay Microsoft for the right to run Linux." Such headlines
are not strictly true; the payment ($99 total) evidently goes to Verisign,
and what is really being paid for is the ability to boot Linux with a
minimum of UEFI-caused user inconvenience. The payment for a
Microsoft-signed key raises eyebrows, but it is evidently seen as the best
response to a bad situation.
And perhaps that is just what it is. But it also raises a number of
interesting questions.
A good idea?
For example: what guarantees exist that a Microsoft-signed key will
continue to be available in the future for a reasonable price? If secure
boot takes over, and the only universally-recognized keys are those signed
by Microsoft, then Microsoft will have a monopoly on the right to boot an
operating system on future hardware. Corporations are, in general, not
known for a principled refusal to exploit that kind of position, and this
corporation, in particular, is well known indeed for the opposite sort of
behavior. One can only assume that the price of such keys would increase
in this situation.
Microsoft will also have the right to revoke keys if they can be said to be
a threat to the promises given by the secure boot mechanism. That is why
Fedora must be careful to limit anything that enables direct access to the
hardware; should somebody be able to get such access, the signed Fedora
system could be used to attack Windows systems that have secure boot
enabled. In theory, all it would take is a kernel security hole to enable
this sort of attack; that could then cause the Fedora key to be revoked. A
quick check shows about 20 kernel security updates issued by Fedora since
the beginning of this year, with multiple vulnerabilities fixed in most.
That could lead to a lot of key churn, especially if, as Alan Cox suggests, every kernel hole will require that
its certificate be revoked.
Depending on what software is run on a specific system (if it dual-boots
Windows and Linux, for example), a revoked key could
find itself into the system's "forbidden signatures" database. That would
immediately disable the booting of the signed Fedora image, essentially
crippling the machine. The amount of joy resulting from such an outcome
can be expected to be small.
Some developers have argued that Fedora's
plan is a violation of the GNU General Public License, or, at least, of the
Fedora project's own guidelines, despite Fedora's efforts to ensure that
users retain as much freedom as possible. GPL enforcement actions in this
case seem unlikely; there's no shortage of much more severely locked-down
Linux systems out there, and they have not been the target of such actions
thus far. But there is a definite risk of damage to the Fedora project's
image as users discover that they cannot easily install their own kernels,
add third-party modules, or run tools like SystemTap.
Finally, there is the risk that Fedora's plan will legitimize the UEFI
secure boot mechanism. For now secure boot can be disabled on x86 systems;
what if
Microsoft, in the future, points to Fedora 18 as an example of how
everybody is able to work within the secure boot system and tries to make
secure boot mandatory? Thus, some argue, Fedora is giving aid and comfort
to those who would most like to take control of our systems away from us.
Why bother?
Given all of this, one might well wonder why Fedora is pursuing this path.
Fedora users are not generally known to clamor for locked-down systems that
they cannot easily tweak. Without any inside information whatsoever, your
editor suggests that there are two entirely plausible reasons for Fedora's
attempt to work with secure boot:
- The Fedora project, like many free software projects, would like to
have a wider base of users. It fears that, in the absence of a "just
works" experience on upcoming hardware, it will lose users to other
distributions that might be more willing to make that effort. Some of
those users may be lost to Linux altogether.
- The plan starts with a disclaimer that it is not representative in any
way of Red Hat's intentions for its enterprise distribution. But it seems
clear that there could be actual customer demand for a version of RHEL
that runs in the secure boot environment. If one embraces the sort of
restrictions that come with enterprise support, the additional rules
imposed by secure boot will have a minimal impact, while the apparent
benefits are clear. Fedora's role is, among other things, to test out
technologies that might go into RHEL; in this case, Fedora's users get
to stumble into the secure boot land mines so RHEL users don't have
to.
So Fedora's decision to take this approach is not all that surprising. The
project has concluded that it is better to restrict user freedom in certain
settings to make their life easier in other ways; as Matthew Garrett put it:
[T]here's no way to rationally say that the loss of
freedom in terms of users not being able to produce their own
signed bootloader or kernel for free is more or less significant
than the benefit of having an operating system that users can
install without firmware reconfiguration.
For those who do think that the loss of freedom inherent in the Fedora
scheme is unacceptable, the time between the present and when
Windows 8 hardware
starts shipping would be an ideal opportunity to demonstrate better
alternatives. But it's not clear what those would be.
Alternatives?
One could simply ignore secure boot, requiring users to disable it before
they can install Linux on their machines. That imposes a potentially scary
or difficult task on those users; by the specification, secure boot cannot
be disabled by the software directly. There may also be resistance from
users who see a switch saying "turn off security" and don't want to flip
it. This approach will work fine for hard-core Linux users and developers,
but seems certain to lose other kinds of users.
An alternative would be to attempt to gain more control of the situation at
the hardware level. An example can be seen in Google, which has made a
point of ensuring that unlockable Android handsets exist and are available
at a reasonable price. Hardware designed to run ChromeOS also, by design,
comes with an easily-toggled physical switch that turns off the boot-time
checks for users wanting to install their own software. The level of
interest in "jailbreaks" for locked-down handsets shows that a lot of users
do see value in having full control over the hardware they own. Open (and
"open source") hardware has a following; it may be that the only real way
to remain in control is to work to ensure that this kind of hardware
continues to exist and has a growing market share. There should be a
business opportunity here; projects like the Vivaldi tablet show that some people see
that opportunity and are trying to pursue it.
In the absence of open hardware, we will continue to be at the mercy of
others whose interests are unlikely to be the same as ours (for just about
any value of "ours"). That will leave us in a position where attempts to
cope like what we're seeing with Fedora seem like the best options
available. That does not seem like the path to freedom; it is not why we
have spent decades developing free operating systems. Fedora's secure boot
plan may be an effective workaround, but leaves the real bug unfixed.
Comments (122 posted)
Page editor: Jonathan Corbet
Next page: Security>>