LWN Weekly Edition Leading itemsSecurity Kernel development Distributions Development Announcements ->Multipage format
This page Previous weekFollowing week |
kernel: multiple vulnerabilities
kernel: privilege escalation
kernel: information disclosure
krb5: UDP ping-pong flaw in kpasswd
libvirt: denial of service
mediawiki: multiple vulnerabilities
openstack-keystone: delayed token invalidation
openstack-keystone: insecure signing directory
openstack-nova: denial of service
openswan: code execution
openvpn: possible plaintext recovery
ruby: object taint bypassing
thunderbird: multiple vulnerabilities
tomcat: information disclosure
Page editor: Jake Edge Kernel development Brief items Kernel release statusThe current development kernel is 3.10-rc2, released on May 20. Linus says: "For being an -rc2, it's not unreasonably sized, but I did take a few pulls that I wouldn't have taken later in the rc series. So it's not exactly small either. We've got arch updates (PPC, MIPS, PA-RISC), we've got driver fixes (net, gpu, target, xen), and we've got filesystem updates (btrfs, ext4 and cepth - rbd)."Stable updates: 3.9.3, 3.4.46, and and 3.0.79 were released on May 19; 3.6.11.4 came out on May 20.
Ktap 0.1 releasedA new kernel tracing tool called "ktap" has made its first release. "KTAP have different design principles from Linux mainstream dynamic tracing language in that it's based on bytecode, so it doesn't depend upon GCC, doesn't require compiling a kernel module, safe to use in production environment, fulfilling the embedded ecosystem's tracing needs." It's in an early state; the project is looking for testers and contributors.
Merging zswapAs reported in our Linux Storage, Filesystem, and Memory Management Summit coverage, the decision was made to merge the zswap compressed swap cache subsystem while holding off on the rather more complex "zcache" subsystem. But conference decisions can often run into difficulties during the implementation process; that has proved to be the case here.Zswap developer Seth Jennings duly submitted the code for consideration for the 3.11 development cycle. He quickly ran into opposition from zcache developer Dan Magenheimer; Dan had agreed with the merging of zswap in principle, but he expressed concerns that zswap may perform poorly in some situations. According to Dan, it would be better to fix these problems before merging the code:
I think the real challenge of zswap (or zcache) and the value to
distros and end users requires us to get this right BEFORE users
start filing bugs about performance weirdness. After which most
users and distros will simply default to 0% (i.e. turn zswap off)
because zswap unpredictably sometimes sucks.
The discussion went around in circles the way that in-kernel compression discussions often do. In the end, though, the consensus among memory management developers (but not Dan) was probably best summarized by Mel Gorman:
I think there is a lot of ugly in there and potential for weird
performance bugs. I ran out of beans complaining about different
parts during the review but fixing it out of tree or in staging
like it's been happening to date has clearly not worked out at all.
So the end result is likely to be that zswap will be merged for 3.11, but with a number of warnings attached to it. Then, with luck, the increased visibility of the code will motivate developers to prepare patches and improve the code to a point where it is production-ready.
Kernel development news Ktap — yet another kernel tracerOnce upon a time, usable tracing tools for Linux were few and far between. Now, instead, there is a wealth of choices, including the in-kernel ftrace facility, SystemTap, and the LTTng suite; Oracle also has a port of DTrace for its distribution, available to its paying customers. On May 21, another alternative showed up in the form of the ktap 0.1 release. Ktap does not offer any major features that are not available from the other tracing tools, but there may still be a place for it in the tracing ecosystem.Ktap appears to be strongly oriented toward the needs of embedded users; that has affected a number of the design decisions that have been made. At the top of the list was the decision to embed a byte-code interpreter into the kernel and compile tracing scripts for that interpreter. That is a big difference from SystemTap, which, in its current implementation, compiles a tracing script into a separate module that must be loaded into the kernel. This difference matters because an embedded target often will not have a full compiler toolchain installed on it; even if the tools are available, compiling and linking a module can be a slow process. Compiling a ktap script, instead, requires a simple utility to produce byte code for the ktap kernel module. That compiler implements a language that is based on Lua. It is C-like, but it is dynamically typed, has a dictionary-like "table" type, and lacks arrays and pointers. There is a simple function definition mechanism which can be used like this:
function eventfun (e) {
printf("%d %d\t%s\t%s", cpu(), pid(), execname(), e.tostring())
}
The resulting function will, when called, output the current CPU number, process ID, executing program name, and the string representation of the passed-in event e. There is a probe-placement function, so ktap could arrange to call the above function on system call entry with:
kdebug.probe("tp:syscalls", eventfun)
A quick run on your editor's system produced a bunch of output like:
3 2745 Xorg sys_setitimer(which: 0, value: 7fff05967ec0, ovalue: 0)
3 2745 Xorg sys_setitimer -> 0x0
2 27467 as sys_mmap(addr: 0, len: 81000, prot: 3, flags: 22, fd: ffffffff, off: 0)
2 27467 as sys_mmap -> 0x2aaaab67c000
2 3402 gnome-shell sys_mmap(addr: 0, len: 97b, prot: 1, flags: 2, fd: 21, off: 0)
2 3402 gnome-shell sys_mmap -> 0x7f4ec4bfb000
There are various utility functions for generating timer requests, creating histograms, and so on. So, for example, this script:
hist = {}
function eventfun (e) {
if (e.sc_is_enter) {
inplace_inc(hist, e.name)
}
}
kdebug.probe("tp:syscalls", eventfun)
kdebug.probe_end(function () {
histogram(hist)
})
is sufficient to generate a histogram of system calls over the period of time from when it starts until when the user interrupts it. Your editor ran it with a kernel build running and got output looking like this:
value ------------- Distribution ------------- count
sys_enter_open |@@@@@@@@ 587779
sys_enter_close |@@@@ 343728
sys_enter_newfstat |@@@@ 331459
sys_enter_read |@@@ 283217
sys_enter_mmap |@@@ 243458
sys_enter_ioctl |@@ 219364
sys_enter_munmap |@@ 165006
sys_enter_write |@ 128003
sys_enter_poll |@ 77311
sys_enter_recvfrom | 52898
The syntax for setting probe points closely matches that used by perf; probes can be set on specific functions or tracepoints, for example. It is possible to hook into the perf events mechanism to get other types of hardware or software events, and memory breakpoints are supported. The (sparse) documentation packaged with the code also suggests that ktap is able to set user-space probes, but none of the example scripts packaged with the tool demonstrate that capability. Ktap scripts can manipulate the return value of probed functions within the kernel. There does not currently appear to be a way to manipulate kernel-space data directly, but that could presumably be added (along with lots of other features) in the future. What's there now is a proof of concept as much as anything; it is a quick way to get some data out of the kernel but does not offer a whole lot that is not available using the existing ftrace interface. For those who want to play with it, the first step is a simple:
git clone https://github.com/ktap/ktap.git
From there, building the code and running the sample scripts is a matter of a few minutes of relatively painless work. There is the ktapvm module, which must, naturally, be loaded into the kernel. That module creates a special virtual file (ktap/ktapvm under the debugfs root) that is used by the ktap binary to load and run compiled scripts. Ktap in its current form is limited, without a lot of exciting new functionality. Even so, it seems to have generated a certain amount of interest in the development community. Getting started with most tracing tools usually seems to involve a fair amount of up-front learning; ktap, perhaps, is a more approachable solution for a number of users. The whole thing is about 10,000 lines of code; it shouldn't be hard for others to run with and extend. If developers start to take the bait, interesting things could happen with this project.
Low-latency Ethernet device pollingLinux is generally considered to have one of the most fully featured and fast networking stacks available. But there are always users who are not happy with what's available and who want to replace it with something more closely tuned for their specific needs. One such group consists of people with extreme low latency requirements, where each incoming packet must be responded to as quickly as possible. High-frequency trading systems fall into this category, but there are others as well. This class of user is sometimes tempted to short out the kernel's networking stack altogether in favor of a purely user-space (or purely hardware-based) implementation, but that has problems of its own. A relatively small patch to the networking subsystem might just be able to remove that temptation for at least some of these users.Network interfaces, like most reasonable peripheral devices, are capable of interrupting the CPU whenever a packet arrives. But even a moderately busy interface can handle hundreds or thousands of packets per second; per-packet interrupts would quickly overwhelm the processor with interrupt-handling work, leaving little time for getting useful tasks done. So most interface drivers will disable the per-packet interrupt when the traffic level is high enough and, with cooperation from the core networking stack, occasionally poll the device for new packets. There are a number of advantages to doing things this way: vast numbers of interrupts can be avoided, incoming packets can be more efficiently processed in batches, and, if packets must be dropped in response to load, they can be discarded in the interface before they ever hit the network stack. Polling is thus a win for almost all situations where there is any significant amount of traffic at all. Extreme low-latency users see things differently, though. The time between a packet's arrival and the next poll is just the sort of latency that they are trying to avoid. Re-enabling interrupts is not a workable solution, though; interrupts, too, are a source of latency. Thus the drive for user-space solutions where an application can simply poll the interface for new packets whenever it is prepared to handle new messages. Eliezer Tamir has posted an alternative solution in the form of the low-latency Ethernet device polling patch set. With this patch, an application can enable polling for new packets directly in the device driver, with the result that those packets will quickly find their way into the network stack. The patch adds a new member to the net_device_ops structure:
int (*ndo_ll_poll)(struct napi_struct *dev);
This function should cause the driver to check the interface for new packets and flush them into the network stack if they exist; it should not block. The return value is the number of packets it pushed into the stack, or zero if no packets were available. Other return values include LL_FLUSH_BUSY, indicating that ongoing activity prevented the processing of packets (the inability to take a lock would be an example) or LL_FLUSH_FAILED, indicating some sort of error. The latter value will cause polling to stop; LL_FLUSH_BUSY, instead, appears to be entirely ignored. Within the networking stack, the ndo_ll_poll() function will be called whenever polling the interface seems like the right thing to do. One obvious case is in response to the poll() system call. Sockets marked as non-blocking will only poll once; otherwise polling will continue until some packets destined for the relevant socket find their way into the networking stack, up until the maximum time controlled by the ip_low_latency_poll sysctl knob. The default value for that knob is zero (meaning that the interface will only be polled once), but the "recommended value" is 50µs. The end result is that, if unprocessed packets exist when poll() is called (or arrive shortly thereafter), they will be flushed into the stack and made available immediately, with no need to wait for the stack itself to get around to polling the interface. Another patch in the series adds another call site in the TCP code. If a read() is issued on an established TCP connection and no data is ready for return to user space, the driver will be polled to see if some data can be pushed into the system. So there is no need for a separate poll() call to get polling on a TCP socket. This patch set makes polling easy to use by applications; once it is configured into the kernel, no application changes are needed at all. On the other hand, the lack of application control means that every poll() or TCP read() will go into the polling code and, potentially, busy-wait for as long as the ip_low_latency_poll knob allows. It is not hard to imagine that, on many latency-sensitive systems, the hard response-time requirements really only apply to some connections, while others have no such requirements. Polling on those less-stringent sockets could, conceivably, create new latency problems on the sockets that the user really cares about. So, while no reviewer has called for it yet, it would not be surprising to see the addition of a setsockopt() operation to enable or disable polling for specific sockets before this code is merged. It almost certainly will be merged at some point; networking maintainer Dave Miller responded to an earlier posting with "I just wanted to say that I like this work a lot." There are still details to be worked out and, presumably, a few more rounds of review to be done, so low-latency sockets may not be ready for the 3.11 merge window. But it would be surprising if this work took much longer than that to get into the mainline kernel.
An unexpected perf featureLocal privilege escalations seem to be regularly found in the Linux kernel these days, but they usually aren't quite so old—more than two years since the release of 2.6.37—or backported into even earlier kernels. But CVE-2013-2094 is just that kind of bug, with a now-public exploit that apparently dates back to 2010. It (ab)uses the perf_event_open() system call, and the bug was backported to the 2.6.32 kernel used by Red Hat Enterprise Linux (and its clones: CentOS, Oracle, and Scientific Linux). While local privilege escalations are generally considered less worrisome on systems without untrusted users, it is easy to forget that UIDs used by network-exposed services should also qualify as untrusted—compromising a service, then using a local privilege escalation, leads directly to root. The bug was found by Tommi Rantala when running the Trinity fuzz tester and was fixed in mid-April. At that time, it was not recognized as a security problem; the release of an exploit in mid-May certainly changed that. The exploit is dated 2010 and contains some possibly "not safe for work" strings. Its author expressed surprise that it wasn't seen as a security problem when it was fixed. That alone is an indication (if one was needed) that people in various colored hats are scrutinizing kernel commits—often in ways that the kernel developers are not. The bug itself was introduced in 2010, and made its first appearance in the 2.6.37 kernel in January 2011. It treated the 64-bit perf event ID differently in an initialization routine (perf_swevent_init() where the ID was sanity checked) and in the cleanup routine (sw_perf_event_destroy()). In the former, it was treated as a signed 32-bit integer, while in the latter as an unsigned 64-bit integer. The difference may not seem hugely significant, but, as it turns out, it can be used to effect a full compromise of the system by privilege escalation to root. The key piece of the puzzle is that the event ID is used as an array index in the kernel. It is a value that is controlled by user space, as it is passed in via the struct perf_event_attr argument to perf_event_open(). Because it is sanity checked as an int, the upper 32 bits of event_id can be anything the attacker wants, so long as the lower 32 bits are considered valid. Because event_id is used as a signed value, the test:
if (event_id >= PERF_COUNT_SW_MAX)
return -ENOENT;
doesn't exclude negative IDs, so anything with bit 31 set (i.e. 0x80000000) will be
considered valid.
The exploit code itself is rather terse, obfuscated, and hard to follow, but Brad Spengler has provided a detailed description of the exploit on Reddit. Essentially, it uses a negative value for the event ID to cause the kernel to change user-space memory. The exploit uses mmap() to map an area of user-space memory that will be targeted when the negative event ID is passed. It sets the mapped area to zeroes, then calls perf_event_open(), immediately followed by a close() on the returned file descriptor. That triggers:
static_key_slow_dec(&perf_swevent_enabled[event_id]);
in the sw_perf_event_destroy() function.
The code then looks for non-zero values in the mapped area, which can be
used (along with the event ID value and the size of the array elements) to
calculate the base address of the perf_swevent_enabled array.
But that value is just a steppingstone toward the real goal. The exploit gets the base address of the interrupt descriptor table (IDT) by using the sidt assembly language instruction. From that, it targets the overflow interrupt vector (0x4), using the increment in perf_swevent_init():
static_key_slow_inc(&perf_swevent_enabled[event_id]);
By setting event_id appropriately, it can turn the address of the
overflow interrupt handler into a user-space address.
The exploit arranges to mmap() the range of memory where the clobbered interrupt handler will point and fills it with a NOP sled followed by shellcode that accomplishes its real task: finding the UID/GIDs and capabilities in the credentials of the current process so that it can modify them to be UID and GID 0 with full capabilities. At that point, in what almost feels like an afterthought, it spawns a shell—a root shell. Depending on a number of architecture- or kernel-build-specific features (not least x86 assembly) makes the exploit itself rather fragile. It also contains bugs, according to Spengler. It doesn't work on 32-bit x86 systems because it uses a hard-coded system call number (298) passed to syscall(), which is different (336) for 32-bit x86 kernels. It also won't work on Ubuntu systems because the size of the perf_swevent_enabled array elements is different. The following will thwart the existing exploit:
echo 2 > /proc/sys/kernel/perf_event_paranoid
But a minor change to the flags passed to perf_event_open()
will still allow the privilege escalation. None of these is a real defense
of any sort
against the vulnerability, though they do defend against this
specific exploit. Spengler's analysis has more details, both of the
existing exploit as well as ways to change it to work around its fragility.
The code uses syscall(), presumably because perf_event_open() is not (yet?) available in the GNU C library, but it could also be done to evade any argument checks done in the library. Any sanity checking done by the library must also be done in the kernel, because using syscall() can avoid the usual system call path. Kernels configured without support for perf events (i.e. CONFIG_PERF_EVENTS not set) are unaffected by the bug as they lack the system call entirely. There are several kernel hardening techniques that would help to avoid this kind of bug leading to system compromise. The grsecurity UDEREF mechanism would prevent the kernel from dereferencing the user-space addresses so that the perf_swevent_enabled base address could not be calculated. The PaX/grsecurity KERNEXEC technique would prevent the user-space shellcode from executing. While these techniques can inhibit this kind of bug from allowing privilege escalation, they impose costs (e.g. performance) that have made them unattractive to the mainline developers. Suitably configured kernels on hardware that supports it would be protected by supervisor mode access prevention (SMAP) and supervisor mode execution protection (SMEP), the former would prevent access to the user-space addresses much like UDEREF, while the latter would prevent execution of user-space code as does KERNEXEC. This is a fairly nasty hole in the kernel, in part because it has existed for so long (and apparently been known by some, at least, for most of that time). Local privilege escalations tend to be somewhat downplayed because they require an untrusted local user, but web applications (in particular) can often provide just such a user. Dave Jones's Trinity has clearly shown its worth over the last few years, though he was not terribly pleased how long it took for fuzzing to find this bug. Jones suspects there may be "more fruit on that branch somewhere", so more and better fuzzing of the perf system calls (and kernel as a whole) is indicated. In addition, the exploit author at least suggests that he has more exploits waiting in the wings (not necessarily in the perf subsystem), it is quite likely that others do as well. Finding and fixing these security holes is an important task; auditing the commit stream to help ensure that these kinds of problems aren't introduced in the first place would be quite useful. One hopes that companies using Linux find a way to fund more work in this area.
Patches and updates Kernel trees
Core kernel code
Development tools
Device drivers
Filesystems and block I/O
Memory management
Networking
Architecture-specific
Security-related
Virtualization and containers
Page editor: Jonathan Corbet Distributions Empty symlinks and full POSIX complianceSymbolic links are a mechanism to make one pathname be an alias for another. One would think that there would be little value in an empty symbolic link — one where the destination pathname is the empty string — but that doesn't keep people from trying to create such links, an act that the Linux virtual filesystem layer does not allow. It turns out that its refusal to allow the creation of empty symbolic links puts Linux out of compliance with the POSIX standard. The real question, though, might be: how much does that really matter?
Pointing to the voidIt turns out that many Unix-based systems will happily allow a command like:
ln -s "" link-to-nothing
On a Linux system, though, that command will fail with a "No such file or directory" error. This is, as was pointed out in a bug report last January, a somewhat confusing message. If the empty string is replaced by the name of a nonexistent file, no such error results. In other words, for most cases, the lack of an existing file is not a concern. So it seems strange that Linux would gripe about "no such file" in the empty string case. As part of the ensuing discussion, it turned out that the POSIX standard was not consistent with how empty symbolic links (that already exist in the filesystem) are handled in existing systems. Solaris systems will, when such a link is dereferenced, treat it as a link to the current directory; essentially, an empty link is treated as if it were "." instead. BSD systems respond differently: they take the position that no such file can exist and duly return the "no such file" error. Neither of those responses was compliant with POSIX, a problem which was only fixed in early May, when the standard was updated to allow either the Solaris or the BSD behavior. The result is a standard that explicitly says one cannot know how the system will resolve an empty symbolic link; they might work, or they might not. How Linux handles an attempt to resolve an empty symbolic link that already exists within a filesystem is not well defined. Some of the work of link resolution is pushed down into filesystem-specific code, so the behavior may depend on which filesystem type is in use. It is hard to test because, as mentioned above, Linux does not allow the creation of empty symbolic links, so they can only come by way of a filesystem from another system. But, in general, an attempt to resolve an empty symbolic link can be expected to return a "No such file or directory" response. The refusal to create an empty symbolic link, as it turns out, is contrary to how POSIX thinks the symlink() system call should work. The standard text says explicitly that the target string "shall be treated only as a character string and shall not be validated as a pathname." Empty strings are valid character strings, and the implementation is not allowed to care that they cannot be the name of a real file, so, by the standard, the creation of such a symbolic link should be allowed. Back in January, Pádraig Brady posted a patch enabling the creation of empty symbolic links. The patches did not generate much interest at that time. He followed up in May after the standard had been updated; this time Al Viro expressed his feelings on the matter:
Functionality in question is utterly pointless, seeing that
semantics of such symlinks is OS-dependent anyway *and* that
blanket refusal to traverse such beasts is a legitimate option.
What's the point in allowing to create them in the first place?
And that is pretty much where the discussion stopped.
Linux and POSIX complianceThat said, it would not be entirely surprising if such a patch were to make it into the kernel at some point. The cost of enabling the creation of empty symbolic links is essentially zero, and adding that capability would bring Linux a little closer to POSIX compliance. But true POSIX compliance, which is a function of both the kernel and the low-level libraries that sit above it, still seems like a distant goal for Linux distributions as a whole. As a Unix-like system, Linux is not that far removed from compliance with the POSIX standard. Linux developers normally try to adhere to such standards when it makes sense to do so, but they generally feel no need to apply changes that, in their opinion, do not make technical sense just because a standard document calls for it. The reaction to the creation of empty symbolic links is a case in point. The value of closer adherence to POSIX is not seen as being high enough to justify the addition of a "feature" that seems nonsensical. The value question is an interesting one. Getting certified to the point where one can use the POSIX trademark is a matter of passing the verification test suite, applying for certification, and handing over a relatively small amount of money as described in the fee schedule [PDF]. An enterprise Linux distributor wishing to claim POSIX compliance could almost certainly attain this certification in a relatively short period of time with an investment that would be far smaller than was required for, say, Common Criteria security certification. Carrying some non-mainline patches to the kernel and C library would likely be necessary, but enterprise distributors have generally shown little reluctance to do that when it suits their interests. But there are no POSIX-certified Linux distributions on the market now. As far as your editor can tell, the only time a distribution has achieved that certification was when Linux-FT claimed it in 1995. That work was (or was not, depending on which side of the argument you listen to) acquired by Caldera shortly thereafter; Caldera, too, intended to achieve POSIX certification. That certification does not appear to have happened, and Caldera, of course, followed its own unhappy path to its doom. Since then, corporate interest in POSIX certification for Linux has been subdued, to say the least. One can only conclude that the commercial value of a 100% certified POSIX-compliant distribution is not enough to justify even a relatively small level of effort. If distributors were losing business due to the lack of certification, they would be doing something about it. But, it seems, "almost POSIX" is good enough for users, especially in an era where Linux is the preferred platform for many applications. POSIX still has its place; it sets the expectations for the low-level interface provided by the operating system and helps to ensure compatibility. But, increasingly, most current development work is outside of the scope of POSIX. The standard cannot hope to keep up with the changes being made to Linux at the kernel level and above. We live in a fast-changing world where, in many cases, "what does Linux do?" is the real standard. The developers who are busily pushing Linux forward have little time or patience for working toward complete POSIX compatibility when the interesting problems are elsewhere, so a fully POSIX-compliant distribution seems unlikely to show up in the near future.
Brief items Distribution quotes of the week
I admit to everything. I am merely an artificial creature, designed by
Lennart and sent here by the GNOME cabal, to end Debian as it is and
turn it into a useless system that is not the UNIX way™.
-- Josselin Mouette
When I finally got my new fangled EGA screen, it was *configurable*: you
could use a button to choose to have it in orange, green, or with actual
colours. Back in the good old days when you were actually given a
choice.
-- Enrico Zini
Nowadays, if I type my login name all uppercase, I don't even get an uppercase "PASSWORD:" prompt anymore :(
Arrows move the cursor, enter follows links, '/' searches. And don't
dare touch anything else because nobody knows what could happen!
-- Michał Górny
Debian GNU/Hurd 2013 releasedWhile it is not an official Debian release, the Debian GNU/Hurd team has announced the release of Debian GNU/Hurd 2013. GNU Hurd is a Unix-style kernel based on the Mach microkernel and Debian GNU/Hurd makes much of the Debian system available atop that kernel.
Debian GNU/Hurd is currently available for the i386 architecture with more than 10.000 software packages available (more than 75% of the Debian archive, and more to come!).
Please make sure to read the configuration information, the FAQ, and the translator primer to get a grasp of the great features of GNU/Hurd. Due to the very small number of developers, our progress of the project has not been as fast as other successful operating systems, but we believe to have reached a very decent state, even with our limited resources.
Mageia 3 releasedThe much-delayed Mageia 3 release is out. "We dedicate this release to the memory of Eugeni Dodonov, our friend, our colleague and a great inspiration to those he left behind. We miss his brilliance, his courtesy and his dedication." Changes include an RPM upgrade, the 3.8 kernel, availability of GRUB2 (but GRUB is still the default bootloader), and more. See the release notes for lots of details.
NetBSD 6.1The NetBSD Project has announced NetBSD 6.1, the first feature update of the NetBSD 6 release branch. "It represents a selected subset of fixes deemed important for security or stability reasons, as well as new features and enhancements." See the changelog for details.
Pidora 18Pidora is a Fedora remix for the Raspberry Pi. Pidora 18 has been released. "It is based on a brand new build of Fedora for the ARMv6 architecture with greater speed and includes packages from the Fedora 18 package set."
Distribution News Debian GNU/Linux "Bits from Debian" editors delegationDPL Lucas Nussbaum has appointed Ana Beatriz Guerrero López and Francesca Ciceri as Bits from Debian editors. "The Bits from Debian Editors oversee and maintain the official blog of the Debian project, "Bits from Debian"".
Newsletters and articles of interest Distribution newsletters
OpenMandriva Picks Name, Releases Alpha (OStatic)OStatic covers recent news from OpenMandriva. "While the rest of Linuxdom was reading of the Debian 7.0 and Mageia 3 releases, the OpenMandriva gang have been hard at it trying to get their new distribution some attention. The OpenMandriva name was made official and an alpha was released into the wild."
Tails 0.18 can install packages on the fly (The H)The H takes a look at Tails, The Amnesiac Incognito Live System. "Existing Tails users are also strongly urged to upgrade to Tails 0.18; the team lists security vulnerabilities discovered in the previous 0.17.2 release to reinforce that recommendation. Another new feature in the Debian-based distribution is support of obfs3 bridges when using the Tor network. Obfuscation bridges make it harder for the ISP to know a user is using the Tor network by disguising the protocol in use; obfs3 is the latest version of the protocol replacing obfs2 which no longer works in China."
Page editor: Rebecca Sobol Development An "enum" for Python 3Designing an enumeration type (i.e. "enum") for a language may seem like a straightforward exercise, but the recently "completed" discussions over Python's PEP 435 show that it has a few wrinkles. The discussion spanned several long threads in two mailing lists (python-ideas, python-devel) going back to January in this particular iteration, but the idea is far older than that. A different approach was suggested in PEP 354, which was proposed in 2005 but rejected at that time, largely due to lack of widespread interest. A 2010 discussion also led nowhere (at least in terms of the standard library), but the most recent discussions finally bore fruit: Guido van Rossum accepted PEP 435 on May 9. The basic idea is to have a class that implements an enum, which, in Python, might look a lot like:
from enum import Enum
class Color(Enum):
red = 1
green = 2
blue = 3
That would allow using Color.green (and the others) as a constant,
effectively.
Not only would Color.blue have a value, but it would also have a
name ('blue') and an order (based on the declaration order). Enums can
also be iterated over, so that:
for color in Color:
print(color, color.name, color.value)
gives:
Color.red red 1
Color.green green 2
Color.blue blue 3
Along the way, there were several different enum proposals made. Ethan Furman offered one that incorporated multiple types of enum, including ones for bit flags, string-valued enums, and automatically numbered sequences. Alex Stewart came up with a different syntax for defining enums to avoid the requirement to specify each numeric value. Neither made it to the PEP stage, though pieces of both were adopted into the first draft of PEP 435, which was authored by Eli Bendersky and Barry Warsaw. There are a couple of fairly obvious motivations for adding enums, which were laid out in the PEP. An immutable set of related, constant values is a useful construct. Making them their own type, rather than just using sequences of some other basic type (like integer or string) means that error checking can be done (i.e. no duplicates) and that nonsensical operations can raise errors (e.g. Color.blue * 42). Finally, it is convenient to be able to declare enum members once but to still be able to get a string representation of the member name (i.e. without some kind of overt assignment like: green.name='green'). Some of the use cases mentioned early in the discussion of the PEP are for values like stdin and stdout, the flags for socket() or seek() calls, HTTP error codes, opcodes from the dis (Python bytecode disassembly) module, and so forth. One of the questions that was immediately raised about the original version of the PEP was its insistence that "Enums are not integers!", so ordered comparisons like:
Color.red < Color.green
would raise an exception, though equality tests would not:
print(Color.green == 2)
True
To some, that seemed to run directly counter to the whole idea of an enum
type, but allowing ordered comparisons has some unexpected consequences as Warsaw
described. Two different enums could be compared with potentially
nonsensical results:
print(MyAnimal.cat == YourAnimal.dog)
True
In general, the belief is that "named integers" is a small subset of the
use cases for enums, and that most uses do not need ordered comparisons.
But, the final accepted PEP does have an IntEnum variant
that provides the ordering desired by some. IntEnum members are also a
subclass of int, so they can be used to replace user-facing
constants in the
standard library that are already treated as integers (e.g. HTTP error codes,
socket() and seek() flags, etc.).
A second revision of the PEP was posted in April, after lengthy discussion both in python-devel and python-ideas. Furman offered up another proposal, this time as an unnumbered PEP with four separate classes for different types of enums. Two different views of enums arose in the discussion, as Furman summarized:
There seems to be two basic camps: those that think an enum
should be valueless, and have nothing to do with an integer besides using
it to select the appropriate enumerator [...] and those for whom the
integer is an integral part
of the enumeration, whether for sorting, comparing, selecting an index, or
whatever.
The critical aspect of using or not using an integer as the base type is: what happens when an enumerator from one class is compared to an enumerator from another class? If the base type is int and they both have the same value, they'll be equal -- so much for type safety; if the base type is object, they won't be equal, but then you lose your easy to use int aspect, your sorting, etc. Worse, if you have the base type be an int, but check for enumeration membership such that Color.red == 1 == Fruit.apple, but Color.red != Fruit.apple, you open a big can of worms because you just broke equality transitivity (or whatever it's called). We don't want that. Furman's proposal looked overly complex to Bendersky and others commenting on a fairly short python-ideas thread. Meanwhile in python-devel, another monster thread was spinning up. The first objection to the revised PEP was in raising a NotImplementedError when doing ordered comparisons of enum members. That was quickly dispatched with a recognition that TypeError made far more sense. Other issues, such as the ordered comparison issue that was handled with IntEnum in the final version, did not resolve quite as quickly. One question, originally raised by Antoine Pitrou, concerned the type of the enum members. The early PEP revisions considered Color.red to not be an instance of the Color class, and Warsaw strongly defended that view. At some level, that makes sense (since the members are actually attributes of the class), but it is confusing in other ways. In a sub-thread, Van Rossum, Warsaw, and others looked at the pros and cons of the types of enum members, as well as implementation details of various options. In the end, Van Rossum made some pronouncements on various features, including the question of member type, so:
isinstance(Color.blue, Color)
True
is now an official part of the specification.
As Python's benevolent dictator for life (BDFL), which is Van Rossum's only-semi-joking title, he can put an end to arguments and/or "bikeshedding" about language features. In the same thread, he made some further pronouncements (along with a plea for a halt to the bikeshedding). It is a privilege that he exercises infrequently, but it is clearly useful to the project to have someone in that role. Much like Linus Torvalds for the kernel, it can be quite helpful to have someone who can stop a seemingly endless thread. Van Rossum's edicts came after Furman summarized the outstanding issues (after a summary request from Georg Brandl). That is a fairly common occurrence in long-running Python threads: someone will try to boil down the differences into a concise list of outstanding issues. Another nice feature of Python discussions is their tone, which is generally respectful and flame-free. Participants certainly disagree, sometimes strenuously, but the tone is refreshingly different from many other projects' mailing lists. Not everyone is happy with the end result for enums, however. Andrew Cooke is particularly sad about the outcome. He points out that several expected behaviors for enums are not present in PEP 435:
class Color(Enum):
red = 1
green = 1
is not an error; Color.green is an alias for Color.red
(a dubious "feature", he noted with a bit of venom).
In addition, there is a way to avoid having to assign values for each enum
member (auto-numbering, essentially), but its syntax is clunky:
Color = Enum('Color', 'red green blue')
Beyond having to repeat the class name as a string (which violates the
"don't repeat yourself" (DRY) principle), it starts the numbering from one,
rather than zero. Nick Coghlan responded
to Cooke's complaints by more or less agreeing with the criticism. There
is still room for improvement in Python enums, but PEP 435 represents a
solid step forward, according to Coghlan.
It is instructive to watch the design of a language feature play out in public as they do for Python (and other languages). Enums are something that the developers will have to live with for a long time, so it is not surprising that there would be lots of participation and examination of the feature from many different angles. While PEP 435 probably didn't completely satisfy anyone's full set of requirements, there is still room for more features, both in the standard library and elsewhere, as Coghlan pointed out. The story of enums in Python likely does not end here.
Python and implicit string concatenationIn a posting with a title hearkening back to a famous letter of old ("Implicit string literal concatenation considered harmful?"), Guido van Rossum opened up a python-ideas discussion about a possible feature deprecation. In particular, he had been recently bitten by a hard-to-find bug in his code because of Python's implicit string concatenation. He wondered if it was perhaps time to consider deprecating the feature, which was added for "flimsy" reasons, he said. As the discussion shows, however, getting rid of dubious language features is a tricky task at best. The problem stems from Python's behavior when faced with two adjacent string literals: it concatenates the two strings. Van Rossum ran into difficulties and got an argument count exception because he forgot a comma. He wrote:
foo('a' 'b')
when he really wanted:
foo('a', 'b')
The former passes one argument (the string "ab"), while the latter passes
two. This implicit concatenation can cause similar (but potentially even
harder to spot)
problems in things like lists:
[ 'a', 'b',
'c', 'd'
'e', 'f' ]
which creates a five-element list with "de" as the fourth element. The
reason Van Rossum added the feature to Python is, by his own admission, questionable: "I copied it from C but the reason why it's
needed there doesn't really apply to Python, as it is mostly useful
inside macros". Beyond that, the string concatenation operator
("+") is evaluated at compile time, so there is no runtime penalty to
constructing long strings that way. Given all of that, should the feature
be dumped in some future version of Python?
There were some who quickly jumped on the deprecation bandwagon. It is, evidently, a common problem—one that is rather irritating to hit. But other commenters noted some problems in blindly requiring the + operator. Perhaps the biggest problem is that the "%" interpolation operator has higher precedence than +, which means:
print("long string %d " +
"another long string %s" % (2, 'foo'))
would not work at all using the "+" operator, but would work fine
when omitting
it. In the thread,
it was mentioned that the % operator may be slated for eventual
deprecation itself, but the same problem
exists with its replacement:
the format() string method. So, a simple substitution that adds
+ simply
won't work in all cases and additional parentheses will be required.
Another alternative, using triple-quoted strings, has its own set of problems, mostly related to handling indentation inside those strings. Adding some kind of "dedent" (i.e. un-indent) operation or syntax into the language might help. Currently:
'''This string
will have "extra"
spaces'''
will result in a string with multiple spaces before "will" and "spaces".
There was also a suggestion to add a new "..." operator that would indicate a continued string, but Stephen J. Turnbull (and others) didn't think that justified adding a new operator (the ellipsis), especially given that the symbol is already used for custom list slice operations. Additional suggestions included a new string modifier (prefacing the string literal with "m" for example) to indicate a continued string:
a = [m'abc'
'def']
Another "farfetched" idea was to allow compile-time string
processors to be specified for string literals:
!merge!'''\
abc
def'''
which would run the "merge" processor on the string, creating "abcdef".
There was no end to the suggested alternatives to implicit concatenation—there's a reason the mailing list is called python-ideas after all—but participants were fairly evenly split on whether to deprecate the feature. There were enough complaints about doing so that it seems unlikely that concatenation by juxtaposition will be deprecated any time soon. As Van Rossum noted, though, it is a feature that would almost certainly never pass muster if it were proposed today. Furthermore:
I do realize that this will break a lot of code, and that's the only
reason why we may end up punting on this, possibly until Python 4, or
forever. But I don't think the feature is defensible from a language
usability POV. It's just about backward compatibility at this point.
Van Rossum's lament should be a helpful reminder for language designers. It is very difficult to get rid of a feature once it has been added. That is especially true for a low-level syntax element like literal strings—just ask the developer of make about the tab requirement for makefiles. For Python, concatenation by juxtaposition seems likely to be around for long time to come—perhaps forever.
Brief items Quotes of the week
So
the current state of the art is just to copy & paste ScreenInit and
friends from another driver, because the documentation wouldn't
actually be any shorter than the hundreds of lines of code.
— Daniel Stone (thanks to Arthur Huillet)
Ask a programmer to review 10 lines of code, he'll find 10 issues. Ask him to do 500 lines and he'll say it looks good.
— Giray
Özil (from a retweet by Randall Arnold)
QEMU 1.5.0 releasedVersion 1.5.0 of the QEMU hardware emulator is out. "This release was developed in a little more than 90 days by over 130 unique authors averaging 20 commits a day. This represents a year-to-year growth of over 38 percent making it the most active release in QEMU history." Some of the new features include KVM-on-ARM support, a native GTK+ user interface, and lots of hardware support and performance improvements. See the change log for lots of details.
Perl 5.18.0 releasedThe Perl 5.18.0 release is out. "Perl v5.18.0 represents approximately 12 months of development since Perl v5.16.0 and contains approximately 400,000 lines of changes across 2,100 files from 113 authors." See this perldelta page for details on what has changed.
New Python releasesSeveral point releases of Python are now available. Benjamin Peterson announced the release of 2.7.5 on May 15, and Georg Brandl announced 3.2.5 and 3.3.2 on May 16. The primary focus of the releases are regression fixes, so users are encouraged to upgrade.
Newsletters and articles Development newsletters from the past week
Blender dives into 3D printing industry (Libre Graphics World)Libre Graphics World looks at the use of Blender in 3D printing; the recent 2.67 release includes a "3D printing toolbox." "While Blender cannot help with making actual devices easier to use, it definitely could improve designing printable objects. And that's exactly what happened last week, when Blender 2.67 was released."
Page editor: Nathan Willis Announcements Brief items Ten years of GroklawGroklaw is celebrating its tenth anniversary. "Thank you for sticking to the job for ten years without giving out, and for funding the necessary activities that make Groklaw Groklaw. We made a difference in this old world. It's an achievement we can tell our grandchildren about some day. Not everyone can say that, but we actually made a difference. And nobody can take that away from us."
Sony opens up the Xperia Tablet ZSony has announced the availability of an Android Open Source Project distribution for its Xperia Tablet Z device. "For all you developers out there, of course this means you can now access the software and contribute to this project. And this is all before the tablet is even available in the US. A special thanks to our Sony Mobile team for helping us create the package early and a huge thanks to the Android developer community for all your support. We can’t wait to see what you’ll do with the code." Source is available on GitHub.
EFF: Vermont Is Mad as Hell at Patent TrollsThe Electronic Frontier Foundation has sent out a release about how the US state of Vermont is going on the offensive against patent trolls. "Not content to strike back against a single troll, Vermont is also poised to pass a bill dealing with the problem as a whole. The Vermont House and Senate recently passed a bill to combat 'bad faith assertions of patent infringement'. And the latest word is that Vermont's governor is about to sign it into law."
Google Code to deprecate downloadsGoogle has announced that it will be phasing out the file download feature for projects hosted on Google Code. "Downloads were implemented by Project Hosting on Google Code to enable open source projects to make their files available for public download. Unfortunately, downloads have become a source of abuse with a significant increase in incidents recently. Due to this increasing misuse of the service and a desire to keep our community safe and secure, we are deprecating downloads."
Articles of interest How Google plans to rule the computing world through Chrome (GigaOM)GigaOM asserts that Google will be taking over the desktop (regardless of the underlying operating system) with its Chrome browser. "For many Chrome is just a browser. For others who use a Chromebox or Chromebook, like myself, it’s my full-time operating system. The general consensus is that Chrome OS, the platform used on these devices, can only browse the web and run either extensions and web apps; something any browser can do. Simply put, the general consensus is wrong and the signs are everywhere."
Calls for Presentations DebConf 13: Call for PapersDebConf 13 will take place near Vaumarcus, Switzerland August 11-18. The call for papers and presentations is open until June 1, 2013. "This year submission of a formal written paper for the conference proceedings is again optional, though encouraged. Providing a written paper in advance means that interested people can attend your session ready with ideas for discussion, and especially helps those who find it hard to follow rapid English speech."
EuroPython 2014/2015 Conference Team - Call for ProposalsThe EuroPython Society is looking for volunteers to organize EuroPython conferences in 2014 and 2015. This call for proposals closes June 14, 2013.
Call for Papers - PostgreSQL Conference Europe 2013PostgreSQL Conference Europe 2013 will be held on October 29-November 1 in Dublin, Ireland. The call for proposals closes July 22.
Upcoming Events GNU Hackers Meeting 2013 in Paris, FranceThe GNU Hackers Meeting will take place August 22-25, 2013 in Paris, France. "As we are still in the process of defining the program, we welcome your talk proposals. We are particularly interested in new GNU programs and recent developments in existing packages, plus any related topic. In our experience the audience tends to be technically competent, so feel free to propose very technical topics as well; we will try to schedule all such talks together in the same morning or afternoon session, for the public’s sake."
Events: May 23, 2013 to July 22, 2013The following event listing is taken from the LWN.net Calendar.
If your event does not appear here, please tell us about it.
Page editor: Rebecca Sobol
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds