LWN.net Weekly Edition for February 14, 2013
FOSDEM: Richard Fontana on copyleft-next
In July 2012, Richard Fontana started the GPL.next project to experiment with modifications to version 3 of the GNU General Public License (GPLv3). The name was quickly changed to the more neutral "copyleft-next" and the license has evolved into a "radically different text" compared to the GPLv3 since project inception. Fontana gave a talk in the FOSDEM legal devroom on February 3 that presented the current status of the project and his reasons for exploring new ideas about copyleft licensing.
Fontana explained that he initially described the project as a fork of the GPLv3 but admitted that it "sounded more negative than I intended". He actually co-authored the GPLv3, LGPLv3, and AGPLv3 licenses together with Richard Stallman and Eben Moglen during his time at the Software Freedom Law Center. Fontana, who is now Red Hat's open-source licensing counsel, stressed that copyleft-next is his personal project and not related to his work for SFLC, FSF, or Red Hat, although "these experiences had a personal influence".
The complexity of the GPLv3
Allison Randal's 2007 essay "GPLv3,
Clarity and Simplicity" is a powerful critique of the GPLv3 and was
deeply influential on his thinking, Fontana said. The essay argued that everyone
"should be enabled to comprehend the terms of the license
".
Based on the (then) near-finished draft of the GPLv3, Randal observed that
it's unlikely that clarity and simplicity had been a priority during the
drafting process.
Fontana feels that the complexity of the GPL had a side effect of creating an "atmosphere of unnecessary inscrutability and hyper-legalism" surrounding the GPL. Additionally, he perceives that legal interpretation of the license is lacking. Richard Stallman has withdrawn from active license interpretation and Brett Smith, for a long time FSF's "greatest legal authority" according to Fontana, left his position as FSF's License Compliance Engineer in May 2012. He wonders whether the complexity of the GPL, together with FSF's withdrawal from an active interpretive role, has contributed to a shift to non-copyleft licenses. He also believes that developer preference for licensing minimalism is rising.
Another reason for the creation of copyleft-next is Fontana's desire to experiment with new ideas and forms of licensing. He pointed out that every license (proprietary or free) is imperfect and could benefit from improvements. He feels strongly that license reform should not be monopolized. Due to concerns about license proliferation, the OSI has discouraged the creation of new licenses, effectively creating a monopoly for the stewards of existing OSI-approved licenses. Fontana downplayed concerns of license proliferation, partly because GPL-compatible licenses should also be compatible with copyleft-next and because copyleft-next offers one-way compatibility with the GPL. Finally, he views copyleft-next as a "gradual, painless successor to GPLv2/GPLv3".
Fontana also expressed his disappointment in the way open source licenses have historically been developed. While the drafting process for GPLv3 was very advanced and transparent compared to other efforts, it seems insufficiently transparent to him by present-day standards. He pointed to the Project Harmony contributor agreements as another example of a non-transparent process since it employed the Chatham House Rule during parts of the drafting process.
Contribution norms
Unsurprisingly, copyleft-next's development process is very different and follows the "contemporary methodology of community projects". The license is hosted on Gitorious, and there is a public mailing list and IRC channel—a bug tracker will be added in the near future. Fontana acts as the sabd(nnfl)—the self-appointed benevolent dictator (not necessarily for life).
The project has participation guidelines (informally known as the Harvey Birdman Rule, after a US cartoon series featuring lawyers). The norms reflect Fontana's intention to involve developers and other community members in the development process. They encourage transparency in license drafting and aim to "prevent the undue influence of interest groups far removed from individual software developers" (in other words, lawyers).
The guidelines disallow closed mailing lists as well as substantive private conversations about the development of the project. The latter can be remedied by posting a summary to the public mailing list. Fontana is true to his word and posted summaries of discussions he had at FOSDEM. Finally, the Harvey Birdman Rule forbids contributions in the form of word-processing documents and dictates that mailing list replies using top-posting shall be ignored.
The copyleft-next license
The copyleft-next license is a strong copyleft license. The word "strong" refers to the scope of the license. The Mozilla Public License (MPL), for example, is a weak copyleft license in this sense since its copyleft only applies to individual files. While modifications to a file are covered by MPL's copyleft provisions, code under the MPL may be distributed as part of a larger proprietary piece of software. The GPL and copyleft-next, on the other hand, have a much broader scope and make it difficult to make proprietary enhancements of free software.
Copyleft-next was initially developed by taking the GPLv3 text and removing parts from it. For each provision, Fontana asked whether the incremental complexity associated with the provision is necessary and worthwhile. For many provisions, he concluded they weren't—this includes provisions in the GPLv3 that no other open source license has needed, obscure clauses, and text that should be moved to a FAQ. The GPL has a lot of historical baggage, and Fontana believes that the reduction in complexity of copyleft-next has led to a license that developers and lawyers alike can read and understand. Those readers interested in verifying this claim can find the current draft on Gitorious.
In order to show the drastic reduction in complexity, Fontana compared the word and line counts of several popular open source licenses. The word counts were as follows:
License Words copyleft-next 0.1.0 1423 Apache License 2.0 1581 GPLv1 2063 MPL 2.0 2435 GPLv2 2968 GPLv3 5644
For comparison, the MIT license consists of 162 words and the BSD 3-clause license has 212 words.
Copyleft-next has a number of interesting features. It offers outbound compatibility with the GPLv2 (or higher) and AGPLv3 (or higher), meaning that code covered by copyleft-next can be distributed under these licenses. This allows for experimentation in copyleft-next, Fontana explained. The license also simplifies compliance: when the source code is not shipped with a physical product, distributors do not have to give a written offer to supply the source code on CD or a similar medium. They can simply point to a URL where the source code can be found for two years. Like the GPLv3, copyleft-next allows license violations to be remedied within a certain time period (although compared to GPLv3 the provision has been simplified). In contrast to GPLv3, the current draft of copyleft-next doesn't contain an anti-Tivoization clause.
The copyleft-next license also takes a stance against certain practices detested by many community members. The license includes a proprietary-relicensing "poison pill": if the copyright holders offer proprietary relicensing, the copyleft requirements evaporate—the project effectively becomes a permissively licensed one, meaning that no single entity has a monopoly on offering proprietary versions. This provision was inspired by the Qt/KDE treaty, which says that the KDE Free Qt Foundation can release Qt under a BSD-style license if Qt is no longer offered under the LGPL 2.1. Furthermore, copyleft-next has an anti-badgeware provision: it explicitly excludes logos from the requirement to preserve author attributions.
While copyleft-next started as an exercise to simplify the GPLv3, it has incorporated ideas and concepts from other licenses in the meantime. For example, several provisions, such as the one explicitly excluding trademark grants, were inspired by or directly borrowed from MPL 2.0.
Fontana made the first release of copyleft-next, 0.1.0, just before FOSDEM and released version 0.1.1 in the interim. He mentioned during the talk that he is thinking of creating an Affero flavor of copyleft-next as well. He would like to see more participation from community members. The mailing list provides a good way to get started and the commit logs explain the rationale of changes in great detail.
LCA: The X-men speak
Linux.conf.au 2013 in Canberra provided an interesting window into the world of display server development with a pair of talks about the X Window System and one about its planned successor Wayland (a talk which will be the subject of its own article shortly). First, Keith Packard discussed coming improvements to compositing and rendering. He was followed by David Airlie, who talked about recent changes and upcoming new features for the Resize, Rotate and Reflect Extension (RandR), particularly to cope with multiple-GPU laptops. Each talk was entertaining enough in its own right, but they worked even better together as the speakers interjected their own comments into one another's Q&A period (or, from time to time, during the talks themselves).
Capacitance: sworn enemy of the X server
Packard kicked things off by framing recent work on the X server as
a battle against capacitance—more specifically, the excess power
consumption that adds up every time there is an extra copy operation
that could be avoided. Compositing application window contents and
window manager decorations together is the initial capacitance sink,
he said, since historically it required either copying an
application's content from one scanout buffer to another, or
repainting an entirely new buffer then doing a page-flip between the
back (off-screen) buffer and the front (on-screen) buffer. Either
option requires significant memory manipulation, which has steered the
direction of subsequent development, including DRI2, the rendering
infrastructure currently used by the X server.
But DRI2 has its share of other problems needing attention, he
said. For example, the Graphics Execution Manager (GEM) assigns its
own internal names called global GEM handles to the graphics memory it
allocates. These handles are simply integers, not references to
objects (such as file descriptors) that the kernel can
manage. Consequently, the kernel does not know which applications are
using any particular handle; it instead relies on every application to
"remember to forget the name
" of each handle when it is
finished with it. But if one application discards the handle while
another application still thinks it is in use, the second application
will suddenly get whatever random data happens to get placed in the
graphics memory next—presumably by some unrelated application.
GEM handles have other drawbacks, including the fact that they bypass
the normal kernel security mechanisms (in fact, since the handles are
simple integers, they are hypothetically guessable). They are also
specific to GEM, rather than using general kernel infrastructure like
DMA-BUFs.
DRI2 also relies on the X server to allocate all buffers, so applications must first request an allocation, then wait for the X server to return one. The extra round trip is a problem on its own, but server allocation of buffers also breaks resizing windows, since the X server immediately allocates a new, empty back buffer. The application does not find out about the new allocation until it receives and processes the (asynchronous) event message from the server, however, so whatever frame the application was drawing can simply get lost.
The plan is to fix these problems in DRI2's successor, which Packard referred to in slides as "DRI3000" because, he said, it sounded futuristic. This DRI framework will allow clients, not the X server, to allocate buffers, will use DMA-BUF objects instead of global GEM handles, and will incorporate several strategies to reduce the number of copy operations. For example, as long as the client application is allocating its own buffer, it can allocate a little excess space around the edges so that the window manager can draw window decorations around the outside. Since most of the time the window decorations are not animated, they can be reused from one frame to the next. Compositing the window and decoration will thus be faster than in the current model, which copies the application content on every frame just to draw the window decorations around it. Under the new scheme, if the client knows that the application state has not changed, it does not need to trigger a buffer swap.
Moving buffer management out of the X server and into the client has other benefits as well. Since the clients allocate the buffers they use, they can also assign stable names to the buffers (rather than the global GEM handles currently assigned by the server), and they can be smarter about reusing those buffers—such as by marking the freshness of each in the EGL_buffer_age extension. If the X server has just performed a swap, it can report back that the previous front buffer is now idle and available. But if the server has just performed a blit (copying only a small region of updated pixels), it could instead report back that the just-used back buffer is idle instead.
There are also copy operations to be trimmed out in other ways,
such as by aligning windows with GPU memory page boundaries. This
trick is currently only doable on Intel graphics hardware, Packard
said, but results in about a 50% improvement gain from the status quo
to the hypothetical upper limit. He already has much of the
DRI-replacement work functioning ("at least on my
machine
") and is targeting X server 1.15 for its release. The
page-swapping tricks are not as close to completion; a new kernel
ioctl() has been written to allow exchanging chunks of GPU
pages, but the page-alignment code is not yet implemented.
New tricks for new hardware
Airlie's talk focused more on supporting multiple displays and
multiple graphics cards. This was not an issue in the early days, of
course, when the typical system had one graphics card tied to one
display; a single Screen (as defined by the X Protocol) was
sufficient. The next step up was simply to run a separate Screen for
the second graphics card and the second display—although, on the
down side, running two separate screens meant it was not possible to
move windows from one display to the other. Similar was "Zaphod"
mode, a configuration in which one graphics card was used to drive two
displays on two separate Screens. The trick was that Zaphod mode used
two copies of the GPU driver, with one attached to each screen. Here
again, two Screens meant that it was not possible to move windows
between displays.
Things started getting more interesting with Xinerama, however. Xinerama mode introduced a "fake" Screen wrapped around the two real Screens. Although this approach allowed users to move windows between their displays, it did this at the high cost of keeping two copies of every window and pixmap, one for each real Screen. The fake Screen approach had other weaknesses, such as the fact that it maintained a strict mapping to objects on the real, internal Screens—which made hot-plugging (in which the real objects might appear and disappear instantly) impossible.
Thankfully, he said, RandR 1.2 changed this, giving us for the
first time the ability to drive two displays with one graphics card,
using one Screen. " RandR 1.4 solves hot-plugging of displays, work which required
adding support for udev, USB, and other standard kernel interfaces to
the X server, which up until then had been used its own methods for
bus probing and other key features. RandR 1.4's approach to USB
hotplugging worked by having the main GPU render everything, then
having the USB GPU simply copy the buffers out, performing its own
compression or other tricks to display the content on the USB-attached
display. RandR also allowed the X server to offload drawing part of
the screen (such as a game) in one window on the main display.
The future of RandR includes several new functions, such as
"simple" GPU switching. This is the relatively
straightforward-sounding action of switching display rendering from
running on one GPU to the other. Some laptops have a hardware switch
for this function, he said, while others do it in software.
Another new feature is what Airlie calls "Shatter," which splits up
rendering of a single screen between multiple GPUs.
Airlie said he has considered several approaches to getting to
this future, but at the moment Shatter seems to require adding a
layer of abstraction he called an "impedance" layer between X server
objects and GPU objects. The impedance layer tracks protocol objects
and damage events and converts them into GPU objects. " It is sometimes tempting to think of X as a crusty old
relic—and, indeed, both Packard and Airlie poked fun at the
display server system and its quirks more than once. But what both
talks made clear was that even if the core protocol is to be replaced, that
does not reduce window management, compositing, or rendering to a
trivial problem. The constantly-changing landscape of graphics
hardware and the ever increasing expectations of users will certainly
see to that.
Collabora's Daniel Stone presented the final piece of the
linux.conf.au 2013 display server triptych, which started with a pair of talks from Keith Packard and
David Airlie. Stone explained the concepts behind Wayland and how it
relates to X11—because, as he put it, " Stone, who said that he was " The root of the trouble, Stone said, was that—thanks to
politics and an excessive commitment to maintaining
backward compatibility even with ancient toolkits—no one was
allowed to touch the core protocol or the X
server core, even as the needs of the window system evolved and
diverged. For one thing, the XFree86 project, where much of the
development took place, was not itself the X Consortium. For another,
" Things did improve, he said. When the X.Org Foundation was formed,
the project gained a cool domain name, but it also undertook some
overdue development tasks, such as modularizing the X server. The
initial effort may have been too modular, he noted, splitting into
345 git modules, but for the most part it was a positive. Using
autotools, the X server was actually buildable. Modularization
allowed X developers to excise old and unused code; Stone said the
pre-refactoring xserver 1.0.2 release contained 879,403 lines,
compared to 562,678 lines today.
But soon they began adding new features again; repeating the
pile-of-extensions model. According to his calculations, today X
includes a new drawing model (XRender), four input stacks (core X11,
XInput 1.0, 2.0, and 2.2), five display management extensions (core
X11, Xinerama, and the three generations of RandR that Airlie spoke
about), and four buffer management models (core X11, DRI, MIT-SHM, and
DRI2). At that point, the developers had fundamentally changed how X
did everything, and as users wanted more and more features, those
features got pushed out of X and into the client side (theming, fonts,
subwindows, etc.), or to the window manager (e.g., special effects).
That situation leaves the X server itself with very little to do.
Client applications draw everything locally, and the X server hands
the drawing to the window manager to render it. The window manager
hands back the rendered screen, and the X server " Wayland, he said, simply cuts out all of the middleman steps
that the X server currently consumes CPU cycles performing. Client
applications draw locally, they tell the display server what they have
drawn, and the server decides what to put onto the display and where.
Commenters in the "Internet peanut gallery" sometimes argue that X is
"the Unix way," he said. But Wayland fits the "do one thing, do it
well" paradigm far better. " Stone then turned his attention to providing a more in-depth
description of how Wayland works. The first important idea is that in
Wayland, every frame is regarded as "perfect." That is, the client
application draws it in a completed form, as opposed to X, where
different rectangles, pixmaps, and text can all be sent separately by
the client, which can result in inconsistent on-screen behavior. DRI2
almost—but not quite—fixed this, but it had limitations
(chiefly that it had to adhere to the core X11 protocol).
Wayland is also "descriptive" and not "prescriptive," he said. For
example, in X, auxiliary features like pop-up windows and screensavers
are treated exactly like application windows: they grab keyboard and
window input and must be positioned precisely on screen. Unpleasant
side effects result, such as being unable to use the volume keys when
a screensaver is active, and being unable to trigger the screensaver
when a menu is open on the screen. With Wayland, in contrast, the
application tells the server that a frame is a pop-up and lets the
compositor decide how to handle it. Yes, he said, it is possible that
someone would write a bad compositor that would mishandle such a
pop-up—but that is true today as well. Window managers are also
complex today; the solution is to not run the bad ones.
Wayland also uses an event-driven model, which simplifies (among
other things) listening for input devices. Rather than asking the
server for a list of initial input devices which must be parsed (and
is treated separately from subsequent device notifications), clients
simply register for device notifications, and the Wayland server sends
the same type of message for existing devices as it does for any
subsequent hot-plugging events. Wayland also provides " Stone capped off the session with a discussion about Weston, the
reference implementation of a Wayland server, its state of readiness,
and some further work still in the pipeline. Weston is reference
code, he explained. Thus it has plugin-based "shells" for common
desktop features like docks and panels, and it supports existing X
application clients. It offers a variety of output and rendering
choices, including fbdev and Pixman, which he pointed out to refute
the misconception that Wayland requires OpenGL. It also supports
hardware video overlays, which he said will be of higher quality than
the X implementation.
The GNOME compositor Mutter has an out-of-date port to Wayland, he
continued, making it in essence a hybrid X/Wayland
compositor as is Weston. GNOME Shell used to run on Mutter's Wayland
implementation, he said, or at least " Last but clearly not least, Stone addressed the state of Wayland
support for remoting. X11's lousy implementation of IPC, he said, in
which it acts as a middleman between the client and compositor, hits
its worst-case performance when being run over the Internet.
Furthermore, the two rendering modes every application uses (SHM and
DRI2), do not work over the network anyway. The hypothetical "best"
way to implement remoting support, he explained, would be for the
client application to talk to the local compositor only, and have that
compositor speak to the remote compositor, employing image compression
to save bandwidth. That, he said, is precisely what VNC does, and it
is indeed better than X11's remote support. Consequently, Wayland
developer Kristian Høgsberg has been experimenting with implementing
this VNC-like remoting support in Weston, which has its own branch
interested parties can test. " For end users, it will still be a while before Wayland is usable on
Linux desktops outside of experimental circumstances. The protocol
was declared 1.0 in October 2012, as
was Weston, but Weston is still a reference implementation (lacking
features, as Stone described in his talk). It may be a very long time
before applications are ported from X11 to Wayland, but by providing a
feature-by-feature comparison of Wayland's benefits over X, Stone has
crafted a good sales pitch for both application developers and end
users.
It was like ... sanity
",
he concluded, giving people what they had long wanted for
multiple-monitor setups (including temporarily connecting an external
projector for presentations). But the sanity did not last, he continued,
because vendors started manufacturing new hardware that made his life
difficult. First, multi-seat/multi-head systems came out of the
woodwork, such as USB-to-HDMI dongles and laptop docking stations.
Second, laptops began appearing with multiple GPUs, which "
come
in every possible way to mess up having two GPUs
". He has one
laptop, for example, which has the display-detection lines connected
to both GPUs ... even though only one of the GPUs could actually
output video to the connected display.
It's
quite messy
", he said, describing the impedance layer as a
combination of the X server's old Composite Wrapper layer and the
Xinerama layer "munged" together. Nevertheless, he said, it is
preferable to the other approach he explored, which would rely on
pushing X protocol objects down to the GPU layer. At the moment, he
said, he has gotten the impedance layer to work, but there are some
practical problems, including the fact that so few people do X
development that there are only one or two people who would be
qualified to review the work. He is likely to take some time off to
try and write a test suite to aid further development.
Marking the spot
LCA: The ways of Wayland
everything you read on
the Internet about it will be wrong.
"
The Dark Ages
tricked into
" working on X
about ten years ago, reviewed X11's history, starting with the
initial assumption of single-keyboard, single-mouse systems
with graphics hardware focused on drawing rectangles, blitting images,
and basic window management. But then, he continued, hardware got
complicated (from multiple input devices to multiple GPUs), rendering
got complicated (with OpenGL and hardware-accelerated video decoding),
and window management got awful (with multiple desktop environments,
new window types, and non-rectangular windows). As time passed, things
slowly got out of hand for X; what was originally a well-defined
mechanism swelled to incorporate dozens of protocol extensions and
thousands of pages of specifications—although on the latter
point, Packard chimed in to joke that the X developers never wrote
anything that could be called specifications.
no one was the X Consortium; they weren't doing
anything.
" As a result, more and more layers got wrapped
around the X server, working around deficiencies rather than fixing
them. Eventually, the X server evolved into a operating system: it
could run video BIOSes, manage system power, perform I/O port and PCI
device management, and load multiple binary formats. But in spite of
all these features, he continued, it was "the dumbest OS you've
ever seen
". For example, it could generate a
configuration file for you, but it was not smart enough to just
use the correct configuration.
Light at the end of the tunnel
does what it's
told
" and puts it on the display. Essentially, he said, the X
server is nothing but a "terrible, terrible, terrible
"
inter-process communication (IPC) bus. It is not introspectable, and
it adds considerable (and variable) overhead.
What one thing is X doing, and what is it
doing well?
"
The Wayland forward
proper object
lifetimes
", which eliminates X11's fatal-by-default and
hard-to-work-around BadDevice errors. Finally, it side-steps
the problem that can occur when a toolkit (such as GTK+ or Clutter)
and an application support different versions of the XInput
extension. In X, the server only gets one report from the application
about which version is supported; whether that equals the toolkit or
the application's version is random. In Wayland, each component
registers and listens for events separately.
Go Weston
someone demoed it once in
July ... so it's ready for the enterprise
". In fact, Stone is
supposed to bring the GNOME Shell code up to date, but he has not yet
had time. There are implementations for GTK+, Clutter, and Qt all in
upstream git, and there is a Gstreamer waylandvideosink
element, although it needs further work. In reply to a question from
the audience, Stone also commented that Weston's touchpad driver is
still incomplete, lacking support for acceleration and scrolling.
We think it's going to be better
at remoting than X
", Stone said, or at least it cannot be worse
than X.
Security
Recent Java vulnerabilities
Since August 2012, there has been increasing buzz about security holes in Oracle's Java implementation. The hubbub reached such proportions that US National Public Radio (NPR) stations were heard repeating recommendations (originating from CERT) that people disable all Java plugins on their systems. The noise started when Security Explorations (SE), a one-person Polish company run by Adam Gowdiak, went public about security vulnerabilities after malware was detected that exploited two issues SE had reported to Oracle in April 2012.
On February 1, 2013, Oracle released a new version of Java that fixed most of the issues that SE uncovered, with the exception of one, identified as "issue #51." This article describes the history of this process, where the security vulnerabilities were detected, an explanation of the different kinds of vulnerabilities detected, and how all of these relate to OpenJDK.
SE has been examining security issues in Java, and detecting points of attack, since 2002. It has worked closely and, for the most part, amicably with Sun and Oracle, and has exercised what has been called "responsible disclosure," (i.e. notifying companies about their vulnerabilities, and refusing to release details about known holes, until the companies have had time to fix them). This relationship was strained during 2012 after Oracle failed to address all of the issues that SE had reported to it in the patches to Java over a six-month period. After the malware attack of August 2012, SE went public, presenting at Devoxx [PDF], and releasing a technical report [PDF] in November 2012. These disclosures claim that, as far back as 2005, SE reported to Sun on many of the weaknesses that led to the current issues. Both disclosures detail the specific issues detected by SE and ways to exploit them to effect a complete security compromise of a Java installation.
Around the same time, SE stepped up its research, finding not only 31 issues in Oracle's Java, but 17 in IBM's version, and 2 in Apple's. Of these, 17 Oracle bugs could result in a full compromise of the Java security sandbox, which is the means by which Java isolates potentially untrustworthy software. Since OpenJDK uses the same code base as Oracle's Java, those issues were present in OpenJDK as well. SE's November technical report lists 50 known issues in total. Several more were reported to Oracle by SE since that report, but are not yet public. After Oracle's February update of Java only one issue, issue #51, remains unresolved.
Most of the issues discovered relate to the Java Reflection API. This is a powerful tool that provides for dynamic loading of classes, as well as access to their members, and is what makes component architectures, like Java Beans, possible. However, there is inherent risk in the very nature of allowing access across unknown classes.
The kinds of access allowed include:
- Obtaining an object of a given class, given the name of the class, via forName().
- obtaining the methods of a class using getMethods()
- Invoking a method in another class via the method invoke(), which allows the caller to provide the arguments to the called methods.
There are Field and Method classes that correspond to the underlying fields and methods, as well as a Constructor class that allows you to create new instances of classes. These all inherit from the java.lang.reflect.AccessibleObject class, which has a private field called "override". If override is true then operations and accesses are allowed to the caller regardless of the caller's privileges.
In its research, SE found numerous places where combinations of misuses of forName() and invoke(), along with improper access to the override field allowed systems to become vulnerable. In addition, there is a type field of the Field object that represents the type of an underlying object. In the technical report, Gowdiak imagined a scenario where:
SE further asserts that one can impersonate trusted callers via controlling the parameters of Reflection API calls made by system classes.
In Java 7, Oracle added another level of security, via indirection, called a "lookup class." What SE found was that the lookup classes themselves were vulnerable. The security check is conducted in the MethodHandles.Lookup class prior to any method handle creation. This check allows for access to arbitrary members (methods, constructors, and fields) of restricted classes if the lookup object and a target class are from the same class loader namespace. Also, by default, a lookup object instance uses a caller of the MethodHandles.Lookup() method as a lookup class. Therefore, a security breach can be effected by calling this method from system code to create a lookup object with a system class.
In SE's technical report there are numerous examples of all the exploitation vectors that they used to compromise the Java security sandbox. All were combinations of the weaknesses described above, since no one weakness by itself was sufficient to escape the sandbox. There are a number of consequences of these exploits, including: an attacker could define a class and cause it to be loaded into a privileged class loader namespace; security checking could be completely turned off (via calling SetSecurityManager() with a NULL argument); permissions of an unsafe object could be changed at will; malicious classes could inherit from privileged classes and redefine trusted methods with malicious ones; or any combination of those.
The relationship of OpenJDK to Oracle's Java Standard Edition (SE) is
complex. OpenJDK is
the reference implementation for Oracle's Java SE. However, bug fixes do
not automatically propagate from one to the other (in either direction),
since they the projects are developed independently. That said, one week after Oracle
released its fixes to Java, OpenJDK 7 was updated to
reflect all of the fixes. OpenJDK users will want to upgrade at the
first opportunity.
In its report, SE noted that it searched for holes in Java precisely because Java's security is so good. A more timely response from Oracle might have been desirable. However, at this point, nine months from when Oracle learned of the deficiencies, it issued a release that fixed all of the bugs detailed in SE's technical report, as well as several that were only identified in the last couple months.
[The author wishes to thank the many contributors to the Fedora project's Java developers list, who provided valuable information on the upgrades to OpenJDK and their relationship to Oracle's releases. A special shout out goes to Omair Majid, who provided links to the information as well.]
A survey on vulnerability and update information in LWN
Here at LWN, we are considering making some changes to how we handle security advisories from distributors and the vulnerabilities to which they refer. Before doing anything rash, though, we'd like to ask you, our readers, what you think. If you have a moment, please have a look at this article containing a discussion of the situation and a quick survey on how useful our update and vulnerability information is now. The answers we get will guide us in any changes that we may decide to make.
Brief items
Security quotes of the week
Emont: Video decoding in a sandbox
Guillaume Emont describes his work using the Chromium sandbox mechanism to make video decoding in GStreamer more secure. "The way setuid-sandbox works is rather straightforward: there is a sandboxme command that needs to be installed setuid root. You run sandboxme my_command and then from inside my_command, you first set up the file descriptors that you will need (being careful not to put there anything that could allow to escape the sandbox, more on that later), and then you call the provided chrootme() function, which will tell the sandboxme process to restrict the privileges that my_command has (e.g. it can still read and write on the fds that it has open, but it cannot open new ones)."
New vulnerabilities
android-tools: temporary file vulnerability
| Package(s): | android-tools | CVE #(s): | CVE-2012-5564 | ||||||||||||
| Created: | February 10, 2013 | Updated: | February 13, 2013 | ||||||||||||
| Description: | The adb tool creates a log file under /tmp with a static name, making it vulnerable to symbolic link attacks. | ||||||||||||||
| Alerts: |
| ||||||||||||||
curl: code execution
| Package(s): | curl | CVE #(s): | CVE-2013-0249 | ||||||||||||||||
| Created: | February 8, 2013 | Updated: | February 25, 2013 | ||||||||||||||||
| Description: | From the cURL advisory:
libcurl is vulnerable to a buffer overflow vulnerability when communicating with one of the protocols POP3, SMTP or IMAP. When negotiating SASL DIGEST-MD5 authentication, the function Curl_sasl_create_digest_md5_message() uses the data provided from the server without doing the proper length checks and that data is then appended to a local fixed-size buffer on the stack. This vulnerability can be exploited by someone who is in control of a server that a libcurl based program is accessing with POP3, SMTP or IMAP. For applications that accept user provided URLs, it is also thinkable that a malicious user would feed an application with a URL to a server hosting code targetting this flaw. This vulnerability can be used for remote code execution (RCE) on vulnerable systems. | ||||||||||||||||||
| Alerts: |
| ||||||||||||||||||
dnsmasq: access restriction bypass
| Package(s): | dnsmasq | CVE #(s): | CVE-2013-0198 | ||||||||||||||||||||
| Created: | February 7, 2013 | Updated: | February 18, 2013 | ||||||||||||||||||||
| Description: | From the Mageia advisory: This update completes the fix for CVE-2012-3411 provided with dnsmasq-2.63. It was found that after the upstream patch for CVE-2012-3411 issue was applied, dnsmasq still: - replied to remote TCP-protocol based DNS queries (UDP protocol ones were corrected, but TCP ones not) from prohibited networks, when the --bind-dynamic option was used, - when --except-interface lo option was used dnsmasq didn't answer local or remote UDP DNS queries, but still allowed TCP protocol based DNS queries, - when --except-interface lo option was not used local / remote TCP DNS queries were also still answered by dnsmasq. | ||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||
drupal: multiple vulnerabilities
| Package(s): | drupal | CVE #(s): | |||||
| Created: | February 7, 2013 | Updated: | February 13, 2013 | ||||
| Description: | From the Mageia bug report: Multiple vulnerabilities were fixed in the supported Drupal core versions 7(DRUPAL-SA-CORE-2013-001). * A reflected cross-site scripting vulnerability (XSS) was identified in certain Drupal JavaScript functions that pass unexpected user input into jQuery causing it to insert HTML into the page when the intended behavior is to select DOM elements. Multiple core and contributed modules are affected by this issue. * A vulnerability was identified that exposes the title or, in some cases, the content of nodes that the user should not have access to. * Drupal core provides the ability to have private files, including images. A vulnerability was identified in which derivative images (which Drupal automatically creates from these images based on "image styles" and which may differ, for example, in size or saturation) did not always receive the same protection. Under some circumstances, this would allow users to access image derivatives for images they should not be able to view. | ||||||
| Alerts: |
| ||||||
gnome-screensaver: unauthorized session access
| Package(s): | gnome-screensaver | CVE #(s): | CVE-2013-1050 | ||||
| Created: | February 12, 2013 | Updated: | February 13, 2013 | ||||
| Description: | From the Ubuntu advisory:
It was discovered that gnome-screensaver did not start automatically after logging in. This may result in the screen not being automatically locked after the inactivity timeout is reached, permitting an attacker with physical access to gain access to an unlocked session. | ||||||
| Alerts: |
| ||||||
gnutls: plaintext recovery
| Package(s): | gnutls | CVE #(s): | CVE-2013-1619 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 13, 2013 | Updated: | September 3, 2013 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the CVE entry:
The TLS implementation in GnuTLS before 2.12.23, 3.0.x before 3.0.28, and 3.1.x before 3.1.7 does not properly consider timing side-channel attacks on a noncompliant MAC check operation during the processing of malformed CBC padding, which allows remote attackers to conduct distinguishing attacks and plaintext-recovery attacks via statistical analysis of timing data for crafted packets, a related issue to CVE-2013-0169. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
gnutls: denial of service
| Package(s): | gnutls | CVE #(s): | CVE-2012-1663 | ||||
| Created: | February 12, 2013 | Updated: | February 13, 2013 | ||||
| Description: | From the CVE entry:
Double free vulnerability in libgnutls in GnuTLS before 3.0.14 allows remote attackers to cause a denial of service (application crash) or possibly have unspecified other impact via a crafted certificate list. | ||||||
| Alerts: |
| ||||||
ircd-hybrid: denial of service
| Package(s): | ircd-hybrid | CVE #(s): | CVE-2013-0238 | ||||||||||||
| Created: | February 8, 2013 | Updated: | April 10, 2013 | ||||||||||||
| Description: | From the Debian advisory:
Bob Nomnomnom reported a Denial of Service vulnerability in IRCD-Hybrid, an Internet Relay Chat server. A remote attacker may use an error in the masks validation and crash the server. | ||||||||||||||
| Alerts: |
| ||||||||||||||
kernel: denial of service
| Package(s): | kernel | CVE #(s): | CVE-2013-0231 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 8, 2013 | Updated: | June 14, 2013 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Xen
advisory:
Xen's PCI backend drivers in Linux allow a guest with assigned PCI device(s) to cause a DoS through a flood of kernel messages, potentially affecting other domains in the system. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
kernel: privilege escalation
| Package(s): | kernel | CVE #(s): | CVE-2013-0268 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 10, 2013 | Updated: | July 12, 2013 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | The kernel's MSR register driver relied only upon filesystem-level access checks to restrict users who could write registers. As a result, the root user could access registers even if the capabilities that would ordinarily restrict such activity (CAP_SYS_RAWIO) had been dropped. The consequences are severe — execution of arbitrary code in kernel mode — but exploitation requires a process already running as root. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mariadb: password brute-force vulnerability
| Package(s): | mariadb | CVE #(s): | CVE-2012-5627 | ||||||||||||||||||||
| Created: | February 10, 2013 | Updated: | February 13, 2013 | ||||||||||||||||||||
| Description: | The mariadb COM_CHANGE_USER operation fails to abort the session when an incorrect password is supplied, enabling many passwords to be tried in quick succession. | ||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||
mysql/mariadb: information disclosure
| Package(s): | mariadb mysql | CVE #(s): | CVE-2012-5615 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 10, 2013 | Updated: | August 20, 2015 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | The mysql / mariadb server provides different authentication error messages depending on whether the provide user name exists or not. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
openssh: denial of service
| Package(s): | openssh | CVE #(s): | CVE-2010-5107 | ||||||||||||||||||||||||||||||||||||||||
| Created: | February 13, 2013 | Updated: | February 25, 2016 | ||||||||||||||||||||||||||||||||||||||||
| Description: | From the Red Hat bugzilla:
A denial of service flaw was found in the way default server configuration of OpenSSH, a open source implementation of SSH protocol versions 1 and 2, performed management of its connection slot. A remote attacker could use this flaw to cause connection slot exhaustion on the server. | ||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||
openssl: multiple vulnerabilities
| Package(s): | openssl | CVE #(s): | CVE-2013-0166 CVE-2013-0169 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 8, 2013 | Updated: | May 15, 2013 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the OpenSSL advisory:
SSL, TLS and DTLS Plaintext Recovery Attack (CVE-2013-0169) Nadhem Alfardan and Kenny Paterson have discovered a weakness in the handling of CBC ciphersuites in SSL, TLS and DTLS. Their attack exploits timing differences arising during MAC processing. Details of this attack can be found at: http://www.isg.rhul.ac.uk/tls/ TLS 1.1 and 1.2 AES-NI crash (CVE-2012-2686) A flaw in the OpenSSL handling of CBC ciphersuites in TLS 1.1 and TLS 1.2 on AES-NI supporting platforms can be exploited in a DoS attack. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
postgresql: information disclosure/denial of service
| Package(s): | postgresql | CVE #(s): | CVE-2013-0255 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 11, 2013 | Updated: | February 21, 2013 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Red Hat bugzilla:
An array index error, leading to out of heap-based buffer bounds read flaw was found in the way PostgreSQL, an advanced Object-Relational database management system (DBMS), performed retrieval of textual form of error message representation when processing certain enumeration types. An unprivileged database user could issue a specially-crafted SQL query that, when processed by the server component of the PostgreSQL service, would lead to denial of service (daemon crash) or disclosure (of certain portions of) server memory. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
qt: information disclosure
| Package(s): | qt | CVE #(s): | CVE-2013-0254 | ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 13, 2013 | Updated: | March 22, 2013 | ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Red Hat bugzilla:
A security flaw was found in the way QSharedMemory class implementation of the Qt toolkit created shared memory segments (they were created with world-readable and world-writeable permissions). A local attacker could use this flaw to read or alter content of particular shared memory segment, possibly leading to their ability to obtain sensitive information or influence behaviour of shared memory segment reader process. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
rails: protection bypass/code execution
| Package(s): | rails | CVE #(s): | CVE-2013-0276 CVE-2013-0277 | ||||||||||||||||||||||||||||||||||||
| Created: | February 13, 2013 | Updated: | March 15, 2013 | ||||||||||||||||||||||||||||||||||||
| Description: | From the CVE entries:
ActiveRecord in Ruby on Rails 3.2.x before 3.2.12, 3.1.x before 3.1.11, and 2.3.x before 2.3.17 allows remote attackers to bypass the attr_protected protection mechanism and modify protected model attributes via a crafted request. (CVE-2013-0276) Active Record in Ruby on Rails 3.x before 3.1.0 and 2.3.x before 2.3.17 allows remote attackers to cause a denial of service or execute arbitrary code via crafted serialized attributes that cause the +serialize+ helper to deserialize arbitrary YAML. (CVE-2013-0277) | ||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||
sssd: file modification and denial of service
| Package(s): | sssd | CVE #(s): | CVE-2013-0220 CVE-2013-0219 | ||||||||||||||||||||||||||||||||||||||||
| Created: | February 10, 2013 | Updated: | October 11, 2013 | ||||||||||||||||||||||||||||||||||||||||
| Description: | The system security services daemon suffers from two vulnerabilities:
| ||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||
vlc: two code execution flaws
| Package(s): | vlc | CVE #(s): | |||||
| Created: | February 7, 2013 | Updated: | February 13, 2013 | ||||
| Description: | From the Videolan advisories [1, 2]: Summary : Buffer overflows in freetype renderer and HTML subtitle parser When parsing a specially crafted file, a buffer overflow might occur. If successful, a malicious third party could trigger an invalid memory access, leading to a crash of VLC or arbitratry code execution. Summary : Buffer Overflow in ASF Demuxer When parsing a specially crafted ASF movie, a buffer overflow might occur. If successful, a malicious third party could trigger an invalid memory access, leading to a crash of VLC media player's process. In some cases attackers might exploit this issue to execute arbitrary code within the context of the application but this information is not confirmed. | ||||||
| Alerts: |
| ||||||
wireshark: multiple vulnerabilities
| Package(s): | wireshark | CVE #(s): | CVE-2013-1572 CVE-2013-1573 CVE-2013-1574 CVE-2013-1575 CVE-2013-1576 CVE-2013-1577 CVE-2013-1578 CVE-2013-1579 CVE-2013-1580 CVE-2013-1581 CVE-2013-1582 CVE-2013-1583 CVE-2013-1584 CVE-2013-1585 CVE-2013-1586 CVE-2013-1587 CVE-2013-1588 CVE-2013-1589 CVE-2013-1590 | ||||||||||||||||||||||||
| Created: | February 12, 2013 | Updated: | March 8, 2013 | ||||||||||||||||||||||||
| Description: | From the openSUSE advisory:
wireshark 1.8.5 fixes bugs and security issues. Vulnerabilities fixed:
| ||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||
wordpress: cross-site scripting and request forgery
| Package(s): | wordpress | CVE #(s): | CVE-2013-0235 CVE-2013-0236 CVE-2013-0237 | ||||||||||||||||
| Created: | February 10, 2013 | Updated: | July 2, 2013 | ||||||||||||||||
| Description: | The wordpress publishing system suffers from two cross-site scripting vulnerabilities and one server-side request forgery vulnerability that might be exploitable to compromise a site. See the wordpress 3.5.1 release announcement for more information. | ||||||||||||||||||
| Alerts: |
| ||||||||||||||||||
Page editor: Jake Edge
Kernel development
Brief items
Kernel release status
The current development kernel is 3.8-rc7, released on February 8. Linus says: "Anyway, here it is. Mostly driver updates (usb, networking, radeon, regulator, sound) with a random smattering of other stuff (btrfs, networking, so on. And most everything is pretty small."
Stable updates: the 3.7.7, 3.4.30 and 3.0.63 updates were released on February 11; 3.5.7.5 was released on February 8.
The 3.7.8, 3.4.31, and 3.0.64 updates are in the review process as of this writing; they can be expected on or after February 14.
Quotes of the week
We are not going away, we are here to stay. We cannot be silenced or stopped anymore, and we are becoming harder and harder to ignore.
It is only a matter of time before we produce an open source graphics driver stack which rivals your binary in performance. And that time is measured in weeks and months now. The requests from your own customers, for support for this open source stack, will only grow louder and louder.
So please, stop fighting us. Embrace us. Work with us. Your customers and shareholders will love you for it.
Kroah-Hartman: AF_BUS, D-Bus, and the Linux kernel
Greg Kroah-Hartman writes about plans to get D-Bus functionality into the kernel (a topic last covered here in July, 2012). "Our goal (and I use 'goal' in a very rough term, I have 8 pages of scribbled notes describing what we want to try to implement here), is to provide a reliable multicast and point-to-point messaging system for the kernel, that will work quickly and securely. On top of this kernel feature, we will try to provide a 'libdbus' interface that allows existing D-Bus users to work without ever knowing the D-Bus daemon was replaced on their system."
Kernel development news
Some 3.8 development statistics
The release of 3.8-rc7 suggests that the 3.8 development cycle is nearing its close. This has been a busy cycle indeed, with, as of this writing, just over 12,300 non-merge changesets finding their way into the mainline. That makes 3.8 the most active development cycle ever, edging out 2.6.25 and its mere 12,243 changesets. Like it or not, the time for the traditional statistics article has come around; this time, though, your editor has tried looking at things in a different way.But, before getting to that, here's the usual numbers. As of this writing, some 1,253 developers have contributed code to the 3.8 kernel. The most active of those were:
Most active 3.8 developers
By changesets H Hartley Sweeten 426 3.5% Bill Pemberton 381 3.1% Philipp Reisner 238 1.9% Andreas Gruenbacher 210 1.7% Lars Ellenberg 146 1.2% Mark Brown 143 1.2% Sachin Kamat 135 1.1% Al Viro 127 1.0% Tomi Valkeinen 115 0.9% Wei Yongjun 114 0.9% Axel Lin 112 0.9% Johannes Berg 104 0.8% Kevin McKinney 103 0.8% YAMANE Toshiaki 101 0.8% Ben Skeggs 100 0.8% Paulo Zanoni 100 0.8% Ian Abbott 98 0.8% Mauro Carvalho Chehab 91 0.7% Andrei Emeltchenko 84 0.7% Daniel Vetter 82 0.7%
By changed lines Greg Kroah-Hartman 42448 5.8% Sreekanth Reddy 30415 4.2% H Hartley Sweeten 22581 3.1% Naresh Kumar Inna 19378 2.7% Larry Finger 16798 2.3% Paul Walmsley 16720 2.3% Jaegeuk Kim 13470 1.9% Rajendra Nayak 10398 1.4% David Howells 9946 1.4% Wei WANG 9775 1.3% Ben Skeggs 9395 1.3% Jussi Kivilinna 8784 1.2% Philipp Reisner 8596 1.2% Eunchul Kim 8533 1.2% Bill Pemberton 8293 1.1% Nobuhiro Iwamatsu 7795 1.1% Peter Hurley 7671 1.1% Laxman Dewangan 6898 0.9% Lars-Peter Clausen 6537 0.9% Lars Ellenberg 6320 0.9%
H. Hartley Sweeten's position at the top of the changeset list should be
unsurprising by now; he continues the seemingly endless task of cleaning up
the Comedi data acquisition drivers. Bill Pemberton has been working to
rid the kernel of the __devinit markings (and variants),
reflecting the fact that we all live in a hotplug world now. Philipp
Reisner, Andreas Gruenbacher, and Lars Ellenberg all contributed long lists
of changes to the DRBD distributed block
driver; the resulting code dump caused block maintainer Jens Axboe to promise Linus that "Following that, it was both made
perfectly clear that there is going to be no more over-the-wall pulls and
how the situation on individual pulls can be improved.
"
On the lines-changed side, Greg Kroah-Hartman worked on the __devinit removal, but also removed over 37,000 lines of code from the staging tree. Sreekanth Reddy made a number of additions to the mpt3sas SCSI driver, Naresh Kumar Inna contributed the Chelsio FCoE offload driver, and Larry Finger added the rtl8723ae wireless driver.
Some 205 employers (that we know about) supported development on the 3.8 kernel. The most active of these were:
Most active 3.8 employers
By changesets (None) 1580 12.8% Red Hat 1112 9.0% Intel 1076 8.7% (Unknown) 917 7.4% LINBIT 595 4.8% Linaro 572 4.6% Texas Instruments 492 4.0% Vision Engraving Systems 426 3.5% Samsung 410 3.3% SUSE 310 2.5% IBM 287 2.3% 254 2.1% Broadcom 190 1.5% (Consultant) 171 1.4% Wolfson Microelectronics 161 1.3% Freescale 129 1.0% Free Electrons 128 1.0% Parallels 123 1.0% NVidia 121 1.0% NetApp 121 1.0%
By lines changed (None) 79954 11.0% Red Hat 60515 8.3% Intel 46326 6.4% Linux Foundation 43190 5.9% (Unknown) 41097 5.7% Samsung 36596 5.0% (Consultant) 33175 4.6% LSI Logic 30415 4.2% Linaro 29030 4.0% Vision Engraving Systems 26074 3.6% LINBIT 22487 3.1% Chelsio 21534 3.0% Texas Instruments 21276 2.9% IBM 14233 2.0% Broadcom 12236 1.7% Renesas Electronics 11570 1.6% NVidia 10369 1.4% Realsil Microelectronics 9797 1.3% Qualcomm 9345 1.3% SUSE 9139 1.3%
Red Hat remains in its traditional position at the top of the list — but not by much. Perhaps more significant is that some companies that have long shown up in the top 20 have fallen off the list this time; those companies include AMD and Oracle. Meanwhile, we continue to see an increasingly strong showing from companies in the mobile and embedded area.
What are they working on?
Many of the companies in the above list have obvious objectives for their work in the kernel; LINBIT, for example, is a business built around DRBD, and Wolfson Microelectronics is in the business of selling a lot of audio hardware. But if companies just focused on driver work, there would be nobody left to do the core kernel work; thus, a look at what parts of the kernel any specific company is working on will say something about how broad its objectives are. To that end, your editor set out to hack on the gitdm tool to focus on one company at a time. So, for example, from the 3.3 kernel onward (essentially, from the beginning of 2012 to the present), Red Hat's changes clustered in these areas:
Red Hat % Subsystem Notes 34% drivers/ 9% gpu, 6% media, 6% net, 3% md 20% fs/ 3% xfs, 3% nfsd, 2% cifs, 2% gfs2, 1% btrfs, 1% ext4 14% include/ 8% net/ 8% tools/ 7% arch/x86/ 7% kernel/ 2% mm/
(Patches touching more than one subsystem are counted in each, so the percentages can add up to over 100%.)
Red Hat puts a lot of effort into making drivers work, but also has a strong interest in the filesystem subtree. The large proportion of patches going into tools/ reflects Red Hat's continued development of the perf tool.
Intel's focus during the same time period is somewhat different:
Intel % Subsystem Notes 66% drivers/ 22% net, 17% gpu, 4% scsi, 3% acpi, 3% usb 17% net/ 7% bluetooth, 5% mac80211, 3% nfc 13% include/ 7% arch/x86 3% fs/
Intel is a hardware company, so the bulk of its effort is focused on making its products work well in the Linux kernel. Improving memory management or general-purpose filesystems is mostly left for others.
Google's presence in the kernel development community has grown considerably in the last few years. In this case, the pattern of development is different yet again:
% Subsystem Notes 27% drivers/ 4% net, 4% pci, 3% staging, 3% input, 3% gpu 22% net/ 11% ipv4, 5% core, 5% ipv6 21% include/ 11% mm/ 10% fs/ 6% ext4, 1% proc 8% kernel/ 6% arch/arm 5% arch/x86 4% Documentation/
Google has an obvious interest in making the Internet work better, and much of its work in the kernel is aimed toward that goal. But the company also wants Android to work better (thus more driver work, ARM architecture work) and better scalability in general, leading to a lot of core kernel work. Much of Google's work is visible to the outside world in one way or another, so it is nice to see that the company has been reasonably diligent about keeping the relevant documentation current.
While we are on the subject of ARM, what about Linaro? This consortium is very much about hardware enablement, so it would not be surprising to see a focus on the ARM architecture subsystem. And, indeed, that's how it looks:
Linaro % Subsystem Notes 47% drivers/ 5% pinctrl, 4% clk, 4% mmc, 4% mfd, 3% gpu, 3% media 36% arch/arm 12% include/ 9% kernel/ 6% sound/ 5% Documentation/ 2% fs/ 1.5% pstore
Almost everything Linaro does is focused on making the hardware work better; even much of the work on the core kernel is dedicated to timekeeping. And while lots of work in Documentation/ is always welcome, in this case, it mostly consists of device tree snippets.
Finally, what about the largest group of all — developers who are working on their own time? Here is where those developers put their energies:
Unaffiliated developers % Subsystem Notes 68% drivers/ 13% staging, 12% net, 10% gpu, 8% media, 6% usb, 2% hid 14% arch/ 5% arm, 2% mips, 2% x86, 2% sparc 8% include/ 6% net/ 2% batman-adv 3% fs/ 2% Documentation/ 2% sound/ 1% kernel/
Volunteer developers, it seems, share a strong interest in making their own hardware work; they are also the source of many of the patches going into the staging tree. That suggests that, in a time when much of the kernel is becoming more complex and less approachable, the staging tree is providing a way for new developers to get into the kernel and learn the ropes in a relatively low-pressure setting. The continued health of the community depends on a steady flow of new developers, so providing an easy path for developers to get into kernel development can only be a good thing.
And, certainly, from the information found here, one should be able to conclude that the development community remains in good health overall. We are about to complete our busiest development cycle ever with no real signs of strain. For the time being, things seem to be functioning quite well.
Rationalizing CPU hotplugging
One of the leading sources of code churn in the 3.8 development cycle was the removal of the __devinit family of macros. These macros marked code and data that were only needed during device initialization and which, thus, could be disposed of once initialization was complete. These macros are being removed for a simple reason: hardware has become so dynamic that initialization is never complete; something new can always show up, and there is no longer any point in building a kernel that cannot cope with transient devices. Even in this world, though, CPUs are generally seen as being static. But CPUs, too, can come and go, and that is motivating changes in how the kernel manages them.Hotplugging is a familiar concept when one thinks about keyboards, printers, or storage devices, but it is a bit less so for CPUs: USB-attached add-on processors are still relatively rare in the market. Even so, the kernel has had support for CPU hotplug for some time; the original version of Documentation/cpu-hotplug.txt was added in 2006 for the 2.6.16 kernel. That document mentioned a couple of use cases for this feature: high-end NUMA hardware that truly has runtime-pluggable processors, and the ability to disable a faulty CPU in a high-reliability system. Other uses have since come along, including system suspend operations (where all CPUs but one are "unplugged" prior to suspending the system) and virtualization, where virtual CPUs can be given to (or taken from) guests at will.
So CPU hotplug is a useful feature, but the current implementation in the
kernel is not well loved; in a recent patch
set intended to improve the situation, Thomas Gleixner remarked that
"the current CPU hotplug implementation has become an increasing
nightmare full of races and undocumented behaviour.
" CPU hotplug
shows a lot of the signs of a feature that has evolved significantly over
time without high-level oversight; among other things, the sequence of
steps followed for an unplug
operation is not the reverse of the steps to plug in a new CPU. But much
of the trouble associated with CPU hotplug is blamed on its extensive use
of notifiers.
The kernel's notifier mechanism is a way for kernel code to request a callback when an event of interest happens. They are, in a sense, general-purpose hooks that anybody in the kernel can use — and, it seems, just about anybody does. There have been a lot of complaints about notifiers, as is typified by this comment from Linus in response to Thomas's patch set:
Notifiers also make the code hard to understand because there is no easy way to know what will happen when a notifier chain (which is a run-time construct) is invoked: there could be an arbitrary set of notifiers in the chain, in any order. The ordering requirements of specific notifiers can add some fun challenges of their own.
The process of unplugging a CPU requires a surprisingly long list of actions. The scheduler must be informed so it can migrate processes off the affected CPU and shut down the relevant run queue. Per-CPU kernel threads need to be told to exit or "park" themselves. CPU frequency governors need to be told to stop worrying about that processor. Almost anything with per-CPU variables will need to make arrangements for one CPU to go away. Timers running on the outgoing CPU need to be relocated. The read-copy-update subsystem must be told to stop tracking the CPU and to ensure that any RCU callbacks for that CPU get taken care of. Every architecture has its own low-level details to take care of. The perf events subsystem has an impressive set of requirements of its own. And so on; this list is nowhere near comprehensive.
All of these actions are currently accomplished by way of a set of notifier callbacks which, with luck, get called in the right order. Meanwhile, plugging in a new CPU requires an analogous set of operations, but those are handled in an asymmetric manner with a different set of callbacks. The end result is that the mechanism is fragile and that few people have any real understanding of all the steps needed to plug or unplug a CPU.
Thomas's objective is not to rewrite all those notifier functions or fundamentally change what is done to implement a CPU hotplug operation — at least, not yet. Instead, he is focused on imposing some order on the whole process so that it can be understood by looking at the code. To that end, he has replaced the current set of notifier chains with a linear sequence of states to be worked through when bringing up or shutting down a CPU. There is a single array of cpuhp_step structures, one per state:
struct cpuhp_step {
int (*startup)(unsigned int cpu);
int (*teardown)(unsigned int cpu);
};
The startup() function will be called when passing through the state as a new CPU is brought online, while teardown() is called when things are moving in the other direction. Many states only have one function or the other in the current implementation; the eventual goal is to make the process more symmetrical. In the initial patch set, the set of states is:
State startup teardown CPUHP_CREATE_THREADS ✔ CPUHP_PERF_X86_UNCORE_PREP ✔ ✔ CPUHP_PERF_X86_PREPARE ✔ ✔ CPUHP_PERF_BFIN ✔ CPUHP_PERF_POWER ✔ CPUHP_PERF_SUPERH ✔ CPUHP_PERF_PREPARE ✔ ✔ CPUHP_SCHED_MIGRATE_PREP ✔ ✔ CPUHP_WORKQUEUE_PREP ✔ CPUHP_RCUTREE_PREPARE ✔ ✔ CPUHP_HRTIMERS_PREPARE ✔ ✔ CPUHP_TIMERS_PREPARE ✔ ✔ CPUHP_PROFILE_PREPARE ✔ ✔ CPUHP_X2APIC_PREPARE ✔ ✔ CPUHP_SMPCFD_PREPARE ✔ ✔ CPUHP_SMPCFD_PREPARE ✔ CPUHP_SLAB_PREPARE ✔ ✔ CPUHP_NOTIFY_PREPARE ✔ CPUHP_NOTIFY_DEAD ✔ CPUHP_CPUFREQ_DEAD ✔ CPUHP_SCHED_DEAD ✔ CPUHP_CLOCKEVENTS_DEAD ✔ CPUHP_BRINGUP_CPU ✔ CPUHP_AP_OFFLINE Application processor states CPUHP_AP_SCHED_STARTING ✔ CPUHP_AP_PERF_X86_UNCORE_STARTING ✔ CPUHP_AP_PERF_X86_AMD_IBS_STARTING ✔ ✔ CPUHP_AP_PERF_X86_STARTING ✔ ✔ CPUHP_AP_PERF_ARM_STARTING ✔ CPUHP_AP_ARM_VFP_STARTING ✔ ✔ CPUHP_AP_ARM64_TIMER_STARTING ✔ ✔ CPUHP_AP_KVM_STARTING ✔ ✔ CPUHP_AP_X86_TBOOT_DYING ✔ CPUHP_AP_S390_VTIME_DYING ✔ CPUHP_AP_CLOCKEVENTS_DYING ✔ CPUHP_AP_RCUTREE_DYING ✔ CPUHP_AP_SCHED_NOHZ_DYING ✔ CPUHP_AP_SCHED_MIGRATE_DYING ✔ CPUHP_AP_MAZ End marker for AP states CPUHP_TEARDOWN_CPU ✔ CPUHP_PERCPU_THREADS ✔ ✔ CPUHP_SCHED_ONLINE ✔ ✔ CPUHP_PERF_ONLINE ✔ ✔ CPUHP_SCHED_MIGRATE_ONLINE ✔ CPUHP_WORKQUEUE_ONLINE ✔ ✔ CPUHP_CPUFREQ_ONLINE ✔ ✔ CPUHP_RCUTREE_ONLINE ✔ ✔ CPUHP_NOTIFY_ONLINE ✔ CPUHP_PROFILE_ONLINE ✔ CPUHP_SLAB_ONLINE ✔ ✔ CPUHP_NOTIFY_DOWN_PREPARE ✔ CPUHP_PERF_X86_UNCORE_ONLINE ✔ ✔ CPUHP_PERF_X86_ONLINE ✔ CPUHP_PERF_S390_ONLINE ✔ ✔
Looking at that list, one begins to see why the current CPU hotplug mechanism is hard to understand. Things are messy enough that Thomas is not really trying to change anything fundamental in how CPU hotplug works; most of the existing notifier callbacks are still there, they are just invoked in a different way. The purpose of the exercise, Thomas said, was:
Once some high-level order has been brought to the CPU hotplug mechanism, one can think about trying to clean things up. The eventual goal is to have a much smaller set of externally visible states; for drivers and filesystems, there will only be "prepare" and "enable" states available, with no ordering between subsystems. Also, notably, drivers and filesystems will not be allowed to cause a hotplug operation (in either direction) to fail. When the process is complete, the hotplug subsystem should be much more predictable, with a lot more of the details hidden from the rest of the kernel.
That is all work for a future series, though; the first step is to get the infrastructure set up. Chances are that will require at least one more iteration of Thomas's "Episode 1" patch set, meaning that it is unlikely to be 3.9 material. Starting around 3.10, though, we may well see significant changes to how CPU hotplugging is handled; the result should be more comprehensible and reliable code.
The zswap compressed swap cache
Swapping is one of the biggest threats to performance. The latency gap between RAM and swap, even on a fast SSD, can be four orders of magnitude. The throughput gap is two orders of magnitude. In addition to the speed gap, storage on which a swap area resides is becoming more shared and virtualized, which can cause additional I/O latency and nondeterministic workload performance. The zswap subsystem exists to mitigate these undesirable effects of swapping through a reduction in I/O activity.Zswap is a lightweight, write-behind compressed cache for swap pages. It takes pages that are in the process of being swapped out and attempts to compress them into a dynamically allocated RAM-based memory pool. If this process is successful, the writeback to the swap device is deferred and, in many cases, avoided completely. This results in a significant I/O reduction and performance gains for systems that are swapping.
Zswap basics
Zswap intercepts pages in the middle of swap writeback and caches them using the frontswap API. Frontswap has been in the kernel since v3.5 and has been covered by LWN before. It allows a backend driver, like zswap, to intercept both swap page writeback and the page faults for swapped out pages. Zswap also makes use of the "zsmalloc" allocator (discussed below) for compressed page storage.
Zswap seeks to be as simple as possible in its structure and operation. There are two primary data structures. The first is the zswap_entry structure, which contains information about a single compressed page stored in zswap:
struct zswap_entry {
struct rb_node rbnode;
int refcount;
pgoff_t offset;
unsigned long handle; /* zsmalloc allocation */
unsigned int length;
/* ... */
};
The second is the zswap_tree structure which contains a red-black tree of zswap entries indexed by the offset value:
struct zswap_tree {
struct rb_root rbroot;
struct list_head lru;
spinlock_t lock;
struct zs_pool *pool;
};
At the highest level, there is an array of zswap_tree structures indexed by the swap device number.
There is a single lock per zswap_tree to protect the tree structure during lookups and modifications. The higher-level swap code provides certain protections that simplify the zswap implementation by not having to design for concurrent store, load, and invalidate operations on the same swap entry. While this single-lock design might seem like a likely source for contention, actual execution demonstrates that the swap path is largely bottlenecked by other locks at higher levels, such as the anon_vma mutex or swap_lock. In comparison, the zswap_tree lock is very lightly contended. Writeback support, covered in the next section, also led to this single-lock design.
For page compression, zswap uses compressor modules provided by the kernel's cryptographic API. This allows users to select the compressor dynamically at boot time, and gives easy access to hardware compression accelerators or any other future compression engines.
A zswap store operation occurs when a page is selected for swapping by the reclaim system and frontswap intercepts the page in swap_writepage(). The operation begins by compressing the page into a per-CPU temporary buffer. Compressing into the temporary buffer is required because the compressed size, and thus the size of the permanent allocation needed to hold it, isn't known until the compression is actually done. Once the compressed size is known, an object is allocated and the temporary buffer is copied into the object. Lastly, a zswap_entry structure is allocated, populated, and inserted into the tree for that swap device.
If the store fails for any reason, most likely because of an object allocation failure, zswap returns an error which is propagated up through frontswap into swap_writepage(). The page is then swapped out to the swap device as usual.
A load operation occurs when a program page faults on a page table entry (PTE) that contains a swap entry and is intercepted by frontswap in swap_readpage(). The swap entry contains the device and offset information needed to look up the zswap entry in the appropriate tree. Once the entry is located, the data is decompressed directly into the page allocated by the page fault code. The entry is not removed from the tree during a load; it remains up-to-date until the entry is invalidated.
An invalidate operation occurs when the reference count for a particular swap offset becomes zero in swap_entry_free(). In this case, the zswap entry is removed from the appropriate tree, and the entry and the zsmalloc allocation that it references are freed.
To be preemption-friendly, interrupts are never disabled. Preemption is only disabled during compression while accessing the per-cpu temporary buffer page, and during decompression while accessing a mapped zsmalloc allocation.
Zswap writeback
To operate optimally as a cache, zswap should hold the most recently used pages. With frontswap, there is, unfortunately, a real potential for an inverse least recently used (LRU) condition in which the cache fills with older pages, and newer pages are forced out to the slower swap device. To address this, zswap is designed with "resumed" writeback in mind.
As background, the process for swapping pages follows these steps:
- First, an anonymous memory page is selected for swapping and a slot is
allocated in the swap device.
- Next, the page is unmapped from all processes using that page. The
PTEs referencing that page are filled with the swap entry that consists of
the swap type and offset where the page can be found.
- Lastly, the page is scheduled for writeback to the swap device.
When frontswap_store() in swap_writepage() is successful, the writeback step is not performed. However, the slot in the swap device has been allocated and is still reserved for the page even though the page only resides in the frontswap backend. Resumed writeback in zswap forces pages out of the compressed cache into their previously reserved swap slots in the swap device. Currently, the policy is basic and forces pages out from the cache in two cases: (1) when the cache has reached its maximum size according to the max_pool_percent sysfs tunable or, (2) when zswap is unable to allocate new space for the compressed pool.
During resumed writeback, zswap decompresses the page, adds it back to the swap cache, and schedules writeback into the swap slot that was previously reserved. By splitting swap_writepage() into two functions after frontswap_store() is called, zswap can resume writeback from the point where the initial writeback terminated in frontswap. The new function is called __swap_writepage().
Freeing zswap entries becomes more complex with writeback. Without writeback, pages would only be freed during invalidate operations (zswap_frontswap_invalidate page()). With writeback, pages can also be freed in zswap_writeback_pages(). These invalidate and writeback functions can run concurrently for the same zswap entry. To guarantee that entries are not freed while being accessed by another thread, a reference count field (called refcount) is used the zswap_entry structure.
Zsmalloc rationale
One really can't talk about zswap without mentioning zsmalloc, the allocator it uses for compressed page storage, which currently resides in the Linux Staging tree.
Zsmalloc is a slab-based allocator used by zswap; it provides more reliable allocation of large objects in a memory constrained environment than does the kernel slab allocator. Zsmalloc has already been discussed on LWN, so this section will focus more on the need for zsmalloc in the presence of the kernel slab allocator.
The objects that zswap stores are compressed pages. The default compressor is lzo1x-1, which is known for speed, but not so much for high compression. As a result, zswap objects can frequently be large relative to typical slab objects (>1/8th PAGE_SIZE). This is a problem for the kernel slab allocator under memory pressure.
The kernel slab allocator requires high-order page allocations to back slabs for large objects. For example, on a system with a 4K page size, the kmalloc-512 cache has slabs that are backed by two contiguous pages. kmalloc-2048 requires eight contiguous pages per slab. These high-order page allocations are very likely to fail when the system is under memory pressure.
Zsmalloc addresses this problem by allowing the pages backing a slab (or “size class” in zsmalloc terms) to be both non-contiguous and variable in number. They are variable in number because zsmalloc allows a slab to be composed of less than the target number of backing pages. A set of non-contiguous pages backing a slab are stitched together using fields of struct page to create a “zspage”. This allows zsmalloc to service large object allocations, up to PAGE_SIZE, without requiring high-order page allocations.
Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively uses an entire page, resulting in ~50% waste. Hence there are no kmalloc() cache sizes between PAGE_SIZE/2 and PAGE_SIZE. Zswap frequently needs allocations in this range, however. Using the kernel slab allocator causes the memory savings achieved through compression to be lost in fragmentation.
In order to satisfy these larger allocations while not wasting an entire page, zsmalloc allows objects to span page boundaries at the cost of having to map the allocations before accessing them. This mapping is needed because the object might be contained in two non-contiguous pages. For example, in a zsmalloc size class for objects that are 2/3 of PAGE_SIZE, three objects could be stored in a zspage with two non-contiguous backing pages with no waste. The object stored in the second of the three object positions in the zspage would be split between two different pages.
Zsmalloc is a good fit for zswap. Zswap was evaluated using the kernel slab allocator and these issues did have a significant impact on the frontswap_store() success rate. This was due to kmalloc() allocation failures and a need to reject pages that compressed to sizes greater than PAGE_SIZE/2.
Performance
In order to produce a performance comparison, kernel builds were conducted with an increasing number of threads per run in a constant and constrained amount of memory. The results indicate a runtime reduction of 53% and an I/O reduction of 76% with zswap compared to normal swapping. The testing system was configured with:
- Gentoo running v3.7-rc7
- Quad-core i5-2500 @ 3.3GHz
- 512MB DDR3 1600MHz (limited with mem=512m on boot)
- Filesystem and swap on 80GB HDD (about 58MB/s with hdparm -t)
The table below summarizes the test runs.
Baseline zswap Change N pswpin pswpout majflt I/O sum pswpin pswpout majflt I/O sum %I/O MB 8 1 335 291 627 0 0 249 249 -60% 1 12 3688 14315 5290 23293 123 860 5954 6937 -70% 64 16 12711 46179 16803 75693 2936 7390 46092 56418 -25% 75 20 42178 133781 49898 225857 9460 28382 92951 130793 -42% 371 24 96079 357280 105242 558601 7719 18484 109309 135512 -76% 1653
The 'N' column indicates the maximum number of concurrent threads for the kernel build (make -jN) for each run. The next four columns are the statistics for the baseline run without zswap, followed by the same for the zswap run. The I/O sum column for each run is a sum of pswpin (pages swapped in), pswpout (pages swapped out), and majflt (major page faults). The difference between the baseline and zswap runs is shown both in relative terms, as a percentage of I/O reduction, and in absolute terms, as a reduction of X megabytes of I/O related to swapping activity.
A compressed swap cache reduces the efficiency of the page reclaim process. For any store operation, the cache may allocate some pages to store the compressed page. This results in an reduction of overall page reclaim efficiency. This reduction in efficiency results in additional shrinking pressure on the page cache causing an increase in major page faults where pages must be re-read from disk. In order to have a complete picture of the I/O impact, the major page faults must be considered in the sum of I/O.
The next table shows the total runtime of the kernel builds:
Runtime (in seconds) N base zswap %change 8 107 107 0% 12 128 110 -14% 16 191 179 -6% 20 371 240 -35% 24 570 267 -53%
The runtime impact of swap activity is decreased when comparing runs with the same number of threads. The rate of degradation is reduced for increasingly constrained runs when comparing baseline and zswap.
The measurements of average CPU utilization during the builds are:
%CPU utilization (out of 400% on 4 cpus) N base zswap %change 8 317 319 1% 12 267 311 16% 16 179 191 7% 20 94 143 52% 24 60 128 113%
The CPU utilization table shows that with zswap, the kernel build is able to make more productive use of the CPUs, as is expected from the runtime results.
Additional performance testing was performed using SPECjbb. Metrics regarding the performance improvements and I/O reductions that can be achieved using zswap on both x86 and Power7+ (with and without hardware compression acceleration), can be found on this page.
Conclusion
Zswap is a compressed swap cache, able to evict pages from the compressed cache, on an LRU basis, to the backing swap device when the compressed pool reaches it size limit or the pool is unable to obtain additional pages from the buddy allocator. Its approach trades CPU cycles for reduced swap I/O. This trade-off can result in a significant performance improvement as reads to and writes from to the compressed cache are almost always faster that reading from a swap device which incurs the latency of an asynchronous block I/O read.
Patches and updates
Kernel trees
Architecture-specific
Core kernel code
Development tools
Device drivers
Filesystems and block I/O
Memory management
Security-related
Virtualization and containers
Miscellaneous
Page editor: Jonathan Corbet
Distributions
Apache OpenOffice in Fedora
One of the features approved by the Fedora Engineering Steering Committee (FESCo) at its February 6 meeting may come as a bit of a surprise to some: adding Apache OpenOffice (AOO) to the distribution. There was a fair amount of confusion in the fedora-devel mailing list discussion of the feature, mostly surrounding which office suite would be the default—AOO or LibreOffice (LO)—and there were also calls to disallow packaging AOO altogether. Those calls were ignored. But, since both suites have descended from the same parent, OpenOffice.org (OOo), there are some things to be worked out so both can coexist on one system. Beyond that, though, some AOO project members are not happy that LO has "squatted" on its program names—to the point where the issue of trademarks has been raised.
Adding Apache OpenOffice (AOO) to Fedora 19 was proposed by Andrea Pescetti, whose history with OpenOffice goes back to 2003. As with all feature proposals, Fedora product manager Jaroslav Reznik posted the proposal to the mailing list for discussion. That resulted in a fairly long thread—no surprise for fedora-devel.
There were calls to effectively ban AOO from Fedora, but that didn't really seem like a majority opinion, no matter how loudly expressed. Those arguments tended to center around hostility toward Oracle, which "shepherded" OpenOffice.org for a time after acquiring it with Sun. That hostility has bled over to the AOO project after the code was donated to Apache. Some were also concerned about users getting confused about which office suite to install or that adding another large (>1G) package would be a burden on mirrors.
In the end, though, there is nothing about AOO that violates Fedora's packaging guidelines, and giving users a choice of office suites is certainly in line with the distribution's mission (Adam Jackson's famous "Linux is not about choice" message notwithstanding). Beyond just OpenOffice.org descendants, Fedora already offers a number of other office suites (e.g. Calligra) or components (e.g. Gnumeric, AbiWord). As Martin Sourada put it:
At the FESCo meeting, there was essentially no question about approving AOO; the discussion was about technical issues in making LO and AOO coexist. The main problem is that both projects share program names (e.g. soffice, oowriter, ...) that originally came from OOo. If both packages are installed, obviously only one can own those names. FESCo decided to ask the two projects (or really the Fedora packagers of each project) to cooperate, but pointedly said that LO did not have to make any changes to accommodate AOO.
Pescetti announced the FESCo decision on the AOO development mailing list, which resulted in numerous congratulatory messages. The AOO project would clearly like to get into Linux distributions, where its predecessor OOo has almost completely been replaced by LO. Pescetti noted the clashing binary name issue in his announcement, which led to some unhappiness in the thread.
From the perspective of some in the AOO camp, that project is the "real" successor to OOo, and should thus be the inheritor of the names of the binary programs. But, Linux distributions switched to LO as the successor during the several years when there were no OOo releases and AOO either didn't exist or was still incubating. When Oracle donated the code to Apache, it also donated the OpenOffice.org trademark, which led Rob Weir to note:
sudo yum install openoffice.org
That is not part of the problem, though, as the package name for LO in Fedora is not "openoffice.org". The names of the binaries used to invoke the office suite are a different story, though. It is not at all clear that an upstream project gets to decide what the binaries used by a particular distribution are called, trademarked or not. There has been no claim that things like soffice or oocalc are trademarked (and it's not at all clear they could be), but some in the AOO project believe they "belong" to AOO. Jürgen Schmidt described it this way:
Some magic UNO bootstrap code used by UNO client applications used the soffice alias for example. Changing it would break potential client applications.
The other aliases like oowriter are obvious where they come from, why should we change them?
It is important to come back in distros but we should not [easily] give up what belongs to OpenOffice.
Weir is concerned about users
getting confused, noting that the project has already heard from some that
were confused by getting "something else
" when they thought they were installing AOO. He called that
"classic trademark infringement
". Later in the thread, he
made it clear that he is talking about AOO
vs. LO confusion, rather than some other form
of trademark confusion:
Exactly how that confusion has come about (by running oocalc and getting LibreOffice Calc or by installing some package with an ambiguous name, for example) is not described. There is a largely unused openoffice.org alias in the Fedora LO package (pointing to libreoffice), but Pescetti does not think that getting rid of that will be a problem. Beyond that, it's not really clear what trademark infringement disagreements AOO could have with LO (or more precisely in this case, Fedora's packaging of LO). As Pescetti pointed out, even if there are any trademark issues, they should not take precedence over actually packaging AOO for Fedora:
Given a historical perspective, one can understand both sides of the "who gets the binary names" argument. But, other than some possible (mis)use of openoffice.org, it's a bit hard to see a trademark issue in play here. In addition, Debian's "iceweasel" (which is its version of the Firefox web browser) can be invoked by typing "firefox" at the command line—seemingly without any trademark complaints from Mozilla.
Rather than muttering darkly about trademarks, working with LO and the Linux distributions to find an amicable solution on binary names for both projects would seem the proper course. There has been talk of prefxing "lo" or "aoo" for things like oowriter, but the trickiest to solve is likely to be the binary name with the oldest provenance: soffice, which hearkens back to the original StarOffice—grandparent of both AOO and LO.
Brief items
Distribution quote of the week
ROSA Desktop Fresh 2012 GNOME
ROSA community members have released a GNOME variant of ROSA Desktop Fresh 2012.Webconverger 17.0
Webconverger runs on live media (CD/USB) to provide a secure, dedicated web browser for kiosks and thin clients. The 17.0 release has been rebased from Debian squeeze (6.0) to Debian wheezy (7.0) and the new installer now supports USB installation.
Distribution News
Debian GNU/Linux
bits from the DPL: January 2013
DPL Stefano Zacchiroli notes that this is his last report before the start of the DPL election process. He will not run for another term. Other topics in these bits include the search for volunteer admins for Google Summer of Code, maintaining an authoritative list of DFSG-free licenses, trademark policy, cloud images, and more.Debian bugs #800000 and #1000000 contest
Step right up and place your bets. The bug #700000 mark was turned on February 7, so when will bugs #800000 and #1000000 be reported?
Newsletters and articles of interest
Distribution newsletters
- DistroWatch Weekly, Issue 494 (February 11)
- SprezzOS news (February 7)
- Ubuntu Weekly Newsletter, Issue 303 (February 10)
Chakra Linux 2013.02 delivers KDE 4.10 (The H)
The H takes a quick look at Chakra Linux 2013.02. "The latest release of Chakra Linux brings the recently released KDE 4.10 to the users of the Arch Linux based distribution. Chakra Linux 2013.02, code-named "Benz", also includes updates to the distribution's own tools such as its installation assistant and its theme. Chakra was originally aimed at providing a live CD that allowed for easy uptake by new users but still maintained the powerful roots and extensive package selection of Arch. The distribution can be installed and provides a modern Linux desktop; although it is still based on Arch Linux, it now uses its own repositories."
Page editor: Rebecca Sobol
Development
LCA: Chrome OS and open firmware
At linux.conf.au, Google's Duncan Laurie presented a talk (slides) about the company's recent work writing open source firmware for its Chrome OS-based laptops. Although there are still some instances of closed firmware on the current crop of Chrome OS devices, the project is clearly making progress on that front—with potential benefits for other free software systems as well.
Laurie, who outlined his previous work on open source firmware projects at Cobalt Networks and Sun, started off by emphasizing that Chrome OS is not a general-purpose operating system. Instead, it is targeted at specific devices, over which Google has control—although, he added, there are people who have built Chromium OS (the open source version of Chrome OS, analogous to the Chromium browser) distributions for other hardware. Google's control over the devices means imposing requirements on the hardware supplier that other Linux distributions generally do not mandate, including support for security features like Chrome OS's verified boot process and inclusion of a Trusted Platform Module (TPM).
But as long as the company is dictating to the hardware supplier, he said, it thought it would be interesting to see if it could be disruptive and open up parts of the firmware payload that are usually closed. The first two generations of Chrome OS devices shipped with an EFI firmware stack that Google licensed from an existing firmware vendor. But Google has no freedom to release any of that code under an open source license, including the code it wrote to integrate its security features.
Opening up the firmware code offers several advantages to the company, including the ability to devote more resources to the firmware project if the project is in danger of missing its ship date, as well as simply having more eyes scrutinize the code. Developing firmware includes tackling a number of hard problems on modern Intel and ARM systems, he said, and working on it regularly uncovers new bugs, from the hardware level on up. Controlling the firmware also gives Google the opportunity to deliver consistency across multiple architectures, and to work on reducing power consumption and increasing performance.
BIOS and booting
Laurie then provided a detailed tour through the Chrome OS firmware stack, highlighting which pieces are open, which are not but could become open, and which have no real chance of ever being opened. His talk focused on the x86 architecture, because it is the platform he and the project as a whole have spent the most time on.
At the core of the firmware stack is coreboot, the open source BIOS
replacement. Both coreboot's founder and its current maintainer both
work on the Chrome OS firmware team, a fact that Laurie said made
coreboot "a pretty easy choice for us.
"
Coreboot's overall structure is quite similar to EFI, he said, since
these days there are a fairly well-established set of stages involved
in getting a system up and running. One key difference is that
coreboot does not include a bootloader; it has a more general
"payload" instead, which allows you more flexibility in what to
do when bringing up the board. Other interesting differences include
the fact that coreboot's first stage ("bootblock") is written
partly in assembly, but partly in C that is compiled with romcc, a special
compiler that uses only processor registers for variable storage.
Chrome OS uses the U-Boot bootloader as its coreboot payload. This is a bit unusual for x86, he said, since U-Boot is mostly commonly known as an ARM bootloader, but the project chose it because Google had already done the work to integrate its verified boot process into U-Boot. Nevertheless, U-Boot is a bit more complicated than Chrome OS really requires, so the project is keeping the door open to other bootloaders in future releases.
Closed bits and open bits
Coreboot also executes some closed-source binaries, the first of which is the Intel reference code binary required to get memory up and running. This code is provided by Intel to licensed BIOS vendors. It is normally supplied as an EFI module, Laurie said, but Chrome OS wrote wrappers around it to enable its use by coreboot. The project has put binary blobs of those packages on coreboot.org for Sandy Bridge and Ivy Bridge, but the distribution terms prevent them for doing so for other generations of hardware. Google has been working with Intel to find a way for the chip vendor to distribute these newer coreboot binaries, however. The effort is called the Intel Firmware Support Package, and is documented online.
Another blob of closed code provided by Intel is the firmware for the Management Engine, a microcontroller handling various features like clock signal generation. Laurie said the Management Engine makes life more difficult for Chrome OS, since it is quite large (from 1.5 to 5 MB) and difficult to configure and debug. Configuration of the blob is mandatory, and can only be done with an application Intel provides for Windows machines only.
Chrome OS does not normally need to initialize video hardware, since the Intel i915 graphics driver in the Linux kernel can bring up and initialize graphics on its own. However, Chrome OS does need video capabilities for recovery mode and developer mode, so the project currently includes the Intel-provided binary blob for this purpose. But it is exploring another option: extracting the initialization code from the kernel driver to use separately.
The Embedded Controller (EC) is another microcontroller found in virtually all notebooks, and which is responsible for a variety of platform tasks like fan control, battery charging (which, after all, must work even when the computer is powered down), and lid and power-button control. On x86, the EC usually has both a firmware interface and an ACPI interface. Laurie described the EC as the component "closest to his heart;" the project wrote its own EC code after failing to design hardware that would work without an embedded controller at all. The code is open, and is now shipping in the Samsung ARM Chromebook. EC firmware is a tempting (if largely unknown) security target, he said, so Chrome OS has incorporated it into its verified boot process.
Firmware features
Speaking of verified boot, Laurie explained that the feature has a similar design to UEFI Secure Boot, but that its "root of trust" is read-only firmware. To replace this read-only firmware requires physical access to the machine, but Google provided documentation on how to do this for all Chrome OS products. In operation, the read-only firmware verifies the signature of the updatable "read-write" firmware, which in turns verifies the signature of the kernel. The read-only portion of the firmware contains the initial coreboot stages and U-Boot; the read-write firmware would contain any updates and fixes, but is primarily used to verify and boot the kernel.
Chrome OS also includes a "recovery mode" that can be used to restore a
system that has been compromised: recovery mode is stored in read-only firmware
and will only boot a signed image from a USB stick, not from the
local disk. Recovery mode can be triggered if the verified boot
process detects a compromise, or can be initiated by the user (on
current Chromebooks, via a hardware switch which ensures the user is
physically present). Similar methods allow the user to access
"developer mode," which is a jailbreak mode built into every
Chromebook device. "Chrome OS is very locked down by default,
but we really don't want to lock people out from doing
interesting things with the hardware they own
".
Chrome OS firmware also features a persistent log of system events—because, as Laurie explained, firmware is often the first place people point fingers when they encounter a bug. The log is based on the SMBIOS System Event log, but it has a kernel sysfs interface as well—which makes it possible to debug some kernel problems through the firmware log. The firmware also saves all console output generated during the boot process to a memory buffer, which is then exported at /sys/firmware/log.
Laurie provided a brief overview of the ARM boot process as it differs from the x86 equivalent. Although the vendor-provided binaries appear in different places, they are present on the ARM devices as well. Nevertheless, the boot process is quite similar, ending up with U-Boot booting the signature-verified kernel.
Ultimately, he concluded, it is important to have as much open firmware as possible, since the different layers interact so much even in seemingly simple scenarios. For example, opening a laptop lid cycles through a number of policy decisions (such as whether the user is logged in or not) and hardware events (such as laptop lid switch signals, routed through the Embedded Controller and relayed to the kernel through ACPI events). Most free software fans would likely agree with Laurie's sentiment, although the full inventory of firmware modules involved on a modern PC might come as a surprise to some.
Brief items
Quotes of the week
In fact, most new languages seem to be *regressing* here. Both Perl and Python were already fairly bad at this, Java bailed on the problem and shoved it under the carpet of a version of static linking, and now Go is even worse than they are and is explicitly embracing static linking. Sigh.
LibreOffice 4.0 released
The Document Foundation has announced the release of version 4.0 of the LibreOffice free office suite. As might be guessed, there is a whole pile of new features in the release, including content and document management system integration, better DOCX support, performance improvements, Android-based remote control for Impress, and more. There has also been a great deal of change to the code base: "The resulting code base is rather different from the original one, as several million lines of code have been added and removed, by adding new features, solving bugs and regressions, adopting state of the art C++ constructs, replacing tools, getting rid of deprecated methods and obsoleted libraries, and translating twenty five thousand lines of comments from German to English. All of this makes the code easier to understand and more rewarding to be involved with for the stream of new members of our community."
NumPy 1.7.0 available
Version 1.7.0 of the NumPy library is now available. This release includes a random sample generator, improvements to vectorize, and a where parameter for ufuncs, which "allows the use of boolean arrays to choose
where a computation should be done
".
New release of xrandr, version 1.4.0
Aaron Plattner has announced the availability of xrandr 1.4.0. This is the command line tool for accessing the X11 Resize, Rotate,
and Reflect (RandR) extension. New features include support for RandR 1.4's provider objects, new scaling options, and the ability to use the Border property to "configure different border adjustments for different edges of the screen.
"
ODB C++ ORM 2.2.0 released
A major new release of ODB 2.2.0, the object-relational mapping (ORM) system for C++, is available. This version features several new features, starting with the ability to connect to multiple database systems from the same application. Other features include Qt5 support, automatically-derived SQL name transformations, and "prepared queries," which are described as "a thin wrapper around the
underlying database system's prepared statements functionality.
Prepared queries provide a way to perform potentially expensive
query preparation tasks only once and then execute the query
multiple times.
"
SystemTap release 2.1
Version 2.1 of the system diagnostic tool SystemTap has been released. Highlights include customizable aggregate array sorting, optional time limit suppression, and experimental compiled regex support. Emacs and Vim editor modes for SystemTap are now bundled as well.
Newsletters and articles
Development newsletters from the past week
- Caml Weekly News (February 12)
- What's cooking in git.git (February 9)
- What's cooking in git.git (February 12)
- Haskell Weekly News (February 2)
- Openstack Community Weekly Newsletter (February 8)
- Perl Weekly (February 11)
- PostgreSQL Weekly News (February 11)
- Ruby Weekly (February 7)
Gräßlin: Client Side Window Decorations and Wayland
KWin hacker Martin Gräßlin discusses client side decorations (CSD) and Wayland on his blog. He notes that while Weston—the reference Wayland compositor—requires CSD, nothing in the Wayland protocol does. "I had a talk with Andy from Qt Wayland fame about the CSD implementation and he explained [to] me that inside Qt the CSD code gives some overhead and that they have a flag to turn them off. Which is great. And we in KWin already have server side decorations and will need to keep them around for legacy X applications. What's the point then to use CSD in Qt if we already have the decorations and can give the application a better performance? Well none and that's why I plan to use server side decoration in KWin on Wayland."
OpenPlans: EveryBlock and OpenBlock (and something new)
The OpenPlans blog reports on the abrupt shutdown of EveryBlock, a popular "hyperlocal" news site run by NBC (and which was initially based on open source code). "What we lost today was a powerful (closed) engine for gathering data from many different sources and making sense of it,
" OpenPlans says, adding that it hopes the EveryBlock shutdown will reignite interest in the open source fork of the original codebase, OpenBlock. Others have commented on the sudden shutdown as well, including Mozilla OpenNews, which said the site "exemplified new approaches
" to journalism.
Chromatic: Goodnight, Parrot
Perl developer Chromatic has posted a post-mortem of sorts for the Parrot virtual machine. "Because volunteer time and interest and skills are not fungible, some of the people working Parrot had goals very different from mine. I wanted a useful and usable Perl 6 which allowed me to use (for example) PyGame and NLTK from Python and (if it had existed at the time) a fast CSS traversal engine from a JavaScript implementation. Other people wanted other things which had nothing to do with Perl 6. I won't speak for anyone else, but I suspect that the combination of a deliberate distancing of Parrot from Perl 6, including separate repositories, the arm's length of a six month deprecation policy, and an attempt to broaden Parrot's focus beyond just Rakudo created rifts that have only widened by now."
Pitt: umockdev: record and mock hardware for debugging and testing
Martin Pitt introduces umockdev, a device simulation library. "The umockdev-run program builds a sandbox using libumockdev, can load *.umockdev and *.ioctl files into it, and run a program in that sandbox. I. e. it is a CLI interface to libumockdev, which is useful in the 'debug a failure with a particular device' use case if you get the text dumps from a bug report."
LibreOffice 4.0: First Take (ZDNet)
ZDNet reviews the LibreOffice 4.0 release. "The Document Foundation (the organization behind LibreOffice) calls version 4 a milestone release. It's hard to agree, though — unless the milestone is more like the starting line. On the surface this looks like a welcome point release that improves compatibility, although bringing the Android remote presentation control to Windows will make it more significant. However, after all this time we were hoping for a much more major update."
Page editor: Nathan Willis
Announcements
Brief items
Announcing Google Summer of Code 2013
Google Summer of Code will return for its ninth year. According to the schedule mentoring organizations will need to submit applications from March 18-29, 2013. The student application period opens April 22 and closes May 3.OIN Exceeds 500th Licensee Milestone
Open Invention Network (OIN) has announced that the OIN licensee community exceeds 500 licensees. "While numerous defensive patent management entities have formed to protect small groups of companies, OIN is distinguished as a resource available to all of the technology industry. Established companies like Cisco, HP and Google as well as emerging entities like Twitter and Facebook benefit from leverage against patent aggression and access to OIN's shared IP resources. OIN's patents cover areas that include big data and analytics, cloud computing, smart mobile devices and mobile broadband, and social networking, among others."
Articles of interest
FSFE: I love Free Software Day
The Free Software Foundation Europe is asking free software users to show their appreciation on February 14, "I love Free Software Day". ""Every day, we use Free Software and often take it for granted. We write bug reports, tell others how they should improve their software, or ask them for new features - and often we are not shy about criticising. So, to let the people in Free Software receive a positive feedback at least once a year, there is the 'I love Free Software day'." says Matthias Kirschner, who initiated the FSFE's#ilovefs campaign."
Garrett: The Samsung laptop issue is not fixed
Matthew Garrett has posted a brief warning for Samsung laptop owners on his blog: "The recent Linux kernel commits avoid one mechanism by which Samsung laptops can be bricked, but the information we now have indicates that there are other ways of triggering this. It also seems likely that it's possible for a userspace application to cause the same problem under Windows. We're still trying to figure out the full details, but until then you're safest ensuring that you're using BIOS mode on Samsung laptops no matter which operating system you're running."
Calls for Presentations
Akademy 2013 Call for Presentations
The call for presentations and open registration have been announced for Akademy 2013. The conference will take place July 13-19 in Bilbao, Spain. The CfP closes March 15.PyCon Singapore 2013 Call for Proposals
The call for proposals for PyCon Singapore is open until April 1. PyCon SG will take place June 13-15 in Singapore.
Upcoming Events
Events: February 14, 2013 to April 15, 2013
The following event listing is taken from the LWN.net Calendar.
| Date(s) | Event | Location |
|---|---|---|
| February 15 February 17 |
Linux Vacation / Eastern Europe 2013 Winter Edition | Minsk, Belarus |
| February 18 February 19 |
Android Builders Summit | San Francisco, CA, USA |
| February 20 February 22 |
Embedded Linux Conference | San Francisco, CA, USA |
| February 22 February 24 |
Southern California Linux Expo | Los Angeles, CA, USA |
| February 22 February 24 |
FOSSMeet 2013 | Calicut, India |
| February 22 February 24 |
Mini DebConf at FOSSMeet 2013 | Calicut, India |
| February 23 February 24 |
DevConf.cz 2013 | Brno, Czech Republic |
| February 25 March 1 |
ConFoo | Montreal, Canada |
| February 26 March 1 |
GUUG Spring Conference 2013 | Frankfurt, Germany |
| February 26 February 28 |
ApacheCon NA 2013 | Portland, Oregon, USA |
| February 26 February 28 |
O’Reilly Strata Conference | Santa Clara, CA, USA |
| March 4 March 8 |
LCA13: Linaro Connect Asia | Hong Kong, China |
| March 6 March 8 |
Magnolia Amplify 2013 | Miami, FL, USA |
| March 9 March 10 |
Open Source Days 2013 | Copenhagen, DK |
| March 13 March 21 |
PyCon 2013 | Santa Clara, CA, US |
| March 15 March 17 |
German Perl Workshop | Berlin, Germany |
| March 15 March 16 |
Open Source Conference | Szczecin, Poland |
| March 16 March 17 |
Chemnitzer Linux-Tage 2013 | Chemnitz, Germany |
| March 19 March 21 |
FLOSS UK Large Installation Systems Administration | Newcastle-upon-Tyne , UK |
| March 20 March 22 |
Open Source Think Tank | Calistoga, CA, USA |
| March 23 | Augsburger Linux-Infotag 2013 | Augsburg, Germany |
| March 23 March 24 |
LibrePlanet 2013: Commit Change | Cambridge, MA, USA |
| March 25 | Ignite LocationTech Boston | Boston, MA, USA |
| March 30 | NYC Open Tech Conference | Queens, NY, USA |
| March 30 | Emacsconf | London, UK |
| April 1 April 5 |
Scientific Software Engineering Conference | Boulder, CO, USA |
| April 4 April 5 |
Distro Recipes | Paris, France |
| April 4 April 7 |
OsmoDevCon 2013 | Berlin, Germany |
| April 6 April 7 |
international Openmobility conference 2013 | Bratislava, Slovakia |
| April 8 April 9 |
Write The Docs | Portland, OR, USA |
| April 8 | The CentOS Dojo 2013 | Antwerp, Belgium |
| April 10 April 13 |
Libre Graphics Meeting | Madrid, Spain |
| April 10 April 13 |
Evergreen ILS 2013 | Vancouver, Canada |
| April 14 | OpenShift Origin Community Day | Portland, OR, USA |
If your event does not appear here, please tell us about it.
Page editor: Rebecca Sobol
