By Jake Edge
June 27, 2012
Bdale Garbee has been involved in free software, particularly Debian, for a
long time, but he's
also been involved in the "corporate side" of open source at HP for quite some
time as well. That
gives him a good perspective on the interface between companies and free
software projects. In a bit of a reprise of a talk that goes back to 2003
or 2004,
he spoke at LinuxCon Japan on "The Business of Contribution"—how
companies and free software projects can benefit from each other.
There are multiple ways that someone can become involved in the free and
open source software (FOSS) ecosystem. They range from those who take the
code and use it in some way all the way up to those that lead and manage a
project or subsystem. Garbee said that he hoped to convince attendees to
move their way up to higher levels of engagement in FOSS.
In general FOSS projects have certain things in common. They are developed
and supported by the community, there is no single company in charge, and
there is a wide range of contributions that come from people with a variety
of interests, abilities, and motivations. The users of the software have
flexibility in how they acquire support; they can become a developer or pay
someone to do development for them. This puts them in control of their own
destiny, which stands in contrast to license arrangements where users just
get binaries.
The key to FOSS licensing is that "people we don't even know are empowered
to build things we can't even imagine", Garbee said. Everyone can take
advantage of those newly built things, both individuals and companies.
Business and community
Something that is probably immediately obvious, he said, are that the
expectations are somewhat different for FOSS developers and
companies. It is a bit difficult to characterize what FOSS developers are
looking for because each project and developer is different, but there are
some common elements.
Developers often want to scratch an itch and
solve a problem that they are having. They also are generally not working
to a particular schedule, something that is particularly true with Debian
for example, he said. FOSS developers are looking for the fun that comes
from a collaborative development project. Lastly, FOSS development is a
means to develop and maintain a personal reputation, which is something
that can be hard for companies to understand when hiring them.
On the other side of the coin, companies need to make money. In fact, in
the US, companies are legally required to make decisions based on their
revenue generation possibilities. Companies also must grow and expand to
provide the financial return that investors expect. That means that
companies want to be able to differentiate their products from their
competitors' products. Corporate reputation is also important, but there
are so many ways to lower a company's reputation, and seemingly few ways to
increase it, which makes doing new things risky, he said.
But there is good news in that there is something that the community and
companies can agree on: positive user experience. For the community, that
will increase the number of people who want to use and benefit from what it
has made. For companies, a great user
experience will help motivate customers to want to give them money.
Licenses and business models
FOSS licenses have their basis in copyright law, which differs somewhat in
various jurisdictions, but there is a lot of commonality. The choice of which
license to use for a given project is made by the founder(s) and those that
join the project later are explicitly accepting those terms for their
contributions. That "very special and unique right" to set the license is
an important decision as it can have a "really profound impact" on the
nature of the community that forms. In addition, a particular choice of
license can sometimes cause new projects to form using different license terms.
There are lots of different open source software licenses
available—60 or so—which
is a bit unfortunate. But, they break down into two basic types:
reciprocal or permissive. Reciprocal licenses require that one pass on the
same rights to the code to those you distribute it to, while permissive
licenses do not require that. The reality is that each license type has
its place, Garbee said. It is important for project founders to carefully
consider the license they choose, while it is equally important for
contributors and users to understand the license that is being used.
It is vital to recognize that "open source" is not a business model, it
is, instead, a process for development. One can use open source to enable
the sale of hardware—this is what HP does—by ensuring that
Linux runs well on that hardware. That is one possible business model, but
there are others.
Selling services, like training or support, is a common example of a
business model around FOSS. One could also give the source away, but
charge for access to the binaries. That may sound strange, but it is more
or less what the enterprise Linux distributions do, he said. Another model
is to open up the core of some large software system, perhaps a framework
and a few sample modules, and build a business around selling add-ons.
That model can work, but it sometimes becomes something of an arms race as
FOSS developers create and contribute add-ons that compete with the ones
that are sold.
Working with FOSS gives a company the "opportunity to harness the
creativity of a very diverse community". There is no company on earth, or
even a collection of companies, that can bring to bear the talents that
exist in a community like that surrounding the Linux kernel. Finding a way
to work with FOSS communities can be an enormous boon for a company.
Deciding to contribute
There are multiple benefits to a company for participating in open source.
For one, the software has "zero marginal cost", which means that a second
(or third ...)
copy costs nothing extra. The web provides a low-cost collaboration and
distribution mechanism, which also reduces costs. Most of tools are also
available as free software, so building a product is less costly. There is
an enormous amount of existing code in the form of libraries and tools that
can be used to build a product more cheaply and with a faster time to market.
There are multiple levels of engagement with FOSS, starting with taking the
code and running it. Those who do so are valid members of the FOSS
community. Going beyond that and monitoring the development or user
mailing lists makes you a bit more engaged. The next step might be to find
and fix a bug or to answer a question on the list. Each step up gives a
person more influence in the direction of the project.
There are more and less efficient ways to use open source in a product, he
said. It is common that one or more OSS components will require some change
before they can be used, and that is "completely legally OK". If there is a
reciprocal license, the changes must be distributed, but that's not a big
deal. The problem occurs when it is time to put out a new version of the
product. If the changes have not gone upstream,
they will need to be ported to any new version of that upstream project.
On the other hand, if the changes have been contributed upstream—and
shepherded through the process of getting them merged—the features
and bug fixes needed will be there in the next versions. That means that
the work that was done once won't have to be repeated each time a new
version of the product (which may require a new version of the upstream
project) is built. That doesn't mean there is no work that needs to be
done, as the new version may have new bugs that need to be addressed, but
it is much more efficient to get changes upstream.
For the kernel, things like device drivers will be kept up to date as the
internals of the kernel change over time. The kernel is a good example
because it is such a fast moving and dynamic body of code, but the
situation exists in other projects as well.
Choosing to use FOSS is a "first step to making the best use of resources",
Garbee said. That allows a company to focus on differentiating its
product. But maximizing community involvement helps to grow the body of
technology that's available to be used in future products, without having
to reinvest over and over again.
Using FOSS does not mean that a company
has to give everything away for free. But it doesn't mean that the
company gets everything it wants for free either. There will always be a
small piece that the company will need to invest in, but in spite of that,
working with FOSS projects is an "incredibly powerful model".
Getting more involved with a project can bring other advantages too. One
can help influence the project's direction and learn about enhancements
that others are working on. Learning about security vulnerabilities and other problems in
the code earlier will also be useful. It will give insight into who is
contributing to the project, which may lead to interesting information on
what competitors are up to. Because of that, Garbee is surprised by the
email he sometimes gets from people working at other companies who didn't
expect some announcement that HP has made. Had they been paying attention
to the projects that HP was working on, the announcement would probably
have been less surprising.
One way to experiment with FOSS is to work with partners on a project
without really committing the company to doing the work, but that will not
build "FOSS
equity". The technical accomplishments and the reputation that the project builds
will not accrue to the company (and instead to the partner). A better way
is to have company employees
who work directly on FOSS projects. The expertise that they gain can be
reapplied to other projects, and the company gets to help set the direction
for the project. In addition, the company can take credit for the work it
is doing.
One important thing to recognize, however, is that reputation is built by
individuals. Getting code accepted involves a
combination of technical prowess and reputation within the community. In
that case, the reputation that matters is that of the individual. Linus
Torvalds has said that he doesn't really know the corporate
affiliation of most kernel developers. In order for the company to be
successful in the community, it must allow individuals to gain reputation
in that community. It is important for managers to understand this, he
said, and to make it a part of the career development of the employee.
Garbee noted that he once was talking to a peer at a competitor who asked
about how he should review a particular employee. That person had been tasked
with getting some code upstream, but ran into a competing proposal. The
employee then made sure that all of the company's needs were met in the
alternative, which eventually was merged. The peer was wondering
whether the employee should be rewarded for that, and Garbee's response was
"give the guy a raise". He recognized that the alternative proposal was
better, dealt with his ego, and ensured that the company's requirements
were met. He achieved the desired result, rather than get his own code
upstream, which fully met the objective. That is something that many
managers have a hard time understanding, he said.
Does this really work?
Garbee then described HP's experiences with FOSS as something of a
justification for what he had been saying. In the last ten years or so, HP
has accomplished a great deal by trying to engage with FOSS projects "as
best we can". That is one of the things that has led HP to be the world's
largest IT company by many
measures, he said.
One of the more significant recent decisions that HP made with regard to
FOSS was for the HP Public Cloud, which is based on OpenStack and Ubuntu
LTS. The company has made a "significant commitment" of HP
resources to OpenStack. It is now one of the largest technical
contributors to the project and hosts its build farm. In addition, HP is
helping with the transition of OpenStack to a "foundation model", similar
to that of Eclipse or Apache.
HP has recently completed an analysis of all of its products and found that
the majority contain at least some FOSS. Any of those that do have FOSS
components are increasing that percentage rapidly. It is the most
efficient use of HP's software engineering effort, he said. All across HP,
there is
more and more direct engagement with FOSS projects, which is helping to
make the company more successful.
As a consequence of the talk, Garbee said that he hoped the audience would
find that becoming more directly engaged with FOSS projects made sense for
them and their companies. He ended with a famous quote from Margaret Mead:
Never doubt that a small group of thoughtful,
committed citizens can change the world.
Indeed, it is the only thing that ever has.
That is, he said, a strong statement of how things work in open source. If
someday you look around and wonder why it is that something isn't fixed,
remember that you are someone.
Garbee's talk was clearly aimed at the mostly Asian audience, and tried to
assist those present with arguments to make to their management about the
benefits of working more closely with FOSS projects. The arguments are
useful for anyone trying to get their company more involved in FOSS
projects, of course, but Asian companies have traditionally been less
engaged with
upstream projects. He also slipped some
rocket slides into the talk, which, beyond bringing some of his hobby into
view, also gave him an opportunity to show a more personal connection to
open source.
One of those slides showed him carrying a rather large rocket to the
launching pad (shown at right). That rocket was subsequently lost after reaching Mach 1.3
and attaining 5000m above the ground because of a
malfunctioning closed-source altimeter. That led him and Keith Packard into
a collaboration to create open hardware and software for rocket telemetry
and control, which ultimately helps him "not lose rockets". So, not only
is open source good for companies, it's good for rockets—and other hobbies
too.
Comments (3 posted)
By Jonathan Corbet
June 27, 2012
Proprietary kernel modules have always had an unclear status in the kernel
development community. Most (but not all) developers seem to agree that it is
possible to make a module that is not a derived product of the kernel
itself (and, thus, not subject to the requirements of the GPL); the Andrew
filesystem, ported from Unix in the early days, has been cited as an example
of this kind of module. Most developers also seem to agree that many other
types of modules cannot be written as an independent work. There is far
less agreement on where the line should be drawn, but there is a consensus
around the idea that modules should not lie about their licensing status to
gain closer access to kernel internals. The story of a company that was
recently caught shipping that sort of dishonest module shows the kind of
reaction that can be expected—and why closed-source modules are a bad idea
in general.
Loadable kernel modules can only access kernel symbols that have been
explicitly exported for that use. Two types of declarations are used to
export these symbols. EXPORT_SYMBOL_GPL() is an indication that
any module using the symbol is reaching so deeply into the kernel that it
is, by necessity, a derived product of the kernel. Instead,
EXPORT_SYMBOL() is used for symbols that only might
indicate a derived-product status; it is also used for symbols that were
already exported before the GPL-only mechanism was added.
The kernel's module loader includes an enforcement mechanism that prevents
any module without a GPL-compatible license from accessing explicitly
GPL-only symbols. To function, this mechanism clearly must know what the
license for a specific module is. That is the weak point of the entire
scheme: the kernel uses the honor system. Loadable modules include a
special declaration that tell the kernel which license applies; the module
loader looks at that declaration and decides whether the module will have
access to GPL-only symbols or not. If the module lies about its license,
the kernel will normally trust it.
That is where CloudLinux comes in.
This company offers a CentOS-like distribution aimed at the needs of cloud
computing providers. Essentially, CloudLinux has taken a RHEL clone,
added its own container mechanism, and created a business around providing
updates and support. The container mechanism is at least partially
implemented by the company's "lve" kernel module; lve includes an access
control and resource limit mechanism that, seemingly, is implemented
independently of the kernel's control group subsystem. It can place limits
on CPU usage, I/O bandwidth usage, and memory consumption, among other
things; all this is managed through an
administrator-friendly XML file.
Matthew Garrett recently became aware that, while the lve module claims to
be GPL-licensed, there is not any source available for it; he responded
with a patch causing the kernel to
recognize and blacklist this module. Since lve uses GPL-only symbols, this
change will prevent it from loading, essentially breaking it. This change
has yet to be pushed into the mainline, but it has been accepted by the modules maintainer (Rusty
Russell) and marked for stable updates as well. Unless something changes,
it will soon become impossible to load the lve module into any current
kernels.
The immediate effect of this change on CloudLinux is likely to be limited.
Chances are that few users download the binary lve module and load it into
their own kernels; almost all lve users are probably running the entire
CloudLinux distribution. CloudLinux could easily patch its own kernel to
remove the check for the lve module and restore the previous
functionality. But such a move would almost certainly be noticed, and
there is a distinct possibility that one or more kernel developers would
react in an unpleasant way. It would not be an advisable course of action.
What CloudLinux did instead was to apologize and plead for time:
We planned to close source the module, and we will do it later
on. Yet, it looks like one of our developers missed the point --
and did things incorrectly.
Please, give us two-three weeks to straighten things out. By the
end of three weeks I plan to have source RPMs with the GPLed
version of the modules available in our source repositories.
Later on we will have new module that is not GPL released.
This response was not entirely well received; it is not at all clear why
CloudLinux is unable to immediately ship the source corresponding to the
binary module it is distributing. That said, the community will
probably limit itself to grumbling as long as CloudLinux follows
through and comes back into compliance with the kernel's licensing. Nobody
likes a delay in a source release, but short delays are usually tolerated,
grudgingly.
So the licensing part of this episode is, probably, resolved, but
it is also interesting to look at why the lve module needs GPL-only symbols
in the first place.
The specific symbols it needs relate to the creation of binary attributes
(essentially unstructured sysfs files) in the sysfs virtual filesystem.
Much of the internal sysfs interface is GPL-only, and the functions for the
management of binary attributes (which are generally discouraged) are
certainly so. It turns out that lve creates binary attributes for a reason that would certainly never get
through a proper review:
We do a "hack", which is not a pretty one, populating /sys with
.htaccess files. This is really needed only by shared hosters,
where one of the end users on the server, could be a hacker and
could create symlinks that would later be followed by apache to
read privileged information.
As a comprehensive security solution, this implementation leaves just a
little bit to be desired.
CloudLinux is trying to reduce the amount of damage that could be done
should an attacker successfully compromise a daemon running on the system.
Kernel developers have been working for years to
mitigate just this sort of threat; the results of that work take the form
of security modules, namespaces, and more. Had CloudLinux done its work in
the open, it would certainly have been directed toward solutions that have
support from others and years of experience in actual use. Solutions, in
other words, that might actually deliver some additional security.
Instead, they got a one-off "not pretty hack" that could never get upstream.
If CloudLinux follows through on its plan to close the module entirely, it
will have to cease use of GPL-only symbols and, thus, will not be able to
implement this particular mechanism. That may be enough to convince kernel
developers to leave it alone. But it should leave others wondering what
other surprises may be lurking inside—surprises that cannot be
investigated, since the source for the module is not available.
There may well be a viable business in the creation of a well-supported
distribution aimed at the needs of cloud providers. With solid security
support and some top-tier user-space management tools, a company like
CloudLinux could well produce an offering that others are willing to pay
for. But the most robust and trustworthy server distribution will come
from using (and improving) the work that many developers have done for
years to create a useful containers implementation. It is hard to imagine
that a proprietary solution put together by a small company can work as
well. Compared to that, the risk of licensing troubles seems like a case
of adding insult to injury.
Comments (17 posted)
By Nathan Willis
June 26, 2012
Those of us who type in Latin characters may easily
overlook what it takes to get text into windows or command lines in
other writing systems. Entry of characters not found on one's keyboard
requires the use of an input method (IM)
which turns multiple keystrokes into characters. There are plenty of capable projects, but they
often lack deep integration into the desktop environment or widget
toolkit. In April, GNOME developer Rui Matos proposed
a feature for the upcoming GNOME 3.6 release that would integrate the
IBus framework into the
core GNOME desktop, tackling this precise challenge. IBus is a
framework that allows the user to select — and switch between — multiple IMs. The plan spawned considerable debate, not
only on the merits of IBus, but on the wisdom of tightly integrating a
single component into the desktop environment. Complicating matters
is the divide between the bulk of the GNOME developer community and
those users who depend on input methods, primarily from the
Chinese-Japanese-Korean (CJK) language communities.
At the heart of the issue is how CJK users input text. In alphabetic
or syllabic writing systems (including European, North Indic, and Middle
Eastern), there is a known mapping between the pronunciation of a
word and its written representation. In the logographic writing
systems of CJK, where the strokes of the character do not represent sounds or other subdivisions of the word, users make use of an
IM instead. There are phonetic IMs such as Pinyin (in which the
user types in the romanization of a word on the keyboard and the IM
substitutes the correct character, or logogram), shape-based IMs like Cangjie (which
decompose the logogram in a strict order), and many hybrids. In most
cases, good dictionaries or tables are required, plus word-prediction,
spell-checking, and other add-on features to cope with homophones and other
tricky bits. Mobile phone users got a taste of the IM experience
through T9
and other numeric-keypad predictive text systems, which have largely
been replaced.
For everyday typing, the challenge is greater because no one input
method is inherently superior: if you do not know how to pronounce an
unfamiliar word, a phonetic method is no use, and switching to
shape-based method makes sense. On the other hand, typing a word
that you hear but have not seen written down demands a phonetic
method. Likewise, regional accents can have very different
pronunciations for the same logogram, but there is often more than one
way to decompose a logogram by shape — not to mention the
problem of writing reforms like Simplified Chinese, which is not
simplified uniformly between mainland Chinese hanzi and Japanese
kanji. Thus, almost no IM can be relied upon to the
exclusion of all others. Throw in open source developers' penchant
for reinventing the wheel to scratch their own itches, and the result
is multiple IM frameworks, each of which
can load individual IMs as the user chooses.
Supporting all of these frameworks is a challenge, one that Matos
(who works mostly on gnome-control-center, GNOME Shell, and other
desktop components) felt detracted from GNOME's ability to provide a
consistent desktop experience and left CJK users a step behind those
users whose writing system works out-of-the-box. As he explained
on the desktop-devel list, IM framework proliferation reduces the odds
that any of the frameworks will get enough testing to be robust, and
it greatly complicates bug triage for GNOME. Plus, many of the
popular IM frameworks also attempt to be cross-desktop, which makes
integrating them seamlessly into GNOME difficult. Their visual
appearance does not match, they conflict with core GNOME settings like
XKB configuration and key bindings, and they cannot be automatically
started (relying either on user interaction or shell scripts to
launch). The fix he proposed is to take one IM framework, tie it in
more directly to GNOME (including adding or extending GNOME APIs where
needed), and make sure that it works for the majority of users.
Specifically, Matos said, GNOME has three requirements to properly
integrate an IM framework with the desktop: a GTK+ module, a D-Bus API
that can enumerate, activate, and configure installed IMs, and a D-Bus
API on GNOME's side that the framework can use to draw predictive-text
pop-up windows. Right now, he said, only IBus meets all of the
requirements. Owen Taylor added
some additional requirements to integrate the framework with GNOME
accessibility and configuration standards, and a quality set of
existing IMs. The IBus team expressed a willingness to work with
GNOME, which was also critical.
Feedback from the CJK community
However, the plan elicited strong reactions from some members of the
CJK user community. Marguerite Su replied
that few if any CJK users like IBus; with most preferring rival
framework Fcitx out of the existing
options available on
desktop Linux. Others chimed in with criticism of IBus. The
complaints included several overlapping issues, including
customization features, IM engine quality, and speed. Su elaborated
on the feature disparity, saying that some users want to customize
shortcuts, rearrange the word-completion suggestions, or fetch new
dictionary words from Internet servers. But trading IBus for Fcitx is
not the simple solution. IBus works well for Simplified Chinese, but not
Traditional Chinese, while Fcitx is focused largely on Chinese and has less robust
support for other languages, including Japanese (although it has plans
to improve its Japanese support). Chinese users are greater in
number, which might argue in favor of Fcitx, but surely the best
solution would be to find a framework that serves everyone. Support
for writing systems unrelated to the CJK family (such as Tibetan or
Thai) is weaker in general across the IM frameworks.
But over the course of the sometimes-heated list discussion, the IBus critics
proposed not simply adopting Fcitx outright, but keeping IM frameworks
a user-controlled, pluggable option. For example, Liang Suilong, who argued
that IBus is laggy, still advocated building an "IM framework
framework" for GNOME, thus allowing users to choose the framework they
preferred. But that idea was not well-received by GNOME developers.
Tomas Frydrych summed
up the two sides of the divide:
Rather a long discussion over IBus, but it seems
to more or less boil down to two voices and this:
Gnome developers: we want tighter IM integration and simpler UI in the
name of better UX, and are looking at IBus as the underlying technology,
Users: IBus has poor support for CJK input and a history of not
addressing these problems.
As others pointed out on the list, the latter point was not
accurate; IBus does support CJK, but not all users are happy with its
IM implementations or auxiliary features. Complicating matters more is
the cultural divide between the groups. Weng Xuetian pointed
out that the core GNOME developers are not IM users (most speak
European languages), which makes them unqualified to select the "best"
IM framework to then be used by a large number GNOME users who were
not able to participate in the discussion.
Frameworks, not engines
But the groundswell of IBus criticism was not the final word. Tommy He said
that the debate was too focused on CJK input, to the exclusion of
other writing systems, and moreover that much of the lobbying in favor
of Fcitx focused on the quality of the various IM engines, rather than
the frameworks themselves. After all, if Fcitx can add a new and
improved Japanese IM, then IBus could add one for Traditional Chinese.
Admittedly, the framework-versus-IM debate is a difficult one to pin
down. On the one hand, Fcitx proponents enjoy its more flexible
plugin system, which allows user-defined macros and other daily-use
features. But that same flexibility could make it more difficult to
integrate with GNOME's existing language settings and preferences,
when providing a better first-run experience is explicitly one of the
goals. As Owen Taylor observed,
Fcitx generates its configuration GUI on-the-fly, which is a technique
GNOME finds problematic and tries to avoid.
On May 14, Taylor attempted to re-center
the discussion to the framework question, arguing
that GNOME needs to pick one framework, then make it work "as
well as we can possibly make it for all users. Multiple partially
working frameworks are not a substitute." When the
IBus-versus-Fcitx debate briefly resurfaced, Bastien Nocera advocated
starting the IBus integration work as soon as possible, then inviting
Fcitx to bring its own code demonstrating that it could fit the role
better. "A lot of the code should be reusable, and you can
show how much more awesome and complete your favourite IMF is."
When it looked like the core GNOME developers had made their final
decision to integrate IBus, a few IBus detractors asked the team to
postpone the decision for further discussion, but without success.
Nocera reiterated
his opinion that it would be better to pick an imperfect IM framework
and replace it in later releases than to do nothing:
If we choose to merge integration based on IBus (because of a variety of
reasons), then two things can happen:
- Developers of other Input Frameworks can start creating patches to the
upstream GNOME to provide a better integration than the default choice.
- They choose to start working on the selected IMF because it's the
selected IMF
- They choose to concentrate on other desktops
In all cases, the implementation will evolve, and the integration will
get better. I don't want to have the choice between 2 equally badly
integrated IMFs for GNOME.
But Jasper St. Pierre compared
the decision to other components, like audio. GNOME does not attempt
to support ESD and OSS in addition to PulseAudio due to limited
resources, he said, and GNOME's choice of IM frameworks will not force
distributions to follow suit. "I doubt we have the resources to
support both IBus and FCITX, and provide a good experience for
both. Individual distributions may, but that's their call, not
GNOME's."
The Fcitx lobby may not be satisfied with that response, but it looks
like IBus is here to stay for the 3.6 development cycle at the very
least. The GNOME wiki page
lists seven tracker bugs to follow the integration with GNOME Shell,
GNOME control center, and other components. There is still plenty of
integration and debugging work to be done, including improving IBus's
support
for Hong Kong localization and reducing interference
with other applications.
In an email, Matos said that he thought many of the user concerns
about losing Fcitx boiled down to UI/UX issues, particularly
attachment to "things like skins and themes for the UI
popups. In this regard I just can say that as far as gnome-shell is
(somewhat) themable people will be able to do themes for gnome-shell's
IM popups." On the other hand, he added, there are Fcitx
features like spell-checking that should be handled desktop-wide,
while features like retrieving word lists from an online service
"is orthogonal and there's no reason it can't be implemented [in]
IBus."
It may not be an easy journey, but clearly
pulling CJK users into the fold with first-order input support has
potential benefits that far outweigh the costs.
Comments (43 posted)
Page editor: Jonathan Corbet
Security
By Nathan Willis
June 27, 2012
UEFI Secure boot is expected to interfere with many users' desire
to replace Windows or dual-boot it with Linux, because Microsoft is mandating that
secure boot be enabled on Windows 8 machines at the time of sale. On
June 5, we reported on Fedora's
plans for handling the secure boot mechanism in UEFI. Ubuntu has
subsequently announced its own plans, which take a different approach.
To recap, the secure boot feature constrains the hardware only to boot
software that has been signed by a known cryptographic key. The point
is that booting only signed, trusted binaries prevents attacks through
boot-time malware that could be undetectable after the infected system
is up and running. Microsoft is requiring hardware vendors to have secure
boot enabled if they want to include the official logo for
the upcoming Windows 8, although x86 vendors are also required to
allow the machine's owner to turn off secure boot entirely
or to install new keys. That option is regarded as insufficient for
several reasons, notably that there may be users who are required
(e.g., by office rules) to keep secure boot switched on, and that
entering new keys for every alternative OS is likely to be an arduous
process (even more so for the scenario where one needs to boot a
temporary OS, such as from a CD or USB key).
Fedora's strategy is to enroll in Microsoft's developer program, which
allows the project to purchase an approved $99 key through Verisign,
a key which will be recognized by UEFI secure boot. The key will be
used to sign the shim
bootloader, which is a "trivial UEFI first-stage bootloader" whose
only job is to boot GRUB2. Fedora will also sign the GRUB2 bootloader
and the kernel, although the latter two binaries can be signed with
the Fedora project's own keys.
Ubuntu's plan
Canonical posted
a brief announcement about its own secure boot plan on the company
blog on June 22, although the details were to be found in Steve
Langasek's message
to the ubuntu-devel mailing list. Canonical has generated its own
signing key which will be pre-loaded on machines that ship with
Ubuntu already installed. Ubuntu CDs will ship with a shim bootloader
(the same shim bootloader used by Fedora) signed by one of the existing
Microsoft-certified keys, much like the Fedora plan.
After that point, however, the distribution is taking a markedly
different approach to the trusted bootloader chain. An Ubuntu system
will boot into the efilinux bootloader,
which will in turn boot an unsigned kernel image. Under Fedora's plan, the shim bootloader verifies the integrity of GRUB2 before loading it, and GRUB2 in turn verifies the integrity of the kernel. Canonical
says that their reading of the specification makes it clear that their
secure boot responsibilities stop at the bootloader, and do not extend to
the kernel:
We believe that the intention of secure boot is to protect against
malicious use or modification of pre-boot code, before the
ExitBootServices UEFI service is invoked. Currently, this call is
performed by the boot loader, before the kernel is executed.
Therefore, we will only be requiring authentication of boot loader
binaries. Ubuntu will not require signed kernel images or kernel
modules.
The decision to use efilinux has its own justification. Because GRUB2
is licensed under the GPLv3, Canonical determined that machines with
Ubuntu pre-installed are subject to the "User Product" provisions of
GPLv3, which requires
that the distributor provide the user with all authorization keys
required to install the software. The company consulted with the FSF
about that topic, and were warned that the authorization key clause
would probably (although not definitely...) apply. Thus, if a hardware vendor shipped an Ubuntu system and did not include a way for users to install keys of their own, Canonical would be compelled to disclose its key. Revealing the signing key would undermine the point
of secure boot and "at that point our certificates would of course
be revoked and everyone would end up worse off."
Signatures, revocation, and other fine print
Ubuntu's decision to use its own key for pre-installed machines has
spawned relatively little debate, but there is a sharp disagreement
over the decision not to sign kernel images. Red Hat's Matthew
Garrett (who authored the Fedora secure boot plan) argued
that signing only the bootloader is insufficient:
How are you going to prevent your bootloader from being used to launch a
trojaned Fedora kernel, for instance? This is the kind of decision that
doesn't just affect Ubuntu, it has ramifications for the security model
that other distributions use. This makes it impossible to implement any
kind of signed userspace unless the user explicitly revokes the Ubuntu
bootloader first or uses their own trust chain.
Jamie Strandboge replied
that "the UEFI specification and the Windows 8 logo requirements
is that Secure Boot is designed to protect early boot only,"
and that signing the kernel and large portions of userspace is
unattractive for several reasons, "not least of which is that it
reduces the utility of the distribution."
Strandboge also contended that signing the kernel does not offer a
significant level of protection over signing the bootloader, because
the existence of any exploitable bootloader undermines the
trust chain for all OS vendors. The argument goes that if
DistroX's signed bootloader is vulnerable, malware authors could use
it to create a malicious live CD image that will boot even on a
machine that normally runs DistroY's secure bootloader with its signed
kernel. Thus, signing the kernel image is useful for creating a
trusted environment for user space, but it does not strengthen the
protection of secure boot itself.
There is also the open question of how key-revocations and other
updates to the secure boot world will work in practice. Both Fedora
and Ubuntu plan to make use of a "shim" bootloader so that they can
issue updates to the main bootloader without getting the updates
signed by Microsoft. But the distributions will also need to issue
revocations for vulnerable, signed bootloader and/or kernel images, and
the process by which the OS vendor pushes those updates out has yet to
be determined.
Although most multi-boot discussions revolve around dual-booting
Windows and a single Linux distribution, that is hardly the only
scenario. Canonical said that it will not offer its own signing key to
sign the bootloaders of other distributions or vendors, which some
feared would make it impossible to install, for example, Fedora on a
machine that comes with Ubuntu pre-installed. However, the owners
of machines pre-loaded with Ubuntu will still be able to install
Fedora or other OSes in tandem, because the company will require its
OEMs to include the Microsoft key in the secure boot key database
alongside the Ubuntu key.
As Windows 8 draws near, the questions about UEFI secure boot and
its impact on users continue to swirl. Clearly there are risks in
handing the ultimate say in booting one's machine to a third party
(particularly a rival OS vendor like Microsoft), and even though two of the largest
distributions have crafted a plan for dealing with secure boot's
restrictions, how much of an imposition the final product is still
hinges on unknowns like the revocation and update process. But the
biggest question that remains is whether it is wise to tacitly endorse secure
boot by playing its games in first place. On that, the community may
never arrive at a single answer.
Comments (26 posted)
Brief items
If Microsoft's "reputation" database can't tell the difference between a
gambling site and an independently audited registered nonprofit
public-interest charity founded almost 30 years ago, it is certainly doing
you and your business more harm than good.
--
The
Free Software Foundation is unimpressed at being tagged as a gambling site
Amazingly, Accenture, which sold its crap-on-a-stick high-school sophomoric completely insecure malfunctioning voter registration software to a bunch of states, so unsuccessfully that Colorado refused to pay and others, like Wisconsin and Shelby County, bought out the source code in order to try to bandaid it into a functional system, has decided to issue a DMCA protective order against Black Box Voting for exposing its flawed software.
Last time a voting system company did a DMCA takedown notice (Diebold, in 2004) it got socked with punitive charges for abusing the Digital Millennium Copyright Act, trying to use it to block distribution of material clearly published in the public interest.
--
Bev
Harris gets a DMCA takedown request (the entire
thread
is interesting)
The firm gathers publicly available voter files from all 50 states and supplements this with records of political donations and other profiles purchased from commercial data brokers, says CEO Jeff Dittus. Then, working with about 100 high-traffic websites that register their users, they can match the offline data to the online identities of individuals.
Few Web surfers realize how widely data about them gets bought, sold, and
combined. But the practice is common. In a recent investigation, ProPublica
revealed that Microsoft and Yahoo each offer political campaigns the ability to target voters in similar ways.
--
Jessica
Leber in
Technology Review
Comments (1 posted)
Steve Langasek has posted a set of details on how Ubuntu's UEFI secure boot
mechanism will work. There are some real differences from the approach
taken by Fedora. "
Microsoft's Windows 8 logo requirements do say that there must be a way
for users to disable secure boot or to install their own keys, and we
strongly support this in our own firmware guidelines; but in the event
that a manufacturer makes a mistake and delivers a locked-down system
with a GRUB 2 image signed by the Ubuntu key, we have not been able to
find legal guidance that we wouldn't then be required by the terms of
the GPLv3 to disclose our private key in order that users can install a
modified boot loader. At that point our certificates would of course be
revoked and everyone would end up worse off."
Full Story (comments: 112)
The H
discusses a new demonstration application published by a German security researcher capable of reading credit card information over NFC. "
Contactless credit card systems have been hacked in the past and while the problems with the technology are worrisome, access via NFC is not a viable way to harvest a great amount of credit card data for obvious reasons. The relatively easy availability of smartphone applications like paycardreader will most likely make them attractive for opportunist fraudsters, however."
Comments (12 posted)
New vulnerabilities
apache: privilege escalation
| Package(s): | apache |
CVE #(s): | CVE-2012-0883
|
| Created: | June 25, 2012 |
Updated: | February 12, 2013 |
| Description: |
From the CVE entry:
envvars (aka envvars-std) in the Apache HTTP Server before 2.4.2 places a zero-length directory name in the LD_LIBRARY_PATH, which allows local users to gain privileges via a Trojan horse DSO in the current working directory during execution of apachectl. |
| Alerts: |
|
Comments (2 posted)
asterisk: denial of service
| Package(s): | asterisk |
CVE #(s): | CVE-2012-3553
|
| Created: | June 26, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the Red Hat bugzilla:
AST-2012-008 previously dealt with a denial of service attack exploitable in the Skinny channel driver that occurred when certain messages are sent after a previously registered station sends an Off Hook message. Unresolved in that patch is an issue in the Asterisk 10 releases, wherein, if a Station Key Pad Button Message is processed after an Off Hook message, the channel driver will inappropriately dereference a Null pointer.
Similar to AST-2012-008, a remote attacker with a valid SCCP ID can can use this vulnerability by closing a connection to the Asterisk server when a station is in the "Off Hook" call state and crash the server.
This only affects version 10, and is fixed in 10.5.1. |
| Alerts: |
|
Comments (none posted)
dhcpcd: remote code execution
| Package(s): | dhcpcd |
CVE #(s): | CVE-2012-2152
|
| Created: | June 25, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the Debian advisory:
It was discovered that dhcpcd, a DHCP client, was vulnerable to a stack
overflow. A malformed DHCP message could crash the client, causing a denial of
service, and potentially remote code execution through properly designed
malicious DHCP packets. |
| Alerts: |
|
Comments (none posted)
gdk-pixbuf: integer overflow
| Package(s): | gdk-pixbuf |
CVE #(s): | CVE-2012-2370
|
| Created: | June 25, 2012 |
Updated: | January 17, 2013 |
| Description: |
From the Gentoo advisory:
The "read_bitmap_file_data()" function in io-xbm.c contains an
integer overflow error |
| Alerts: |
|
Comments (none posted)
ImageMagick: integer overflow
| Package(s): | ImageMagick |
CVE #(s): | CVE-2012-1620
|
| Created: | June 22, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the Red Hat Bugzilla entry:
An out-of heap-based buffer read flaw was found in the way ImageMagick, an image display and manipulation tool for the X Window System, retrieved Exchangeable image file format (Exif) header tag information from certain JPEG files. A remote attacker could provide a JPEG image file, with EXIF header containing specially-crafted tag values, which once opened in some ImageMagick tool would lead to the crash of that tool (denial of service).
|
| Alerts: |
|
Comments (none posted)
kernel: NX emulation suspected broken
| Package(s): | kernel |
CVE #(s): | |
| Created: | June 25, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the Fedora advisory:
Disabled 32bit NX emulation. Suspected of being broken and it deviates
from upstream.
|
| Alerts: |
|
Comments (none posted)
kernel: denial of service and iptables bypass
Comments (none posted)
libpng: multiple vulnerabilities
| Package(s): | libpng |
CVE #(s): | CVE-2009-5063
CVE-2011-3464
|
| Created: | June 22, 2012 |
Updated: | October 22, 2012 |
| Description: |
From the Gentoo advisory:
Multiple vulnerabilities have been discovered in libpng:
* The "embedded_profile_len()" function in pngwutil.c does not check
for negative values, resulting in a memory leak (CVE-2009-5063).
* The "png_formatted_warning()" function in pngerror.c contains an
off-by-one error (CVE-2011-3464).
|
| Alerts: |
|
Comments (none posted)
libwpd: code execution
| Package(s): | libwpd |
CVE #(s): | CVE-2012-2149
|
| Created: | June 27, 2012 |
Updated: | July 6, 2012 |
| Description: |
From the Red Hat advisory:
A buffer overflow flaw was found in the way libwpd processed certain
Corel WordPerfect Office documents (.wpd files). An attacker could provide
a specially-crafted .wpd file that, when opened in an application linked
against libwpd, such as OpenOffice.org, would cause the application to
crash or, potentially, execute arbitrary code with the privileges of the
user running the application. |
| Alerts: |
|
Comments (none posted)
links: multiple vulnerabilities
| Package(s): | links |
CVE #(s): | |
| Created: | June 26, 2012 |
Updated: | July 10, 2012 |
| Description: |
From the Gentoo advisory:
A SSL verification vulnerability and two unspecified vulnerabilities
have been discovered in Links. Please review the Secunia Advisory
referenced below for details.
An attacker might conduct man-in-the-middle attacks. The unspecified
errors could allow for out-of-bounds reads and writes. |
| Alerts: |
|
Comments (none posted)
logrotate: symlink and hard link attacks
| Package(s): | logrotate |
CVE #(s): | CVE-2011-1549
|
| Created: | June 26, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the CVE entry:
The default configuration of logrotate on Gentoo Linux uses root privileges to process files in directories that permit non-root write access, which allows local users to conduct symlink and hard link attacks by leveraging logrotate's lack of support for untrusted directories, as demonstrated by directories under /var/log/ for packages. |
| Alerts: |
|
Comments (none posted)
mantis: multiple vulnerabilities
| Package(s): | mantis |
CVE #(s): | CVE-2012-1118
CVE-2012-1119
CVE-2012-1120
CVE-2012-1122
CVE-2012-1123
CVE-2012-2692
|
| Created: | June 25, 2012 |
Updated: | November 9, 2012 |
| Description: |
From the Debian advisory:
CVE-2012-1118:
Mantis installation in which the private_bug_view_threshold
configuration option has been set to an array value do not
properly enforce bug viewing restrictions.
CVE-2012-1119:
Copy/clone bug report actions fail to leave an audit trail.
CVE-2012-1120:
The delete_bug_threshold/bugnote_allow_user_edit_delete
access check can be bypassed by users who have write
access to the SOAP API.
CVE-2012-1122:
Mantis performed access checks incorrectly when moving bugs
between projects.
CVE-2012-1123:
A SOAP client sending a null password field can authenticate
as the Mantis administrator.
CVE-2012-2692:
Mantis does not check the delete_attachments_threshold
permission when a user attempts to delete an attachment from
an issue. |
| Alerts: |
|
Comments (none posted)
mediawiki: multiple vulnerabilities
| Package(s): | mediawiki |
CVE #(s): | CVE-2010-2789
CVE-2011-0537
CVE-2012-1578
CVE-2012-1579
CVE-2012-1580
CVE-2012-1581
CVE-2012-1582
|
| Created: | June 22, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the Gentoo advisory:
MediaWiki allows remote attackers to bypass authentication, to perform
imports from any wgImportSources wiki via a crafted POST request, to
conduct cross-site scripting (XSS) attacks or obtain sensitive
information, to inject arbitrary web script or HTML, to conduct
clickjacking attacks, to execute arbitrary PHP code, to inject
arbitrary web script or HTML, to bypass intended access restrictions
and to obtain sensitive information.
|
| Alerts: |
|
Comments (none posted)
mini-httpd: code execution
| Package(s): | mini-httpd |
CVE #(s): | CVE-2009-4490
|
| Created: | June 25, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the Gentoo advisory:
mini_httpd does not properly check for shell escapes when parsing HTTP
requests.
A remote attacker could send specially crafted HTTP requests, possibly
resulting in execution of arbitrary code with the privileges of the
process, or allowing for overwriting of files. |
| Alerts: |
|
Comments (none posted)
mono and mono-debugger: multiple vulnerabilities
| Package(s): | mono and mono-debugger |
CVE #(s): | CVE-2010-3332
CVE-2010-3369
CVE-2010-4225
|
| Created: | June 22, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the Gentoo advisory:
A remote attacker could execute arbitrary code, bypass general
constraints, obtain the source code for .aspx applications, obtain
other sensitive information, cause a Denial of Service, modify internal
data structures, or corrupt the internal state of the security manager.
A local attacker could entice a user into running Mono debugger in a
directory containing a specially crafted library file to execute
arbitrary code with the privileges of the user running Mono debugger.
A context-dependant attacker could bypass the authentication mechanism
provided by the XML Signature specification.
|
| Alerts: |
|
Comments (none posted)
mosh: denial of service
| Package(s): | mosh |
CVE #(s): | CVE-2012-2385
|
| Created: | June 26, 2012 |
Updated: | April 10, 2013 |
| Description: |
From the Red Hat bugzilla:
A denial of service flaw was found in the way mosh, a remote terminal application, performed processing of parameters that have been passed to the terminal in the terminal dispatcher class (previously there was no limit for the count of parameters, which were allowed to be passed to the dispatcher). A remote attacker could use this flaw to cause a denial of service (mosh server to enter long for loop when trying to process the parameters) via specially-crafted escape sequence string. |
| Alerts: |
|
Comments (none posted)
msmtp: X.509 NULL spoofing
| Package(s): | msmtp |
CVE #(s): | CVE-2009-3942
|
| Created: | June 26, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the CVE entry:
Martin Lambers msmtp before 1.4.19, when OpenSSL is used, does not properly handle a '\0' character in a domain name in the (1) subject's Common Name or (2) Subject Alternative Name field of an X.509 certificate, which allows man-in-the-middle attackers to spoof arbitrary SSL servers via a crafted certificate issued by a legitimate Certification Authority, a related issue to CVE-2009-2408. |
| Alerts: |
|
Comments (none posted)
nbd: denial of service
| Package(s): | nbd |
CVE #(s): | CVE-2011-1925
|
| Created: | June 26, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the CVE entry:
nbd-server.c in Network Block Device (nbd-server) 2.9.21 allows remote attackers to cause a denial of service (NULL pointer dereference and crash) by causing a negotiation failure, as demonstrated by specifying a name for a non-existent export. |
| Alerts: |
|
Comments (none posted)
network-manager: insecure WPA AdHoc connections
| Package(s): | network-manager |
CVE #(s): | CVE-2012-2736
|
| Created: | June 27, 2012 |
Updated: | September 12, 2012 |
| Description: |
From the Ubuntu advisory:
It was discovered that certain wireless drivers incorrectly handled the
creation of WPA-secured AdHoc connections. This could result in AdHoc
wireless connections being created without any security at all. This update
removes WPA as a security choice for AdHoc connections in NetworkManager. |
| Alerts: |
|
Comments (none posted)
nvidia-drivers: privilege escalation
| Package(s): | nvidia-drivers |
CVE #(s): | CVE-2012-0946
|
| Created: | June 25, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the Gentoo advisory:
A vulnerability has been found in the way NVIDIA drivers handle
read/write access to GPU device nodes, allowing access to arbitrary
system memory locations. A local attacker could gain escalated privileges. |
| Alerts: |
|
Comments (none posted)
openjpeg: code execution
| Package(s): | openjpeg |
CVE #(s): | CVE-2012-1499
|
| Created: | June 21, 2012 |
Updated: | June 28, 2012 |
| Description: |
From the Gentoo advisory:
An error in jp2.c of OpenJPEG could allow an out-of-bounds write error.
A remote attacker could entice a user to open a specially crafted JPEG
file, possibly resulting in execution of arbitrary code or a Denial of
Service condition.
|
| Alerts: |
|
Comments (none posted)
php: information disclosure/arbitrary code execution
| Package(s): | php |
CVE #(s): | CVE-2010-2950
|
| Created: | June 27, 2012 |
Updated: | July 2, 2012 |
| Description: |
From the Red Hat advisory:
A format string flaw was found in the way the PHP phar extension processed
certain PHAR files. A remote attacker could provide a specially-crafted
PHAR file, which once processed in a PHP application using the phar
extension, could lead to information disclosure and possibly arbitrary code
execution via a crafted phar:// URI. |
| Alerts: |
|
Comments (none posted)
python-httplib2: use of incorrect certificates
| Package(s): | python-httplib2 |
CVE #(s): | |
| Created: | June 25, 2012 |
Updated: | April 10, 2013 |
| Description: |
From the openSUSE advisory:
python-httplib2 used to ship it's own copy of Mozilla NSS
certificates, but should use the system-wide ones instead. |
| Alerts: |
|
Comments (none posted)
roundcubemail: cross-site scripting
| Package(s): | roundcubemail |
CVE #(s): | CVE-2012-1253
|
| Created: | June 22, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the Red Hat Bugzilla entry:
Cross-site scripting (XSS) vulnerability in Roundcube Webmail before
0.7, when Internet Explorer is used, allows remote attackers to inject
arbitrary web script or HTML via vectors involving an embedded image
attachment. |
| Alerts: |
|
Comments (none posted)
rpm: multiple vulnerabilities
| Package(s): | rpm |
CVE #(s): | CVE-2010-2197
CVE-2010-2199
|
| Created: | June 25, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the CVE entries:
rpmbuild in RPM 4.8.0 and earlier does not properly parse the syntax of spec files, which allows user-assisted remote attackers to remove home directories via vectors involving a ;~ (semicolon tilde) sequence in a Name tag. (CVE-2010-2197).
lib/fsm.c in RPM 4.8.0 and earlier does not properly reset the metadata of an executable file during replacement of the file in an RPM package upgrade or deletion of the file in an RPM package removal, which might allow local users to bypass intended access restrictions by creating a hard link to a vulnerable file that has a POSIX ACL, a related issue to CVE-2010-2059. (CVE-2010-2199). |
| Alerts: |
|
Comments (none posted)
tomcat: multiple vulnerabilities
| Package(s): | tomcat |
CVE #(s): | CVE-2010-4312
CVE-2011-1088
CVE-2011-1183
CVE-2011-1419
CVE-2011-1475
CVE-2011-1582
CVE-2011-2481
|
| Created: | June 25, 2012 |
Updated: | June 27, 2012 |
| Description: |
From the CVE entries:
The default configuration of Apache Tomcat 6.x does not include the HTTPOnly flag in a Set-Cookie header, which makes it easier for remote attackers to hijack a session via script access to a cookie. (CVE-2010-4312)
Apache Tomcat 7.x before 7.0.10 does not follow ServletSecurity annotations, which allows remote attackers to bypass intended access restrictions via HTTP requests to a web application. (CVE-2011-1088)
Apache Tomcat 7.0.11, when web.xml has no login configuration, does not follow security constraints, which allows remote attackers to bypass intended access restrictions via HTTP requests to a meta-data complete web application. NOTE: this vulnerability exists because of an incorrect fix for CVE-2011-1088 and CVE-2011-1419. (CVE-2011-1183)
Apache Tomcat 7.x before 7.0.11, when web.xml has no security constraints, does not follow ServletSecurity annotations, which allows remote attackers to bypass intended access restrictions via HTTP requests to a web application. NOTE: this vulnerability exists because of an incomplete fix for CVE-2011-1088. (CVE-2011-1419)
The HTTP BIO connector in Apache Tomcat 7.0.x before 7.0.12 does not properly handle HTTP pipelining, which allows remote attackers to read responses intended for other clients in opportunistic circumstances by examining the application data in HTTP packets, related to "a mix-up of responses for requests from different users." (CVE-2011-1475)
Apache Tomcat 7.0.12 and 7.0.13 processes the first request to a servlet without following security constraints that have been configured through annotations, which allows remote attackers to bypass intended access restrictions via HTTP requests. NOTE: this vulnerability exists because of an incomplete fix for CVE-2011-1088, CVE-2011-1183, and CVE-2011-1419. (CVE-2011-1582)
Apache Tomcat 7.0.x before 7.0.17 permits web applications to replace an XML parser used for other web applications, which allows local users to read or modify the (1) web.xml, (2) context.xml, or (3) tld files of arbitrary web applications via a crafted application that is loaded earlier than the target application. NOTE: this vulnerability exists because of a CVE-2009-0783 regression. (CVE-2011-2481) |
| Alerts: |
|
Comments (none posted)
Page editor: Jake Edge
Kernel development
Brief items
The current development kernel is 3.5-rc4,
released on June 24. Linus says: "
So while
we still have 200+ commits in this -rc, they really are all pretty tiny and
insignificant. Sure, if the particular issue they fixed hit you (or you are
the developer of those life-changing lines ;), you may disagree with the
"insignificant" part, but to me, this is just how I like the -rc's at this
point."
Stable updates: the 3.0.36 and 3.4.4 stable kernels were released on
June 22.
Comments (none posted)
MUWUHAhaha, watch as I destroy your attempts to reduce line count
in your diffstat.
—
Mel Gorman
Kernel developers tend to look at code from the point of view "does
it work as designed", "is it clean", "is it efficient", "do I
understand it", etc. We often forget to step back and really
consider whether or not it should be merged at all.
—
Andrew Morton
Very few people add printk()s as "inform the system logging daemon
about an event". The prevailing mindset is that perfect code does
not need any logging, so what is left are over 50,000 call sites of
bragging and debugging code, occasionally massaged to be somewhat
user friendly.
—
Ingo Molnar
Comments (none posted)
The Linux Foundation has a couple of new installments in its "30 Linux
kernel developers in 30 weeks" series:
Sarah
Sharp ("
Find a medium-sized project, in a part of the Linux
kernel community that has a responsive mailing list. Don't waste your time
on a bunch of spelling fix patches.") and
Thomas
Gleixner ("
Quite a few people consider me to be one of the Grumpy
Old Men. That's related to my age and the age-related unwillingness to cope
with crap.").
Comments (1 posted)
Jim Gettys has posted
a
lengthy article collecting a lot of thoughts on what's wrong with the
Internet and how the problems can be addressed. "
'Fairness' between
applications is also essential. We should reduce/eliminate the current
perverse incentives for applications to abuse the network, as HTTP does
today. We’ve had an arms race conspiracy for the last decade between web
browsers and web sites to minimize latency that is destructive to other
traffic we may care about (such as telephony, teleconferencing and
gaming). Sometimes this is best addressed by fixing protocols to be both
more efficient and more friendly to the network, as HTTP/1.1 pipelining and
now SPDY are intended to do. But the 'web site sharding' problem is
impossible for clients to avoid."
Comments (8 posted)
Btrfs hacker Josef Bacik has let it be known that he will be leaving Red
Hat and joining the growing crowd of kernel developers at Fusion IO.
Full Story (comments: none)
Patches going into the mainline kernel contain a number of tags; a
"Signed-off-by:" from the author is mandatory, but most patches include
tags like "Acked-by:" or "Reported-by:" as well. The officially recognized
tags are documented in the SubmittingPatches file, but some developers have a
certain habit of inventing their own as well. Andrew Morton, while
grumpily trying to discourage such usage, did a bit of digging for
unofficial tags. The result surprised him, leading him to ask
"
Geeze, guys. Who knew there were so many Kernel Komedians?"
For example, he found a number of variants on "Acked-by:", including:
Cautiously-acked-by:
Delightedly-acked-by:
Embarrassingly-Acked-by:
Emphatically-Acked-by:
Grudgingly-acked-by:
Hella-acked-by:
Sort-Of-Acked-By:
And so on. One can only feel bad for the developers who felt the need to
add "Repented-by:" or "Fatfingered-by:" and wonder about the story behind
tags like "Antagonized-by:" or "Signed-off-and-morning-tea-spilled-by:".
Andrew may dislike these tags, but others seem to find them amusing and
will give them a "Whatevered-by:" at worst.
Full Story (comments: 8)
Kernel development news
By Jake Edge
June 27, 2012
A proposal
from Cong Wang
to discuss the various mechanisms to store the kernel's "dying breath"
spawned a rather large thread on the ksummit-2012-discuss mailing list.
While things like pstore were set up specifically
to provide a means to store kernel crash information, that doesn't
necessarily make it easy for users to access and report kernel
crashes. That led to suggestions and discussion of better ways for users
to get the information out of their crashed systems—including using
QR
codes to facilitate the process.
Most regular users do not have a serial console set up to record crash
information on a separate machine. So the kernel backtrace that appears
after a crash is just written to the console, which means that much of it
will have scrolled off the screen. Even the data that is there is hard to
extract, with some folks trying to type the information in, which is
tedious, not to mention error-prone. A QR code that encoded the relevant
data could certainly help there.
Konrad Rzeszutek Wilk was the first to broach
the QR code idea, though he said it did not originate with him. It
turns out that H. Peter Anvin and Dirk Hohndel have been "messing
with" the idea, but Will Deacon and Marc Zyngier actually showed
something along those lines at the recent Linaro Connect in Hong Kong.
Deacon was hesitant
to call it a prototype, but said that there was some work done on
encoding a kernel crash backtrace as a QR code. There were two
problems with their approach:
-
Even without any error correction, the QR code started to get pretty
large (and unreadable) after more than a few lines of backtrace. This
should be fairly easy to fix by encoding the data in a more sensible
manner rather than just verbatim (especially since a backtrace is
a well-structured log). Maybe you could even gzip the whole thing after
that too (then sell an android app to gunzip it :p)
-
Displaying the QR code on a panic could be problematic. We tried using
the ASCII option of libqrencode but we couldn't find any phone that
would read the result. So we need a way to get to the framebuffer once
we've sawn our head off (maybe this is easier with x86 and VGA modes?).
One of the original motivations for kernel
modesetting (KMS) was to get readable oops information to the screen.
Using KMS to display a fairly simple QR code graphic instead should be
workable, rather than creating an ASCII version as Deacon describes.
Matthew Garrett noted
that it should be fairly straightforward at least for hardware that has KMS
support:
KMS already has atomic modeswitch support for showing panics. We'd just
need to ensure that there's an unaccelerated path for dumping contents
directly to the framebuffer. If you don't have KMS then you don't get to
play with modern useful functionality.
There is some disagreement about where the decoding of any QR code should
take place. Garrett believes that existing QR
apps in phones should be used, while others are not convinced they can be
coerced into being flexible enough to deal with the large QR codes that might
result from a kernel backtrace. Garrett has also done some work on the problem
and described
his approach:
Basic design was as follows: Take the backtrace, compress it, encode in
an alphanumeric QR code including an http:// prefix, submit to
http://kbu.gs/blah automatically when user takes a picture
Anvin would rather see some kind of web
application that accepts a photo of the QR code and decodes it on the server. For
one thing, having one (working) decoding code base is desirable: "I can tell you just how bad a lot of the QR decoder software running on
smartphones are -- because I have tried them." In addition, though,
a web application would also have the photo itself, so even if it didn't
decode because of picture quality or other reasons, those photos could be
used to improve the quality of the decoder.
But that implies that a user would need to download an app to their phone
or use some web application as suggested
by John Hawley. Garrett was not in favor of either solution, noting that
requiring an app makes its harder for users, while a web application
doesn't really make it any better:
And now your workflow is "Take picture, move to browser, upload, wait to
see if it decodes, back to camera, back to browser", etc. I know we're
expected to be bad at UX here, but come on.
Given that many users already use photos to report crashes—taking a
picture of the screen with the last part of the backtrace—the QR code
mechanism, even if a bit cumbersome, might be able to provide the full
backtrace. But, as Dave Jones suggested,
just having scrollback available on the console after a crash would make
much of the problem disappear:
"What would be a thousand times more useful would be having working scrollback
when we panic, like we had circa 2.2".
Users could then take a photo, scroll back a
ways, take another, and so on. In the thread, there was widespread
agreement that console scrollback would be desirable. But it turns out that the
advent of USB keyboards caused the loss of that feature. Doing USB
handling inside the panic code would be messy,
so bringing that
feature back is difficult. Other ideas were mentioned, like providing
enough of the USB stack to write the crash information to a USB stick as
Anvin suggests, or
to "auto-scroll" the console output after a crash without requiring
keyboard input as proposed
by Paul Gortmaker.
Making it easier for users to report crashes with useful information was
one branch of the discussion, but the folks who work on the embedded side
are looking for more developer-oriented solutions as well. Tony Luck outlined
the pstore back-ends that are currently available to store crash and other
information in various places (ERST, EFI variables, RAM) that are
accessible after a reboot. Wang, Tim
Bird, Jason Wessel, and others are interested in discussing that piece of
the puzzle.
While QR codes may seem like something of gimmick, they can compress a fair
amount of data into a form that can be digested elsewhere. Getting useful
information out of an unresponsive, crashed Linux system is fairly
difficult at this point, so finding better ways to do so would be good.
Should the program committee decide to add this topic, a lively discussion
seems likely. If not, though, enough people are looking
into the idea that something will emerge sooner or later.
Comments (20 posted)
By Jonathan Corbet
June 26, 2012
The
record-oriented logging patch set was
pulled into the mainline during the 3.5 merge window. These changes are
meant to make the processing of kernel messages generated by
printk() and friends more reliable, more informative, and more
easily consumed by automatic systems. But recently it has turned out that
these changes make
printk() less useful for kernel developers.
Now there is some uncertainty as to whether this feature can be repaired in
time, or whether it will be reverted back out of the 3.5 release.
One of the core design features of the new printk() is a change
from byte-streamed output to record-oriented output. Current kernels can
easily corrupt messages on their way to the log; for example, when the log
buffer overflows, the kernel simply wraps around and partially overwrites
older messages. Messages from multiple CPUs can also get confused,
especially if one or more CPUs are using multiple printk() calls
to output a single line of text. The switch to the record-oriented
mechanism eliminates these problems; it also makes it possible to attach
useful structured information to messages. As a whole, it looks like a
solid improvement to the kernel logging subsystem.
There is just one little problem, though: when the kernel outputs a partial
message (by passing a string to printk() that does not end with a
newline), the logging system will buffer the text until the rest of the
message arrives. The good news is that this buffering causes the full line
to be output together once it's complete—if things go well. The situation
when things do not go well was best summarized by Andrew Morton:
If a driver does
printk("testing the frobnozzle ...");
do_test();
printk(" OK\n");
and do_test() hangs up, we really really want the user to know that
there was a frobnozzle testing problem. Please tell me this isn't
broken.
Not only is this behavior now broken, but it has also burned at least one
developer who ended up spending a lot of time trying to figure out why the
kernel was hanging. Kernel developers depend heavily on printk(),
so this change has caused a fair amount of concern.
Bugs happen, of course; the important thing is to fix them. A number of
possible fixes have been discussed on the list, including:
- Leave printk() as it is, and change specific callers to
output only full lines. Kay Sievers, the author of the
printk() changes, suggested
that approach, saying "We really should not optimize for
cosmetics (full lines work reliably, they are not buffered) of
self-tests, for the price of the reliability and integrity of all
other users."
- Adding a printk_flush() function to be called in places where
it is important to see partial lines even if things go wrong before
the newline character is printed. The problem with this approach is
that, like printing full lines only, it requires changing every place
in the code where the problem might hit. Experience says that many of
those places can only be found the hard way.
- Add a global knob by which buffering can be turned on or off; this
knob might be set by either user space or the kernel. This idea was
not particularly popular; it seems unlikely that the knob will be set
for unbuffered output when it really matters.
- Simply revert the printk() changes for 3.5 and try again for
3.6 or later. Ingo Molnar posted a
patch to this effect, seemingly as a way of pressuring Kay
to take the problem more seriously.
As of this writing, most of the discussion centers around this patch from Steven Rostedt which simply
removes the buffering from printk(). For the most part, the
advantages of the new code remain. But it is now possible that a single
line of output created with multiple printk() calls may be split
into multiple lines, with messages from other CPUs mixed in between. It
seems to many to be a reasonable compromise fix.
Except that Kay still doesn't like the splitting
of continuation lines. Andrew Morton is also concerned about where the printk()
code is going, saying "The core printk code is starting to make one
think of things like 'cleansing with fire'." Steven, meanwhile, is
reconsidering the whole thing, saying that,
perhaps, printk() is not the right tool for structured logging and
other approaches should be considered. And Greg Kroah-Hartman has suggested that it might be better just to fix
the call sites rather than further complicating the printk() code.
Linus, however, has argued strongly for the
merging of Steven's patch. His view is that buffering at the logging level
is fine, but text emitted with printk() has to get to the console
immediately. So chances are that some version of Steven's fix will be
applied for the 3.5 release. But it has become clear, again, that
adding structured logging to the kernel while not making life harder for
kernel developers is a difficult problem.
Comments (6 posted)
By Jonathan Corbet
June 27, 2012
It has often been said that memory management patches can take a long time
to be accepted into the mainline kernel. Because memory management
performance regressions can take years to be discovered, developers
in this area have become highly conservative; making memory management
changes is not a recommended endeavor for those lacking patience. But
there may be an area where progress can be even more glacial, for different
reasons. Security-oriented changes are subject to arbitrary delays because
tighter security can break programs and irritate users.
Consider the classic symbolic link vulnerability, wherein an attacker fools
a privileged program into writing to a file behind an attacker-controlled
symbolic link. Such vulnerabilities can be exploited to overwrite files
that the attacker would not otherwise have access to. One does not have to
dig far into the LWN vulnerability list to
see that the identification and patching of symbolic link vulnerabilities
is an ongoing process. One might think that, if somebody could come up
with a way to eliminate such vulnerabilities altogether, it would be
adopted in a hurry.
As it happens, Kees Cook has a way to deal
with this class of vulnerabilities. It is based on the observations that
symbolic link vulnerabilities almost always involve links placed in
/tmp, and that /tmp has the "sticky" bit set in any
contemporary distribution. Given that:
The solution is to permit symlinks to only be followed when outside
a sticky world-writable directory, or when the uid of the symlink
and follower match, or when the directory owner matches the
symlink's owner.
In short, this change would make it so that nobody could create symbolic
links in /tmp and expect a privileged program to follow them.
Lest one think that Kees is taking credit for this concept, he posted a bit
of history for this idea, starting with a 1996 Bugtraq
message from Zygo Blaxell and a
kernel patch by Andrew Tridgell from the same year. This idea, in
other words, has been floating around for at least 16 years, but an
implementation
has never found its way into the mainline kernel. Memory management
changes are amazingly fast in comparison.
The reason for the resistance, of course, is that this is a change in
filesystem semantics. There are concerns that it would break POSIX
compliance, though Kees claims that POSIX is silent on this particular
behavior. Also of concern is the possibility of breaking existing
applications. Kees responds that any broken applications would be easily
noticed (while those suffering from symbolic link vulnerabilities are not),
and that no applications relying on existing behavior have ever been
found. There have also been disagreements over how the patch should be
implemented, but those have seemingly mostly been resolved.
So Kees thinks that his current patch set
(a variant of one we have seen
before)
should be considered for merging, finally. The patches implement the
symbolic link restrictions, but also add a new rule for hard links: a hard
link to a file can only be created if the user owns the file or has write
access to it. Once again, this change eliminates a class of attacks, but
at a small cost: older versions of the "at" daemon break unless a small
patch is applied. No other problems have been found, Kees says, after 1.5
years of experience with this patch in the Ubuntu kernel.
Whether that is enough evidence to get the changes merged this time around
remains to be seen. It has only been 16 years, after all, and one would
not want to be too hasty about such a thing.
Meanwhile, Kees has put together a separate security-oriented patch that
has run into some concerns of its own. On Linux systems, there is a sysctl
knob (suid_dumpable) that controls whether a crashing setuid
process generates a core dump or not. Setting it to a non-zero value
allows core dumps to happen; setting it to two applies certain restrictions
that are intended to make it safe. But, Kees says, that's not the case; it
allows a user to create a file called core in almost any
directory, containing arbitrary text (environment strings, for example).
This capability is not necessarily as harmless as one might think; as the 2006 cron vulnerability shows, some
programs will happily pick out the strings they understand in a file full
of junk, happily
ignoring the rest. Thus, he claims, allowing users to create files in
arbitrary locations is asking for trouble.
His response has been through a number of iterations:
- Version 1 disallowed storing
core dumps from privileged executables into a file. If the
core_pattern knob is set to a pipe, instead, core dumps
happen as before. This was seen as an incompatible ABI change,
though, and one that would cause surprising results.
- Version 2 added a new setting (3) that
would only allow setuid core dumps to a pipe. The previous "safe"
setting (2) was deprecated; attempting to set it would fail with an
EINVAL error. This version ran into trouble as a result of how it
interacted with the sysctl mechanism.
- Version 3 fixed the sysctl
difficulties but was opposed by Andrew Morton, who feared that the
deprecation of the previous mode would break current systems in
surprising ways. He suggested keeping suid_dumpable=2 as a
working mode with a warning.
- Version 4 went back to something
closer to version 1, but with some loud warnings emitted. But
then Eric Biederman asked whether disallowing relative paths would be
a sufficient fix.
- Thus, version 5 (the current version,
as of this writing), just disallows the writing of setuid core dumps
to relative paths. Should core_pattern be set to a relative
path ("core", for example), a warning will be logged instead.
Thus far, there has not been much in the way of complaints about the fifth
iteration of the patch. So, possibly, it will not be necessary to wait for
years until this particular bit of security tightening gets into the
mainline kernel. Of course, unlike the system's link behavior, the core
dump behavior can be changed now by concerned system administrators—no need
to wait at all.
Comments (29 posted)
Patches and updates
Kernel trees
Build system
Core kernel code
Development tools
Device drivers
Filesystems and block I/O
Memory management
Networking
Architecture-specific
Security-related
Virtualization and containers
Miscellaneous
Page editor: Jonathan Corbet
Distributions
By Nathan Willis
June 27, 2012
Fedora's Engineering Steering Committee (FESCo) recently voted
to approve an additional feature for Fedora 18: a new alternative
package manager called DNF. The primary author
intends for DNF to
replace Fedora's current package manager, Yum, but there are questions
as to when it will be ready and why starting over is
preferable to updating the existing code.
A handful of news sites reported DNF as "the" new package manager for
Fedora, which might lead one to conclude that it will be the default
option. However, DNF is only approved for inclusion in Fedora 18, as
it does not have feature-parity with Yum and is still in heavy
development. Yum will remain the system package manager, and may for
several more releases.
DNF is the creation of Red Hat's Aleš Kozumplík, and is a fork of Yum
3.4 that uses his own hawkey library to
resolve package dependencies. Like Yum, DNF sits on top of the
lower-level rpm package installer; its purpose is to simplify package
installation, update, and removal by automatically calculating
dependencies and handling the management of dependent packages. The moniker "DNF" officially stands for
nothing, and was chosen by Kozumplík solely as a placeholder, although
the choice prompted humorous responses that it stood for
un-complimentary phrases like "Did Not Finish" or the oft-maligned
Duke Nukem Forever. Hawkey itself is a wrapper around the
libsolv dependency
resolver from openSUSE.
Some of the goals described on the wiki page
include providing API access to languages in addition to Python (which
is currently Yum's only option), a "strict" API, and better
performance.
Migrating the package manager to libsolv is the primary goal, however,
with an eye toward eventually using libsolv as the dependency solver
for RPM itself, as well as making it available to other programs that
handle package management, such as the installer. For users, a
libsolv-powered DNF's principal advantage over Yum is expected to be its
speed. At FUDCon in February, for example, Yum was described as being
four times slower than untar, which no one seems to regard as
acceptable.
Libsolv gains its advantage by using a modern
satisfiability-solving algorithm (or "SAT solver"), rather than the
ad-hoc dependency checking methods employed by other major Linux package
managers. Libsolv's documentation
observes
that most package managers are optimized to find
updates to currently installed packages, with extensions tacked on to
handle other situations, resulting in slow and unpredictable code.
Libsolv uses a reimplementation of the open source Minisat solver, and has been adopted by openSUSE's Zypper package manager.
Integration with Fedora
DNF uses the more modern libsolv for improved dependency solving,
but it does not implement all of Yum's other features. The missing
functionality
generated some feedback from administrators who were upset that some
of the features they depend on had evidently been marked as
deprecated. For example, Yum has a "history" function that allows the
user to investigate logged installation and update actions, and even
roll them back, but the feature was listed on DNF's features
considered for dropping page. When fans of the feature complained
about its removal, Kozumplík agreed
to implement it in a future version.
DNF also lacks PackageKit integration, which PackageKit
maintainer Richard Hughes cited as a major problem during the FESCo
debate
about approving the feature. Others in the discussion were unclear as
to the justification and plans for DNF to replace Yum. Bill
Nottingham asked why a fork was necessary, and Matthew Garrett asked:
Is this intended to prototype functionality that'll head back
to yum? Or is it going to sit there as an alternative and some time
later we'll have a contentious argument about defaults?
FESCo eventually voted to approve the feature on the grounds that (as
Miloslav Trmač put it) it was a "
'we added a package, please
come and play with it' kind of feature, not 'this is a change for
everybody' kind of feature."
Distinct from the issue of advertising DNF's readiness too soon is the
concern that Kozumplík appeared to have launched the project without
first attempting to implement his changes within the main Yum trunk or
working with the Yum team. Yum developer Tim Lauridsen asked on the
desktop-devel mailing list:
Would it not be better to work with yum upstream to make the current yum
depsolver more modular so you could plugin another libsolv based depsolver,
instead of making a fork of yum and starts trashing the current API. There
is a lot more to yum, that just solving dependencies. And making a fork
there is not fully compatible will put a lot of work on your shoulders :)
without the benefit on the work done by yum upstream :) like parallel
download etc.
Kozumplík replied that a fork was required to clean up the Yum API; deprecating legacy interfaces without worrying about simultaneously preserving backward compatibility. He also said
that he is collaborating with the Yum team, which Seth Vidal called
a contradiction, considering Lauridsen's question:
Tim _IS_ a yum developer. That he felt surprised by this means you've
only been communicating with yum devs on the internal red hat packaging
team not in the community channels related to yum specifically. It would
make good sense if you want to figure out what apis are important/matter
to discuss this on the yum-devel mailing list at the very least.
The future of solving
Of course, maintaining the delicate balance between paid-company-developers and community developers is a challenge on which most corporate
distributions spend a lot of time. At least the two camps appear to be talking now. Doubtless there is
interest in the improvements offered by a libsolv-based backend from
the Yum project, particularly as other pieces of the Fedora
infrastructure (including rpm itself and the installer) move toward it. Libsolv is gaining supporters among other projects as well (Hughes, for instance, is also the
author of the alternative package manager zif, and
indicated he was planning to migrate it to libsolv in the future). There does not appear to be a plan to migrate apt or apt-rpm to libsolv, but there has been interest in writing a new SAT solver to replace apt's current dependency resolver.
But agreement on the goal does not make the process any less arduous;
there are inherent difficulties in replacing a piece of software as
close to the core as Yum. Dependency resolving is complicated by the
number of options an RPM package can declare, such as specific
versions of some libraries or "recommended" packages that a user might
select (thus adding still more dependencies to the puzzle). There are
also scenarios where a user wants to swap out one package for another,
and would prefer to leave the common dependencies in-place in order to
cut down on time required. Despite the fact that openSUSE has been
using libsolv in its own package manager Zypper, hawkey and DNF
will require a lot of testing to get all of the corner cases correct.
Until then, Fedora users can count on Yum staying right where it is.
Comments (8 posted)
Brief items
Kitten discussions probably don’t belong here, unless the kitten in
question is surprisingly adept on the command line.
--
Ubuntu IRC Council Blog
This is a development list, after all, not a ranting list.
--
Bill Nottingham
Comments (1 posted)
The first release candidate for the CyanogenMod 9.0 Android distribution is
now
available;
this will be the first CyanogenMod release based on Android 4.0 ("Ice
Cream Sandwich").
"
It wasn’t quick or easy, but we are extremely proud of this release
and what it represents for us as a group. The jump from 2.3.7 to 4.0.4 in
many ways was a fresh start for this project, and as much as the code
changed, the structure and organization of CM as a whole changed as
well. It meant a lot of hard work, and late nights, but also a ton of
fun. We are in this for the challenge, and the reward is always the
satisfaction received when we release it to the masses as a ‘stable’
product. This RC1 brings us a step forward toward that payoff."
Comments (50 posted)
Red Hat Enterprise Linux 6.3 has
been
released. The release notes
are
here. From a
review
at The H: "
RHEL 6.3 officially supports allocating up to 160 processor cores and 2TB of working memory to guest systems; previously, the distribution only supported 64 cores and 512GB of RAM. Spice, which is involved in virtualising desktop PCs, can now pass through USB 2.0 devices from the system that is displaying the desktop to a local or remote virtualised guest operating system. Another new feature is support for SR-IOV-capable (Single Root I/O Virtualisation) networking hardware which can present a single physical card as multiple, separate virtual network cards; this support allows these virtual network cards to be allocated to guest systems."
Comments (17 posted)
James Bottomley has announced the availability of a version of the
Tianocore UEFI implementation
built into a KVM virtual machine; the result is a virtual system
implementing the UEFI secure boot mechanism. "
I'm releasing this now
because interest in UEFI Secure Boot is rising, particularly amongst the
Linux Distributions which don't have access to UEFI secure boot hardware,
so having a virtual platform should allow them to experiment with coming up
with their own solutions." It should be a useful tool for anybody
wanting to make a system that works within the UEFI secure boot environment.
Full Story (comments: 3)
The Open webOS project has
announced
an interim source release called the "Community Edition". "
The
Community Edition is focused on supporting the TouchPad. By contrast, the
Open webOS 1.0 release planned for September includes modernized
technologies to better enable the community to port webOS to the hardware
of their choice, and to integrate open source technologies in areas such as
BlueZ bluetooth and GStreamer."
Comments (11 posted)
Distribution News
Debian GNU/Linux
The Debian release team has announced the Wheezy freeze on June 30.
Packages will no longer migrate automatically from unstable (sid) to
testing (Wheezy) at that time.
Full Story (comments: none)
Fedora
John Rose (aka inode0) has been appointed to fill the final slot on the
Fedora board.
Full Story (comments: none)
Newsletters and articles of interest
Comments (none posted)
Earlier this month LWN
covered the
announcement of an initial x32 release candidate of Gentoo. The x32
ABI enables the running of processes in 64-bit mode while using 32-bit
pointers. Gentoo developer Diego Elio "Flameeyes" Pettenò
isn't
convinced that x32 is the way to go, and debunks some common
misconceptions about the x32 ABI. "
The new x32 ABI has proven to be faster. Not really; what we have right now are a few benchmarks, published by those who actually created the ABI, Of course you’d expect that those who spent time to set it up found it interesting and actually faster, but I honestly have doubts about the results, for reasons that will be clearer by reading the next few entries."
Comments (112 posted)
Ars technica
takes
a look at Android 4.1 (Jelly Bean). "
For developers, the Jelly
Bean SDK will include a new profiling tool, systrace, that provides a clear
visualization of their applications' use of the CPU, GPU, and other system
components, so that bottlenecks can be more readily identified and
resolved." More information can be found in the
Jelly
Bean platform highlights.
Comments (22 posted)
Page editor: Rebecca Sobol
Development
By Jake Edge
June 27, 2012
Parsing HTML is sometimes a surprisingly complicated task. Even what seem
like fairly trivial constructs seem to have a bunch of "fiddly bits" that
one needs to account for. For most of those kinds of tasks, then, one generally
turns to a full-blown HTML parser. While Python includes one in its
standard library, it can be somewhat painful to use as well. Some recent
experiments with Beautiful Soup, in
particular version 4 released earlier this year, have shown a
parser that is well-designed and easy to use.
Beautiful Soup is available
as a tarball that can be installed in the usual Python way (using
setup.py). It can also be installed
using pip or easy_install from the PyPi repositories. It
is also packaged as python-bs4 for Debian and Ubuntu;
packages
for other
distributions will presumably be coming along as well. It
supports both Python 2.7 and Python 3, with few external dependencies.
Beautiful Soup uses the Python standard library parser, but can also use
optional faster parsers (lxml, html5lib) if they are installed.
Getting started with Beautiful Soup is pretty straightforward:
from bs4 import BeautifulSoup
soup = BeautifulSoup(string_or_filehandle)
From that point on, the
soup object is used to query and manipulate
the HTML contained in the argument. The input data is converted
to Unicode, parsed, and cleaned up so that it is valid. At that
point, simply outputting the object (e.g.
print soup) will produce
the cleaned-up HTML.
soup.prettify() will do even more, indenting
and matching up tags to create pretty output.
But, there's lots more available than simply cleaner HTML. It can also take
HTML fragments, which will be transformed into full HTML documents once
parsed. Beautiful Soup
breaks the HTML down into objects which correspond to the tags contained in
the input. For example:
print soup.head
print soup.body
will print the
<head> and
<body> sections.
For an HTML fragment,
soup.body will contain the parsed contents
of that fragment, and each
piece in the HTML can be accessed via the
.children iterator or
the
.contents list. So:
html_frag = '<p>foo bar</p><p>baz<p>yet another graf</p>'
soup = BeautifulSoup(html_frag)
print soup.body.contents
for c in soup.body.children:
print c
# output
[<p>foo bar</p>, <p>baz</p>, <p>yet another graf</p>]
<p>foo bar</p>
<p>baz</p>
<p>yet another graf</p>
Note that the unclosed paragraph tag was fixed.
Each tag in the HTML gets turned into a Tag object by Beautiful Soup. Tag
objects
have attributes like .name, .children, .parent,
and so on, which can be used to distinguish various tags and to navigate
the tree. In addition, the HTML attributes of a particular tag are
available by accessing the tag as a dictionary:
soup = BeautifulSoup('<b class="boldest">foo</b> bar <b class="bolder">baz</b>')
print soup.b['class']
# output
['boldest']
The first tag of a given type can be referred to using the dot notation, so
soup.b is the first "b" tag in the object. One can access the
other tags by navigating in the HTML tree or by searching.
But wait, there's more:
soup.b['class'] = 'waylessbold'
print soup.body
# output
<body><b class="waylessbold">foo</b> bar <b class="bolder">baz</b></body>
So, changing an attribute is reflected in the output of the soup object. That
can be used to add new attributes (
tag['newattr']='foo') or to
remove existing attributes (
del tag['class']).
For many transformation tasks, though, one might rather not step through
the whole HTML tree, and would, instead, want to search for tags of
interest. Beautiful Soup has some powerful capabilities in that area too.
Using the above example:
for b in soup.find_all('b'):
b['class'] = 'justbold'
would change the class of all "b" tags in the object to "justbold"
(adding a "class" attribute to any that don't have it).
More complicated things can be done using regular expressions as well:
for a in soup.find_all('a', href=re.compile(r'^/')):
a['href'] = 'http://lwn.net%s' % a['href']
would turn relative links into full URLs for example. Using keyword
arguments (like
href above) will search the HTML attributes of the
tag based on a string or regular expression. There are also ways to search
based on the presence of a tag (
href=True), to limit the number
of results returned, or to change the default recursive searching so that
only direct children are searched.
One can also create
a dictionary with just the attributes needed (or wanted) on a particular
tag type and simply assign those tags using the .attrs attribute:
for img in soup.find_all('img'):
idict = { 'height' : 22, 'width' : 42, 'src' : img['src'] }
img.attrs = idict
Something like that might be useful to remove HTML attributes like
align= that
aren't acceptable in some forms of HTML (e.g. EPUB).
One can also create and insert entirely new tags into the soup object (and
thus the HTML). A tag can be created with new_tag() then
inserted before or after any other tag:
itag = soup.new_tag('i')
itag.string = 'italicized'
soup.b.insert_before(itag)
print soup.body
# output
<body><i>italicized</i><b class="justbold">foo</b> bar <b class="justbold">baz</b></body>
There is one seeming oddity with Beautiful Soup in that there is another
type of object called a NavigableString.
Any string in the input (even those that are the
child of a Tag, e.g. the "foo" in <b>foo</b>) will be represented as a NavigableString.
These can be encountered while moving around in the HTML tree:
soup = BeautifulSoup('<i>italicized</i><b class="justbold">foo</b> bar <b class="justbold">baz</b>')
for t in soup.body.children:
print t.name
# output
i
b
Traceback (most recent call last):
...
AttributeError: 'NavigableString' object has no attribute 'name'
The unadorned "bar" in the fragment above is clearly not a tag, but by making it
a different object, without the "standard"
.name attribute, it
is a bit of a pain to deal with. The problem occurs mostly in
experimentation and debugging, but one needs to do something like:
from bs4 import NavigableString
if isinstance(child, NavigableString):
...
to detect it, which is somewhat annoying. Perhaps I don't understand the
ramifications of synthesizing a Tag object to hold those strings
(or at least providing things like
.name on those objects),
but on first glance it seems like it would be a better way to
handle them.
Overall, Beautiful Soup is rather impressive. There is quite a bit more to it
than this brief overview shows, but the documentation
is excellent and provides lots of examples. If you have some need to parse
or transform HTML in Python, Beautiful Soup 4 is surely worth a look.
Comments (8 posted)
Brief items
I was asked a few weeks ago, "What was the biggest surprise you
encountered rolling out Go?" I knew the answer instantly: Although
we expected C++ programmers to see Go as an alternative, instead
most Go programmers come from languages like Python and Ruby. Very
few come from C++.
—
Rob
Pike
It's fair enough to say that I wouldn't be a programmer today if it weren't for an interest in game programming, and that is true of several of my friends as well. But if that is true, why then do we have so few finished and polished free software games? Answering that question actually deserves of a post of its own (and indeed, solving that riddle is a good portion of the motive behind Liberated Pixel Cup), but it's enough to say for now that we are missing opportunities of encouraging future hackers by not making free software a welcoming playground for game development.
—
Chris Webber
Comments (14 posted)
Enlightenment developer Bruno Dilly
announced that EFL has merged in EPhysics, a wrapper for the
Bullet Physics library, making it "
pretty simple for an EFL programmer and we expect them to adopt EPhysics to create their next splash screen, transition effects and even more games." The library allows Evas objects to "
have physical attributes such as mass, friction and restitution and shape. They may receive impulses and collide between them."
Comments (none posted)
The first release candidate for KDE 4.9 desktop environment has landed. "
With API, dependency and feature
freezes in place, the KDE team's focus is now on fixing bugs and further
polishing new and old functionality." Highlights include support for Qt Quick in Plasma, deeper integration of the "Activities" scheme for organizing workspaces, and improved metadata sorting-and-searching within the Dolphin file manager.
Full Story (comments: none)
Version 1.2.99.1 of the
SyncEvolution PIM-synchronization framework has been released. This is the first pre-release version of the upcoming 1.3 series, and includes several new features, including KDE/Akonadi support, ActiveSync support, and rewritten D-Bus and CalDAV components. Despite the pre-release status, upgrading is still recommended for several reasons: "
for example,
SyncEvolution 1.3 is required for Evolution 3.4, otherwise photos are
not exported properly. Further workarounds for recent changes in
Google CalDAV were added."
Full Story (comments: none)
Newsletters and articles
Comments (none posted)
Ars technica has posted
a
review of the Android version of Firefox. "
One of the key
features of Firefox for Android is its support for Mozilla's
synchronization service. It works seamlessly with the desktop version of
the browser, allowing the user to access their bookmarks and other browser
data. This capability works as expected and will likely be a major draw for
existing Firefox users."
Comments (21 posted)
Jasper St. Pierre has posted
a lengthy
overview of the Linux graphics stack. It's a good starting point for
anybody who is not clear on what all those acronyms mean. "
The X
server needs to know what’s happening here, though, so it can do things
like synchronization. This synchronization between your glxgears, the
kernel, and the X server is called DRI, or more accurately, DRI2. 'DRI'
stands for 'Direct Rendering Infrastructure', but it’s sort of a strange
acronym. 'DRI' refers to both the project that glued mesa and Xorg together
(introducing DRM and a bunch of the things I talk about in this article),
as well as the DRI protocol and library. DRI 1 wasn’t really that good, so
we threw it out and replaced it with DRI 2."
Comments (128 posted)
Page editor: Nathan Willis
Announcements
Brief items
The results of the 2012 GNOME Foundation board of directors election have been
announced. For the next year, the board will consist of Bastien Nocera,
Emmanuele Bassi,
Andreas Nilsson,
Joanmarie Diggs,
Tobias Mueller,
Shaun McCance, and
Seif Lotfy.
Comments (none posted)
Articles of interest
Kiran Jonnalagadda
talks
about his personal experience with releasing code under an open source
license, and includes a tip for job hunters. "
The first person to
contribute a patch got hired and became HasGeek’s first employee. HasGeek
is now six people. If you’ve never seen us advertise on our own job board,
that’s because we don’t have to. We simply hire volunteers who contribute
to our code. It works great: we don’t need a ramp up period to get them
familiar with our systems, and we don’t need to create fake scenarios for
interviews. They are demonstrating capabilities with production
code." (Thanks to Biju Chacko)
Comments (none posted)
Over at Linux.com, Amanda McPherson, the Linux Foundation's VP of Marketing and Developer Programs, makes the
case for the open cloud. "
But during any technology shift, users can either gain or lose power, and cloud computing represents both an opportunity and a threat. An opportunity to have more computing power, more cheaply and efficiently. But a threat to give up the freedom won by the open source software revolution. In fact, cloud computing platforms can deliver much more vendor lock in than the old client/server vendor-led worlds ever did." As part of the post, she also announced the
schedule for the CloudOpen conference which will be held in conjunction with LinuxCon North America in San Diego, August 29-31.
Comments (28 posted)
The Linux Foundation has
chosen
5 finalists for its 2012 "Inspired by Linux" T-shirt Design Contest. "
It's now time for the community to vote on its favorite design. Check out the designs, below, then head to our voting gallery for more information on the artists and to choose your favorite.
This year we’ve taken additional steps to ensure a fair voting
process. Voters must be registered members of Linux.com and must be logged
in to vote. Only one vote per person is allowed." Voting is open
until July 3.
Comments (none posted)
The Fellowship of Free Software Foundation Europe has an
interview with
Bjarni Runar Einarsson. "
The cloud is easy and convenient… until it isn’t. And when it stops fulfilling your needs, for whatever reason, you may have no recourse except to start from scratch somewhere else. I find it mind boggling how much people have invested in things like Facebook. Thousands of photos, annotations, conversations. Some of which you can copy, but not all. And you can lose access to it in an instant if some automated software routine decides you are an “abusive user” for whatever reason and closes your account. Or even if someone just steals your password."
Comments (none posted)
Education and Certification
The Linux Professional Institute (LPI) has announced a "School-to-Work"
Linux training and certification program in Spain. "
This program, led
by Spanish secondary schools and LPI-Spain, will initially prepare 300
students in careers as Linux professionals and IT trainers."
Full Story (comments: none)
The Linux Professional Institute (LPI) has announced that its Linux
Essentials exam is available "
at select LPI affiliate locations and
IT events in Europe, Middle East and Africa. The single Linux Essentials'
exam leads to a "Certificate of Achievement"."
Full Story (comments: none)
Upcoming Events
The Linux Security Summit will be co-located with LinuxCon in San Diego,
California. The summit will run August 30-31, 2012. The schedule has been
published.
Full Story (comments: none)
linux.conf.au 2013 will take place in Canberra, Australia January
28-February 2, 2013. A prospectus for potential sponsors is available. "
This is a great opportunity to get in early and take advantage of the sponsorship deals being offered for the conference. The conference routinely hosts 500-700 delegates in a different city each year. Canberra last hosted the event in 2005 for 550 delegates, and this year organisers are expecting it to be even bigger."
Full Story (comments: none)
Events: June 28, 2012 to August 27, 2012
The following event listing is taken from the
LWN.net Calendar.
| Date(s) | Event | Location |
June 26 June 29 |
Open Source Bridge: The conference for open source citizens |
Portland, Oregon, USA |
June 26 July 2 |
GNOME & Mono Festival of Love 2012 |
Boston, MA, USA |
June 30 July 1 |
Quack And Hack 2012 |
Paoli, PA, USA |
June 30 July 6 |
Akademy (KDE conference) 2012 |
Tallinn, Estonia |
July 1 July 7 |
DebConf 2012 |
Managua, Nicaragua |
July 2 July 8 |
EuroPython 2012 |
Florence, Italy |
| July 5 |
London Lua user group |
London, UK |
July 6 July 8 |
3. Braunschweiger Atari & Amiga Meeting |
Braunschweig, Germany |
July 7 July 8 |
10th European Tcl/Tk User Meeting |
Munich, Germany |
July 7 July 12 |
Libre Software Meeting / Rencontres Mondiales du Logiciel Libre |
Geneva, Switzerland |
July 8 July 14 |
DebConf12 |
Managua, Nicaragua |
July 9 July 11 |
GNU Tools Cauldron 2012 |
Prague, Czech Republic |
July 10 July 11 |
AdaCamp Washington, DC |
Washington, DC, USA |
July 10 July 15 |
Wikimania |
Washington, DC, USA |
| July 11 |
PuppetCamp Geneva @RMLL/LSM |
Geneva, Switzerland |
July 11 July 13 |
Linux Symposium |
Ottawa, Canada |
July 14 July 15 |
Community Leadership Summit 2012 |
Portland, OR, USA |
July 16 July 20 |
OSCON |
Portland, OR, USA |
July 26 July 29 |
GNOME Users And Developers European Conference |
A Coruña, Spain |
August 3 August 4 |
Texas Linux Fest |
San Antonio, TX, USA |
August 8 August 10 |
21st USENIX Security Symposium |
Bellevue, WA, USA |
August 18 August 19 |
PyCon Australia 2012 |
Hobart, Tasmania |
August 20 August 21 |
Conference for Open Source Coders, Users and Promoters |
Taipei, Taiwan |
August 20 August 22 |
YAPC::Europe 2012 in Frankfurt am Main |
Frankfurt/Main, Germany |
| August 25 |
Debian Day 2012 Costa Rica |
San José, Costa Rica |
If your event does not appear here, please
tell us about it.
Page editor: Rebecca Sobol