Systemd, an alternative to Upstart or System V init, has made big strides since it was announced at the end of April. It has been packaged for Fedora and openSUSE, and for users of Fedora Rawhide, it gets installed as the default. There are still bugs to be shaken out, of course, and that work is proceeding, especially in the context of Rawhide. The big question is whether Fedora makes the leap to use systemd as the init system for Fedora 14.
When last we looked in on systemd, Lennart Poettering intended to have a package ready for Fedora 14, which has happened, but it was unclear what, exactly, openSUSE's plans were. Since then, Kay Sievers, who worked with Poettering on developing systemd, has created an openSUSE Factory—essentially the equivalent of Fedora's Rawhide—package along with web page of instructions for interested users. But most of the action seems to be going on in Fedora-land.
The Rawhide package gets installed in parallel with the existing Upstart daemon, so users can switch back and forth. That's important because of some glitches in the systemd installation that require users who upgrade from earlier Rawhides to manually enable certain services and targets:
# systemctl enable getty@.service prefdm.service getty.target \ rc-local.service remote-fs.target graphical.target(The last "graphical.target" entry comes from a second message and will resolve the common "Failed to load configuration for default.target" problem).
The magic incantation to switch between the two is detailed in a fedora-devel posting announcing that systemd was made the default in Rawhide near the end of July. The kernel command line init parameter can be used to choose either systemd (init=/bin/systemd) or Upstart (init=/sbin/upstart); as Poettering points out: "note the /sbin vs. /bin!"
There are various other problems being reported on fedora-devel as well, from the network not coming up automatically to shutdown behaving strangely. The fixes are generally straightforward, for the network it is a simple matter of doing:
# systemctl enable NetworkManager.service # systemctl start NetworkManager.serviceand the shutdown problem was addressed by Poettering in the systemd git repository within a few days of the report. While he was congratulated for his quick response on the shutdown problem, Poettering's responses on other systemd breakage has caused some consternation among other Fedora developers.
In particular, a dubious interpretation of the "noauto" mount option in /etc/fstab caused numerous complaints. Essentially, that option is meant to tell the system that the filesystem should never be mounted except by an explicit action of the user. But the systemd developers decided that "noauto" just means that the boot shouldn't wait for the filesystem to be mounted, but will still proceed to mount them if present at boot time or plugged in later.
Poettering has changed systemd so that its "noauto" behavior matches expectations—and the documented semantics—but some of his responses rubbed folks the wrong way. Worse, though, is the concern that Fedora will be making a big mistake by adopting systemd now, instead of making it optional for Fedora 14 and then the default in Fedora 15 if all is well. Matthew Miller voiced his concerns:
So, my question is serious. If we go ahead with systemd for F14, will we be hit with an onslaught of confusion, trouble, and change? That would be good for testing systemd, but *not awesome* for the distribution or for its users. Or, on the other hand, is it a matter of a few kinks which we can get solved before release?
Poettering is characteristically optimistic in response: "Quite frankly I'd draw a completely different conclusion of the current state of affairs: everything is looking pretty great." He goes on to note that any issues reported have been quickly fixed, and that the number of bugs has actually been fairly small:
So, [breathe] deeply, and let's all be jolly!
He has also started a series of blog posts to help administrators become familiar with systemd and to ease their transition. But his overall approach sometimes come under question. At least partially because of the way he handled PulseAudio complaints, Poettering has something of a reputation for waving away bugs as features, which tends to irk some. Miller puts it this way: "The pattern I see is that each time, you respond initially with (paraphrased) 'It's not broken -- we've made it better!' before implementing the fix."
The thread went on for quite a ways, even by fedora-devel standards, but the crux of the issue seems to be that there are worries that major changes that break things are chasing Fedora users away and a switch to systemd may be just that kind of change. There is no hard evidence of that, but it is clearly a fairly widespread—though not universal—belief. To try to escape the flames and assist with moving the decision about using systemd as the default in Fedora 14 to a technical, rather than emotional, basis, Bill Nottingham started a thread about acceptance criteria for systemd: "From [reading] the thread, there are many things that I think people would like covered with systemd before they would feel comfortable with it. So, I'm going to attempt to quantify what would need to be tested and verified."
As might be guessed, the flames didn't completely subside but the focus did shift to technical considerations, with an emphasis on backward compatibility to existing Upstart—and earlier System V init—functionality. There are lots of moving parts that go into making the transition from Upstart (which is mostly run in sysvinit compatibility mode in Fedora currently) to systemd, however, and Poettering is feeling like he is getting saddled with fixing any and all problems, even those that are legitimately outside of systemd.
There is a difference between what the Fedora and systemd projects need to do, though. As pointed out by several people, Fedora needs to ensure that the system works well, regardless of where the problem lies. As Nottingham notes, there is a higher burden on the person pushing a major change, but the distribution as a whole needs to get it right:
The onus is on the introducer of a large change to make sure that change works in the system. Sure, we can fire up requests for help, we can lean on people to fix their packages, and we've got people who are willing to help.
As he surely recognizes, Poettering's best strategy is to fix any bugs found in systemd and any other component where he has time and enough understanding to do so. Enlisting aid for any that he, or other systemd developers, can't address seems appropriate as well. There is no one in Fedora that can wave a magic wand and force developers to work on specific tasks, so some amount of team building may be necessary. There is always a hump to get over between the development and deployment of a new feature, and the hump in this case is larger than many. That said, none of the remaining issues looks so daunting that they cannot possibly be overcome in the next few weeks.
The Fedora 14 schedule shows a "Beta Change Deadline" on September 14. On or before that date, there will undoubtedly be a "go or no go" decision made on systemd for Fedora 14. Between now and then, testing systemd, reporting bugs, and perhaps more importantly, pitching in to fix bugs are all things that systemd fans can do to push it forward. Otherwise, we may have to wait until Fedora 15 to really see what systemd can do.this ZDNet article on "Android's dirty little secret." According to that article, the openness of Android has led to an increase in the control held by handset manufacturers and wireless carriers and the fragmentation of the platform. The Open Handset Alliance is in a "shambles," and Android phones have undone all the gains won by that great standard bearer for openness and freedom - the iPhone. One might easily conclude that Android is just business as usual for the mobile telephony industry, but there are a few things worth contemplating here.
The authors seem surprised by the fact that the Open Handset Alliance is not functioning like a free software project, and that manufacturers are not feeding their changes back into the common software core. That is very much true; very little code from HTC, Samsung, or Motorola (for example) can be found in the Android distribution. This outcome is unsurprising, though, for a number of reasons.
The first of those, quite simply, is that Android is still not run like a free software project. Work is done behind closed doors at Google, with the code being "dropped" into the public repository on occasion. It is not uncommon for the code to show up some time after the software starts shipping on handsets. Outsiders have little visibility into what is going on and little say over the direction of Android development; there is no easy way (or incentive) for them to contribute back.
When manufacturers have contributed back, it has tended to be at the lower levels - in the kernel, for example. Some of those contributions go directly upstream, which is where they should go. Others tend to sit in the Android repository. But, regardless of where these contributions end up, they tend not to be the sort of high-level, user-visible code that the article is talking about. The manufacturers prefer to keep that code to themselves.
And "keep it to themselves" is exactly what these manufacturers are entitled to do. At that level of the system, permissive licenses rule, by Google's choice. If more of the Android system were licensed under the GPL, manufacturers would have been required to contribute their changes back - at least, those changes which are demonstrably derived from the GPL-licensed code. Google's decision to avoid the GPL is arguably regrettable, but it is motivated by an understandable fear: manufacturers would simply refuse to use a GPL-licensed Android system. If the choice is truly between permissive licensing and obscurity - and that's how this choice is seen by many - it's not surprising that Google opted for the former.
A related complaint is that the openness of Android allows every manufacturer - and carriers too - to put their own code onto the handsets they sell. Such code can be custom user interfaces, "crapware," or restrictions on what the handset can do. These additions are seen to be undesirable because they contribute to the fragmentation of the system and because they are often antifeatures that customers would not choose to have. The implied subtext is that an Android which disallowed such changes would be a better platform with fewer "dirty little secrets."
As many have pointed out, Android does not differ from other mobile platforms in its malleability in the hands of manufacturers and carriers. Handsets based on Symbian or Windows Mobile can also be customized by vendors. Those handsets, too, can be locked to carriers, restricted in the applications which can be installed, or otherwise crippled. The article presents the iPhone as an independent platform which is immune from this sort of meddling, but the stories of the Google Voice application and the control over the App Store in general say otherwise. Android has not magically fixed this problem (and it is a problem), but neither has it created the problem.
MeeGo, possibly, is trying to do something about the fragmentation side of the issue; there are various rules which must be followed to be able to use the MeeGo name. How useful that will be remains to be seen; there are a number of Android-based devices on the market now which do not advertise the provenance of their software. And, despite the fact that MeeGo is not as afraid of the GPL as Android is, it is still true that MeeGo does not expect to flourish by restricting the flexibility of manufacturers and carriers. When we are lucky enough to be able to obtain MeeGo-based devices, we'll see that they've been messed with in many of the same ways as all the others.
In summary, there are two separate concerns here: fragmentation and freedom. Fragmentation, of course, has been a staple of anti-Linux FUD since the beginning; surely, with the source in the open and with all those distributions, Linux would have to go in many different directions. But Linux has not fragmented in the way that Unix did twenty years ago. The main reason for this, arguably, is the common core (kernel, plumbing layer, and more) that is used by all distributions. Even if strange things are done at the higher levels in a specific distribution, it's still Linux underneath. A convincing case can be made that the use of the GPL at those levels has done a lot to prevent forks and keep everybody in sync.
Android is still Linux underneath, albeit a somewhat limited and strange Linux. But one could argue that much of the Android core is no longer GPL-licensed. So, while Android is based on the Linux kernel, the rest of the system more closely resembles BSD from a licensing point of view. That might make Android more susceptible to fragmentation; perhaps Android heralds the return of the Unix wars. Or it might not; most vendors do eventually realize that the costs of straying too far are too high. In any case, it's hard to imagine manufacturers going too far afield as long as Google continues to put resources into pushing Android forward at high speed.
Freedom seems like a harder problem. The demise of the Nexus One as a mass-market product was taken by some as a sign that consumers have little interest in freedom in general. There are a couple of things worth noting, though, starting with the fact that the Nexus One, in its new role as a developer phone, has quickly sold out. Clearly there is some interest in relatively open hardware out there.
Then there is the tremendous level of interest in the jailbreaking of phones, loading of alternative distributions, etc. Contributions to CyanogenMod have reached a point where the project has had to set up its own Gerrit system to manage the review process. Your editor suspects that very few of the people who are jailbreaking phones or installing new firmware actually need to do that in order to use their handsets. Instead, those people want the freedom to mess with the device and see what comes of it. In other words, Android has kicked off a small (but growing) community of developers and users with an interest in freedom and making full use of the hardware they have bought. With any luck at all, this community will grow, providing a market for vendors who sell open handsets and resisting legislative attempts to make handset hacking harder. Interesting things can only come of all that.
If Android has a "dirty little secret," it's that freedom works both ways. Of course customer-hostile companies (of which there are many in the mobile telephone business) can make use of the freedom provided by Android to do customer-hostile things. But that freedom also appears to be supporting a whole new generation of hobbyists, enthusiasts, and hackers who want to do interesting things with current computing platforms. All told, Android has not made things worse; instead, it looks like it is making things better.
The GNU Flash player Gnash released version 0.8.8 on August 22, the first release advertised as supporting 100% of the Flash videos hosted at YouTube, in addition to GPU acceleration and a host of new features. It is also the first release following a public disagreement between the leading contributors about the project's development process. In addition, although the Gnash project and the alternative free software Flash player Lightspark continue to cover different parts of the Flash specification, development is progressing on ways for users to seamlessly integrate both into the web browsing experience.
The disagreement started in mid-May, between Gnash maintainer Rob Savoye and leading contributor Benjamin Wolsey over development policies. Savoye was in favor of making frequent, experimental commits, while Wolsey argued that commits must be rejected if they broke existing tests — or else the stability of Gnash would suffer.
Eventually the two sides settled
on drafting a set of commit policies for the project, which clarified the
need to prevent test regressions and outlined a multi-step policy for
handling checkins, without the potential for conflict that can arise when
one developer reverts another's changes. Documented on the project wiki,
the policy deems
reversions "a last resort" to be taken only after discussing
the issue on IRC, on the mailing list, and blocking out the offending code
Judging by the mailing list traffic and the progress of the code since then, the policy and the discussion surrounding it seems to have succeeded, and development returned to normal. A much bigger hurdle for the project is lack of funding. Savoye is historically the only developer who works full-time on Gnash, and donations to the non-profit Open Media Now project he established to raise funds for paying developers have slowed down to the point where he has started taking on other coding jobs.
Gnash's lack of sustained funding has been a problem for all of 2010, even forcing the team to drop plans to develop support for newer Flash 9 features like ActionScript 3.0. The project is one of the Free Software Foundation's high priority projects, but that status does not bestow any funds to help development, only publicity done by the FSF. Savoye told the Gnash mailing list in early August that unless donations or other funding pick up, the maintenance of the existing code — including the multiple rendering paths targeting different desktop and embedded platforms — will consume enough time that integrating new features will take a back seat, and the release schedule may have to slow down.
As to the code itself, source tarballs are provided on the GNU FTP mirrors, and the release is available via Git. The GetGnash.org site hosts experimental packages of the release, including Debian packages for Debian, gNewSense, and Ubuntu via an Apt repository, as well as Fedora and OLPC packages via a Yum repository.
At the time of this writing, the "experimental" nature of the GetGnash.org packages is fully in evidence, at least for the Ubuntu package, which does not install due to an unresolvable dependency. Compiling Gnash from source was more successful, however.
Due to the number and variety of media formats that can be encapsulated by Flash, the list of dependencies is long, however it can be reduced by specifying only a subset of the rendering, GUI, and multimedia options. For example:
./configure --enable-renderer=cairo --enable-media=GST --enable-gui=gtkbuilds support for just the Cairo rendering engine, GStreamer media handler, and the GTK+ stand-alone player GUI. The default settings add OpenGL, Anti-Grain Geometry (AGG), FFmpeg, SDL, and KDE4 dependencies.
The plugin for Mozilla-based browsers is built by default, but does not require Firefox development packages. Gnash's make install installs the standalone player; make install-plugins installs the browser plugin, by default placing it in $HOME/.firefox/plugins.
The flexibility in playback engines is one of 0.8.8's main new features, though. If built with FFmpeg and GStreamer media handlers, the choice between them can be made at runtime with the -M switch. Likewise, the -R switch allows runtime selection between Cairo, OpenGL, and AGG rendering (the latter being targeted for framebuffer-only devices).
In addition, Gnash supports switching between two hardware GPU acceleration APIs at runtime, XVideo, and the newer VAAPI. XVideo is not recommended, as the current builds may even reduce speed when compared to software video rendering due to video scaling. VAAPI hardware acceleration includes support for NVIDIA cards using VDPAU, ATI cards using using XvBA, and native support for Intel GPUs.
The project also claims that 100% of the content hosted on YouTube will now play in Gnash. There have been many and varied reasons for YouTube breakage in the past (not the least of which is that the SWF player served up by YouTube changes regularly), but passing 100% of the tests is a milestone indeed. The Gnash developers suggest that everyone experiencing problems with YouTube and Gnash 0.8.8 clear out their YouTube cookies and try again before filing a bug report.
Finally, Savoye has been working on ARM support in recent releases, and Gnash 0.8.8 supports Android devices. This support is likely to be slower than Adobe's official Flash player, however, because for the time being it uses software rendering — but the availability of a free Flash player for mobile devices is an important step.
A question that popped up several times during the last development cycle was whether there was a chance that Gnash might join forces with Lightspark, another free Flash player replacement that works as a browser plugin. Lightspark focuses on supporting the current version of ActionScript, version 3.0, which was introduced with Flash 9. Gnash focuses on supporting older versions of ActionScript, which run on the AVM1 virtual machine from Flash 8 and before. Lightspark implements AVM2 for its ActionScript 3.0 support, and maintainer Alessandro Pignotti has indicated that he cannot feasibly add support for AVM1 in addition to maintaining AVM2. Complicating matters is the fact that Flash 9 and Flash 10 files can incorporate AVM1 code if the developer so chooses.
Each plugin could test for the presence of AVM1 or AVM2 code in a given SWF file, though, so it is theoretically possible for Gnash and Lightspark to co-exist and allow users to view both generations of Flash content. Marek Aaron Sapota pointed out on the Gnash mailing list that the Chrome and Chromium browsers allow both plugins to be installed simultaneously, but that Firefox becomes "confused" in the same situation — even if one of the plugins is disabled.
Progress on that front came on August 2, when Pignotti released Lightspark 0.4.2.2. This release of the plugin tests for the virtual machine version in SWF files, and calls the Gnash program if it uses AVM1 (assuming it detects that Gnash is installed). Consequently, a user could install the Lightspark plugin and Gnash standalone player, not install the Gnash plugin, and play Flash content using both AVM1 and AVM2, seamlessly.
There is a drawback to this approach, because Lightspark has not yet implemented ExternalInterface. So, for the time being, it is a toss-up whether Gnash or Lightspark will offer the best support for any arbitrary SWF file encountered in the browser. Savoye, however, encouraged Pignotti to make use of Gnash's ExternalInterface code, since both projects are released under the GPL.
Of course, the current habit of using YouTube as a test case is of dubious value, particularly in the long term. Not only does video playback test just a small portion of Flash's capabilities when compared to interactive education and game content, but HTML 5 is clearly a better platform for video delivery in the future. Already, YouTube itself allows users to opt-in to an HTML 5 version of the site, as do several other video hosting services.
Even if HTML 5 becomes the preferred web delivery method for audio and video, and HTML 5 with CSS 3 implements rich interactivity that obsoletes most of Flash's other major uses, both Gnash and Lightspark will remain valuable simply because of the millions of Flash files already in existence. In addition, mobile and embedded devices will likely remain a pain point for free software supporters for years to come, as device makers routinely include proprietary components like Adobe's Flash player, and do not make alternatives user-selectable.
It continues to be an interesting year for open source Flash playback. The level of interaction and cooperation between the two main projects is a welcome sign, as is the experimentation being done to bring future releases to previously inaccessible platforms. Case in point: there have been numerous threads in recent months documenting individuals' attempts to get Gnash running on iOS devices like Apple's iPad — which is certainly something the proprietary companies have no intention of pursuing.
It has been said that the US National Security Agency (NSA) blocked the implementation of encryption in the TCP/IP protocol for the original ARPANET, because it wanted to be able to listen in on the traffic that crossed that early precursor to the internet. Since that time, we have been relegated to always sending clear-text packets via TCP/IP. Higher level application protocols (i.e. ssh, HTTPS, etc.) have enabled encryption for some traffic, but the vast majority of internet communication is still in the clear. The Tcpcrypt project is an attempt to change that, transparently, so that two conforming nodes can encrypt all of the data portion of any packets they exchange.
One of the key benefits that Tcpcrypt offers is transparency. That means that if both endpoints of a connection support it, the connection will be encrypted, but if one doesn't support Tcpcrypt, the other will gracefully fall back to standard clear-text TCP/IP. No applications are required to change, and no "new" protocols are required (beyond Tcpcrypt itself, of course) as applications will send and receive data just as they do today. But there is an additional benefit available for those applications that are willing to change: strong authentication.
Tcpcrypt has the concept of a "session ID" that is generated on both sides as part of the key exchange. This ID can be used in conjunction with a shared secret, like a password, to authenticate both ends of the communication. Because the client and server can exchange cryptographic hash values derived from the shared secret and session ID, they can be assured that each is talking over an encrypted channel to an endpoint that has the key (password). A "man in the middle" would not have access to the password and therefore can't spoof the exchange.
Even without any application changes for stronger authentication, Tcpcrypt would defend against passive man-in-the-middle attacks, like eavesdropping. Active attacks could still spoof responses that said Tcpcrypt was not supported, even if the other endpoint did support it, or even relay encrypted traffic. That would still be better than the usual situation today where a passive attacker can gather an enormous amount of clear-text traffic, especially from unencrypted or weakly encrypted wireless networks.
There is an Internet Engineering Task Force (IETF) draft available that describes how Tcpcrypt works by using two new TCP options. Those two options, CRYPT and MAC, will not be recognized by endpoints without Tcpcrypt support, and are therefore harmless. The CRYPT option is used to negotiate the use of Tcpcrypt and to exchange encryption keys, while the MAC option carries a hash value that can be used to verify the integrity of the packet data.
In addition to the IETF draft, the project has produced a paper, The case for ubiquitous transport-level encryption [PDF], that was presented at the 2010 USENIX Security conference. It gives a somewhat higher-level look at how Tcpcrypt integrates with TCP/IP, while providing a lot more information on the cryptographic and authentication algorithms. The slides [PDF] from the presentation are also instructive.
One of the basic premises that underlies Tcpcrypt is that computers have gotten "fast enough" to handle encrypting all internet traffic. Doing so at the transport level, rather than in application protocols (e.g. ssh), can make it transparent to applications. In addition, Tcpcrypt can work through NAT devices, which is something that another lower-layer encryption protocol, IPSec, cannot handle.
Because Tcpcrypt keys are short-lived, non-persistent public/private key pairs, it does not require the public key infrastructure (PKI) that other solutions, like HTTPS, need. That means that endpoints can communicate without getting certificates signed by centralized authorities. Of course the existing PKI certificates will work just fine on top of Tcpcrypt.
While computers may be "fast enough" to handle encryption on every packet, there is still the problem of asymmetry. Servers typically handle much more traffic than clients, so Tcpcrypt is designed to put the most difficult parts of the key negotiation and encryption onto the client side. The claim is that speeds of up to 25x that of HTTPS (i.e. SSL/TLS) can be achieved by Tcpcrypt. One wonders whether mobile devices are "fast enough", but that problem—if it even is one—is probably not one for that much longer.
Overall, Tcpcrypt is an intriguing idea. It certainly isn't a panacea for all of today's network ills, but that is no surprise. Unlike other proposals, Tcpcrypt can be incrementally deployed without requiring that we, somehow, restart the internet. Since it won't break existing devices, it can be developed and tested within the framework of the existing net. If for no other reason, that should give Tcpcrypt a leg up on other potential solutions.
|Created:||August 20, 2010||Updated:||September 1, 2010|
|Description:||From the Red Hat advisory:
This update fixes a vulnerability in Adobe Reader. This vulnerability is detailed on the Adobe security page APSB10-17, listed in the References section. A specially-crafted PDF file could cause Adobe Reader to crash or, potentially, execute arbitrary code as the user running Adobe Reader when opened.
|Package(s):||cacti||CVE #(s):||CVE-2010-1644 CVE-2010-1645 CVE-2010-2543 CVE-2010-2544 CVE-2010-2545|
|Created:||August 24, 2010||Updated:||January 9, 2012|
|Description:||From the Mandriva advisory:
Multiple cross-site scripting (XSS) vulnerabilities in Cacti before 0.8.7f, allow remote attackers to inject arbitrary web script or HTML via the (1) hostname or (2) description parameter to host.php, or (3) the host_id parameter to data_sources.php (CVE-2010-1644).
Cacti before 0.8.7f, allows remote authenticated administrators to execute arbitrary commands via shell metacharacters in (1) the FQDN field of a Device or (2) the Vertical Label field of a Graph Template (CVE-2010-1645).
Cross-site scripting (XSS) vulnerability in include/top_graph_header.php in Cacti before 0.8.7g allows remote attackers to inject arbitrary web script or HTML via the graph_start parameter to graph.php. NOTE: this vulnerability exists because of an incorrect fix for CVE-2009-4032.2.b (CVE-2010-2543).
Cross-site scripting (XSS) vulnerability in utilities.php in Cacti before 0.8.7g, allows remote attackers to inject arbitrary web script or HTML via the filter parameter (CVE-2010-2544).
Multiple cross-site scripting (XSS) vulnerabilities in Cacti before 0.8.7g, allow remote attackers to inject arbitrary web script or HTML via (1) the name element in an XML template to templates_import.php; and allow remote authenticated administrators to inject arbitrary web script or HTML via vectors related to (2) cdef.php, (3) data_input.php, (4) data_queries.php, (5) data_sources.php, (6) data_templates.php, (7) gprint_presets.php, (8) graph.php, (9) graphs_new.php, (10) graphs.php, (11) graph_templates_inputs.php, (12) graph_templates_items.php, (13) graph_templates.php, (14) graph_view.php, (15) host.php, (16) host_templates.php, (17) lib/functions.php, (18) lib/html_form.php, (19) lib/html_form_template.php, (20) lib/html.php, (21) lib/html_tree.php, (22) lib/rrd.php, (23) rra.php, (24) tree.php, and (25) user_admin.php (CVE-2010-2545).
|Created:||August 20, 2010||Updated:||February 7, 2014|
|Description:||From the CVE entry:
freeciv 2.2 before 2.2.1 and 2.3 before 2.3.0 allows attackers to read arbitrary files or execute arbitrary commands via scenario that contains Lua functionality, related to the (1) os, (2) io, (3) package, (4) dofile, (5) loadfile, (6) loadlib, (7) module, and (8) require modules or functions.
|Package(s):||linux-2.6||CVE #(s):||CVE-2009-4895 CVE-2010-2803 CVE-2010-2959 CVE-2010-3015|
|Created:||August 20, 2010||Updated:||March 3, 2011|
|Description:||From the Debian advisory:
Kyle Bader reported an issue in the tty subsystem that allows local users to create a denial of service (NULL pointer dereference). (CVE-2009-4895)
Kees Cook reported an issue in the DRM (Direct Rendering Manager) subsystem. Local users with sufficient privileges (local X users or members of the 'video' group on a default Debian install) could acquire access to sensitive kernel memory. (CVE-2010-2803)
Ben Hawkes discovered an issue in the AF_CAN socket family. An integer overflow condition may allow local users to obtain elevated privileges. (CVE-2010-2959)
Toshiyuki Okajima reported an issue in the ext4 filesystem. Local users could trigger a denial of service (BUG assertion) by generating a specific set of filesystem operations. (CVE-2010-3015)
|Package(s):||kvm||CVE #(s):||CVE-2010-0431 CVE-2010-0435 CVE-2010-2784|
|Created:||August 20, 2010||Updated:||March 3, 2011|
|Description:||From the Red Hat advisory:
It was found that QEMU-KVM on the host did not validate all pointers provided from a guest system's QXL graphics card driver. A privileged guest user could use this flaw to cause the host to dereference an invalid pointer, causing the guest to crash (denial of service) or, possibly, resulting in the privileged guest user escalating their privileges on the host. (CVE-2010-0431)
A flaw was found in QEMU-KVM, allowing the guest some control over the index used to access the callback array during sub-page MMIO initialization. A privileged guest user could use this flaw to crash the guest (denial of service) or, possibly, escalate their privileges on the host. (CVE-2010-2784)
A NULL pointer dereference flaw was found when the host system had a processor with the Intel VT-x extension enabled. A privileged guest user could use this flaw to trick the host into emulating a certain instruction, which could crash the host (denial of service). (CVE-2010-0435)
|Package(s):||moin||CVE #(s):||CVE-2010-2969 CVE-2010-2970|
|Created:||August 25, 2010||Updated:||October 19, 2012|
|Description:||Versions of the MoinMoin wiki system through 1.7.3 or prior to 1.9.3 suffer from multiple cross-site scripting vulnerabilities.|
|Package(s):||moodle||CVE #(s):||CVE-2010-2795 CVE-2010-2796|
|Created:||August 23, 2010||Updated:||February 23, 2011|
|Description:||From the CVE entries:
phpCAS before 1.1.2 allows remote authenticated users to hijack sessions via a query string containing a crafted ticket value. (CVE-2010-2795)
Cross-site scripting (XSS) vulnerability in phpCAS before 1.1.2, when proxy mode is enabled, allows remote attackers to inject arbitrary web script or HTML via a callback URL. (CVE-2010-2796)
|Package(s):||firefox, thunderbird, sunbird||CVE #(s):||CVE-2010-2755|
|Created:||August 20, 2010||Updated:||January 19, 2011|
|Description:||From the CVE entry:
layout/generic/nsObjectFrame.cpp in Mozilla Firefox 3.6.7 does not properly free memory in the parameter array of a plugin instance, which allows remote attackers to cause a denial of service (memory corruption) or possibly execute arbitrary code via a crafted HTML document, related to the DATA and SRC attributes of an OBJECT element. NOTE: this vulnerability exists because of an incorrect fix for CVE-2010-1214.
|Package(s):||openoffice.org||CVE #(s):||CVE-2010-2935 CVE-2010-2936|
|Created:||August 23, 2010||Updated:||April 19, 2011|
|Description:||From the Red Hat advisory:
An integer truncation error, leading to a heap-based buffer overflow, was found in the way the OpenOffice.org Impress presentation application sanitized a file's dictionary property items. An attacker could use this flaw to create a specially-crafted Microsoft Office PowerPoint file that, when opened, would cause OpenOffice.org Impress to crash or, possibly, execute arbitrary code with the privileges of the user running OpenOffice.org Impress. (CVE-2010-2935)
An integer overflow flaw, leading to a heap-based buffer overflow, was found in the way OpenOffice.org Impress processed polygons in input documents. An attacker could use this flaw to create a specially-crafted Microsoft Office PowerPoint file that, when opened, would cause OpenOffice.org Impress to crash or, possibly, execute arbitrary code with the privileges of the user running OpenOffice.org Impress. (CVE-2010-2936)
|Package(s):||php||CVE #(s):||CVE-2010-2190 CVE-2010-1914 CVE-2010-1915|
|Created:||August 24, 2010||Updated:||October 6, 2010|
|Description:||From the CVE entries:
The (1) trim, (2) ltrim, (3) rtrim, and (4) substr_replace functions in PHP 5.2 through 5.2.13 and 5.3 through 5.3.2 allow context-dependent attackers to obtain sensitive information (memory contents) by causing a userspace interruption of an internal function, related to the call time pass by reference feature. (CVE-2010-2190)
The Zend Engine in PHP 5.2 through 5.2.13 and 5.3 through 5.3.2 allows context-dependent attackers to obtain sensitive information by interrupting the handler for the (1) ZEND_BW_XOR opcode (shift_left_function), (2) ZEND_SL opcode (bitwise_xor_function), or (3) ZEND_SR opcode (shift_right_function), related to the convert_to_long_base function. (CVE-2010-1914)
The preg_quote function in PHP 5.2 through 5.2.13 and 5.3 through 5.3.2 allows context-dependent attackers to obtain sensitive information (memory contents) by causing a userspace interruption of an internal function, related to the call time pass by reference feature, modification of ZVALs whose values are not updated in the associated local variables, and access of previously-freed memory. (CVE-2010-1915)
|Created:||August 23, 2010||Updated:||September 13, 2010|
|Description:||From the Red Hat bugzilla:
Several cross-site scripting (XSS) vulnerabilities were found in phpMyAdmin versions prior to 18.104.22.168 and 22.214.171.124 . A remote attacker was able to conduct an XSS attack using crafted URLs or POST parameters on several pages.
|Package(s):||qspice||CVE #(s):||CVE-2010-0428 CVE-2010-0429|
|Created:||August 20, 2010||Updated:||August 27, 2010|
|Description:||From the Red Hat advisory:
It was found that the libspice component of QEMU-KVM on the host did not validate all pointers provided from a guest system's QXL graphics card driver. A privileged guest user could use this flaw to cause the host to dereference an invalid pointer, causing the guest to crash (denial of service) or, possibly, resulting in the privileged guest user escalating their privileges on the host. (CVE-2010-0428)
It was found that the libspice component of QEMU-KVM on the host could be forced to perform certain memory management operations on memory addresses controlled by a guest. A privileged guest user could use this flaw to crash the guest (denial of service) or, possibly, escalate their privileges on the host. (CVE-2010-0429)
|Created:||August 25, 2010||Updated:||August 26, 2010|
|Description:||From the Red Hat advisory: A race condition was found in the way the SPICE Mozilla Firefox plug-in and the SPICE client communicated. A local attacker could use this flaw to trick the plug-in and the SPICE client into communicating over an attacker-controlled socket, possibly gaining access to authentication details, or resulting in a man-in-the-middle attack on the SPICE connection.|
|Created:||August 25, 2010||Updated:||August 26, 2010|
|Description:||The SPICE firefox plugin suffers from a symbolic link vulnerability enabling a local attacker to overwrite files.|
|Created:||August 23, 2010||Updated:||August 25, 2010|
|Description:||From the CVE entry:
The default configuration of the <Button2> binding in Uzbl before 2010.08.05 does not properly use the @SELECTED_URI feature, which allows user-assisted remote attackers to execute arbitrary commands via a crafted HREF attribute of an A element in an HTML document.
|Created:||August 25, 2010||Updated:||August 25, 2010|
|Description:||Zabbix prior to 1.8.3 suffers from multiple cross-site scripting vulnerabilities; see this advisory for details.|
|Created:||August 25, 2010||Updated:||August 25, 2010|
|Description:||It turns out that the zope-ldapuserfolder extension does not verify passwords when somebody logs in as the emergency user.|
Page editor: Jake Edge
Brief itemsreleased by Linus on August 22. It contains mostly fixes, but Linus did also pull some small parts of the VFS scalability patch set. "The other big merge in -rc2 is the intel graphics update. I'm not hugely happy about the timing of it, but I think I needed to pull it. Apart from that, there's a number of random fixes all over, the appended shortlog gives you a taste of it." See said shortlog for details, or the full changelog for all the details.
About 200 changes have been merged (as of this writing) since the 2.6.36-rc2 release. They are dominated by fixes, but there's also a new driver for Marvell pxa168 Ethernet controllers and a core mutex change (see below).
Stable updates: the 126.96.36.199, 188.8.131.52, 184.108.40.206, and 220.127.116.11 stable kernel updates were released on August 20. These are relatively small updates containing fixes for the new "stack guard page" feature which was added to close the recently-disclosed X.org local root vulnerability.
There is a rather larger set of updates in the review process currently; they can be expected on or after August 26.
Except that, as it turns out, it doesn't always perform better. While doing some testing on a 64-core system, Tim Chen noticed a problem: multiple threads can be waiting for the same mutex at any given time. Once the mutex becomes available, only one of this spinning threads will obtain it; the others will continue to spin, contending for the lock. In general, optimism can be good, but excessive optimism can be harmful if it leads to continued behavior which does not yield useful results. That would appear to be the case here.
Tim's response was a patch changing the optimistic spinning implementation slightly. There is now an additional check in the loop to see if the owner of the mutex has changed. If the ownership of a mutex changes while a thread is spinning, waiting for it, that means that it was released and somebody else grabbed it first. In other words, there is heavy contention and multiple CPUs are spinning in a race that only one of them can win. In such cases, it makes sense to just go to sleep and wait until things calm down a bit.
Various benchmark results showed significant performance improvements in heavily-contended situations. That was enough to get the patch merged for 2.6.36-rc2.to get rid of it by pushing the (possibly) infinite looping back to callers.
Andrew Morton was not convinced by the patch:
All of these are wrong, bad, buggy and mustfix. So we consolidated the wrongbadbuggymustfix concept into the core MM so that miscreants could be easily identified and hopefully fixed.
David's response is that said miscreants have not been fixed over the course of many years, and that __GFP_NOFAIL imposes complexity on the page allocator which slows things down for all users. Andrew came back with a suggestion for special versions of the allocation functions which would perform the looping; that would move the implementation out of the core allocator, but still make it possible to search for code needing to fix; David obliged with a patch adding kmalloc_nofail() and friends.
This kind of patch is guaranteed to bring out comments from those who feel that it is far better to just fix code which is not prepared to deal with memory allocation failures. But, as Ted Ts'o pointed out, that is not always an easy thing to do:
Ted's point is that there are always going to be places where recovery from a memory allocation failure is quite hard, if it's possible at all. So the kernel can provide some means by which looping on failure can be done centrally, or see it done in various ad hoc ways in random places in the kernel. Bad code is not improved by being swept under the rug, so it seems likely that some sort of central loop-on-failure mechanism will continue to exist indefinitely.
Kernel development newsthe 2.6.35 announcement, though:
It's a rare developer who, upon having tickled the Big Penguin to that particular shade, will hold off on merging his changes. But Nick asked that the patches sit out for one more cycle, perhaps out of the entirely rational fear of bugs which might irritate users to a rather deeper shade. So Linus will have to wait a bit for his RCU pathname lookup code. That said, some parts of the VFS scalability code did make it into the mainline for 2.6.36-rc2.
Like most latter-day scalability work, the VFS work is focused on increasing locality and eliminating situations where CPUs must share resources. Given that a filesystem is an inherently global structure, increasing locality can be a challenging task; as a result, parts of Nick's patch set are on the complex and tricky side. But, in the end, it comes down to dealing with things locally whenever possible, but making global action possible when the need arises.
The first step is the introduction of two new lock types, the first of which is called a "local/global lock" (lglock). An lglock is intended to provide very fast access to per-CPU data while making it possible (at a rather higher cost) to get at another CPU's data. An lglock is created with:
#include <linux/lglock.h> DEFINE_LGLOCK(name);
The DEFINE_LGLOCK() macro is a 99-line wonder which creates the necessary data structure and accessor functions. By design, lglocks can only be defined at the file global level; they are not intended to be embedded within data structures.
Another set of macros is used for working with the lock:
lg_lock_init(name); lg_local_lock(name); lg_local_unlock(name); lg_local_lock_cpu(name, int cpu); lg_local_unlock_cpu(name, int cpu);
Underneath it all, an lglock is really just a per-CPU array of spinlocks. So a call to lg_local_lock() will acquire the current CPU's spinlock, while lg_local_lock_cpu() will acquire the lock belonging to the specified cpu. Acquiring an lglock also disables preemption, which would not otherwise happen in realtime kernels. As long as almost all locking is local, it will be very fast; the lock will not bounce between CPUs and will not be contended. Both of those assumptions go away, of course, if the cross-CPU version is used.
Sometimes it is necessary to globally lock the lglock:
lg_global_lock(name); lg_global_unlock(name); lg_global_lock_online(name); lg_global_unlock_online(name);
A call to lg_global_lock() will go through the entire array, acquiring the spinlock for every CPU. Needless to say, this will be a very expensive operation; if it happens with any frequency at all, an lglock is probably the wrong primitive to use. The _online version only acquires locks for CPUs which are currently running, while lg_global_lock() acquires locks for all possible CPUs.
The VFS scalability patch set also brings back the "big reader lock" concept. The idea behind a brlock is to make locking for read access as fast as possible, while making write locking possible. The brlock API (also defined in <linux/lglock.h>) looks like this:
DEFINE_BRLOCK(name); br_lock_init(name); br_read_lock(name); br_read_unlock(name); br_write_lock(name); br_write_unlock(name);
As it happens, this version of brlocks is implemented entirely with lglocks; br_read_lock() maps directly to lg_local_lock(), and br_write_lock() turns into lg_global_lock().
The first use of lglocks is to protect the list of open files which is attached to each superblock structure. This list is currently protected by the global files_lock, which becomes a bottleneck when a lot of open() and close() calls are being made. In 2.6.36, the list of open files becomes a per-CPU array, with each CPU managing its own list. When a file is opened, a (cheap) call to lg_local_lock() suffices to protect the local list while the new file is added.
When a file is closed, things are just a bit more complicated. There is no guarantee that the file will be on the local CPU's list, so the VFS must be prepared to reach across to another CPU's list to clean things up. That, of course, is what lg_local_lock_cpu() is for. Cross-CPU locking will be more expensive than local locking, but (1) it only involves one other CPU, and (2) in situations where there is a lot of opening and closing of files, chances are that the process working with any specific file will not migrate between CPUs during the (presumably short) time that the file is open.
The real reason that the per-superblock open files list exists is to let the kernel check for writable files when a filesystem is being remounted read-only. That operation requires exclusive access to the entire list, so lg_global_lock() is used. The global lock is costly, but read-only remounts are not a common occurrence, so nobody is likely to notice.
Also for 2.6.36, Nick changed the global vfsmount_lock into a brlock. This lock protects the tree of mounted filesystems; it must be acquired (in a read-only mode) whenever a pathname lookup crosses from one mount point to the next. Write access is only needed when filesystems are mounted or unmounted - again, an uncommon occurrence on most systems. Nick warns that this change is unlikely to speed up most workloads now - indeed, it may slow some down slightly - but its value will become clearer when some of the other bottlenecks are taken care of.
Aside from a few smaller changes, that is where VFS scalability work stops for the 2.6.36 development cycle. The more complicated work - dealing with dcache_lock in particular - will go through a few more months of testing before it is pushed toward the mainline. Then, perhaps, we'll see Linus in a proper shade of pink.
Adding an interface for user space to be able to access the kernel crypto subsystem—along with any hardware acceleration available—seems like a reasonable idea at first blush. But adding a huge chunk of formerly user-space code to the kernel to implement additional cryptographic algorithms, including public key cryptosystems, is likely to be difficult to sell. Coupling that with an ioctl()-based API, with pointers and variable length data, raises the barrier further still. Still, there are some good arguments for providing some kind of user-space interface to the crypto subsystem, even if the current proposal doesn't pass muster.
Miloslav Trmač posted an RFC patchset that implements the /dev/crypto user-space interface. The code is derived from cryptodev-linux, but the new implementation was largely developed by Nikos Mavrogiannopoulos. The patchset is rather large, mostly because of the inclusion of two user-space libraries for handling multi-precision integers (LibTomMath) and additional cryptographic algorithms (LibTomCrypt); some 20,000 lines of code in all. That is the current implementation, though there is mention of switching to something based on Libgcrypt, which is believed to be more scrutinized as well as more actively maintained, but is not particularly small either.
One of the key benefits of the new API is that keys can be handled completely within the kernel, allowing user space to do whatever encryption or decryption it needs without ever exposing the key to the application. That means that application vulnerabilities would be unable to expose any keys. The keys can also be wrapped by the kernel so that the application can receive an encrypted blob that it can store persistently to be loaded back into the kernel after a reboot.
Ted Ts'o questioned the whole idea behind the interface, specifically whether hardware acceleration would really speed things up:
He was also concerned that the key handling was redundant: "If the goal is access to hardware-escrowed keys, don't we have the TPM [Trusted Platform Module] interface for that already?" But Mavrogiannopoulos noted that embedded systems are one target for this work, "where the hardware version of AES might be 100 times faster than the software". He also said that the TPM interface was not flexible enough and that one goal of the new API is that "it can be wrapped by a PKCS #11 [Public-Key Cryptography Standard for cryptographic tokens like keys] module and used transparently by other crypto libraries (openssl/nss/gnutls)", which the TPM interface is unable to support.
There is already support in the kernel for key management, so Kyle Moffett would like to see that used: "We already have one very nice key/keyring API in the kernel (see Documentation/keys.txt) that's being used for crypto keys for NFSv4, AFS, etc. Can't you just add a bunch of cryptoapi key types to that API instead?" Mavrogiannopoulos thinks that because the keyring API allows exporting keys to user space—something that the /dev/crypto API explicitly prevents—it would be inappropriate. Keyring developer David Howells suggests an easy way around that particular problem: "Don't provide a read() key type operation, then".
But the interface itself also drew complaints. To use /dev/crypto, an application needs to open() the device, then start issuing ioctl() calls. Each ioctl() operation (which are named NCRIO_*) has its own structure type that gets passed as the data parameter to ioctl():
res = ioctl(fd, NCRIO_..., &data);
Many of the structures contain pointers for user data (input and output), which are declared as void pointers. That necessitates using the compat_ioctl to handle 32 vs. 64-bit pointer issues, which Arnd Bergmann disagrees with: "New drivers should be written to *avoid* compat_ioctl calls, using only very simple fixed-length data structures as ioctl commands.". He doesn't think that pointers should be used in the interface at all if possible: "Ideally, you would use ioctl to control the device while you use read and write to pass actual bits of data".
Beyond that, the interface also mixes in netlink-style variable length attributes to support things like algorithm choice, initialization vector, key type (secret, private, public), key wrapping algorithm, and many additional attributes that are algorithm-specific like key length or RSA and DSA-specific values. Each of these can be tacked on as an array of (struct nlattr, attribute data) pairs using the same formatting as netlink messages, to the end of the operation-specific structure for most, but not all, of the operations. It is, in short, a complex interface that is reasonably well-documented in the first patch of the series.
Bergmann and others are also concerned about the inclusion of all of the extra code, as well:
Mavrogiannopoulos thinks that the "benefits outweigh the risks" of adding the extra code, likening it to the existing encryption and compression facilities in the kernel. The difference, as Bergmann points out, is that the kernel actually uses those facilities itself, so they must be in the kernel. The additional code being added here is strictly to support user space.
In the patchset introduction, Trmač lists a number of arguments for adding more algorithms to the kernel and providing a user-space API, most of which boil down to various government specifications that require a separation between the crypto provider and user. The intent is to keep the key material separate from the—presumably more vulnerable—user-space programs, but there are other ways to do that, including have a root daemon that offers the needed functionality as noted in the introduction. There is a worry that the overhead of doing it that way would be too high: "this would be slow due to context switches, scheduler mismatching and all the IPC overhead". However, no numbers have yet been offered to show how much overhead is added.
There are a number of interesting capabilities embodied in the API, in particular for handling keys. A master AES key can be set for the subsystem by a suitably privileged program which will then be used to encrypt and wrap keys before they are handed off to user space. None of the key handling is persistent across reboots, so user space will have to store any keys that get generated for it. Using the master key allows that, without giving user space access to anything other than an encrypted blob.
All of the expected operations are available through the interface: encrypt, decrypt, sign, and verify. Each is accessible from a session that gets initiated by an NCRIO_SESSION_INIT ioctl(), followed by zero or more NCRIO_SESSION_UPDATE calls, and ending with a NCRIO_SESSION_FINAL. For one-shot operations, there is also a NCRIO_SESSION_ONCE call that handles all three of those operations in one call.
While it seems to be a well thought-out interface, with room for expansion to handle unforeseen algorithms with different requirements, it's also very complex. Other than the separation of keys and faster encryption for embedded devices, it doesn't offer that much for desktop or server users, and it adds an immense amount of code and the associated maintenance burden. In its current form, it's hard to see /dev/crypto making its way into the mainline, but some of the ideas it implements might—particularly if they are better integrated with existing kernel facilities like the keyring.
Back in July, Gleb Natapov submitted a patch changing the way paging is handled in KVM-virtualized guests. Included in the patch was the collection of a couple of new statistics on page faults handled in each virtual CPU. More than one month later (virtualization does make things slower), Avi Kivity reviewed the patch; one of his suggestions was:
Nobody questioned this particular bit of advice. Perhaps that's because virtualization seems boring to a lot of developers. But it is also indicative of a wider trend.
That trend is, of course, the migration of much kernel data collection and processing to the "perf events" subsystem. It has only been one year since perf showed up in a released kernel, but it has seen sustained development and growth since then. Some developers have been known to suggest that, eventually, the core kernel will be an obscure bit of code that must be kept around in order to make perf run.
Moving statistics collection to tracepoints brings some obvious advantages. If nobody is paying attention to the statistics, no data is collected and the overhead is nearly zero. When individual events can be captured, their correlation with other events can be investigated, timing can be analyzed, associated data can be captured, etc. So it makes some sense to export the actual events instead of boiling them down to a small set of numbers.
The down side of using tracepoints to replace counters is that it is no longer possible to query statistics maintained over the lifetime of the system. As Matt Mackall observed over a year ago:
Most often, your editor would surmise, administrators and developers are looking for changes in counters and do not need to integrate from time=0. There are times, though, when that information can be useful to have. One could come close by enabling the tracepoints of interest during the bootstrap process and continuously collecting the events, but that can be expensive, especially for high-frequency events.
There is another important issue which has been raised in the past and which has never really been resolved. Tracepoints are generally seen as debugging aids used mainly by kernel developers. They are often tied into low-level kernel implementation details; changes to the code can often force changes to nearby tracepoints, or make them entirely obsolete. Tracepoints, in other words, are likely to be nearly as volatile as the kernel that they are instrumenting. The kernel changes rapidly, so it stands to reason that the tracepoints would change rapidly as well.
Needless to say, changing tracepoints will create problems for any user-space utilities which make use of those tracepoints. Thus far, kernel developers have not encouraged widespread use of tracepoints; the kernel still does not have that many of them, and, as noted above, they are mainly debugging tools. If tracepoints are made into a replacement for kernel statistics, though, then the number of user-space tools using tracepoints can only increase. And that will lead to resistance to patches which change those tracepoints and break the tools.
In other words, tracepoints are becoming part of the user-space ABI. Despite the fact that concerns about the ABI status tracepoints have been raised in the past, this change seems to be coming in through the back door with no real planning. As Linus has pointed out in the past, the fact that nobody has designated tracepoints as part of the official ABI or documented them does not really change things. Once an interface has been exposed to user space and come into wider use, it's part of the ABI regardless of the developers' intentions. If user-space tools use tracepoints, kernel developers will have to support those tracepoints indefinitely into the future.
Past discussions have included suggestions for ways to mark tracepoints which are intended to be stable, but no conclusions have resulted. So the situation remains murky. It may well be that things will stay that way until some future kernel change breaks somebody's tools. Then the kernel community will be forced to choose between restoring compatibility for the broken tracepoints or overtly changing its longstanding promise not to break the user-space ABI (too often). It might be better to figure things out before they get to that point.
Patches and updates
Core kernel code
Filesystems and block I/O
Virtualization and containers
Page editor: Jonathan Corbet
Attendees at LinuxCon 2010 were lucky enough to have not just one, but two presentations devoted to boot speed. The first was "How We Made Ubuntu Faster", by Upstart creator Scott James Remnant; the other was "Improving Android Boot-Up Time", by Tim Bird of Sony. As expected, Scott's talk was centered around netbooks running Ubuntu, while Tim focused on different development boards running Android. Nevertheless, there were some commonalities between both projects.
No discussion of boot up speed would be complete without mentioning the 5 second boot achieved by Arjan de Ven and Auke Kok of Intel's Open Source Technology Center. In fact, a number of things from Scott's session assumed a knowledge of that effort by Intel.
Good metrics are pivotal for improving boot time, and to get good metrics one must standardize the variables. The hardest of these is the machine, because everyone has different computers that have various components that are slower or faster than others. The Ubuntu team realized they would have to buy a whole bunch of "standard" computers. They chose the Dell Inspiron mini 10 netbook, dubbed the "touchpad from hell" by Scott because it was hard to use without the pointer jumping around. The laptop as a whole has the key requirement of being available in SSD and rotational media configurations, and is cheap enough to keep the project under budget.
The next important piece is to have a goal in mind. They chose 10 seconds, by "doubling the numbers that Arjan came up with". The kernel and initramfs get a total of two seconds. The same is allocated for platform initialization such as init scrips. The X server gets another two seconds, and the desktop environment, Gnome, gets four. It turns out these numbers weren't accurate predictors in the long run, but for some cases such as kernel, they were able to beat their deadline.
In order to create an automated system to measure the changes over time, the team threw together a pretty elaborate configuration where the system would reinstall the latest nightly builds, and then profile the resulting boot automatically. They compiled all the results and put them on Scott's people page.
One of the big portions of the Moblin kernel improvement was the early use of asynchronous kernel threads. They improved boot time by initializing the SATA controller, to handle storage, at the same time as the USB host adapters. Canonical built upon on this work by moving populate_rootfs(), the function responsible for unpacking the initramfs, to yet another asynchronous thread.
Though Intel claimed a speed boost from statically compiling modules into the kernel, the Canonical team has to be able to support more than just Intel netbooks. To achieve this, they cleaned up some of the slower parts of the init script, such as a replacement of a 10 millisecond poll of the blkid binary with an event based call to libudev. In the end the team was able to surpass their 2 second target, even with the requirement to use an initramfs.
Scott took some time here to plug Upstart. Though the Intel ultimately settled on a hand tuned invocation of the old System V init daemon to improve boot, Scott insisted that an event based system is better than "thousands of lines of shell script". This is even more true today because pretty much every system on the market has more than one CPU.
The Gnome environment took a bit of time to boot as well. Ubuntu uses Compiz by default, and this took almost half of the time allocated for the desktop environment. The audience asked if Compiz could be eliminated, but there are too many features of Ubuntu that depend on its inclusion. Other large offenders were gnome-panel and Nautilus. Altogether these components contributed to a 10 second Gnome start up, more than double their 4 second allotment.
Their research revealed that storage is the ultimate bottleneck. "Hard drives suck, but SSDs suck too" was the specific wording. To improve the situation, Scott used a well known tool called readahead. Initially developed at Red Hat, readahead is a tool that will log the filename for every instance of open() and execve() for the first 60 seconds of boot. Then on the next boot, a readahead process is spawned early that pulls all files in the list into the page cache, ensuring it's just a simple memory access when they're read later on.
Intel improved Red Hat's readahead with super-readahead, or sreadahead. This does the same thing, but was modified to only preload the blocks that will be read in, instead of the entire file. Since it's assumed to be a MeeGo system running on SSDs, and all SSDs have negligible seek time, the order of blocks on disk is not taken into account. Using an SSD, the Ubuntu system can read all blocks necessary for boot in 3 seconds.
However, Ubuntu has to run on rotating media as well, so yet another iteration called über-readahead was created by Scott. The daemon was modified so that it reads blocks in order when using a rotating hard drive. The graph of this optimization shows a few random metadata reads, followed by a smooth linear path across the platter. For the rotational media, all pages necessary can be read into page cache in fewer than 7 seconds. Scott said that things can go even faster if the inital reads could be sorted and done in order prior to performing readahead on the file contents. There were a few file system patches sent to LKML, but inclusion does not seem likely at this point.
Scott concluded the discussion by admitting they didn't achieve their goal. The inability to reduce the desktop environment portion to fewer than ten seconds precluded a sub-ten second overall boot. Note this is on a Dell netbook, so the numbers will likely be better for systems with beefier processors and I/O subsystems, which includes almost all desktops and traditional laptops. In the presentation abstract, it is stated that some machines boot Ubuntu in as few as 5 seconds. The good news is that the kernel now takes fewer than 2 seconds to initialize, even with the initramfs requirement. And Scott did a lot of useful work that can be used by the larger community. Only time will tell if the other distributions take advantage of his work.
Tim ran into a unique set of problems with Android handsets. Whereas Scott's problems were already well known in the much wider desktop Linux community, Tim is working with a suite of tools with names like Dalvik and Zygote, whose source code has rarely been modified outside of Google. As such, Tim's focus was about getting an initial performance profile to find what part of the boot process will yield the largest reduction in time, and in turn should get the most developer effort.
He profiled three different platforms, the ADP1 and Nexus 1 from HTC, and an EVM OMAP3 board from Mistral Solutions. The overall time for these machines to boot was 57, 36, and 62 seconds, respectively. Though these are all ostensibly development machines, that number still seemed huge compared to the netbook boot times, but it should be mentioned that these boards have a much slower processor and storage. By the same token, you can do a lot more with a fully functional Gnome desktop than a smart phone. Tim pointed out that "it's really sad that you can use a stopwatch to accurately measure a phone booting up".
The boot chart for the EVM board revealed a number of areas for improvement. Android uses a rewritten Java-esque virtual machine called Dalvik in all of its phones. For optimal user experience, all of the classes must be preloaded before the phone is used. Zygote, the utility responsible for doing this work, spends about 21 seconds in I/O wait. The application classes don't have to be preloaded, one can choose to load them on demand, but this is just pushing the problem back and causes longer application load times. Worse, there is a memory penalty for each class now has to be loaded in a different heap, so the memory for identical classes can't be shared.
A potential solution is to figure out how to load every class into Zygote's heap, so that you can have something akin to shared libraries in a conventional OS. Another possible solution is to make Zygote threaded, and have one thread use the CPU while the other is reading from storage. A more far out possibility avoids reading in the classes at all and loads the heap as a binary blob, though this would take the most development effort and would require a rebuild when new classes are installed.
The other potential speed gain lies in the package scanning tool. The purpose of this tool wasn't exactly obvious to Tim, but he illustrated its complexity by showing the call tree. At the end of it all is the parseZipArchive() function, which is called 138 times. There is some low hanging fruit there, for example Tim shaved off a few seconds by commenting out a sanity check of the zip file headers. Just above that is a ZipFileRO::open() call which will mmap() the zip file into memory. The problem is that parseZipArchive() walks the mmaped region and builds a hash table to make subsequent accesses easier, causing page faults for the entire archive. All this is done just to extract one file, AndroidManifest.xml, so the time and memory spent to fault in all those pages and build the hash table is essentially wasted.
There is an emerging consensus within the Android development community that a lot of time can be shaved if readahead is used. But Tim thought it was masking the underlying problem, and that some of the blocks shouldn't be read at all, much less used to populate the page cache. Scott, who was in the audience at this discussion, noted that readahead isn't really about masking temporal locality of reference or "papering over problems", but it is about using the CPU while populating the page cache. Tim still felt that using readahead would make the problems with the code less noticeable, and developers wouldn't be as motivated to fix them. They both agreed when this ships on a device to consumers, it should have readahead enabled.
Unfortunately there were no significant speed ups in boot time yet, but there is still work to do. Interested readers are encouraged to sign up for the Android mailing lists, and check out the eLinux wiki.
Though Android and Dalvik are a departure from the traditional GNU userspace that Ubuntu uses, they do have some commonalities. Firstly, because the kernel is not impacted by user-space differences, kernel improvements will be available for any and all Linux devices. Tim didn't mention the kernel because there are already a lot of well known techniques to boot the kernel faster, so it was outside of the scope of his talk. Presumably, the techniques covered in the Ubuntu presentation would also help the Android system boot more quickly.
Some improvements in user space carry over as well. Readahead is a generic enough technique that it can be included in pretty much any environment. Similarly, profiling techniques like bootchart and ftrace can be run in both environments. However, generic GNU userspace has the advantage of more code sharing and reuse than third party environments like Android. Improvements to booting the X server, for example, will be felt across Ubuntu, MeeGo, and the other desktop Linuxes out there. That isn't the case for Android.
Even so, the developer community for Android is growing and Tim is evidence of that. The problem of slow Android boots has probably not been thought about much outside of Google's office walls, but that is changing. The potential for improvement is there, especially in Android-specific places like the package scanner and Zygote. For desktop distributions and even specialized distributions like MeeGo, the fast boot story may be largely coming to an end. For Android, it's only just beginning.
[ Bootcharts, graph, and photo courtesy of Scott James Remnant of Canonical and Tim Bird of Sony. ]
New ReleasesWe need your help to make Fedora 14 the best release yet, so please take a moment of your time to download and try out the Alpha and make sure the things that are important to you are working. If you find a bug, please report it -- every bug you uncover is a chance to improve the experience for millions of Fedora users worldwide." The move to NCP 4.0 will be in 2 phases. The first immediate change would be to move from OpenSolaris b134 to a recent Illumos build. With this the Nexenta project will change it's base from OpenSolaris to Illumos."
If I'm right, someone owes me a dollar. And maybe royalties.
Debian GNU/Linuxrecruiting women to participate in Debian, whether that be as packagers, bug reporters, technical documentation writers, bug fixers, translators, artists or any other area that helps the development of Debian. "There have been at least 38 women that have contributed in packaging software for Debian, and there are currently 11 female Debian Developers and 1 Debian Maintainer. We'd like to raise those numbers to 50 packagers by the end of 2011, and 20 Debian Developers by the end of 2012."
Red Hat Enterprise Linuxannounced Red Hat Enterprise Linux Extended Life Cycle Support (ELS), which allows customers to continue use of Red Hat Enterprise Linux (RHEL) major releases such as RHEL 3 beyond the regular 7-year life cycle. "Available as an add-on option, Extended Life Cycle Support complements the customers in-place Red Hat Enterprise Linux subscription and is available in single-year subscriptions that allow customers to extend the total use of given major releases by extending the overall supported life cycle from 7 years up to a total of 10 years. This new offering requires the customer to have an existing RHEL subscription with equivalent subscription terms and support level."
Ubuntu familyannounced that she has taken on a new role as the Technical Architect for the Ubuntu distribution. "Right at the start, I should make it clear that I am not the SABDFL. I'm here to help turn his vision into reality. That's what architects do, translate between the potential for a building and carefully measured graphite on paper, then act as a resource for the whole crew as they work together to translate an abstract plan into hard steel, warm brick, and shining glass. I'm here to champion the community's vision for Ubuntu, to facilitate conversations as we integrate multiple perspectives and balance multiple needs, to ask good questions that help us find better solutions. I'm here to help smooth some of the bumps in the road, because no road worth traveling is ever completely easy." No one has stepped forward to take ownership. The Tech Board has subsequently voted (and approved) to decommission support for both these ports archs. The following set of patches drop support for ia64 and sparc from our Maverick kernel."
Newsletters and articles of interest
Page editor: Rebecca Sobol
Vim development doesn't move quickly, but the popular vi-replacement text editor continues to evolve. Maintainer Bram Moolenaar released Vim 7.3 on August 15th with support for new language interfaces and the ability to retain undo information between sessions.
The 7.3 release comes about two years after 7.2, and is primarily a maintenance and bugfix release, but does include a few notable new features and a few changes. Support for GTK+ 1.x has been removed in Vim 7.3 in favor of GTK+ 2.x, which shouldn't pose a problem for any users on modern Linux distributions. For the full list of minor changes from 7.2 in Vim 7.3, users can run
The most interesting feature in Vim 7.3, at least for most, is the addition of persistent undo. Prior to 7.3, the "undo" history for a file was lost when exiting Vim or unloading a buffer. Vim 7.3 adds the
undofile option, which allows you to save the undo history and restore it when reopening a file. Since it's possible that a file would be changed in another editor or by another process, Vim saves a hash of the file and compares it when re-opening the file. If the file being edited has changed, the undo history from the previous session is disregarded to avoid problems.
The Vim 7.x series has added a number of features that make it easier to undo changes and revert to prior versions of a file. The 7.0 release introduced the
:later operations, which can move through the file's history by time rather than stepping through and undoing operations one by one. For example, it's possible to revert to a buffer's state as it was an hour or day ago using
:earlier 1h or
:earlier 1d. Using
:later 1h or
:later 1d would restore the buffer.
Vim 7.3 builds on this by adding a file write option, in addition to the time-based options. Using
:earlier 1f moves backwards one file write,
:earlier 2f back by two writes, etc. Using
:later 1f and so on will restore the buffer to later file writes (if any).
Python 3 and Lua support is now available in Vim 7.3, so users can use Python 3 or Lua within Vim to create macros. This is similar to the way that Emacs uses Lisp, but users can choose to compile in support for Python, Lua, Perl, Ruby, and others. Vim has supported Python 2 for some time, but Vim 7.3 adds the Python 3 interface as an option. Vim can have both Python 2 and 3 compiled in, but only one version can be active at a time.
Vim has supported encryption for files for some time. Files can be encrypted using the
:X command, and then require a password to decrypt the file on opening. (Note that the swap file is not encrypted during the editing session.) Earlier versions of Vim used a weak encryption based on PkZip, but 7.3 adds Blowfish for strong encryption. The weak encryption is used by default, but this can be overridden using the
cryptmethod option. To set the option during a Vim session, you'd run
For users who haven't looked at Vim in a while, the Vim 7 series brings quite a few new features to Vim and helps Vim stand out as far more than just a simple vi clone.
Many users prefer Vim for programming, but those who turn to a text editor for prose will be happy to know that Vim 7 sports a spell checker. Using the spell checking feature, Vim will not only highlight words that are misspelled, but also can suggest words and replace the misspelled item.
It seems that all programs eventually trend towards a tabbed interface, and Vim is no exception. With the 7.0 release, Vim introduced a tabbing feature that allows users to open each buffer in its own tab (if they so choose). This is available in the standard text-mode version of Vim as well as GUI Vim (GVim). Vim also supports a command called :tabdo to allow users to execute commands throughout all buffers that are open in tabs, not just the current buffer.
Vim 7 also introduced an internal version of grep. Previous versions of Vim could use external grep to search files on disk, but this was a problem for Windows versions of Vim and also posed a problem because different systems come with different grep implementations. In Vim 7, users can work with
:vimgrep to search through files for a pattern, and then the
:copen command to see the files that match the pattern (if any) and edit them.
It's also worth mentioning that Vim's license is unique. Moolenaar distributes Vim under a "charityware" license that allows using and copying Vim as much as one likes, but encourages donations to needy children in Uganda via ICCF Holland. The full text of the license is available in Vim's documentation.
The recommended method for getting Vim sources is via the Mercurial repository, but source tarballs are also available via FTP. Vim 7.3 compiles without any problems on Ubuntu 10.04 after installing the needed dependencies.
Overall, Vim 7 was a major leap over Vim 6, with a lot of miscellaneous new features and improvements. If you're a heavy Vim user, the 7.3 release is worth the time to download and compile just for the persistent undo feature. It's also interesting if you use Python 3 or Lua, otherwise it's fairly light on new features. If for some reason you're still on a version of Vim prior to 7.0, now would be a good time to update.
Newsletters and articles
Page editor: Jonathan Corbet
Non-Commercial announcementsresigned. From the resolution: "[...] Whereas, without the continued support and participation of Oracle in the open development of OpenSolaris, the OGB and the community Sun/Oracle created to support the open Solaris development partnership have no meaning, and
Articles of interesta release regarding a new patent application from Apple involving some interesting antifeatures. "Essentially, Apple's patent provides for a device to investigate a user's identity, ostensibly to determine if and when that user is 'unauthorized,' or, in other words, stolen. More specifically, the technology would allow Apple to record the voice of the device's user, take a photo of the device's user's current location or even detect and record the heartbeat of the device's user. Once an unauthorized user is identified, Apple could wipe the device and remotely store the user's 'sensitive data.'" The patent claims list jailbreaking explicitly as an "unauthorized" use. HTC's response to Apple's anti-Android patent infringement lawsuit. Naturally enough, they deny infringing anything and have launched a counterattack against the validity of several of the patents at stake. "You know how in the movies when two guys get into a fight on the street, another guy will run into a bar and yell, Fight! and everyone runs outside to watch? I feel like that guy reading this filing, because I see HTC intends to fight back." writeup from Ruth Suehle on Eben Moglen's LinuxCon keynote, with extensive quotes from the talk, including: "It's crucial to the system to keep imagination in the businesses thinking of new ways to compete using free software. Nobody can get complacent. If we become complacent, we will fail the system of coexistence. We will fail the promise we have among ourselves that the people making freedom can also help the people who love business." a ZDNet article claiming that the openness of Android is handing control back to carriers and handset manufacturers. "As a result, we now have a situation where the U.S. telecoms are reconsolidating their power and putting customers at a disadvantage. And, their empowering factor is Android. The carriers and handset makers can do anything they want with it. Unfortunately, that now includes loading lots of their own crapware onto these Android devices, using marketing schemes that confuse buyers (see the Samsung Galaxy S), and nickle-and-diming customers with added fees to run certain apps such as tethering, GPS navigation, and mobile video." talks with Jos Poortvliet about his work on KDE and his new job as the openSUSE community manager. "Jos Poortvliet: I am, or rather was, the KDE marketing team lead. Since I have moved to Novell as their new openSUSE community manager, I will become less involved in KDE marketing as I move on to work in openSUSE. This will also mean I will have a closer relationship with GNOME. That is exciting - they have a very different approach to things but are also working more and more with the KDE community. I hope to foster that collaboration." reports that Google has made voice and video chat available for Linux. "Google's web-based voice and video chat service, a long-time staple for users of Windows and the Macintosh operating systems, now is available to Linux users. The free plug-in supports Ubuntu and other Debian-based Linux distributions. RPM support will be coming soon, said Tristan Schmelcher, software engineer, in a company blog." takes a look at way to run Linux on your PS3 without voiding the warranty. "According to PSX Scene a bunch of open source hardware hackers have released a dongle called PS Jailbreak that will turn the PS3 back into a Linux machine. The FAT32 file system is currently supported and they are working on NTFS. It works on current firmware but the dongle is fully updatable. It also allows ordinary online game play."
Upcoming EventsThe Ubuntu Developer Summit one of the most important events in the Ubuntu calendar and at it we discuss, debate and design the next version of Ubuntu. We bring together the entire Canonical development team and sponsor a large number of community members across the wide range of areas in which people contribute to Ubuntu."
|OOoCon 2010||Budapest, Hungary|
|Free and Open Source Software for Geospatial Conference||Barcelona, Spain|
|DjangoCon US 2010||Portland, OR, USA|
|CouchCamp: CouchDB summer camp||Petaluma, CA, United States|
|Ohio Linux Fest||Columbus, Ohio, USA|
|September 11||Open Tech 2010||London, UK|
|Open Source Singapore Pacific-Asia Conference||Sydney, Australia|
|X Developers' Summit||Toulouse, France|
|3rd International Conference FOSS Sea 2010||Odessa, Ukraine|
|Italian Debian/Ubuntu Community Conference 2010||Perugia, Italy|
|WordCamp Portland||Portland, OR, USA|
|September 18||Software Freedom Day 2010||Everywhere, Everywhere|
|September 23||Open Hardware Summit||New York, NY, USA|
|BruCON Security Conference 2010||Brussels, Belgium|
|PyCon India 2010||Bangalore, India|
|Japan Linux Symposium||Tokyo, Japan|
|Workshop on Self-sustaining Systems||Tokyo, Japan|
|September 29||3rd Firebird Conference - Moscow||Moscow, Russia|
|Open World Forum||Paris, France|
|Open Video Conference||New York, NY, USA|
|October 1||Firebird Day Paris - La Cinémathèque Française||Paris, France|
|Foundations of Open Media Software 2010||New York, NY, USA|
|IRILL days - where FOSS developers, researchers, and communities meet||Paris, France|
|Utah Open Source Conference||Salt Lake City, UT, USA|
|Free Culture Research Conference||Berlin, Germany|
|17th Annual Tcl/Tk Conference||Chicago/Oakbrook Terrace, IL, USA|
|Linux Foundation End User Summit||Jersey City, NJ, USA|
|October 12||Eclipse Government Day||Reston, VA, USA|
|October 16||FLOSS UK Unconference Autumn 2010||Birmingham, UK|
|October 16||Central PA Open Source Conference||Harrisburg, PA, USA|
|7th Netfilter Workshop||Seville, Spain|
|Pacific Northwest Software Quality Conference||Portland, OR, USA|
|Open Source in Mobile World||London, United Kingdom|
|openSUSE Conference 2010||Nuremberg, Germany|
|OLPC Community Summit||San Francisco, CA, USA|
|GitTogether '10||Mountain VIew, CA, USA|
|Real Time Linux Workshop||Nairobi, Kenya|
|GCC & GNU Toolchain Developers Summit||Ottawa, Ontario, Canada|
|Ubuntu Developer Summit||Orlando, Florida, USA|
|October 26||GStreamer Conference 2010||Cambridge, UK|
|October 27||Open Source Health Informatics Conference||London, UK|
|Hack.lu 2010||Parc Hotel Alvisse, Luxembourg|
|Embedded Linux Conference Europe 2010||Cambridge, UK|
|Government Open Source Conference 2010||Portland, OR, USA|
|European Conference on Computer Network Defense||Berlin, Germany|
|Free Software Open Source Symposium||Toronto, Canada|
|Debian MiniConf Paris 2010||Paris, France|
If your event does not appear here, please tell us about it.
Audio and Video programsavailable from Flumotion, in WebM format.
Page editor: Rebecca Sobol
Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds