LWN.net Logo

LWN.net Weekly Edition for April 15, 2010

Dispatches from the compiler front

By Jonathan Corbet
April 14, 2010
Your editor has recently noticed a string of interesting announcements and discussions in the GCC and LLVM compiler communities. Here is an attempt to pull together a look at a few of these discussions, including resistance to cooperation between the two projects, building an assembler into LLVM, and more.

The up-and-coming LLVM compiler has been an irritation to some GCC developers for some time; LLVM apparently comes off as an upstart trying to muscle into territory which GCC has owned for a long time. So it's not surprising that occasionally the relationship between the two projects gets a little frosty.

Consider the case of DragonEgg, a GCC plugin which replaces the bulk of GCC's optimization and code-generation system with the LLVM implementation. DragonEgg is clearly a useful tool for LLVM developers, who can focus on improving the backend code while making use of GCC's well-developed front ends. Jack Howarth recently proposed the addition of DragonEgg as an official part of the GCC code base. Some developers welcomed the idea; Basile Starynkevitch, for example, thought it would make a good plugin example. But from others came complaints like this:

So, no offense, but the suggestion here is to make this subversive (for FSF GCC) plugin part of FSF GCC? What is the benefit of this for GCC? I don't see any. I just see a plugin trying to piggy-back on the hard work of GCC front-end developers and negating the efforts of those working on the middle ends and back ends.

It's not clear that this is a majority opinion; some GCC developers see DragonEgg as an easy way to try out LLVM code and compare it against their own. If LLVM comes out on top, GCC developers can then figure out why or, possibly, just adopt the relevant LLVM code. Those developers see only benefit in some cooperative competition between the projects.

Others, though, see the situation as more of a zero-sum game; when viewed through that lens, cooperation with LLVM would appear to make little sense. But free software is not a zero-sum game; the more we can learn from each other, the better off we all are. GCC need not worry about being displaced by LLVM (or anything else) any time in the near future. Barring technical issues with the merging of DragonEgg (and none have been mentioned), accepting the code seems like it should be ultimately beneficial to the project.

In a side discussion, GCC developers wondered why LLVM seems to be more successful in attracting developers and mindshare in general. One suggestion was that LLVM has a clear leader who is able to set the direction of the project, while GCC is more scattered. Others have a different view; in this context, Ian Lance Taylor's notes are worth a look:

What I do see is that relatively few gcc developers take the time to reach out to new people and help them become part of the community. I also see a lot of external patches not reviewed, and I see a lot of back-and-forth about patches which is simply confusing and offputting to those trying to contribute. Joining the gcc community requires a lot of self-motivation, or it takes being paid enough to get over the obstacles.

There is also the matter of the old code base, the lack of a clean separation between passes, and, most important, weak internal documentation.

Some of these issues are being fixed; others will take longer. It seems clear that attending to these problems is important for the long-term future of the project.

Lest things look too grim, though, it's worth perusing this posting from Taras Glek on his success with the GCC "profile-guided optimization" (PGO) feature. PGO works by instrumenting the binary, then rebuilding the program with optimization driven by the profile information. With Firefox, Taras was able to cut the startup time by one third and to reduce initial memory use considerably as well. Taras says:

I think the numbers speak for themselves. Isn't it scary how wasteful binaries are by default? It amazes me that Firefox can shrug off a significant amount of resource bloat without changing a single line of code.

There's no shortage of interesting, development-oriented tools being integrated into GCC, and the addition of the plugin architecture can only result in an acceleration of this process. Things have reached a point where more projects should probably be looking into the use of these tools to improve the experience for their users.

Meanwhile, on the LLVM side, the developers have recently unveiled the LLVM MC project. "MC" stands for "machine code" in this context; in short, the LLVM developers are trying to integrate the assembler directly into the compiler. There are a number of reasons for doing this, including performance (formatting text for a separate assembler and running that assembler are expensive operations), portability (not all target systems have an assembler out of the box), and the ability to easily add support for new processor instructions. Much of this functionality is required anyway for LLVM's just-in-time compiler features, so it makes sense to just finish the job.

This work appears to be fairly well advanced, with much of the basic functionality in place. Chris Lattner says:

If you're interested in this level of the tool chain, this is a great area to get involved in, because there are lots of small to mid-sized projects just waiting to be tackled. I believe that the long term impact of this work is huge: it allows building new interesting CPU-level tools and it means that we can add new instructions to one .td file instead of having to add them to the compiler, the assembler, the disassembler, etc.

In summary: there is currently a lot going on in the area of development toolchains. Given that all of us - including those who do no development - depend on those toolchains, this can only be a good thing. Computers can do a lot to make the task of programming them easier and more robust; despite the occasional glitch, developers for both GCC and LLVM appear to be working hard to realize that potential.

Comments (23 posted)

ELC: Android and the community

By Jake Edge
April 14, 2010

Greg Kroah-Hartman delivered some "tough love" to Android in his keynote at this year's Embedded Linux Conference (ELC). He is very clearly excited about Android and what it can do—uses it daily as his regular phone—but is unhappy with Google's lack of community engagement. There is hope that things will change, he said; there has been a fair amount of "introspection" at Google that he hopes will lead it in a more community-oriented direction.

CE Linux Forum architecture group chair and ELC organizer Tim Bird introduced Kroah-Hartman by noting how many kernel subsystems he maintains: eight. The "amount of work that Greg does in the Linux kernel is really hard to comprehend", Bird said. In addition, he has a knack for "involving people in the community". Kroah-Hartman is "omnipresent" in kernel circles and "when he sees a problem, he just comes up with a solution for it". It would seem that the keynote was targeted at solving the "Android problem".

As part of his disclaimer, Kroah-Hartman noted that he comes at the problem as an experienced kernel developer who also has a background in the embedded space. While he works on MeeGo for Novell as his day job, "they don't even realize I'm here giving this talk". He also noted that he is only looking at the Android kernel, as the user space is "weird" and won't be part of his talk.

Five years ago, phone manufacturers approached the kernel developers because they wanted to use Linux on phones. They needed a stable kernel and X server, along with a free Java. The latter was the sticking point because, at that time, there was no free Java—and it isn't something that the kernel hackers could provide. But Android changed that, providing the third piece that the phone makers need.

He pointed out that all of the complaints he was about to make about Android could be fixed "tomorrow" and that the Android user-space applications wouldn't have to change at all. But it's not something that the kernel community can do alone, Google needs to be involved to change the libraries that make up the platform.

Right moves

There were several things that Google did right, Kroah-Hartman said, starting with its choice of Linux. In an aside, he noted that all phone manufacturers bring up their phones using Linux, including Apple with the iPhone; "a little-known fact". He also lauded Google for following the kernel license, which is something that Palm didn't initially do with WebOS, he said. He pointed to android.git.kernel.org as a "wonderful site" that contains all of the Android code in easily accessible Git repositories. But "that's all the good".

Wrong moves

The list of things that Google got "wrong" starts with android.git.kernel.org because it is so disorganized and chaotic. There are six different kernel trees available there with 33 separate branches. There are three branches of 2.6.34, which is "not even released yet". The trees go back to 2.6.25 and some of the older trees are still being updated. Some of the branches are for different hardware, but some aren't.

There are also things like an "old stale Linus tree" in the collection, along with two standalone drivers in a separate repository. One of those drivers has 13 branches, "not all for just one kernel version". There is also an empty repository—with four branches. "It's a mess", he said, and Google is "burying the code in public". That's a common thing for companies to do; "they are doing it well, and it's crap".

He looked at a branch based on 2.6.34-rc2, noting that there were 283 files changed, with 47,715 lines added and 363 lines removed. Half of that was driver code, 30% for filesystems, 15% for architecture-specific code, and the "really scary one" was the 5% that were changes to core kernel code. That "changes the kernel in ways that make it not Linux", he said.

Tons of drivers

Kroah-Hartman then went through a long list of drivers that were in Google's changes, but were not upstream. For most of them, there is "no reason this code needs to be out of the kernel". Some of the changes like the pmem driver, which the Google developers have publicly called "voodoo", and a logging infrastructure that should be done in user space, are not good candidates for the upstream kernel, but the vast majority is.

The big problem with all of this code is that it implements features and fixes bugs that others in the community need. Google rewrote the USB gadget code because the kernel version couldn't easily support multi-function gadgets in 2007, but never got it into the mainline. So, Samsung had to fix the mainline gadget code. The Android developers also fixed a "bunch of bugs" in the Bluetooth code, which is a "nasty protocol to get right", but they never pushed it upstream.

There are lots of small changes in various locations that make it much harder for those who want to port Android to new hardware, he said. Missing a three-line change in the inotify code will cause some applications to fail. There is also a special Android-specific /proc file type that needs to be ported into any kernel bound for Android. And on and on.

Wakelocks

One of the big problems with much of the Android code is wakelocks, which are an Android-specific power-management feature. There are wakelocks in "every Android driver", but wakelocks are not in the mainline. Kernel developers have tried tearing the wakelocks out of the drivers, but then the code diverges quickly. The right solution is to get wakelocks upstream, which has gotten close to happening, but hasn't.

Wakelocks are a good example of how Google is ignoring the community, Kroah-Hartman said. The wakelock code would be submitted for inclusion, there would be some discussion and then the Android developers would "disappear for six months". When they returned, the process would start over. The solution is to "submit and be persistent", to answer the questions, and participate in the discussion. Wakelocks were "one patch submission away from being accepted", he said, but the developers disappeared again.

Ignoring the community

Google is allowed to ignore the community, but "it's sad". He took the Android drivers into the staging tree and Google said it would help get them into mainline shape, but they didn't. So Kroah-Hartman had to remove them, which means that no one else gets the benefit of those drivers. When the drivers were in the staging tree, Fujitsu was working on fixing bugs in the Android low memory killer to use on their "big supercomputer stuff", but now that work has been dropped.

The community wants the code, but Google thinks that it's unique and doesn't need to work with the community. "Don't think you are unique, you aren't", he said. He was not picking on Google, because "this applies to everyone". Ironically, the Google server folks would like to use some of the changes that are in the Android tree, but they aren't upstream.

It's just a fork

Google's response has been that "it's just a fork, no big deal". Every embedded shop has done a fork and it is abiding by the license, "so who cares?". But Kroah-Hartman wants to see the Android platform succeed and Google is making its partners' jobs harder by making them depend on extra code that needs to be ported to their hardware. In order for the platform to succeed, Google needs to work with the community. If it wants to do it alone, "that's fine", but "they don't and we don't want them to". He noted that Qualcomm, a company not noted for community engagement, had approached the kernel developers complaining about the Android situation, which is a glaring indicator of a problem.

That was the end of the "fire and brimstone" portion of Kroah-Hartman's talk. Looking to the future, he said there is "hope" (accompanied by a slide of the Android logo in the style of Shephard Fairey's iconic Barack Obama "Hope" poster). There will be a meeting very soon where the kernel hackers and Android developers are going to "lock themselves in a room and hash this all out", he said.

For vendors using Android, Kroah-Hartman recommended that they "push on Google to make these changes", and that he will be "pushing as hard as I can on the kernel side". He pointed out that it takes less time to get the code upstream than will be spent maintaining it moving forward. Android is "not an end-of-life project", with 50 new handsets announced at the last trade show. "It's not that much work, truly", he said, and there are plenty of consultants that would be willing to help if the Open Handset Alliance wanted to fund this work.

It was noted that no one from Google attended the talk, and an audience member asked how that could be. Kroah-Hartman seemed a bit disappointed that there were no Google representatives, and did say that they had been invited. Perhaps it was the "long drive from the South Bay" that kept them away, he noted wryly. It may also be that memories of his Canonical bashing from the 2008 Linux Plumbers Conference linger; Google folks may not have wanted to sit through something like that.

The Android user space is "divergent from everything you ever thought about Linux user space", which is "great", he said. The idea that you can run something entirely different on top of the kernel makes it very interesting. Google could replace Linux with BSD or something else and all of the applications would still run. But Kroah-Hartman would like to see "Linux succeed for this market" and Google has followed the kernel license as they should; "I want to reward them for that". He has offered to help before and reiterated that offer. It's clear that he is a big fan of the platform, and is just trying to help fix the problems that he sees.

Comments (44 posted)

Catching up with Leslie Hawthorn

April 9, 2010

This article was contributed by Joe 'Zonker' Brockmeier.

Few people in the open source community have touched as many projects as Leslie Hawthorn, the now-former open source program manager for Google. As one of less than ten employees in Google's open source programs office, Hawthorn was at the center of the Google Summer of Code — a project that has worked with hundreds of projects and thousands of college students since its inception in 2005. When Hawthorn announced at the end of March that she was leaving Google, we decided to catch up with her and find out what she's learned from her time at Google and what she has planned next.

LWN: A big part of your job has been the Google Summer of Code program. What lessons have you taken away from the program, and what would you do differently if you could reboot the program?

I think the most important lesson I learned was to just get out there and make things happen, no matter how many mistakes you'll make along the way and no matter how many pairs of eyes are watching you make those mistakes. When I started with the team, I had no FOSS experience whatsoever and was quite worried about doing the wrong thing, and very publicly no less. Over time, I realized that it was far better to make as much progress as possible and make the inevitable course corrections required rather than try to get everything right from the start.

I've also learned that a small group of people who are dedicated to making the world a better place can be incredibly successful in doing just that. It may sound like a cliche, but I never imagined when I started working on Summer of Code that I would have the chance to help so many individuals and projects make things better, from finding new contributors to improving their community structure and relations. Granted, having the power of Google as a company and brand behind the program contributed a great deal, but the real power in Summer of Code is the individuals who contribute to making it a successful way for students to or contribute to FOSS.

As for what I would do differently if I could reboot the program, I think it's working quite well the way it is. It would be excellent to scale pay for the student participants according to their respective geographies or vary the program time line a bit more to take into account various school schedules, but that's just not possible with 2-3 people on the Google side managing logistics. I like to imagine how epic Summer of Code could be with a team of ten people working on it, but that's not particularly logical when you consider the overall size of the Open Source Team.

There are times I think that removing the monetary component would be a good thing - just have folks do the program for the recognition and the t-shirt. It would certainly make administration easier and allow more students to participate in the program. However, I also realize that one of the original goals of the program was to provide students with gainful employment in addition to programming experience, and t-shirts, while completely awesome, do not pay the rent or put food on the table.

LWN: In your experience, how effective has GSoC been in drawing new contributors to open source projects?

I think it has been incredibly effective. When I meet up with GSoC students, most of the individuals I speak with had only been using FOSS, but had not contributed patches or fixed more than a few minor bugs. The best aspect of the program, though, is simply spreading the idea of FOSS' importance to the next generation of programmers. The participants in the program become wonderful evangelists for open source, whether or not they remain core contributors to the projects they worked on for GSoC. I think many of the students who have participated in the program will be instrumental in seeing that FOSS is used in the companies that employ them once they enter the workforce.

LWN: How effective has GSoC been for bringing in new code that becomes part of projects?

I think about half the code produced during GSoC ends up merged into project source trees, but that's just an educated guess. Several projects ask their students to work on research implementations of particular features and, as with all software development, some of what is produced ends up not being the most practical solution. On the other hand, I sometimes hear from program mentors more than a year after a particular GSoC that their student's code just entered the mainline. I think the education that the students receive in FOSS and software development in general is more important than the project work they produce, but certainly a great deal of useful code has been written due to GSoC.

LWN: GSoC was probably the most visible part of the open source programs office, as it touched so many projects. What other projects have you worked on that you're happy with?

I'm particularly proud of the Google Highly Open Participation Contest, which was the world's first global initiative to introduce pre-university students to open source. The contest has only occurred once, though I believe Google has plans to hold it again in the future. The effort was incredibly successful - more than 350 students worldwide participated and completed more than 1,000 bite sized tasks for 10 open source projects. The best thing about GHOP was that it focused on all aspects of FOSS, including documentation, user experience research, marketing and advocacy, not just code. That holistic approach to software development made FOSS more accessible to a wider range of young contributors. I'm actually in Vermont right now as I'll be accepting an award for GHOP from the National Center for Open Source in Education on Friday [April 9, 2010].

I'm also proud of launching and managing the Google Open Source Blog, which now has over 17,000 subscribed readers. The blog gave the team the opportunity to talk about all the great work Google does for the open source community. It also gave Google a great venue to showcase all the wonderful things the community was able to do with company's help, from news about GSoC to FOSS development sponsored by Google.

LWN: A big part of GSoC is mentoring. You've obviously observed quite a few projects mentoring students — what tips and advice would you give readers who are participating in open source projects about mentoring?

First and foremost, make it very clear if you are willing to mentor folks on your project. A "help wanted" note on on your project home page is a good start. Don't be afraid to clearly spell out exactly the kind of contributions you are looking for - if your project will only benefit from folks who have advanced experience in machine learning, that's OK. Folks with a less developed skill set in that area will find another good home. It's also great to have someone specifically assigned dealing with newcomers and helping them get up to speed on the basics quickly before handing them over to someone who can help guide them in a particular area in depth.

Share your mistakes with your mentees. It is incredibly easy for someone less experienced to feel like you are some kind of deity and that your hard won knowledge is somehow an innate ability. By showing off some particularly ugly code or talking about your own faux pas on a development mailing list, you give those with less experience more confidence in their ability to go from newbie to seasoned contributor.

Be available as much as you possibly can be, especially when first starting the mentoring relationship. People do well when they feel like their contributions matter and their voice is heard, and they'll feel neither if they don't hear from you early and often. Once the relationship has been established, you can take a bit more time to answer questions, etc., but during the first few weeks being there as much as you can be is incredibly important. It's also worthwhile to be there often so you can steer your mentee to asking the community for help so that he is not solely reliant on your guidance or opinions. As long as your mentee feels like you're there for a particularly complex or troubling issue, she will be more likely to ask others for help on the easier matters and become more integrated into the wider community as a result.

LWN: On the flip side, any suggestions or advice to readers looking to be mentored with projects — especially those who don't have avenues like GSoC?

Don't be nervous about your lack of experience, just dive right in. Check out a project's mailing list for an idea of what's most important right now and what has been important in the past six months or so. Then join the project IRC channel and lurk for awhile. Don't worry if no one acknowledges you or says anything at first. Eventually, when you have an idea of what you'd like to do for the project, say implement a cool feature or host a user group in your home town, then speak up in the channel. Don't ask to ask a question, just ask it. Be prepared for people to not think you'll carry through on your intended efforts - many people get very excited by the idea of a project but don't actually get stuff done, so the folks you talk to will naturally wonder if you fall into that category. This is OK and completely natural - once you've sent in your first patches, updated the documentation or created that project presentation and asked for feedback, you'll see more engagement with your ideas and get more help from the community to further expand on your work.

LWN: You recently gave a talk at LibrePlanet's Women's Caucus about mentoring women in open source. Can you talk a bit about that, and give specific advice on mentoring women for those who are interested in attracting a more diverse group of contributors to projects?

Thanks to Deb Nicholson of the Free Software Foundation, an entire day's track at Libre Planet was devoted to getting more women involved in Free Software. My talk revolved around some basic points for mentoring, a few of which are enumerated above. The points aren't gender specific, but I hear from many, many projects that they are specifically interested in mentoring folks who come from groups underrepresented in their project, particularly women. I think mentoring is particularly valuable to women and other underrepresented groups since the social group that they are approaching doesn't have as many people in it who look like them, literally, and it can be difficult to feel comfortable and welcome in such an environment.

For groups looking to attract a more diverse group of contributors, start with the basics. Go to the members of your community who are underrepresented and ask them what you think can be done to improve the situation. (Note that no woman can speak for all women in FOSS, just like no man could speak for all men in FOSS, so this may be a difficult conversation at first.) Ask them why they were excited about the project in the first place, whether they would recommend working on the project to their colleagues and what the project can do to be most welcoming to newcomers in general. When you hear great ideas, implement them.

Make sure to include women speakers on the agenda of your project's conferences. Women in particular tend not to attend technical conferences if they feel like they are the only woman present. The closer you can come to gender parity on your conference program, the more women you'll attract to your conference. And I think we all know the importance of connecting in person to making FOSS happen.

Another way to help increase diversity in a project is to publish a diversity statement or a mission statement that explicitly states that the project is open to all and encourages participation from as many different kinds of people as possible. It may be obvious to all the project members that you are open to anyone's participation, but actually reading that a project is looking for people from all walks of life makes people feel welcomed and valued. Specifically mentioning that you are looking for newbies is a great way to attract a more diverse community if most of your contributors eat, sleep and breathe your project. More perspectives help a project grow and thrive.

LWN: Women in open source is a topic that's received a lot of attention the past four or five years. Where do you think we're at in terms of interesting women in participating in open source and retaining new contributors?

Please note that I am going to make a whole bunch of generalizations here, and these are just my opinions.

I think things are getting a whole lot better. The conversation is taking a more positive tone - we're not just focusing on the problems anymore, we're giving people concrete advice on how to make things better. A lot of my colleagues have been focusing on providing positive role models for other women to participate and talking about all the reasons why we enjoy contributing to FOSS, all of which helps more women get involved. I've also seen much more discussion about the role that FOSS plays in making the world a better place, which definitely appeals to women; women tend to want to know that their work makes a wider impact rather than just wanting to scratch their own itch.

I think the growth of women in FOSS will really take off in the next few years since more women are stepping up and being vocal about their contributions to the community and their passion for their work. Sharing one's enthusiasm and joy make FOSS that much more attractive to all new contributors, not just women.

LWN: What are the biggest obstacles here? Is it entirely a cultural problem, or are there outside factors at play as well?

I think there are many factors involved. Women tend to be socialized away from technical endeavors, women tend to get their first computer much later than men do, women tend to have fewer technical role models than men do, etc. I think there's definitely a cultural aspect, too; studies have shown that women are less likely to get involved in competitive scenarios and to underrate their skills when in competitive scenarios. The very nature of FOSS is pretty competitive - one solution vies with another for primacy, one often has to be more vocal to get one's opinions heard - and women may be less likely to be comfortable in those situations. I'm not suggesting FOSS development practices should change overall, but easing all new contributors into your community processes will yield better long term results since more people will feel invested in the project if they have a positive initial experience. Save the harshest critiques of patches or the RTFM responses for those who are a bit more advanced in FOSS and I think we'll see many more people stick around and continue contributing, especially women.

LWN: You've been with Google for quite a while, long enough to see it grow from a company where you knew all the engineers to the enormous company it is today. After six years, you've decided to take on a new challenge. Surely you've been offered other positions — anything in particular that made you decide it's time to leave Google now?

Honestly, I just felt like it was time for a change. I loved my role at Google and I love working with the FOSS community, but I wanted to do something different. Consulting allows me to get a wider range of experience in business and I'm looking forward to expanding my skill set.

LWN: You mentioned you'll be working with hackers in Costa Rica. What, specifically, will you be doing? What's the organization you'll be working with?

Heh. I've been idly talking about creating an intentional community for years now and hackers are just my kind of people. I love Costa Rica and the company that I'll be working with in the near term has offices in the Valley and Costa Rica. I'm looking forward to investigating the viability of a "Costa Rican hacker colony" while consulting. Specifically I'll be working on marketing and business development, and consulting gives me plenty of time to continue pursuing my passion for FOSS.

LWN: Beyond Google - how has the open source community evolved in your view in the last decade or so, and is it improving? Any disappointments?

I'm delighted to say that I am seeing the community taking a more expansive role of the ideas of contribution and contributor. Once upon a time, if you couldn't write code you simply weren't needed. I'm now seeing that attitude change and more individuals and projects explicitly state that any and all contributions are to be valued, be they coding, documentation, running a booth at a conference or organizing user group meetings. All of these require different talents and all of these efforts make a project better. The more inclusive we all are at recognizing contributions, the more contributors we'll find donating their time and expertise to FOSS.

Thanks, Leslie, for taking the time to answer our questions.

Comments (6 posted)

2010 LWN reader survey

Advertising on LWN is not particularly popular, either for readers or for editors. Unfortunately, it is a part of our business model, though. That said, there are some big differences in the kinds of ads that a site can run, with lower-tier sites having to rely on the automated advertising networks like the one that Google runs. But there are higher quality—both in appearance and revenue—ads out there for sites that can attract them. In order to do that, a site needs something called a "media kit".

A media kit is a PDF that describes the site to advertisers (and advertising agencies) in terms that they understand. Essentially, it gives an overview of the site, its content, authors, editors, and, most crucially, its readers. Advertisers are trying to reach decision makers and those who advise them, so they want to understand the demographics of a site's readership in those terms. We have engaged Don Marti to help us create a media kit as the first step towards trying to bring in more advertising revenue—hopefully with fewer ads. Once it is complete, we will be trying to sell advertising to companies in the Linux and free software industries.

We and Don have come up with a survey that we would like our readers to fill out. It combines questions that will be aggregated into the media kit with some others about things like LWN content and your usage of it. We will not distribute any personally identifiable information from your answers, only aggregated statistics. We will also summarize the results of the survey once it has closed, which will be in two weeks. We would really appreciate it if you could take a few minutes and fill out the survey. Thanks in advance from all of us here at LWN.

Comments (none posted)

Page editor: Jonathan Corbet

Security

Threat models for embedded devices

By Jake Edge
April 14, 2010

Understanding the threats that a system could be subjected to is a starting point for deciding on what countermeasures to use to protect it. That idea is the motivation behind the development of "threat models", which is a term used in various contexts including both military and computer security. It is impossible to defend against all possible threats, so understanding which threats are to be defended against helps focus the efforts to elements that will thwart those attacks—without getting lost in a wide variety of other possibilities.

In the embedded systems context, a threat model is the summation of the kinds of threats the device is meant to thwart. It is essentially a list of the attack types that will be defended against by all elements of the system: hardware and software. The good folks at the Embedded Linux Conference accepted my talk, "Understanding threat models for embedded devices", which I thought might best be organized around this article, effectively killing two birds with one stone.

The threat model can be considered long before specific software choices for a device are made, as it is largely determined by the intended function of the device. There are several things that need to be considered, including things like the connectivity of the device (wired, wireless, remote control, etc.), the data that the device has—or controls—access to, the kinds of environments where it will be installed, and the technical sophistication expected of its users. From a threat model perspective, those characteristics are interdependent, so they can't be analyzed in isolation.

What is being protected?

One of the first things that needs to be identified is what is being protected. At the most basic level, the proper functioning of the device likely requires protection, so that attackers cannot cause a denial-of-service, but there are often more assets to protect than that. Another way to look at the problem is to consider what the consequences are if the device is successfully attacked. Beyond just being able to disrupt the user's enjoyment of the device, the data stored in the device or protected by it—think router/firewall for example—needs to be considered.

The value of that data, both to the user and to a potential attacker, is a key factor in determining what threats are relevant. For something like a television or microwave, which either don't store much in the way of data or only store data that has a fairly low value (various settings, favorite channels, etc.), protecting the data from disclosure or erasure is probably not a high priority. But for other devices, a successful attack might drain a user's bank account, snoop on their phone calls, or permanently delete their family photo album. From a user—customer—perspective those differences are very important.

As the value of the data to an attacker increases, the sophistication of the attacks also increases. A basic tenet of security is that making attacks require more resources than an attacker gains if they are successful is an effective means of repelling attacks.

Inputs

One of the biggest factors that impacts a device's threat model is the kinds of connectivity it has to the external world. Devices like home wireless internet routers or network-attached storage servers have obvious network connections, but other devices, even some that might seem to be connection-free, often still have some means of interacting with users that might be a potential means for attack. Remote controls—or even front panel buttons, depending on the installation location—provide a means for providing input to a device. Any input can, of course, be used for good or ill.

There are very few devices that take no input at all, so identifying those inputs is important. There are obvious inputs, wired and wireless networking, bluetooth, GPS signals, cellular voice/data, etc. There are more subtle, and likely less useful, inputs that should also be considered. Cameras are being added to more devices these days, and depending on what kind of processing is done to an image, could become a vector for attack. There have certainly been enough exploits of flaws in image-handling libraries to give one pause.

Some inputs have much more potential for abuse than others, but as part of coming up with a threat model, all of the system inputs should at least be considered. It may make sense for business or other reasons to explicitly ignore certain kinds of attacks, especially if the input mechanism is protected in other ways. If the device is intended to be physically secure—locked up in someone's house or server room for instance—it may be reasonable to exclude attacks requiring physical access from the model. That doesn't mean that those attacks are impossible but that they require an attack focused on a particular individual or organization, which are some of the hardest attacks to thwart.

Installation location

Physical security is often assumed for servers, but embedded devices don't necessarily have that luxury. Depending on the function of the device and the technical knowledge of its owner, devices may be deployed in ways that are outside of the expectations of the developers. "Home" wireless routers are often used in small businesses and are sometimes explicitly set up for use by the general public, in coffee shops for example. Televisions and DVRs may be installed in bars and restaurants which expands the kinds of attacks those devices might be subjected to.

It is important to at least think about other ways that a device might be used. It is much easier to assume that a device will only be installed in a "friendly" environment, or that its owner will keep it physically safe, but those expectations may not be borne out in practice. It is also important to consider the target market for the device and the expected level of technical knowledge for its users.

Users

Technically savvy computer users would probably never even consider hooking up their NAS—with their music, photo, and movie collection—directly to the Internet, but less sophisticated users might. If they were trying to share photos with a relative, or some kind of photo printing service, it might seem to be far easier to just "put it on the net" without recognizing the privacy and security implications.

So some consideration should be given to the target market. Less sophisticated users probably need a higher level of security, at least for some types of devices. If the device is targeting the IT department at a company, some level of security knowledge can probably—hopefully—be assumed. But that same device in the hands of someone with no security training or awareness may be at much more risk.

Another consideration is how the device gets its updates in the event of a security flaw. These days, most devices have some means to update the firmware for security, other bugs, or to add—and sometimes subtract—functionality. In order for security updates to get installed, though, the user must know about the problem, and have the capability to perform the upgrade. Systems that do not plan to have an easy path for user upgrades will probably want to put more effort into ensuring that the firmware that ships has been carefully vetted based on the other factors in its threat model.

Examples

A television with HDMI, remote control, and front panel inputs is not susceptible to very many kinds of attacks. There are potentially denial-of-service attacks possible via the latter two input methods, but the data being protected is of fairly low value (if there is any data at all). The users are likely to have little security awareness, but since there is little of value and few ways to impact the device, the security needs are fairly minimal. The biggest problem might be someone surreptitiously changing the channel or volume via a universal remote—something that is completely outside of the scope of problems that a television manufacturer should be expected to thwart.

A NAS box targeted at home users to store various multimedia files throughout the home has an entirely different profile. Other than a power switch, there probably isn't any real front panel, but there is a network interface, which is where most of the exposure comes from. The data is of fairly high value to its owner, and while its value to an attacker is variable, it is probably fairly low. An attacker who could deny access to the device or its contents might be in a position to extract a ransom to restore it. If the system could be compromised, though, it is likely to be powerful enough to be of interest as a botnet participant, which may be the most likely attack scenario.

Targeting the home market means that users are not likely to have a lot of technical expertise, so they may install and use the device in unexpected ways. But even for those devices that are properly installed behind a router/firewall, there may be attacks via malware running on other computers in the network. Home networks are often considered to be "safe", but browser and other malware attacks against devices behind the firewall, or the firewall itself, are certainly not unknown. Probably one of the more common mistakes that is made with these kinds of devices is using a default administrative password that never gets changed—malware doesn't have to work very hard to exploit that flaw.

Other devices will have different characteristics, of course, but analyzing the device to determine the threats to defend against is similar. Using the value of the data, the kinds of inputs the device has, its users, and how it is intended to be installed, one can start prioritizing which functional areas of the device software need the most attention. Explicitly rejecting certain kinds of attack scenarios will also assist in cutting down on the work that needs to be done, as long as customer expectations are set correctly. Doing this analysis during product planning, rather than after there is working prototype or, worse yet, a shipping product, will make it much less disruptive resulting in a better, more secure device.

Conclusion

Device manufacturers, like software vendors, often see security problems as a public relations issue, which it is to some extent. But it is really more of a customer relations problem—if customers have been burned by security problems in a particular vendor's device, they are much less likely to purchase from that vendor again. As a large software vendor in the Pacific Northwest found out, once a reputation for poor security is established, it can be very hard to undo. It's much better to "bake security in" rather than just hope that no security issues arise.

Comments (7 posted)

Brief items

Apache.org services attacked

The Apache Infrastructure Team has reported a direct, targeted attack against the server hosting their issue-tracking software. "If you are a user of the Apache hosted JIRA, Bugzilla, or Confluence, a hashed copy of your password has been compromised. JIRA and Confluence both use a SHA-512 hash, but without a random salt. We believe the risk to simple passwords based on dictionary words is quite high, and most users should rotate their passwords. Bugzilla uses a SHA-256, including a random salt. The risk for most users is low to moderate, since pre-built password dictionaries are not effective, but we recommend users should still remove these passwords from use. In addition, if you logged into the Apache JIRA instance between April 6th and April 9th, you should consider the password as compromised, because the attackers changed the login form to log them."

Comments (28 posted)

New vulnerabilities

acroread: multiple vulnerabilities

Package(s):acroread CVE #(s):CVE-2010-0190 CVE-2010-0191 CVE-2010-0192 CVE-2010-0193 CVE-2010-0194 CVE-2010-0195 CVE-2010-0196 CVE-2010-0197 CVE-2010-0198 CVE-2010-0199 CVE-2010-0201 CVE-2010-0202 CVE-2010-0203 CVE-2010-0204 CVE-2010-1241
Created:April 14, 2010 Updated:September 8, 2010
Description: From the Red Hat advisory:

This update fixes several vulnerabilities in Adobe Reader. These vulnerabilities are summarized on the Adobe Security Advisory APSB10-09 page. A specially-crafted PDF file could cause Adobe Reader to crash or, potentially, execute arbitrary code as the user running Adobe Reader when opened.

Alerts:
Gentoo 201009-05 2010-09-07
Red Hat RHSA-2010:0349-01 2010-04-14

Comments (none posted)

alienarena: denial of service

Package(s):alienarena CVE #(s):
Created:April 9, 2010 Updated:April 14, 2010
Description: From the Fedora advisory:

By supplying various invalid parameters to the download command, it is possible to cause a DoS condition by causing the server to crash. A path ending in . or / will crash on Linux. Supplying a negative offset will cause a crash on all platforms. - Fix buffer overflow identified in R1Q2 client code.

Alerts:
Fedora FEDORA-2010-6132 2010-04-09
Fedora FEDORA-2010-6068 2010-04-09

Comments (none posted)

clamav: denial of service

Package(s):clamav CVE #(s):CVE-2010-0098
Created:April 9, 2010 Updated:September 8, 2010
Description: From the Ubuntu advisory:

It was discovered that ClamAV did not properly verify its input when processing CAB files. A remote attacker could send a specially crafted CAB file to evade malware detection. (CVE-2010-0098)

It was discovered that ClamAV did not properly verify its input when processing CAB files. A remote attacker could send a specially crafted CAB file and cause a denial of service via application crash.

Alerts:
Gentoo 201009-06 2010-09-07
Mandriva MDVSA-2010:082-1 2010-05-20
SuSE SUSE-SR:2010:010 2010-04-27
Pardus 2010-55 2010-04-20
Mandriva MDVSA-2010:082 2010-04-18
Ubuntu USN-926-1 2010-04-08

Comments (none posted)

drupal-views: multiple vulnerabilities

Package(s):drupal-views CVE #(s):
Created:April 12, 2010 Updated:April 14, 2010
Description: From the Fedora advisory:

Views module provides a flexible method for Drupal site designers to control how lists of content are presented. Views accepts parameters in the URL and uses them in an AJAX callback. The values were not filtered, thus allowing injection of JavaScript code via the AJAX response. A user tricked into visiting a crafted URL could be exposed to arbitrary script or HTML injected into the page. In addition, the Views module does not properly sanitize file descriptions when displaying them in a view, thus the the file descriptions may be used to inject arbitrary script or HTML. Such cross site scripting [1] (XSS) attacks may lead to a malicious user gaining full administrative access. These vulnerabilities affect only the Drupal 6 version. The file description vulnerability is mitigated by the fact that the attacker must have permission to upload files. In both the Drupal 5 and Drupal 6 versions, users with permission to 'administer views' can execute arbitrary PHP code using the views import feature. An additional check for the permission 'use PHP for block visibility' has been added to insure that the site administrator has already granted users of the import functionality the permission to execute PHP.

Alerts:
Fedora FEDORA-2010-6356 2010-04-10
Fedora FEDORA-2010-6317 2010-04-10

Comments (none posted)

firefox: security bypass

Package(s):firefox CVE #(s):CVE-2010-0182
Created:April 12, 2010 Updated:August 9, 2010
Description: From the Ubuntu advisory:

Wladimir Palant discovered that Firefox did not always perform security checks on XML content. An attacker could exploit this to bypass security policies to load certain resources.

Alerts:
CentOS CESA-2010:0500 2010-08-06
Debian DSA-2075-1 2010-07-27
SuSE SUSE-SR:2010:013 2010-06-14
Mandriva MDVSA-2010:070-1 2010-04-20
Mandriva MDVSA-2010:070 2010-04-13
SuSE SUSE-SA:2010:021 2010-04-14
Ubuntu USN-921-1 2010-04-09
Red Hat RHSA-2010:0501-01 2010-06-22
CentOS CESA-2010:0501 2010-06-24
Gentoo 201301-01 2013-01-07

Comments (none posted)

firefox: multiple vulnerabilities

Package(s):firefox CVE #(s):CVE-2010-0164 CVE-2010-0165 CVE-2010-0167 CVE-2010-0168 CVE-2010-0170 CVE-2010-0172 CVE-2010-1122
Created:April 14, 2010 Updated:October 3, 2011
Description: From the Mandriva advisory:

Security researcher regenrecht reported (via TippingPoint's Zero Day Initiative) a potential reuse of a deleted image frame in Firefox 3.6's handling of multipart/x-mixed-replace images. Although no exploit was shown, re-use of freed memory has led to exploitable vulnerabilities in the past (CVE-2010-0164).

Mozilla developers identified and fixed several stability bugs in the browser engine used in Firefox and other Mozilla-based products. Some of these crashes showed evidence of memory corruption under certain circumstances and we presume that with enough effort at least some of these could be exploited to run arbitrary code (CVE-2010-0165, CVE-2010-0167).

Mozilla developer Josh Soref of Nokia reported that documents failed to call certain security checks when attempting to preload images. Although the image content is not available to the page, it is possible to specify protocols that are normally not allowed in a web page such as file:. This includes internal schemes implemented by add-ons that might perform privileged actions resulting in something like a Cross-Site Request Forgery (CSRF) attack against the add-on. Potential severity would depend on the add-ons installed (CVE-2010-0168).

Mozilla developer Blake Kaplan reported that the window.location object was made a normal overridable JavaScript object in the Firefox 3.6 browser engine (Gecko 1.9.2) because new mechanisms were developed to enforce the same-origin policy between windows and frames. This object is unfortunately also used by some plugins to determine the page origin used for access restrictions. A malicious page could override this object to fool a plugin into granting access to data on another site or the local file system. The behavior of older Firefox versions has been restored (CVE-2010-0170).

Unspecified vulnerability in Mozilla Firefox 3.5.x through 3.5.8 allows remote attackers to cause a denial of service (memory corruption and application crash) and possibly have unknown other impact via vectors that might involve compressed data, a different vulnerability than CVE-2010-1028 (CVE-2010-1122).

Alerts:
Mandriva MDVSA-2011:140 2011-10-01
Mandriva MDVSA-2011:141 2011-10-01
Mandriva MDVSA-2011:139 2011-10-01
Mandriva MDVSA-2010:070-1 2010-04-20
Mandriva MDVSA-2010:070 2010-04-13
Mageia MGASA-2012-0176 2012-07-21
Gentoo 201301-01 2013-01-07

Comments (none posted)

kdm: privilege escalation

Package(s):kdebase3 kde4-kdm CVE #(s):CVE-2010-0436
Created:April 14, 2010 Updated:June 1, 2010
Description: From the KDE advisory:

KDM contains a race condition that allows local attackers to make arbitrary files on the system world-writeable. This can happen while KDM tries to create its control socket during user login. This vulnerability has been discovered by Sebastian Krahmer from the SUSE Security Team.

Alerts:
CentOS CESA-2010:0348 2010-06-01
CentOS CESA-2010:0348 2010-06-01
Slackware SSA:2010-110-02 2010-04-21
CentOS CESA-2010:0348 2010-04-20
Ubuntu USN-932-1 2010-04-19
Pardus 2010-50 2010-04-20
Debian DSA-2037-1 2010-04-17
Fedora FEDORA-2010-6077 2010-04-09
Fedora FEDORA-2010-6096 2010-04-09
Mandriva MDVSA-2010:074 2010-04-15
Red Hat RHSA-2010:0348-01 2010-04-14
SuSE SUSE-SR:2010:009 2010-04-14

Comments (none posted)

MoinMoin: access restriction bypass

Package(s):moin CVE #(s):CVE-2010-1238
Created:April 8, 2010 Updated:April 14, 2010
Description: It was discovered that the TextCha protection in MoinMoin could be bypassed by submitting a crafted form request. This issue only affected Ubuntu 8.10. (CVE-2010-1238)
Alerts:
Ubuntu USN-925-1 2010-04-08
Gentoo 201210-02 2012-10-18

Comments (none posted)

spamass-milter: arbitrary code execution

Package(s):spamass-milter CVE #(s):CVE-2010-1132
Created:April 9, 2010 Updated:April 27, 2010
Description: From the Fedora advisory:

This update includes a fix for a problem where if the milter is running using the "-x" option to expand aliases before passing inbound mail through SpamAssassin, a malicious client using a carefully-crafted SMTP session could execute arbitrary code on the mail server. The fix avoids the use of a shell in the alias expansion and hence there is no longer a problem with having to sanitize input from the client.

Alerts:
Debian DSA-2021-2 2010-04-26
Fedora FEDORA-2010-5176 2010-03-23
Fedora FEDORA-2010-5096 2010-03-23

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The current development kernel is 2.6.34-rc4, released on April 12, about a week later than would have been expected. The delay was the result of a nasty VM regression (see below). All told, some 500 fixes have been merged since -rc3; see the announcement for the short-form changelog, or see the full changelog for all the details.

There have been no stable updates released in the last week.

Comments (none posted)

Quotes of the week

Hey, all my other theories made sense too.. They just didn't work.

But as Edison said: I didn't fail, I just found three other ways to not fix your bug.

-- Linus Torvalds

To be honest I think 4K stack simply has to go. I tend to call it "russian roulette" mode.

It was just a old workaround for a very old buggy VM that couldn't free 8K pages and the VM is a lot better at that now. And the general trend is to more complex code everywhere, so 4K stacks become more and more hazardous. It was a bad idea back then and is still a bad idea, getting worse and worse with each MLOC being added to the kernel each year.

-- Andi Kleen

Comments (7 posted)

Idle cycle injection

By Jonathan Corbet
April 14, 2010
When Google's Mike Waychison addressed the 2009 Kernel Summit, one of the goals he laid out was the merging of Google's idle cycle injection code into the mainline. Idle cycle injection is the forced idling of the CPU to avoid overheating; essentially, it is Google's way of running processors to the very edge of their capability without going past that edge and allowing the smoke to escape. This sort of power management is certainly not a Google-specific problem, so it makes sense to get the code upstream. Salman Qazi's recently posted kidled patch series shows the current form of this work.

The core idea is simple: through some new control files under /proc/sys/kernel/kidled, the system administrator can set, on a per-CPU basis, the percentage of time that the CPU should be idle and an interval over which that percentage is calculated. If the end of an interval draws near and the CPU has not been naturally idle for the requisite time, kidled will force the processor to go idle for a while.

Naturally enough, there are some complications. The first is that it would be nice to avoid forcing idle cycles when important processes are running. So kidled includes the notion of "eager cycle injection." By way of the control group mechanism, processes can be marked as being "interactive." When so-marked processes are not running, kidled will try to get its forced idle cycles in early. When interactive processes are running, instead, idle cycles will be forced only when strictly necessary. In this way, "interactive" processes will not be impeded by idle cycle injection except when there is no alternative.

The other twist has to do with the accounting of idle CPU time. The injection of idle cycles takes CPU time away from somebody; the kidled code allows the administrator to say who the victims should be. There is another control group parameter which controls the "power capping priority" of each process. When idle cycles are injected, kidled will mess around in the scheduler's data structures, causing processes with lower priorities to be charged for the idle time. That means that, when CPU usage must be throttled, specific processes can be made to suffer more than others.

As of this writing, there has been little public discussion of the patches. The core concept is not controversial, but it will be interesting to see how the scheduler-related parts of the series are received.

Comments (10 posted)

Kernel development news

ELC: Status of embedded Linux

By Jake Edge
April 14, 2010

Embedded Linux Conference (ELC) organizer Tim Bird surveyed the embedded Linux landscape in a talk he gave at the conference. He looked at new and proposed kernel features that embedded developers might be interested in as well as issuing a "call to arms" to those developers to get more involved with the rest of the community. This talk is a regular feature at ELC to help the embedded community stay on top of the "ripping" speed of kernel development.

There have been four new kernels since last year's conference, and Bird listed the interesting features for the embedded space in each of 2.6.30-33, as well as noting that LogFS had finally made it into the kernel in 2.6.34, something that he was concerned might not ever happen. The speed of kernel development is amazing, he said, and the great thing about it is that "even while I am sleeping in my bed, people are pounding away on it".

He pointed to a few "patches to watch" that may be coming in new kernels, specifically the kbuild CROSS_COMPILE option, which will make it easier to build for multiple architectures. He also noted Arnd Bergmann's asm-generic patches that are geared towards making it easier to add new architectures to the kernel—without propagating the bugs and quirks from existing ones.

Boot speed

Bird then looked at different "technology areas" to point out interesting features or work going on in those areas. Boot time is a "hot topic" right now; it would have been in the past if the embedded community was more involved in mainline kernel development. The Moblin five second boot effort really kickstarted that work. He noted that he has a Sony (his employer) video camera that boots Linux in 1.5 seconds; "I'm very proud of that", he said.

Several new kernel features are available to help reduce boot time, including asynchronous function calls, which allow some parts of device initialization to run in parallel. There is also scripts/bootgraph.pl to help visualize where boot time is being spent.

Devtmpfs was also noted as a way to decrease boot times, with some seeing a 0.6 second reduction on desktops. Bird said that there needs to be some testing done on the embedded side to see how much it can help there. He also listed two patches that speed up symbol resolution for module loading by getting rid of the current linear search. One switches to a binary search and the other uses a hash table. For Bird's use cases, he always statically links in drivers, but has heard that more embedded developers are going the loadable module route.

Greg Kroah-Hartman piped up that he needed one of those two patches for MeeGo, but that the submitters had disappeared. There was general agreement that contacting them and getting something upstream would be good.

Filesystems

Several different filesystems for embedded use cases were listed by Bird. Squashfs has been out of the mainline for years, but was merged in 2.6.29, and has since been improved by others in a "classic case" demonstrating the advantages of mainline code. Ubifs is also in the mainline and folks at Toshiba have been characterizing its performance, which they reported on at the CE Linux Forum (CELF) Japan Jamboree. It has "really slow mount times" in some cases, which CELF would like to fund someone to fix.

LogFS is "way better optimized" for certain flash devices and has fast mount times, he said. He noted that AXFS, the advanced execute-in-place (XIP) filesystem, had kind of disappeared, so it didn't appear to be on track for mainlining. He has been playing with AXFS at Sony to try to further decrease boot time.

Bird also noted that the VFAT patent avoidance patches had not made it into the mainline. It would be useful for some embedded devices, he said. Most embedded developers work around the patent by disabling VFAT and using 8.3 filenames, which is somewhat unfortunate. Another thing he is keeping an eye on is VFS-based union mounts, which would allow embedded developers to stop creating "filesystems with weird links" between them as is currently common.

Power management and realtime

The runtime power management code has been merged, which will allow suspending and resuming individual system components to reduce power consumption. There is ongoing work on asynchronous suspend/resume, which Bird said he didn't know very much about, but it's "gotta be really cool". An audience member helped out by saying that it is in some ways like the asynchronous initialization code (for faster boot), but "in the other direction".

The RT_PREEMPT patchset "continues its slow march into the kernel", with threaded interrupt handlers being merged in 2.6.30 and preparatory work for the future sleeping spinlocks merge that went into 2.6.33. There are still some big kernel lock issues (BKL) to be resolved and CELF may fund some work in that area.

Kernel size

The slide for kernel size and memory use had a picture of a "hybrid Winnebago", which is the image Bird has of the kernel today. It just keeps growing in size. To help embedded developers make better use of limited memory, there is the smem tool that was funded by CELF. He has used it in a few projects this year and it "has been very helpful".

Various compression methods have been added to compress the kernel image in different ways. LZMA can be up to 30% better than gzip, and LZO is not as good at compression, but is much faster. There are tradeoffs dependent on processor speed and I/O bandwidth that make it more difficult to pick the right compression method, as Dirk Hohndel pointed out.

The ramzswap device (also known as compcache) allows in-memory compressed swap. It is "really cool" but the maintainer only was able to benchmark on desktop systems. It would be good if someone could do some benchmarking on embedded systems, Bird said.

Tracing and security

Ftrace now has support for dynamic probes that came in 2.6.33, and the perf tool can place and use those probes as well. There is tracing of kernel variable access and modification available now. The perf "diff" mode can show the performance differences between two runs, and also came in 2.6.33.

The TOMOYO merge in 2.6.30 was "a big deal" because it finally was able to get path-based security into the kernel. NTT Data is now adding TOMOYO rules to Android. Bird is in favor of a diversity of choices for security as it gives people a chance to demonstrate which is the best solution for various use cases. As part of that, CELF funded a study [PDF] for applying the Smack security module in a television use case and found that the overhead was higher than they expected.

CELF contract work

CELF has funded various projects over the last year including smem, out-of-memory notifications in cgroups, SquashFS, the Smack analysis, device trees for ARM, and the -ffunction-sections work to put each kernel function in its own section to assist with dead code removal. Going forward, CELF has an open project proposal plan that will start funding new projects in the next few weeks. It is also sponsoring Matt Mackall to be one of the two Linux kernel embedded maintainers (David Woodhouse is the other).

A call to arms

Bird ended his talk with a list of things that embedded developers can do to work better with the community. At the top of that list was "work at top of tree". He realized that when he gives these talks, he is generally talking to people who aren't using the kernels he talks about because embedded folks tend to pick a kernel and stick with it. "It's difficult" to work with the most recent kernel, but it's worth it. "Version gap is the single biggest problem" in embedded Linux. He suggested that embedded developers beat up on their board vendor to get board support packages using the latest kernels and to do their testing on boards that are already supported in the mainline.

The other suggestion he had was not to "wait for others to test new features", and instead to do the testing themselves. He listed a number of things that need testing in the mainline: LogFS, Ubifs mount times, ramzswap, runtime power management, and so on. "Post the results to the elinux.org wiki" or come to the next conference (October 27-28 in Cambridge, UK) and tell him about it.

Comments (2 posted)

The case of the overly anonymous anon_vma

By Jonathan Corbet
April 13, 2010
During the stabilization phase of the kernel development cycle, the -rc releases typically happen about once every week. 2.6.34-rc4 is a clear exception to that rule, coming nearly two weeks after the preceding -rc3 release. The holdup in this case was a nasty regression which occupied a number of kernel developers nearly full time for days. The hunt for this bug is a classic story of what can happen when the code gets too complex.

Sending email to linux-kernel can be an intimidating prospect for a number of reasons, one of which being that one never knows when a massive thread - involving hundreds of messages copied back to the original sender - might result. Borislav Petkov's 2.6.34-rc3 bug report was one such posting. In this case, though, the ensuing thread was in no way inflammatory; it represents, instead, some of the most intensive head-scratching which has been seen on the list for a while.

The bug, as reported by Borislav, was a null pointer dereference which would happen reasonably reliably after hibernating (and restarting) the system. It was quickly recognized as being the same as another bug report filed the same day by Steinar H. Gunderson, though this one did not involve hibernation. The common thread was null pointer dereferences provoked by memory pressure. The offending patch was identified by Linus almost immediately; it's worth taking a look at what that patch did.

Way back in 2004, LWN covered the addition of the anon_vma code; this patch was controversial at the time because the upcoming 2.6.7 kernel was still expected to be an old-style "stable, no new features" release. This patch, a 40-part series which fundamentally reworked the virtual memory subsystem, was not seen as stable material, despite Linus's attempt to characterize it as an "implementation detail." Still, over time, this code has proved solid and has not been changed significantly since - until now.

The problem solved by anon_vma was that of locating all vm_area_struct (VMA) structures which reference a given anonymous (heap or stack memory) page. Anonymous pages are not normally shared between processes, but every call to fork() will cause all such pages to be shared between the parent and the new child; that sharing will only be broken when one of the processes writes to the page, causing a copy-on-write (COW) operation to take place. Many pages are never written, so the kernel must be able to locate multiple VMAs which reference a given anonymous page. Otherwise, it would not be able to unmap the page, meaning that the page could not be swapped out.

The reverse mapping solution originally used in 2.6 proved to be far too expensive, necessitating a rewrite. This rewrite introduced the anon_vma structure, which heads up a linked list of all VMAs which might reference a given page. So a fork() also causes every VMA in the child process which contains anonymous pages to be added to a the list maintained in the parent's anon_vma structure. The mapping pointer in struct page points to the anon_vma structure, allowing the kernel to traverse the list and find all of the relevant VMA structures.

This diagram, from the 2004 article, shows how this data structure looks:

[anonvma]

This solution scaled far better than its predecessor, but eventually the world caught up. So Rik van Riel set out to make things faster, writing this patch, which was merged for 2.6.34. Rik describes the problem this way:

In a workload with 1000 child processes and a VMA with 1000 anonymous pages per process that get COWed, this leads to a system with a million anonymous pages in the same anon_vma, each of which is mapped in just one of the 1000 processes. However, the current rmap code needs to walk them all, leading to O(N) scanning complexity for each page.

Essentially, by organizing all anonymous pages which originated in the same parent under the same anon_vma structure, the kernel created a monster data structure which it had to traverse every time it needed to reverse-map a page. That led to the kernel scanning large numbers of VMAs which could not possibly reference the page, all while holding locks. The result, says Rik, was "catastrophic failure" when running the AIM benchmark.

Rik's solution was to create an anon_vma structure for each process and to link those together instead of the VMA structures. This linking is done with a new structure called anon_vma_chain:

    struct anon_vma_chain {
	struct vm_area_struct *vma;
	struct anon_vma *anon_vma;
	struct list_head same_vma;
	struct list_head same_anon_vma;
    };

Each anon_vma_chain entry (AVC) maintains two lists: all anon_vma structures relevant to a given vma (same_vma), and all VMAs which fall within the area covered by a given anon_vma structure (same_anon_vma). It gets complicated, so some diagrams might help. Initially, we have a single process with one anonymous VMA:

[AV
Chain]

Here, "AV" is the anon_vma structure, and "AVC" is the anon_vma_chain structure seen above. The AVC links to both the anon_vma and VMA structures through direct pointers. The (blue) linked list pointer is the same_anon_vma list, while the (red) pointer is the same_vma list. So far, so simple.

Imagine now that this process forks, causing the VMA to be copied in the child; initially we have a lonely new VMA like this:

[AV
Chain]

The kernel needs to link this VMA to the parent's anon_vma structure; that requires the addition of a new anon_vma_chain:

[AV
Chain]

Note that the new AVC has been added to the blue list of all VMAs referencing a given anon_vma structure. The new VMA also needs its own anon_vma, though:

[AV
Chain]

Now there's yet another anon_vma_chain structure linking in the new anon_vma. The new red list has been expanded to contain all of the AVCs which reference relevant anon_vma structures. As your editor said, it gets complicated; the diagram for the 1000-child scenario which motivated this patch will be left as an exercise for the reader.

When the fork() happens, all of the anonymous pages in the area point back to the parent's anon_vma structure. Whenever the child writes to a page and causes a copy-on-write, though, the new page will map back to the child's anon_vma structure instead. Now, reverse-mapping that page can be done immediately, with no need to scan through any other processes in the hierarchy. That makes the lock contention go away, making benchmarkers happy.

The only problem is that embarrassing oops issue. Linus, Rik, Borislav, and others chased after it, trying no end of changes. For a while, it seemed that a bug causing excessive reuse of anon_vma structures when VMAs were merged could be the problem, but fixing the bug did not fix this oops. Sometimes, changing VMA boundaries with mprotect() could cause the wrong anon_vma to be used, but fixing that one didn't help either. The reordering of chains when they were copied was also noted as a problem...but it wasn't the problem.

Linus was clearly beginning to wonder when it might all end: "Three independent bugs found and fixed, and still no joy?" He repeatedly considered just reverting the change outright, but he was reluctant to do so; the solution seemed so tantalizingly close. Eventually he developed another hypothesis which seemed plausible. An anonymous page shared between parent and child would initially point to the parent's anon_vma:

[AV
Chain]

But, if both processes were to unmap the page (as could happen during system hibernation, for example), then the child referenced it first, it could end up pointing to the child's anon_vma instead:

[AV
Chain]

If the parent mapped the page later, then the child unmapped it (by exiting, perhaps), the parent would be left with an anonymous page pointing to the child's anon_vma - which no longer exists:

[AV
Chain]

Needless to say, that is a situation which is unlikely to lead to anything good in the near future.

The fix is straightforward; when linking an existing page to an anon_vma structure, the kernel needs to pick the one which is highest in the process hierarchy; that guarantees that the anon_vma will not go away prematurely. Early testing suggests that the problem has indeed been fixed. In the process, three other problems have been fixed and Linus has come to understand a tricky bit of code which, if he has his way, will soon gain some improved documentation. In other words, it would appear to be an outcome worth waiting for.

Comments (34 posted)

Patches and updates

Kernel trees

Build system

Core kernel code

Development tools

Device drivers

Documentation

Filesystems and block I/O

Memory management

Networking

Architecture-specific

Virtualization and containers

Benchmarks and bugs

Miscellaneous

Page editor: Jonathan Corbet

Distributions

News and Editorials

SimplyMepis 8.5

April 14, 2010

This article was contributed by Susan Linton

I have been looking forward to the release of SimplyMepis 8.5 for at least six months. I used SimplyMepis 8.0 the first half of last year with its KDE 3.5 desktop and was very content. I was late to the KDE 4 bandwagon, but I became interested last summer when reports pegged it as stable and usable. Reports were already circulating that the next release of Mepis would have KDE 4 as the desktop. It was a long hard wait, but the opportunity finally arrived on March 30 with the release of SimplyMepis 8.5.

Under the Hood

SimplyMepis desktop

This release of MEPIS is an update to version 8.0 released last year. Although many bugfixes, enhancements, and features have been added, much of the underlying code has remained the same. For example this release was compiled with GCC 4.3.2 like 8.0 and still uses Glibc 2.7, Perl 5.10.0, Python 2.5.2, and Xorg X Server 1.4.2. That doesn't mean significant updates haven't been applied. Qt was updated to 4.5.3, GTK+ updated to 2.18.3, and the kernel updated to 2.6.32. Basically, SimplyMepis 8.5 was updated as much as possible without breaking compatibility with Debian 5.0 Lenny.

A newer kernel was implemented in order to support some of the newer hardware that may have come into use since 2.6.27 was released. MEPIS uses the kernel from the Debian developmental branch and Debian is known to patch its kernels for features, security and bug fixes. A few fixes found in recent patch sets include code to repair breakage in Dosemu and Wine, expand some bug reporting information, and fix a Debian-specific bug in modules.dep generation by module-init-tools. Linux 2.6.32 brought support for goodies such as ATI R600/R700 3D graphic cards, Micrel KS8851 Ethernet chips, ACPI 4.0, and CX25821 and Hauppauge HVR TV cards.

On the Surface

As with each new SimplyMepis release, the boot and login images have been redesigned for beauty. But what's most striking at first look of the desktop is how SimplyMepis it truly is. Yes, the desktop has been upgraded to KDE 4.3.4 this release, but yet it's still very SimplyMepis.

The background is in the tradition of typical SimplyMepis in shades of blue and featuring the newest Pyramid logo. The panel is small and unobtrusive with a very few applets and several application launchers. A few icons populate the desktop. The trademark KDE Widget Curls give away the desktop version underneath. SimplyMepis uses the Oxygen Widget style and Crystal Window decorations with Kubuntu Feisty buttons; and OxywinM Panel background, dialogs and tooltip color scheme. On the desktop are icons for Trash, Documents folder, MEPIS Website, and the MEPIS manual. SimplyMepis 8.5 arrives with no window effects enabled.

In the Menu

SimplyMepis features a classic hierarchical menu familiar in KDE 3 and many other desktop environments. It, as with other KDE 4 desktops, can be changed to the Kickoff menu with a right-click of the mouse. Lancelot is also available in the Add Widgets dialog.

The GIMP was removed from SimplyMepis' default install in 8.0, but Gwenview is provided for image viewing and browsing. OpenOffice.org 3.1.1 is available for office tasks and Firefox 3.5.6 is added for Web browsing. Video enjoyment can be had with KMPlayer and Google Gadgets customize and tweak your desktop. If your partition gets a bit tight, then perhaps Sweeper can assist with clearing some space. Of course, APT and Synaptic can fill in the gaps. As expected, lots of applications were updated with bug fixes this cycle.

From MEPIS

SimplyMepis wc search

MEPIS includes a number of configuration tools for the SimplyMepis distribution. Most are uncomplicated, yet provide essential functionality. Several received some code updates but few if any changes appear in the GUI. Those we've seen before include the Network Assistant that will set up wired and wireless connections and a System Assistant that can set up hostname, a bootable USB key, install GRUB, or check a partition. The User Assistant adds or deletes users and the X-Window Assistant sets up mice, keyboards, and graphics.

This release did receive two new assistants. One is the NDisWrapper Manager. It can either load one of the commonly bundled wireless Ethernet drivers or can install one from a file. It can even be used to blacklist any that may be loaded at boot during auto-detection.

SimplyMepis wc apps

The second new tool is the MEPIS Welcome Center, which seems aimed at new users in particular. Divided into two sections, the first provides information and the second allows certain package management functions. Under Recommended First Steps one can search for particular keywords or just peruse the MEPIS Manual, Wiki, or Forums. Under the Optional Extras, one can install language packs, quickly install popular applications, or activate community software repositories. Some of these popular applications available are Amarok, GIMP, and Wine. The Welcome Center can be found both stand-alone in the menu and bundled in the KDE Control Center.

Personal Experience

My experience with SimplyMepis so far has been smooth and pretty much uneventful. The install process completed with no issues. I did experience some quirks with KDE. The infamous random Akregator crashes have occurred a few times, sometimes losing the latest feeds. One time the whole Kwin window manager crashed and prompted a ctrl+alt+backspace. A couple of times some Konqueror windows just disappeared. After a fresh login, the wallpaper on my second monitor isn't displayed, instead reverting to a solid color background. Finally, Nspluginviewer still consumes all the CPU resources even if plugins are disabled.

I'm afraid I can't really comment accurately on performance. I upgraded my computer hardware just prior to the release of SimplyMepis 8.5, and unfortunately, this erased my frame of reference. All I can really say is that on my new computer SimplyMepis with KDE 4.3.4 is very responsive even with window effects enabled.

In conclusion, while technically this was a minor version update, SimplyMepis 8.5 represents a big change for developers and users. As the last official KDE 3 holdout moves on, it signals the true beginnings of a new era. I still get emails every once in a while from users complaining about being forced to migrate to KDE 4 and for that body of users, SimplyMepis 8.5 is a wonderful transitional release. It presents KDE 4 in an environment that remains very similar in appearance to its previous KDE 3 desktop. For SimplyMepis users, it still very much like home. For new users, it could be a gentle introduction to KDE 4.

Comments (none posted)

New Releases

Announcing the release of Fedora 13 Beta

The beta release of Fedora 13 is now available. "The beta release is the last important milestone of Fedora 13. Only critical bug fixes will be pushed as updates leading up to the general release of Fedora 13, scheduled to be released in the middle of May. We invite you to join us and participate in making Fedora 13 a solid release by downloading, testing, and providing your valuable feedback."

Full Story (comments: 5)

openSUSE 11.3 Milestone 5: The Community Strikes Back

The openSUSE project has released openSUSE 11.3 Milestone 5. "M5 was marked by significant contributions from both the openSUSE Community, and the larger Linux community. We've added some interesting new packages, made some updates to core processes, and participated in a coordinated multi-distribution upgrade of a major multimedia component. Over 50 bugs were fixed and 8 new features were implemented."

Full Story (comments: none)

Ubuntu 10.04 LTS Beta 2 released

The Ubuntu team has announced the second beta release of Ubuntu 10.04 LTS (Long-Term Support) Desktop and Server Editions and Ubuntu 10.04 LTS Server for Ubuntu Enterprise Cloud (UEC) and Amazon's EC2, as well as Ubuntu 10.04 Netbook Edition.

Full Story (comments: none)

Distribution News

Debian GNU/Linux

New Debian archive snapshot service

The Debian project has announced a new service, a wayback machine that allows access to old packages based on dates and version numbers. "The ability to install packages and view source code from any given date can be very helpful to developers and users alike. It provides a valuable resource for tracking down when regressions were introduced, or for providing a specific environment that a particular application may require to run. The snapshot archive is accessible like any normal apt repository, allowing it to be easily used by all."

Full Story (comments: none)

Fedora

Fedora Board Meeting Recap 2010-04-08

Click below for a recap of the April 8, 2010 meeting of the Fedora Advisory Board. Topics include Election schedule, Spins, User base, and Fedora UX designers.

Full Story (comments: none)

Ubuntu family

Ubuntu switches search back to Google

Canonical's plan to switch the default search provider to Yahoo for the 10.04 release appears to have not worked out; the default will be changed back to Google before the release is made. "It was not our intention to 'flap' between providers, but the underlying circumstances can change unpredictably. In this case, choosing Google will be familiar to everybody upgrading from 9.10 to 10.04 and the change will only be visible to those who have been part of the development cycle for 10.04."

Full Story (comments: 65)

Other distributions

Linux Mint 6 Felicia reaches end of life

The Linux Mint team has announced that Linux Mint 6 Felicia will reach end of life on April 30, 2010. This release was based on Ubuntu 8.10 which is planned to reach end-of-life at the same date.

Comments (none posted)

Distribution Newsletters

Debian Project News - April 12th, 2010

The first issue of Debian Project News for 2010 is available. Topics include Debian Project Leader elections, Bits from the Release Team, Estimates of the number of Debian users, Bits from the DPL, New archive snapshot service available, MiniDebConf held in Panama, First German Debian Mini Conference, Graphical Installer for ARM-Based netbooks, QEMU image for SH4 port available, and more.

Comments (none posted)

DistroWatch Weekly, Issue 349

The DistroWatch Weekly for April 12, 2010 is out. "The constantly evolving development branches of major distributions are a double-edge sword: one one hand, they offer the very latest applications and technologies, but on the other, they tend to break in the most inopportune moments. The sidux project, which aims to stabilise Debian "sid" and release it as a well-tested, yet cutting-edge distro, could be a great compromise between the typical geek's two conflicting desires. Read on for our first-look review of sidux 2009-04 and a brief interview with the distribution's lead developers. In the news section, the Arch Linux release engineering team updates the ISO images release process, Gentoo announces the launch of a new cooperative Wiki project, TuxRadar presents a comprehensive group test of today's most prominent lightweight distributions, and North Korea is rumoured to have developed its own Linux-based operating system. Also in this issue, news about an interesting multi-boot live DVD containing 11 mini-distributions and a brief look at some of today's gaming options on Linux. Happy reading!"

Comments (none posted)

Fedora Weekly News 220

The Fedora Weekly News for April 7, 2010 is out. "This week's issue kicks off with a couple announcements, including news on opening bids for FUDCon 2011 locations, details on the one week slip on Fedora 13 beta, and links to upcoming Fedora events globally. From the Fedora Planet, news and views from Fedora community members including availability of Red Hat Enterprise Virtualization 2.2, lots of libguestfs tips, the creation of "A K12 Educator's Guide to Open Source Software", and the availability of the open source texbook on Open Source...."

Full Story (comments: none)

openSUSE Weekly News/118

The openSUSE Weekly News for April 10, 2010 is out with news and articles from the openSUSE communtity. "From this issue on, we have a new Layout. We have more Teamreports, an Kernel Review (WIP) and the Sections "From the Ambassadors" and 'openSUSE in $Country'. In that Place every Translation Team can post local Events and other stuff..."

Comments (none posted)

Ubuntu Weekly Newsletter #188

The Ubuntu Weekly Newsletter for April 10, 2010 is out. "In this issue we cover: Ubuntu 10.04 LTS Beta 2 released, Countdown Banner is live, help spread the word, Regional Membership Boards: Restaffing, Call for New Operators in the #ubuntu, #kubuntu and #ubuntu-offtopic channels, Patch Day, May 5th 2010, Next Ubuntu Hug Day! - April 15, Being passionate about some things, Website Localization Project Meeting, Reviving the Ubuntu Accessibility Team, Ubuntu One contact phone sync opened again, Canonical Upgrading GNOME Bugzilla and Commercial Sponsorship, Ubuntu's News Web Office Integration, and much, much more!"

Full Story (comments: none)

Distribution reviews

What's the best lightweight Linux distro? (TuxRadar)

TuxRadar takes a look at several light-weight Linux distributions. "The important things that we'll look at here are the amount of space needed, how much processing power is required to get the distro running at an acceptable level, and the effort required to get it to work. Something to bear in mind is that one of the ways in which developers are able to create slimmed-down distros is by ditching the scripts and wizards that we've come to take for granted. This can complicate tasks that you might expect to be straightforward, such as installing software." The article looks at Damn Small Linux, CrunchBang, Lubuntu, Puppy Linux, SliTaz, Tiny Core Linux, Unity Linux, and VectorLinux.

Comments (none posted)

Page editor: Rebecca Sobol

Development

MongoDB: leave your SQL at home

April 14, 2010

This article was contributed by Nathan Willis

MongoDB is an open source document-oriented database system that is designed for speed and scalability in web site data operations, bridging the gap between simple "key/value" structured storage and the heavyweight requirements of relational database systems. Like other databases in the so-called "NoSQL" vein, MongoDB trades in full ACID compliance for the ability to solve a smaller set of problems easily and quickly.

MongoDB theory

MongoDB's data sets are called "collections" and are roughly analogous to the tables in a traditional relational database. Unlike relational database tables, however, they have no predefined structure (or schema, to use the canonical term) — each record in the collection is a "document" that can potentially have a different structure than every other document in the collection.

This is not to say that MongoDB documents are unstructured, of course; they use a key-value pair syntax modeled on the popular JavaScript Object Notation (JSON) format. MongoDB calls this syntax BSON (alternately expanded as "Binary JSON" and "Binary Serialized dOcument Notation"), and it is designed to be easily traversed, easily coded-to, and lightweight — enough so that it is also MongoDB's network transfer format. Document keys are strings, and values can be variety of types including strings, arrays, and even other documents.

For example, a JSON object such as

    {
        "firstName": "Nathan",
        "lastName": "Willis",
        "Url": "http://www.freesoftwhere.org"
    }
would appear quite simply as the document:
    {"firstName" : "Nathan" , "lastName" : "Willis" , "Url": "http://www.freesoftwhere.org" , \
     "_id" : ObjectId(497cf6075172cf775cace8fb)} 
in a MongoDB collection.

MongoDB's query language is also based on the BSON syntax, so data can be fetched with simple expressions such as db.users.find({'lastName': 'Willis'}) or sorted with db.users.find({}).sort({lastName: 1}). All of MongoDB's queries are dynamic, however, meaning that clients can query the database on any key, without first having to calculate a "view" that indexes the data based on a particular key. This is different from other document-oriented databases, such as CouchDB, which can perform only static queries.

The conceptual differences between MongoDB's schema-free documents and a traditional relational database produce some limitations, but also enable some real-world speed optimizations. Developer Richard Kreuter described MongoDB in a talk at Texas Linux Fest on April 10. He said that because documents are schema-free, the database can be designed to store information commonly accessed in a serial fashion within a single document — for example, a blog post's content and all of the reply comments. 99 percent of the time, he said, they will be retrieved in precisely that order. By not storing the post, user names, and comments in separate tables, access is substantially sped up. The only cost is loss of the comparatively-infrequently-needed ability to atomically update the post and the comments simultaneously from different database clients.

The project lists web site content management, real-time analytics, caching, and logging as ideal use cases for MongoDB. Highly transactional systems, on the other hand, are a poor fit, as the MongoDB server can enforce transactionality only on operations that touch a single document.

In addition to its overall document-centric design, MongoDB also offers several interesting features that database application developers are likely to find convenient. One example is the "upsert" operation, which updates an object in a database document if the object already exists, and creates it if it does not exist. Another example is "capped collections," in which a collection is created with a fixed size, and the oldest entries are automatically removed. Capped collections allow a collection to automatically retain order, but free the developer from having to manually "age-out" the oldest objects by tracking their timestamps.

MongoDB deployment and administration

MongoDB is developed primarily at 10gen, a company which offers commercial support contracts and training for MongoDB administration and development. The latest release is version 1.4, from March 22, 2010, and is under the AGPL version 3. The project provides packages for 32-bit and 64-bit versions of x86 Linux, Solaris, Windows, and Mac OS X, as well as an Apt repository for Debian and Ubuntu.

The main MongoDB server runs as the mongod process. Packages include a shell interpreter interface called mongo, which uses JavaScript as its command language — most of the documentation and tutorials on he Mongo web site use this interface for their examples. Language drivers are available for C, C++, Python, Java, and Perl clients in the official packages, and C#, REST, ColdFusion, Ruby, PHP, JavaScript, and several others in community-supported add-ons.

Mongo supports several replication configurations, including the usual master-slave, as well as "replica sets" that automatically negotiate which database server functions as the master at a given point in time. Master-master replication is supported only in a limited fashion.

Mongo is designed to be highly horizontally scalable, supporting database cluster functionality like failover, map/reduce, and sharding. The current release supports auto-sharding, in which a routing process called mongos interacts with the client in order to abstract away the actual cluster of mongod servers.

Because Mongo does not support transactions in the sense that relational databases support them, it does not support transaction logs that enable database repair — the only real protections against data loss are backups and replication. One other feature worth noting is that the current releases of Mongo only support username-and-password authentication that grants read-write or read-only access to a particular database. Deployments that need stronger security or more fine-grained access control may not find Mongo a good fit.

Still, there are plenty of large-scale production MongoDB servers in the wild — most notably the web, project, and download pages on SourceForge.net, the GitHub service, and the Disqus blog-discussion-system. Those examples and the others listed on the Mongo "production deployments" page all seem to fit broadly into the problem space that Mongo is optimized for: "high-volume, low-value data" web sites, which have little need for the transactional requirements that a relational database system like MySQL provides. If your site also fits the pattern, MongoDB deserves a close look.

Comments (5 posted)

Brief items

Quotes of the week

Developing the release process is almost as hard as developing the code.
-- Keith Packard

Think Shakespearean and you can get an accurate count though:

notmuch count to be or not to be

OK, we need a simpler search syntax than that...

-- Carl Worth

Comments (none posted)

Bricolage 2.0 released

Bricolage is a content management system aimed at organizations with large amounts of content; the 2.0 release has been announced. Changes include a reworked interface ("The amazingly-flexible Bricolage approach to document editing is now also amazingly easy to work with"), a number of backend improvements, and more. See the changelog for details.

Full Story (comments: none)

GNUmed 0.7.0

Version 0.7.0 of the GNUmed medical records management package is out. It has a number of new features which are certainly unique to this type of software ("manage date of death per patient"), but the core feature seems to be "a rather unexpected new functionality" in the form of visual progress notes. See this posting for more information.

Full Story (comments: none)

IcedTea6 1.8 released

IcedTea is a Java development kit build done entirely with open source tools. The 1.8 release is out; it includes an OpenJDK update, but the key aspect of this release would appear to be the fixing of a discouragingly large number of security issues.

Full Story (comments: 3)

Perl 5.12.0 released

The Perl 5.12.0 release is out; it marks the transition to a time-based release process for Perl 5, where a major release will happen each (northern-hemisphere) spring. Changes in this release include better Unicode support, some new APIs to make it even easier to extend the language, a solution to the Y2038 problem, a "yada yada operator," and more; see this page for a detailed list.

Full Story (comments: 18)

Python 2.7 beta 1 released

The Python development team has announced the first beta release of Python 2.7. Python 2.7 is likely to be the last major version in the 2.x series, although more major releases have not been absolutely ruled out. "2.7 includes many features that were first released in Python 3.1. The faster io module, the new nested with statement syntax, improved float repr, set literals, dictionary views, and the memoryview object have been backported from 3.1. Other features include an ordered dictionary implementation, unittests improvements, a new sysconfig module, and support for ttk Tile in Tkinter."

Full Story (comments: none)

WebKit2 posted

A new major revision of the WebKit rendering engine has been posted by Apple. "WebKit2 is designed from the ground up to support a split process model, where the web content (JavaScript, HTML, layout, etc) lives in a separate process. This model is similar to what Google Chrome offers, with the major difference being that we have built the process split model directly into the framework, allowing other clients to use it." Unfortunately, it lacks a Linux port at the moment, but one assumes that can be fixed.

Full Story (comments: none)

Xen hypervisor 4.0.0 released

Xen has released the Xen hypervisor 4.0.0. See the release notes for more information. "Xen 4.0 includes and builds the new pvops dom0 Linux 2.6.31.x kernel as a default. There's also long-term supported Linux 2.6.32.x based pvops dom0 kernel tree available. You can also use the old-style linux-2.6.18-xen as the dom0 kernel, or any of the various forward-ports of the 2.6.18 xen patches to newer kernels."

Comments (4 posted)

Newsletters and articles

Development newsletters from the last week

Comments (none posted)

Ubuntu's Success Story: the Upstart Startup Manager (LinuxPlanet)

Over at LinuxPlanet, Akkana Peck looks at Upstart, which is rapidly supplanting System V init for many distributions. "Upstart, in contrast, is event based. An 'event' can be something like 'booting' ... or it can be a lot more specific, like 'the network is ready to use now'. You can specify which scripts depend on which events. Anything that isn't waiting for an event can run whenever there's CPU available. [...] This event-based system has another advantage: you can theoretically use it even after the system is up and running. Upstart is eventually slated to take over tasks such as or plugging in external devices like thumb drives (currently handled by udev and hal), or running programs at specific times (currently handled by cron)."

Comments (46 posted)

Page editor: Jonathan Corbet

Announcements

Non-Commercial announcements

Fedora Summer Coding 2010 is looking for students

Fedora is sponsoring a "summer of code"-style program for students called Fedora Summer Coding, which is looking for student participants. The ideas page is a little thin right now, but mentors have until April 14 to add to that list. "We are rapidly constructing this summer coding program. We know what we are doing, but because of timing, we are building the infrastructure, process, and requirements as we go. It's like moving in to a house while the scaffolding is still outside. The Fedora Project makes it easy to do stuff like this, since the plumbing and stuff are already in place. (Enough of that metaphor ?)". Click below for Karsten Wade's full invitation to students.

Full Story (comments: 7)

Commercial announcements

Mandriva Announces Arnaud Laprévot as CEO

Mandriva has announced that its board of directors has named Arnaud Laprévote to serve as the company's Chief Executive Officer. Laprévote will also hold the position of Chief Technical Officer and of Director of Research and Development.

Comments (none posted)

OIN announces increase in licensing program

The Open Invention Network (OIN) has announced a significant increase in the number of new licensees in its most recent fiscal quarter. "During the first quarter of 2010, OIN signed 40 new licensees. OIN licensees benefit from enhanced leverage that is driven by access to OIN and shared intellectual property resources that may be employed to deter patent aggression against open source users and community members."

Comments (none posted)

Ulteo Joins Open Invention Network

Ulteo has joined the Open Invention Network (OIN). "“We view an OIN license as one of the key methods through which open source innovators can deter patent aggression,” said Gaël Duval, co-founder. “We are committed to freedom of action in Linux, and in taking a license we help to address the threat from companies that support proprietary platforms to the exclusion of open source initiatives, and whose behaviors reflect a disdain for inventiveness and collaboration.”"

Comments (none posted)

rPath Makes Linux Patching Faster, Safer, More Predictable

rPath has announced enhancements to the intelligent patching capabilities of its next-generation system automation platform. "Specifically, rPath now automates inventory discovery, allows users to "cherry pick" updates and errata for incremental updates, and simplifies the user experience for Linux patching and system administration. To encourage Red Hat Network (RHN) Satellite users to try the rPath platform, the company has launched its "Satellite Swap-Out" promotional offer. For existing RHN Satellite customers, rPath will match or beat their current subscription to RHN Satellite with a richer, more complete solution."

Full Story (comments: none)

Cray Releases Latest Version of Its Linux Operating System Equipped With New Cluster Compatibility Mode

Cray Inc. has announced the release of the latest version of its Cray Linux Environment. "This third generation of the Cray Linux Environment includes the introduction of Cluster Compatibility Mode, allowing Cray XT supercomputers to run applications from Independent Software Vendors (ISVs) without modifications."

Comments (none posted)

Articles of interest

Livnat Peer: Switching from C# to Java

Livnat Peer looks at porting a C# application to Java. "In 2008 RedHat acquired Qumranet, a startup whose focus was Virtualization. Among other products Qumranet developed a management application for Virtualization. The management application was written in C# and one of the first tasks we got was to make the management application cross platform, well this was expected considering the fact that the acquisition was done by RedHat... We started exploring the web looking for ideas how to approach this task. At the beginning things did not look promising most of the references we found for porting projects from one technology to another were about complete failures, the only obvious suggestion that we saw all over was not to change technology and architecture at the same time."

Comments (29 posted)

Interesting times for Video on the Web

Robin Watts has weblog post about Google funding for the TheorARM project. From the blog: "These ARM based devices represent the single biggest class of devices still needing work for decent Theora playback. Any efficiency savings we can make feed back directly into being able to cope with larger screen sizes or giving longer battery life. This is where Google's grant comes in - by helping fund the development of TheorARM (a free optimised ARM version of Theora), they are helping to hasten the day when video works everywhere on the web, for everyone. That's got to be something to be pleased about." (Thanks to Paul Wise)

Comments (30 posted)

Welte: Anatomy of contemporary GSM cellphones

Harald Welte has posted a low-level look at how GSM phones work. "The specifications of the GSM proprietary On-air encryption A5/1 and A5/2 are only made available to GSM baseband chip makers who declare their confidentiality. Implementing the algorithm in software is apparently considered as breach of that confidentiality. Thus, the encryption algorithms are only implemented in hardware - despite them being reverse-engineered and publicly disclosed by cryptographers as early as 1996."

Comments (3 posted)

Hacker previews custom 3.21 PS3 firmware with Linux support (The H)

PlayStation3 hacker George Hotz has shown a proof-of-concept hack to restore the "Install Other OS" functionality to the device, as reported by The H. "Less than 48 hours after Sony's announcement, Hotz called for PS3 owners not to update their systems and announced that he would find 'a safe way of updating to retain OtherOS support'. Hotz has now published a video of the working custom firmware, which he calls 3.21OO, as proof on YouTube. The hacker says that the custom firmware can be 'installed without having to open up your PS3, just by restoring a custom generated PUP file, but only from 3.15 or previous'." We looked at Sony's move to disallow running Linux on PS3s in last week's edition, and noted that Hotz seemed likely to make good on his threat to develop custom firmware that routed around Sony's intentions.

Comments (2 posted)

10th Anniversary of Linux for the Mainframe: Beginning to Today (eWeek)

In celebration of ten years of mainframe Linux deployments, eWeek takes a look at the history of Linux on mainframes. The article is annoyingly broken up over six pages, but does cover the milestones in developing and deploying Linux on mainframes. "Over the years, various features available on the mainframe have made their way into the Linux code base for multiple platforms, leading to significant improvements in the Linux operating system. For example, mainframe dynamic resource management capabilities have made their way into Linux for x86 platforms, and features such as the tickless timer have also made their way from the mainframe into the Linux code base."

Comments (8 posted)

Father of Java leaves Oracle (The H)

The H reports that James Gosling has left Oracle. "Just a few months after Sun's acquisition by Oracle, James Gosling, inventor of the Java programming language and Chief Technology Officer of Sun's Developer Products group, has left the company. In a post on his blog, Gosling confirms that he resigned from Oracle on the 2nd of April."

Comments (5 posted)

Mueller: Patents used by IBM also a threat to other FOSS projects

Florian Mueller has published an initial analysis identifying a dozen of the patents IBM asserted against Hercules that may also read on other major Free and Open Source Software (FOSS) projects. "In order to look into this in more depth and find additional IBM patents assertable against Free and Open Source Software, Mueller issues a "Call to Research", encouraging members of the FOSS community to provide further analysis and identify additional problems that IBM's patents -- especially the ones already asserted against the French open source startup TurboHercules -- could represent."

Full Story (comments: none)

Interviews

QA with Nokia's Ari Jaaksi: MeeGo Revs Up (Linux.com)

Jennifer Cloer talks with Ari Jaaksi, Nokia's Vice President of MeeGo Devices. "How is the "big merge" going and are things on track to deliver MeeGo v1.0 in Q2? Jaaksi: We're moving right along and making great progress. Following the initial announcement at Mobile World Congress, we've released the MeeGo core operating system repositories - anyone can go to meego.com and download this package for free. And just yesterday, a number of leading companies spanning chipset designers, device manufacturers, software vendors and more announced their support for MeeGo. We're well on our way toward the MeeGo 1.0 release."

Comments (none posted)

Meeting Minutes

GNOME Foundation Meeting Minutes - April 1, 2010

Click below for a look at the minutes from the April 1, 2010 meeting of the GNOME Foundation Board. The discussion included board meetings at GUADEC, the new Code of Conduct was accepted, Hackfest sponsorships were approved, Budget for Brazilian Events, Events in Africa, and several other topics.

Full Story (comments: none)

Calls for Presentations

SciPy 2010 News: Specialized track deadline extended

The deadline for submitting an abstract for a SciPy 2010 specialized track has been extended until April 25, 2010.

Full Story (comments: none)

DeepSec 2010 - Call for Papers and Experts

DeepSec In-Depth Security Conference 2010 has announced a call for papers. The conference will be in Vienna, Austria from November 23-26, 2010. The call for proposals is open until July 31, 2010.

Full Story (comments: none)

Upcoming Events

Register for Akademy 2010

The 2010 Akademy summit is open for registration. "Starting July 3rd 2010, hundreds of KDE community members, employees of companies working with us and many other Free Software enthusiasts will gather at Tampere, Finland. There, at the University of Tampere, the annual Akademy summit 2010 will take place. For a full week, Tampere will be the place where stunning new technology is demonstrated, hundreds of prominent Free Software contributors walk the corridors and new plans for the future of the Free Desktop emerge."

Comments (none posted)

Events: April 22, 2010 to June 21, 2010

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
April 23
April 25
FOSS Nigeria 2010 Kano, Nigeria
April 23
April 25
QuahogCon 2010 Providence, RI, USA
April 24 Festival Latinoamericano de Instalación de Software Libre Many, Many
April 24 Open Knowledge Conference 2010 London, UK
April 24
April 25
OSDC.TW 2010 Taipei, Taiwan
April 24
April 25
BarCamb 3 Cambridge, UK
April 24
April 25
Fosscomm 2010 Thessaloniki, Greece
April 24
April 25
LinuxFest Northwest Bellingham WA, USA
April 24
April 26
First International Workshop on Free/Open Source Software Technologies Riyadh, Saudi Arabia
April 25
April 29
Interop Las Vegas Las Vegas, NV, USA
April 28
April 29
Xen Summit North America at AMD Sunnyvale, CA, USA
April 29 Patents and Free and Open Source Software Boulder, CO, USA
May 1
May 2
OggCamp Liverpool, England
May 1
May 2
Devops Down Under Sydney, Australia
May 1
May 4
Linux Audio Conference Utrecht, NL
May 3
May 6
Web 2.0 Expo San Francisco San Francisco, CA, USA
May 3
May 7
SambaXP 2010 Göttingen, Germany
May 6 NLUUG spring conference: System Administration Ede, The Netherlands
May 7
May 8
Professional IT Community Conference New Brunswick, NJ, USA
May 7
May 9
Pycon Italy Firenze, Italy
May 10
May 14
Ubuntu Developer Summit Brussels, Belgium
May 17
May 21
Fourth African Conference on FOSS and the Digital Commons Accra, Ghana
May 18
May 21
PostgreSQL Conference for Users and Developers Ottawa, Ontario, Canada
May 24
May 25
Netbook Summit San Francisco, CA, USA
May 24
May 26
DjangoCon Europe Berlin, Germany
May 24
May 30
Plone Symposium East 2010 State College, PA, USA
May 27
May 30
Libre Graphics Meeting Brussels, Belgium
June 1
June 4
Open Source Bridge Portland, Oregon, USA
June 3
June 4
Athens IT Security Conference Athens, Greece
June 7
June 9
German Perl Workshop 2010 Schorndorf, Germany
June 7
June 10
RailsConf 2010 Baltimore, MD, USA
June 9
June 11
PyCon Asia Pacific 2010 Singapore, Singapore
June 9
June 12
LinuxTag Berlin, Germany
June 10
June 11
Mini-DebConf at LinuxTag 2010 Berlin, Germany
June 12
June 13
SouthEast Linux Fest Spartanburg, SC, USA
June 15
June 16
Middle East and Africa Open Source Software Technology Forum Cairo, Egypt
June 19 FOSSCon Rochester, New York, USA

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds