LWN.net Weekly Edition for May 19, 2011
LGM: Usability and AdaptableGIMP
University of Waterloo Human-Computer Interaction (HCI) researcher Michael Terry always has intriguing work on display when he comes to the Libre Graphics Meeting. Two years ago, it was ingimp, a GIMP derivative that collects real-time usability data, and Textured Agreements, a study that showed how semantic markup and illustration dramatically increase the percentage of users that read and understand end-user license agreements and other click-through "legalese."
2011 was no different. Terry and his students presented two talks that could change the way open source developers work. The first was an analysis of real-world frequently-asked-questions mined through the Google Suggest search auto-completion feature. The second was another version of GIMP that — drawing on ingimp results and the Google Suggest analysis — presents a task-oriented interface, where users choose a task set to work on, and the tool palette morphs to fit the job at hand. The result is less confusing for new users, and because the set of supported tasks is customizable, it can grow through community contributions and individual personalization.
How do I ...
The first talk, entitled Quick and Dirty Usability: Leveraging Google Suggest to Instantly Know Your Users, detailed an analysis project led by PhD candidate Adam Fourney. Fourney and his collaborators started with what many Google users already know: by typing in a partial search query to Google Suggest, one can see the most popular complete search queries entered in by Google users at large. Although this is often done for comedic purposes, they instead scripted the process and automated search query prospecting runs for Firefox, GIMP, Inkscape, and small set of other open source applications, using an assortment of "how to" phrases and re-wording tricks to grab the broadest set of results.
After collecting data for three months, the team analyzed and classified the results. Additional Google tools such as Google Insights and AdWords helped to turn the raw query results into a usable data set. The queries break down into six basic phrase categories (e.g., imperatives like "gimp rotate text" versus questions like "how to draw a line in gimp"), six "intents
" (e.g., troubleshooting problems versus seeking instructions), and in many cases related words allow certain queries to be correlated (e.g., "bring the window back" and "lost my window").
Analyzing the data may make for fascinating reading, but the researchers jumped straight to the potential practical applications for open source software projects. Among the findings that a Google Suggest query-study can uncover for a project are where the users' terminology and the project's diverge, functionality desired by the user base that might not be on the project's radar, and usability problems.
In the first category, the researchers observed that GIMP users frequently search for help making a picture "black-and-white." In reality, of course, the users are most likely not interested in a binary, two-color result: they want a grayscale image. Thus the project can help lead the user in the right direction by including "black-and-white" in the relevant tutorials and built-in help. The second and third categories are probably more self-explanatory: if users are consistently looking for help with a specific topic or error dialog message, it is simple to move from that knowledge to a feature request or bug.
Harvesting Google search results itself is not a new idea, but the talk (and the paper from which it originates) do an excellent job of explaining how to take a simple query and systematically glean useful data from it. An audience member asked Fourney if he planned to release his query-harvesting code; he declined on the grounds that Google's terms-of-use do not explicitly address using Google Suggest for this type of data mining, and he fears that publishing the code could lead to an obfuscation campaign from the search engine. Nevertheless, he said, the actual code was simple, and anyone with scripting experience could duplicate it in short order. The real genius, of course, comes in correctly interpreting the results in order to improve one's application.
Adaptation
A prime example of how developers could use search query analysis was on display in the second talk, Introducing AdaptableGIMP. AdaptableGIMP is built on top of GIMP 2.6 (the current stable release), and seeks to add a flexible, new-user-friendly interface to the image editor. Binary packages are available from the project for Windows and 32-bit Linux systems. The Linux builds are provided as Debian packages: one for AdaptableGIMP itself, and replacements for the standard gimp-data and libgimp2.0. Source tarballs and a Github repository are also provided for those who would prefer to install from source.
AdaptableGIMP does not remove any functionality from upstream GIMP. Rather, it replaces the standard toolbox palette, and links in to a web-based archive of "task sets," each of which loads a customized tool palette of its own. When you are not working with one of the loadable task sets, however, you can switch back to the default toolbox with its grid of tools. The custom palettes presented by each task set give you an ordered list of buttons to click that steps you through the process of performing the task at hand. It's a bit like having a built-in tutorial; no more getting lost or searching through the menus and tool-tips one by one.
![[AdaptableGIMP search]](https://static.lwn.net/images/2011/adaptablegimp-search-sm.png)
At launch time, AdaptableGIMP asks you to log in with an AdaptableGIMP "user account." This is optional, and is designed to tie in to the project's wiki service (as well as to allow multiple users on a shared system to maintain distinct identities, a feature that is probably more useful to Windows households). At the top of AdaptableGIMP's toolbox is a search box, into which you can type the name of a particular task or keywords. As you type, an overlay presents live search results retrieved from the AdaptableGIMP task set wiki. Each has a title and a short description; if the pop-up results are not helping, you can also launch a search dialog window that offers more information, including a preview of the steps involved in the task and the associated wiki page.
![[AdaptableGIMP help]](https://static.lwn.net/images/2011/adaptablegimp-redeyehelp-sm.png)
Whenever you select a task, it loads a custom list of commands into the toolbox. This is a bit like loading a recorded macro, except that the steps are not executed automatically; you perform each one in sequence. An "information" button next to the task title opens up a pop-up window explaining it in more detail. For example, the "Convert a Picture to a Sketch" task set involves four steps: converting the image to grayscale, adjusting the contrast in the the Levels dialog, applying the Sobel filter, and inverting the result.
Each step is represented by a button on the task set toolbox, as you click through them they change shape to the "pressed" look. It helps to have the Undo History palette open, because clicking the buttons again does not undo the completed step. In any event, clicking the buttons opens up the correct tool, filter, or adjustment window, pre-loaded with the correct settings (if applicable), but clicking does not automatically execute every task, because some require input — selecting the right portion of the image, for example.
Looking beyond the individual task sets, AdaptableGIMP allows the user to personalize their experience by downloading local copies of frequently used task sets. The interface shows the last-update-date of each task set, which allows the user to compare it to the public version and retrieve updates. The task sets are stored on the project's wiki in XML. You can create your own task set on the wiki by using the built-in editor, but you can also create it or edit an existing task set within AdaptableGIMP, simply by clicking on the "Edit commands" button.
On the down side, the AdaptableGIMP team admits it is not experienced at building .debs — and even appeals for help with the packaging. Currently the Debian packages weigh in at 16.7MB and installing them is less than trivial. The AdaptableGIMP package conflicts with standard GIMP 2.6 and will not install until GIMP is removed, while the gimp-data and libgimp2.0 packages do install, they just break GIMP 2.6 as a result. Fortunately, all are easily removed and replaced with standard GIMP once testing is complete. Hopefully someone more experienced with packaging will offer to lend a hand, because AdaptableGIMP is an interesting package that distributions may actually want to consider offering.
AdaptableGIMP's launch-time selection of public task sets came from the results of the Google search query study discussed above, plus the ongoing data collection provided by ingimp. The wiki is open to all, however, and the concept of AdaptableGIMP "user accounts" is advertised as enabling good task set authors to develop a positive reputation with the user community by sharing their work.
All together now!
At the moment, AdaptableGIMP is tied to the wiki for retrieving remote task sets, but Terry pointed out that this is not set in stone, and that in the future it should be possible to add in other remote "task repositories." The researchers have not been able to do this yet solely for lack of time, which is also why the adaptable interface and task-set framework have not yet been modularized so that they could be re-used in other applications.
Inkscape and Blender team members in the audience asked follow-up questions about that point, including how to stay in contact with the research group as development continued. Like GIMP, both applications support a vast array of unrelated use cases, to the point where casual users sometimes do not know where to get started. From the GIMP camp itself, Øyvind "pippin" Kolås was highly complimentary of the work, saying "we think you're on crack — but it's good crack.
" He went on to say that the GIMP UI team found a lot of the ideas interesting and useful, although AdaptableGIMP probably could not be directly incorporated into the upstream GIMP.
Terry agreed with the latter point after the talk was over, explaining that the AdaptableGIMP interface is not meant to replace the traditional GIMP interface. Also, in the present code base, the process of changing the GIMP toolbox was not straightforward enough to make AdaptableGIMP distributable as a GIMP plug-in because it removes and replaces parts of the UI, which are activities not covered by the plug-in API.
As for distant versions of GIMP, Inkscape, or Blender, who knows? Terry and his research students intend to keep developing the AdaptableGIMP software — including the communal, public task set architecture, which (more than the interface changes) makes up the backbone of the system. As the team stated at the beginning of the talk, years of ingimp research show that most users make use of only six GIMP tools on average — but everyone's six is different. That is why GIMP and the other creative applications are so complex: everyone uses a different subset of their functionality. By separating "the tasks" from "the toolbox," AdaptableGIMP shows that it is possible to carve a usable path through even the most complex set of features, provided that you make getting something done the goal, instead of showing off everything you can do.
Graphics tools are certainly not the only subset of open source applications that could learn from this approach. Just about any full-featured program offers more functionality than a single task requires of it. The other important lesson from AdaptableGIMP is that presenting a streamlined interface does not necessitate removing functionality or even hiding it — only offering a human- and task-centric view on the same underlying features.
Mark Shuttleworth on companies and free software
I had the opportunity to sit down with Mark Shuttleworth, founder of Ubuntu
and Canonical, for an wide-ranging, hour-long conversation
while at Ubuntu Developer Summit (UDS) in Budapest. In his opening talk, Shuttleworth said that he wanted to "make the case
"
for contributor agreements, which is something he had not been successful
in doing previously. In order to do that, he outlined a rather different
vision than he has described before of how to increase Linux and free
software adoption, particularly on
the desktop, in order to reach his goal of 200 million Ubuntu users in the
next four years. While some readers may not agree with various parts of
that vision, it is
definitely worth understanding Shuttleworth's thinking here.
Company participation in free software
In Shuttleworth's view, the participation of companies is vital to bringing the Linux desktop to
the next level, and there is no real path for purely software companies to
move from producing proprietary software toward making free software.
There is a large "spike-filled canyon
" between the proprietary
and the free license
world. Companies that do not even try to move in a "more free" direction
are largely ignored by the community, while those which start to take some
tentative steps in that direction tend to be harassed,
"barbed
", and "belittled
". That means that
companies have to leap that canyon all in one go or face the wrath of the
"ideologues
" in the community. It sets up a "perverse
situation where companies who are trying to engage get the worst
experience
", he said.
The community tends to distrust the motives of companies and even fear
them, but it is a "childish fear
", he said. If we make
decisions based on that fear, they are likely to be bad ones. Like
individuals, companies have varied motives some of which align with the
interests of the community and some of which don't. Using examples like
Debian finding the GNU Free Documentation License to be non-free, while
Debian is not a free distribution under the FSF's guidelines, he noted that
the community can't even define what a "fully free" organization looks
like. Those kinds of disagreements make it such that we are "only
condemning ourselves to a lifetime of argument
". In addition, because it is so unclear, "professional software
companies
" aren't likely to run the gauntlet of community
unhappiness to start down the path that we as a community should want them
to.
Essentially, Shuttleworth believes that it is this anti-corporate,
free-license-only agenda that is holding free software back. For some,
"the idea of freedom is more important than the reality
", and
those people may "die happy
" knowing that their ideal was
never breached, but that isn't what's best for free software, its
adoption, and expansion. The "ideologues are costing free software
the chance
" to get more corporate participation. What's needed is a
"more mature
understanding of how free software can actually grow
", he said.
Existing company participation
There are, of course, companies that do contribute to free software, but
those companies "do something orthogonal
" to software
development, he said. He pointed to Intel as a hardware vendor that wants
to sell more chips, and Google, which provides services, as examples of
these kinds of participants. There are also the distribution companies,
Red Hat, SUSE, Canonical, and others, but they have little interest in
seeing free software projects become empowered (by which he means able to
generate revenue streams of their own), he said, because that means
that anyone looking for support or "assurances about the
software
" can only get it through the distribution companies.
Though some at Canonical disagree with the approach—because it will
reduce the company's revenues—Shuttleworth is taking a stand in favor
of contributor
agreements to try to empower the components that make up distributions. By
doing that, "it will weaken Canonical
", but will strengthen
the ecosystem. There needs to be more investment into the components, he
said, which requires that those components have more power, some of
which could come from the projects owning the copyright of the code.
Whether those projects are owned by Canonical, some other company, or by a
project foundation, owning the code empowers the components.
The other main reason that Shuttleworth is "taking a strong public
view
" about contributor agreements is to provide some cover for
those who might want to use them. He has "thick skin
" and
would like to move the free software ecosystem to getting more
"companies that are actually interested in software
"
involved. So far, he has "seen no proposals from the
ideologues
" on how to do that.
Companies may be more willing to open up their code and participate if they
know they can also offer the code under different terms. That requires
that, at least some of the time, contributors be willing to give their patches to
the project. Those who are unwilling to do so are just loaning their
patches to the project, and "loaning a patch is very uncool
".
The "fundamentalists
" who are unwilling to contribute their
code under a copyright assignment (while retaining broad rights to the code in question) are simply
not being generous, he said.
The state of free software today
The goal should be to "attract the maximum possible participation to
projects that have a free element
", he said. He is "not arguing for
proprietary software
", but he is tired of seeing "80%
done
" software. In addition, the free software desktop applications
are generally far behind their proprietary counterparts in terms of
functionality and usability. He would like to "partner with companies that
get things done
", specifically pointing to Mozilla as an
organization that qualifies.
The fear that our code will be taken proprietary is holding us back,
Shuttleworth said. In the meantime, we have many projects where the job is
only 80% done, and there is no documentation. A lot of those projects
eventually end up in the hands of new hackers who take over the project and
want to
change everything, which results in a different unfinished application or
framework.
Involving software companies will not be without its own set of problems,
as those companies will still do "other things that we don't
like
", but there is a need for professional software companies to help
get free software over the hump.
The "lone hacker
" style of development is great as far as it
goes, but there are lots of other pieces that need to come together. He
pointed to the differences between Qt and GTK as one example. GTK is a
"hacker toolkit
", whereas Qt is owned by a company that does
documentation, QA, and other tasks needed to turn it into a "professional
toolkit
". Corporate ownership of the code will sometimes lead to
abuse, like "Oracle messing around with Java
", but free
software needs to "use
" companies in a kind of
"jujitsu
" that leverages the use of the companies' code in
ways that are beneficial to the ecosystem.
He said that some of the biggest free software success stories come
from companies being involved with the code. MySQL and PostgreSQL are
"two great free
software databases
", which have companies behind their
development or providing support. CUPS is a great
printing subsystem at least partly because it is owned and maintained by
Apple. Android is another example of an open source success; it has Google
maintaining strict control over the codebase.
Shuttleworth has a fairly serious disagreement with how the
OpenOffice.org/LibreOffice split came about. He said that Sun made a $100
million "gift
" to the community when it opened up the
OpenOffice code. But
a "radical faction
" made the lives of the OpenOffice
developers "hell
" by refusing to contribute code under the Sun
agreement. That eventually led to the split, but furthermore led Oracle to
finally decide to stop OpenOffice development and lay off 100 employees.
He contends that the pace of development for LibreOffice is not keeping up
with what OpenOffice was able to achieve and wonders if OpenOffice would
have been better off if the "factionalists
" hadn't won.
There is a "pathological lack of understanding
" among some
parts of the community about what companies bring to the table, he said.
People fear and mistrust the companies on one hand, while asking
"where can I get a job in free software?
" on the
other. Companies bring jobs, he said. There is a lot of "ideological
claptrap
" that permeates the community and, while it is reasonable
to be cautious about the motives of companies, avoiding them entirely is
not rational.
Project Harmony
The Canonical contributor
agreement is "mediocre at best
", but does have "some
elements which are quite generous
", he said. It gives a wide license
back for code that is contributed so the code can be released under any
license the author chooses. In addition, Canonical will make at least one
release of the project using the patch under the license that governs
the project, he said. That guarantee does not seem to appear in the actual
agreement
[PDF], however.
These kinds of contributor agreements are going to continue to exist, he
said, and believing otherwise "denies the reality of the world we
live in
". The problem is that there are so many different
agreements that are "all amateur in one form or another
", so
there is a need to "distill the number of combinations and
permutations
" of those agreements into a consistent set. That is
the role of Project Harmony, he said.
The project brought together various groups, companies, organizations, and
individuals with different ideas about contributor agreements, including
some who are "bitterly opposed
" to copyright assignment. The
project has produced draft 1.0
agreements that have "wide recognition
" that they
represent the set of options that various projects want.
The agreements will help the community move away from "ad hoc
"
agreements to a standard set, which is "akin to Creative
Commons
", he said. The idea is that it will become a familiar
process for developers so they don't have to figure out a different
agreement for each project they contribute to. Down the road, Shuttleworth
sees the project working on a 2.0 version of the agreements which would
cover more jurisdictions, and address any problems that arise.
Shuttleworth's vision
In the hour that we spoke, Shuttleworth was clearly passionate about free software, while being rather frustrated with the state of free software applications today. He has a vision for the future of free software that is very different from the current approach. One can certainly disagree with that vision, but it is one that he has carefully thought out and believes in. One could also argue that huge progress has been made with free software over the last two or three decades—and Shuttleworth agrees—but will our current approach take things to the "next level"? Or is some kind of different approach required?
As far as contributor agreements go, it seems a bit late to be making the case for them at this point—something that Shuttleworth acknowledged in his talk at the UDS opening. Opposition to the agreements, at least those requiring copyright assignment, is fairly high, and opponents have likely dug in their heels. While he bemoans ideology regarding contributor agreements, there are procedural hurdles that make them unpopular as well; few want to run legal agreements by their (or their company's) lawyers.
The biggest question, though, seems to be whether a more agreement-friendly community would lead to more participation by companies. If the goal is to get free software on some rather large number of desktops in a few short years—a goal that may not be shared by all—it would certainly seem that something needs to change. Whether that means including more companies who may also be pursuing proprietary goals with the same code is unclear, but it is clear that Shuttleworth, at least, is going to try to make that happen.
NLUUG: Filling the gaps in open telephony
On May 12, NLUUG held its Spring Conference with the theme "Open is efficient". With such a general theme, it won't be a surprise that the program was a mix of talks about policy matters, case studies, and technical talks in various domains. However, two talks were particularly interesting because they pinpointed some gaps in our current solutions for open (source) telephony.
Open SIP firmware
The Dutch cryptographer Rick van Rein presented [PDF] his project to build open source firmware for SIP (Session Initiation Protocol) telephony. His vision is that SIP holds great promises for the future of telephony, but that nobody is unleashing its potential:
All this holds for SIP devices, but there's another type of SIP phone
that currently has much more advanced functionality: the softphones (software implementing SIP functionality on a computer or smartphone). According to van Rein, the softphone market is where the real innovation is happening, with advanced features such as presence settings, IPv6 connectivity, and end-to-end encryption of phone calls with ZRTP (a cryptographic key-agreement protocol to negotiate the keys for encryption in VoIP calls). In short, open source is great at handling SIP functionality, but this doesn't help the people that have bought a SIP phone (the hardware), because the firmware of these devices remains "as open as an oyster
".
Van Rein's goal is to build open source firmware that can be installed on such a SIP device instead of its original closed firmware, and ultimately it should be able to bring the advanced SIP features of softphones to these phones too. The project is called 0cpm, and is partially funded by NLnet Foundation. As a proof of concept, van Rein is now implementing his firmware for the Grandstream BT200, a fairly typical and affordable SIP phone for home and office use, but the framework is designed with portability in mind. The 0cpm firmware, called Firmerware, is GPLv3 licensed.
As some of these SIP phones have only 256K of RAM, Linux would be too big an operating system to run on them; even its microcontroller cousin uClinux would be too large. So van Rein wrote his own tickless realtime operating system, with a footprint of around 13K. Together with the network stack and the application, this fits well into the 512K NOR flash that is typical for smaller devices. According to van Rein, it's important that 0cpm is able to run on cheap and energy efficient phones (because they're always on) with limited resources, so he doesn't have the luxury to use Linux. At the bottom of the 0cpm firmware stack, you need drivers for all chips and peripherals of the phone, and on top of it there will be some applications running, such as a SIP phone application.
One of the main goals of the 0cpm project is to enable SIP on IPv6. For most end users, current SIP phones are too complex to configure due to a dependency on IPv4 and NAT routers. To tackle these issues, most SIP vendors end up passing all traffic through their own servers, but of course this isn't free. Van Rein believes that only an IPv6 SIP project will be able to offer an open but easy-to-configure SIP experience to end users. With IPv6, direct calls are always possible, and with technologies like ITAD (defined in RFC 3219) and ENUM (E.164 NUmber Mapping), SIP telephone numbers can be found using a DNS-based lookup. By combining all these existing pieces in the 0cpm project, users can finally call freely. Not only free as in beer (that's where the project's name comes from, "zero cents per minute"), but also free as in speech, van Rein emphasizes. For devices that have no IPv6 connectivity whatsoever, the 0cpm firmware will fall back to a suitable device-local tunnel for IPv6 access.
Reverse engineering
But before we get there, there's a lot of reverse engineering to do. In
his talk at the NLUUG conference, van Rein gave some tips and tricks he
used to reverse engineer the inner workings of his Grandstream BT200
device. One of the tips he gave was to use the Logical Link Control
(LLC) network protocols, which are extremely easy to implement and come in
handy for reverse engineering. The LLC1 protocol offers a trivial datagram
service directly over Ethernet and it has minimal requirements: just
memory, network and booting code. A stand-alone LLC1 image is about 10K,
including console access. You can install a bootloader using TFTP over LLC1
instead of TFTP over UDP. In a similar way you can connect to the console
over LLC2 (a trivial stream service) instead of over TCP. It has the same
minimal requirements as LLC1 and adds about 200 lines of C code. Van Rein
calls LLC "a generally useful tool for reverse engineering
",
emphasizing that with LLC2 it's even possible to show the boot logs before
the device gets an IP address.
But when reverse engineering phone hardware, you should first be able to figure out what each component is doing. In general, most phones contain a System-on-Chip, RAM, flash storage, Ethernet connectivity, and GPIO (General Purpose Input/Output) pins. Reverse engineering is more of an art than a science, and it includes identifying the components, gathering datasheets, and finding an open source compiler toolchain. Finally, you have to figure out a way to launch your own code on the device. Then you can start writing some "applications" to test your drivers, for instance an application that flashes the LEDs to test the drivers for timers and interrupts, and an application that shows typed numbers to test the drivers for the keys and the display. By building various of these simple applications, you can test the drivers individually. The 0cpm software contains these applications to make porting easier. Van Rein is currently still working on the drivers, and he has a simple application that gets an IPv6 address. He hopes to be able to show a working phone application in the second half of this year.
In short, van Rein truly believes that progress in SIP phones will come from open source firmware, and with the 0cpm project he wants to build this firmware. While the current proof of concept phone is the Grandstream BT200, he invites anyone to port to their own hardware. For interested developers, there's a Git repository with the source code (git://git.0cpm.org/firmerware/, which is only accessible over IPv6). Reverse engineering current SIP phone hardware is a big task, and van Rein emphasizes that 0cpm is not even alpha-quality code. If the project can generate a critical mass, its vision of generic SIP firmware could come true, in much the same way as we now have free firmware projects such as OpenWrt and DD-WRT for wireless routers.
A nice by-product will be that 0cpm allows a truly secure way of calling each other, thanks to the direct IPv6 connectivity without any central server that can wiretap all media streams, as well as the encryption and mutual authentication offered by the ZRTP protocol. On a related note, GNU SIP Witch 1.0 was released last week, which offers a secure peer-to-peer SIP server that lets ZRTP phones talk without the need for an intermediate service provider.
An "open" GSM network operator
Another telephony-related project that was presented at the NLUUG conference is Limesco [in Dutch], an "open" and transparent pan-European not-for-profit GSM network. Mark van Cuijck, one of the three founders, presented the rationale behind this project:
On the other hand, the telephones and some of the telecom infrastructure are becoming more and more open. Van Cuijck mentioned the popular Android platform, which is largely open source and has a big open applications ecosystem. There's also OsmocomBB, an open source GSM baseband software implementation, and OpenBTS, an open source software-based implementation of a GSM base station. And in the SIP domain, we have open source SIP softphones such as Ekiga and server software like Asterisk and FreeSWITCH. But what good are all these open source programs if the mobile network operators are very restrictive about what happens on their network? That's where Limesco would come in.
Limesco is still in research phase, so even the founders aren't sure yet that it will become reality. They are researching now whether it would be financially and technically viable to become a mobile virtual network operator (using another operator's infrastructure) aiming at a target audience of users that value freedom, openness, and transparency. They have published a survey that has been filled out by 1200 people, and they are talking with companies in the telecom market to get an idea about the costs. By June 1, they want to decide whether to go further with the project or whether to abandon it.
A targeted approach
Whatever will happen with the project, the idea is interesting, and it seems like it should appeal to enough people to make it a viable business model. There are already a lot of mobile virtual network operators that have a specific target audience. For instance, van Cuijck mentioned the Dutch mobile operator Telesur, which targets Surinamese inhabitants of the Netherlands. Many of these people still have relatives in this former colony of the Netherlands, and Telesur is responding by offering them very cheap call rates between the Netherlands and Suriname. It's this kind of targeted approach that the Limesco founders are thinking about, but this time targeted at the "hacker" community.
Van Cuijck presented some ideas. One of the goals of Limesco is to be more transparent about the call rates:
The preliminary results from the survey also show that the respondents are interested in knowing what data Limesco would store about the subscribers. Each mobile network operator has to store certain data and cooperate with the police, and Limesco would have to obey these laws as any other network operator, but it wants to make the difference by being completely transparent about it.
Another goal is to give the subscribers the freedom to manage their own services. For instance, instead of offering services on the level of the operator network, subscribers could get the capability to manage their own voice mail application, their own conference call implementation, and so on. Subscribers could also get the ability to block certain numbers, and it should even be possible to link two mobile phone numbers to one SIM card or to have two SIM cards linked to the same mobile phone number. All these examples van Cuijck mentioned are clearly features that are not interesting for most of the public but that could be interesting for a niche audience of do-it-yourself people.
This all sounds interesting, but it's not yet reality, and it might be that the project is a bit naïve. One of the people in the audience made an excellent observation after van Cuijck's talk: a mobile operator makes its profit because of the difference in how much you think you call and how much you call in reality. All these various call plans only have one goal: confuse the subscribers and let them choose a suboptimal plan for their situation. So if Limesco wants to be completely transparent about its call rates, it loses the information asymmetry to its customers, so it will make less profit than the other mobile operators. While this observation may be true, though, your author thinks that Limesco can have a competitive advantage over other mobile operators with its do-it-yourself approach: if its subscribers are not interested in the many services other operators offer, like voice mail, conference calls, call forwarding, and so on, Limesco doesn't have to build or outsource these services, and maybe that's how it can lower the costs for the infrastructure it rents.
Filling the gaps in open telephony
Both 0cpm and Limesco are interesting projects in that they are filling the few remaining gaps in our open telephony infrastructure. We have good open source SIP softphones like Ekiga, we have good SIP server software like Asterisk, we even have open source GSM base station software such as OpenBTS and open source GSM baseband software like OsmocomBB, but we still lack two important components to have a fully open and transparent telephony experience: open firmware for SIP phones, and a mobile network operator that doesn't hamper what we can do with our mobile phone connection. Perhaps we will see that change in the relatively near future.
Security
Seccomp: replacing security modules?
LWN recently discussed an enhancement to the seccomp mechanism which would allow applications to restrict their future access to system calls. By blocking off some calls altogether and by using simple, ftrace-style filters to restrict the possible arguments to other system calls, a process could construct a small sandbox which would constrain it (and its children) going forward. Getting support for new security mechanisms in the kernel is often a challenge, but not this time - almost all of the reviews are positive. The biggest complaint seems to be that the patches are not ambitious enough; at least one developer would like to see developer Will Drewry shoot for replacing the Linux security module (LSM) mechanism altogether.Ingo Molnar has been a supporter of this work; indeed, he suggested some of the ideas which led to the current set of patches. But he is now asking Will to be a bit more ambitious in his goals. Rather than act as a gatekeeper for system calls, why not implement the ability to hook into arbitrary kernel events and filter access there? Those who have watched Ingo over the last couple of years are unlikely to be surprised to see that he suggests hooking into the perf events subsystem for this task. Perf already allows an application to attach to events to get notifications and counts; adding per-process filter expressions, he suggests, is a natural evolution of that capability.
In other words, Ingo suggests dropping the current interface, which is implemented with prctl(), in favor of a perf-based (or, at least, perf-like) interface which could operate on kernel events. In principle, any software event that perf can deal with now (including tracepoints) could be used, but these events would have to be explicitly modified by kernel developers to enable this sort of "active" use. For events modified in this way, filters written in an expanded language could be provided by an application. See this message from Ingo for an example of how this sort of functionality might be used.
One of the biggest advantages of hooking to arbitrary events is that filters could be applied at the real decision points. A filter which allows access to the open() system call based on the name of the file being opened is not truly secure; the application could change the name between the pre-open() check and when open() actually uses it. Checking at a tracepoint placed more deeply within the VFS lookup code, instead, would block this sort of attack. A check placed in the right location could also be more efficient, replacing several checks at the system call level.
According to Ingo, there are a lot of advantages to providing this sort of capability. It would allow, for the first time, security policies to be set up by unprivileged applications; developers could thus take a more active role in ensuring the security of their code. The feature could be made stackable, allowing multiple application layers to add restrictions. In fact, he thinks it's such a great idea that he said:
Someday, he said, event-based filters could simply replace LSM which he blamed for a number of ills, including stalled
security efforts, desktop annoyances, infighting, fragmentation, and
"probably *less* Linux security
". Merging the code in its
current form, he said, would take away the incentive to go all the way, so
he'd like to see it reworked along these lines first.
Needless to say, this idea is not universally popular in the Linux security module community. James Morris supports the merging of the current patch, which, he says, is a good way to reduce the attack surface of the system call interface, but, he said, it is the wrong place for more serious security checks. Real security policies, he said, should be done at the LSM level. Eric Paris suggested that the filter capability should be implemented as an LSM, but he also pointed out a key weakness of that approach:
Getting application developers to make use of a Linux-specific security mechanism is already asking a lot. Getting them to use a mechanism which may or may not be present even on Linux systems is even harder; that may be part of why application developers have never really stepped forward to provide SELinux policies for their code. The filtering capability envisioned by Ingo would be part of the core kernel itself; that alone could help to make it the "single security model" that Eric was wishing for.
Any such outcome is to be found well in the future, though; there are numerous obstacles to overcome. The amount of work needed to implement this capability is not trivial. Individual tracepoints within the kernel would have to be evaluated to determine whether making them "active" makes any sense. Without a great deal of care, allowing applications to block operations within the kernel could well introduce security problems of its own. Based on past experience, the developers of the existing security mechanisms in the kernel might oppose the addition of an entirely new security-related framework. Even Linus, in the past, has been resistant to the idea of creating a single security policy mechanism for the kernel.
For the near future, Will has indicated that he will look at implementing the feature along the lines suggested by Ingo. Once some code is out there, developers will be able to see its implications and the debate can start for real. The chances of the discussion going on for some time are fairly good.
Brief items
Security quotes of the week
New vulnerabilities
acpid: denial of service
Package(s): | acpid | CVE #(s): | CVE-2011-1159 | ||||||||||||||||
Created: | May 16, 2011 | Updated: | May 31, 2012 | ||||||||||||||||
Description: | From the Red Hat Bugzilla entry:
It was reported that acpid opened the UNIX socket that informs unprivileged processes about acpi evens in blocking mode. If an unprivileged process were to stop reading data from the socket, then after some time the socket queue fills up which would then lead to a hang of the privileged acpid daemon. The daemon will hang until the socket peer process read some portion of the queued data or the peer process exits or is killed. | ||||||||||||||||||
Alerts: |
|
apr: denial of service
Package(s): | apr | CVE #(s): | CVE-2011-0419 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Created: | May 13, 2011 | Updated: | August 2, 2011 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Description: | From the Mandriva advisory:
It was discovered that the apr_fnmatch() function used an unconstrained recursion when processing patterns with the '*' wildcard. An attacker could use this flaw to cause an application using this function, which also accepted untrusted input as a pattern for matching (such as an httpd server using the mod_autoindex module), to exhaust all stack memory or use an excessive amount of CPU time when performing matching. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Alerts: |
|
apturl: denial of service
Package(s): | apturl | CVE #(s): | |||||
Created: | May 17, 2011 | Updated: | May 18, 2011 | ||||
Description: | From the Ubuntu advisory:
It was discovered that apturl incorrectly handled certain long URLs. If a user were tricked into opening a very long URL, an attacker could cause their desktop session to crash, leading to a denial of service. | ||||||
Alerts: |
|
exim: arbitrary code execution
Package(s): | exim4 | CVE #(s): | CVE-2011-1407 | ||||||||||||||||||||||||||||||||
Created: | May 13, 2011 | Updated: | May 31, 2011 | ||||||||||||||||||||||||||||||||
Description: | From the Debian advisory:
It was discovered that Exim is vulnerable to command injection attacks in its DKIM processing code, leading to arbitrary code execution. | ||||||||||||||||||||||||||||||||||
Alerts: |
|
flash-plugin: multiple vulnerabilities
Package(s): | flash-plugin | CVE #(s): | CVE-2011-0579 CVE-2011-0618 CVE-2011-0619 CVE-2011-0620 CVE-2011-0621 CVE-2011-0622 CVE-2011-0623 CVE-2011-0624 CVE-2011-0625 CVE-2011-0626 CVE-2011-0627 | ||||||||||||
Created: | May 13, 2011 | Updated: | May 18, 2011 | ||||||||||||
Description: | From the Red Hat advisory:
Multiple security flaws were found in the way flash-plugin displayed certain SWF content. An attacker could use these flaws to create a specially-crafted SWF file that would cause flash-plugin to crash or, potentially, execute arbitrary code when the victim loaded a page containing the specially-crafted SWF content. (CVE-2011-0618, CVE-2011-0619, CVE-2011-0620, CVE-2011-0621, CVE-2011-0622, CVE-2011-0623, CVE-2011-0624, CVE-2011-0625, CVE-2011-0626, CVE-2011-0627) This update also fixes an information disclosure flaw in flash-plugin. (CVE-2011-0579) | ||||||||||||||
Alerts: |
|
perl: denial of service
Package(s): | perl | CVE #(s): | CVE-2010-4777 | ||||||||||||
Created: | May 13, 2011 | Updated: | July 7, 2011 | ||||||||||||
Description: | From the openSUSE advisory:
This update fixes a bug in perl that makes spamassassin crash and does not allow bypassing taint mode by using lc() or uc() anymore. | ||||||||||||||
Alerts: |
|
perl-Mojolicious: cross-site scripting
Package(s): | perl-Mojolicious | CVE #(s): | CVE-2011-1841 CVE-2010-4803 | ||||||||||||
Created: | May 16, 2011 | Updated: | May 25, 2011 | ||||||||||||
Description: | From the CVE entries:
Cross-site scripting (XSS) vulnerability in the link_to helper in Mojolicious before 1.12 allows remote attackers to inject arbitrary web script or HTML via unspecified vectors. (CVE-2011-1841) Mojolicious before 0.999927 does not properly implement HMAC-MD5 checksums, which has unspecified impact and remote attack vectors. (CVE-2010-4803) | ||||||||||||||
Alerts: |
|
pure-ftpd: command injection
Package(s): | pure-ftpd | CVE #(s): | CVE-2011-1575 | ||||||||||||||||
Created: | May 13, 2011 | Updated: | May 31, 2011 | ||||||||||||||||
Description: | From the openSUSE advisory:
Pure-ftpd is vulnerable to the STARTTLS command injection issue similar to CVE-2011-0411 of postfix. | ||||||||||||||||||
Alerts: |
|
tor: multiple vulnerabilities
Package(s): | tor | CVE #(s): | CVE-2011-0015 CVE-2011-0016 CVE-2011-0490 CVE-2011-0491 CVE-2011-0492 CVE-2011-0493 | ||||||||||||
Created: | May 16, 2011 | Updated: | June 9, 2011 | ||||||||||||
Description: | From the CVE entries:
Tor before 0.2.1.29 and 0.2.2.x before 0.2.2.21-alpha does not properly check the amount of compression in zlib-compressed data, which allows remote attackers to cause a denial of service via a large compression factor. (CVE-2011-0015) Tor before 0.2.1.29 and 0.2.2.x before 0.2.2.21-alpha does not properly manage key data in memory, which might allow local users to obtain sensitive information by leveraging the ability to read memory that was previously used by a different process. (CVE-2011-0016) Tor before 0.2.1.29 and 0.2.2.x before 0.2.2.21-alpha makes calls to Libevent within Libevent log handlers, which might allow remote attackers to cause a denial of service (daemon crash) via vectors that trigger certain log messages. (CVE-2011-0490) The tor_realloc function in Tor before 0.2.1.29 and 0.2.2.x before 0.2.2.21-alpha does not validate a certain size value during memory allocation, which might allow remote attackers to cause a denial of service (daemon crash) via unspecified vectors, related to "underflow errors." (CVE-2011-0491) Tor before 0.2.1.29 and 0.2.2.x before 0.2.2.21-alpha allows remote attackers to cause a denial of service (assertion failure and daemon exit) via blobs that trigger a certain file size, as demonstrated by the cached-descriptors.new file. (CVE-2011-0492) Tor before 0.2.1.29 and 0.2.2.x before 0.2.2.21-alpha might allow remote attackers to cause a denial of service (assertion failure and daemon exit) via vectors related to malformed router caches and improper handling of integer values. (CVE-2011-0493) | ||||||||||||||
Alerts: |
|
wireshark: denial of service
Package(s): | wireshark | CVE #(s): | CVE-2011-1592 | ||||||||||||||||
Created: | May 13, 2011 | Updated: | June 7, 2011 | ||||||||||||||||
Description: | From the CVE entry:
The NFS dissector in epan/dissectors/packet-nfs.c in Wireshark 1.4.x before 1.4.5 on Windows uses an incorrect integer data type during decoding of SETCLIENTID calls, which allows remote attackers to cause a denial of service (application crash) via a crafted .pcap file. | ||||||||||||||||||
Alerts: |
|
Page editor: Jake Edge
Kernel development
Brief items
Kernel release status
The current development kernel remains 2.6.39-rc7. Linus has stated his intent to release the final 2.6.39 kernel on May 18, but that release has not happened as of this writing. Presumably he is simply waiting for the LWN Weekly Edition to be published; 2.6.39 will almost certainly be out by the time you read this.Stable updates: there have been no stable kernel updates in the last week.
Quotes of the week
This feature can be replicated for any subdirectory, where the parent holds multiple snapshots of said directory. There is no global snapshot table per-say.
This makes it possible to trivially construct and maintain multiple mirroring domains within any subdirectory structure. For example you can construct a HAMMER2 filesystem which holds multiple roots and then mount the desired one based on a boot menu item, and you can work within these roots as if they were the root of the whole filesystem (even though they are not).
Pushback on pointer hiding
There has been a determined effort over the last few kernel development cycles to eliminate the leakage of kernel addresses into user space. A determined attacker, it is thought, could use address information to figure out where important data structures are in memory; that is an important step toward corrupting those structures. So it arguably makes sense to avoid exposing kernel addresses in /proc files and other places where the kernel provides information to user space.Early in the 2.6.39 development cycle, a patch was applied to censor kernel addresses appearing in /proc/kallsyms and /proc/modules. On an affected system, /proc/kallsyms looks like this:
... 0000000000000000 V callchain_recursion 0000000000000000 V rotation_list 0000000000000000 V perf_cgroup_events 0000000000000000 V nr_bp_flexible 0000000000000000 V nr_task_bp_pinned 0000000000000000 V nr_cpu_bp_pinned ...
Needless to say, zeroing out the address information makes this file rather less useful than it had been previously. What drew attention to this change, though, was a report that perf produces bogus information in this situation. It seems that perf was not detecting the hiding of kernel addresses, so it happily went forward with all those zero values.
That is obviously a bug in perf; it will be fixed shortly. But a number of developers complained about the practice of hiding kernel addresses by default. That behavior makes the system less useful than it was before, and will certainly cause other surprises. People who want whatever extra security is provided by this behavior should have to ask for it explicitly, it was said; David Miller pointed out that other security technologies - like SELinux - are not turned on by default.
That argument won the day, so the final 2.6.39 release will not hide kernel pointers by default. Anybody wanting pointer hiding should turn it on by setting the kernel.kptr_restrict knob to 1.
Kernel development news
Integrating memory control groups
The control group mechanism allows an administrator to group processes together and apply any of a number of resource usage policies to them. The feature has existed for some time, but only recently have we seen significant use of it. Control groups are now the basis for per-group CPU scheduling (including the automatic per-session group scheduling that was merged for 2.6.38), process management in systemd, and more. This feature is clearly useful, but it also has a bad reputation among many kernel developers who often are heard to mutter that they would like to yank control groups out of the kernel altogether. In the real world, removing control groups is an increasingly difficult thing to do, so it makes sense to consider the alternative: fixing them.One of the complaints about control groups is that they have been "bolted on" to existing kernel mechanisms rather than properly integrated into those mechanisms. Given the relatively late arrival of control groups, that is, perhaps, not a surprising outcome. When attaching a significant new feature to long-established core kernel code, it is natural to try to keep to the side and minimize the intrusion on the existing code. But bolting code onto the side is not always the way toward an optimal solution which can be maintained over the long term. Some recent work with the memory controller highlights this problem - and points toward an improvement of the situation.
The system memory map consists of one struct page for each physical page in the system; it can be thought of as an extensive array of structures matching the array of pages:
The kernel maintains a global least-recently-used (LRU) list to track active pages. Newly-activated pages are placed at the end of the list; when it is time to reclaim pages, the pages at the head of the list will be examined first. The structure looks something like this:
Much of the tricky code in the memory management subsystem has to do with how pages are placed in - and moved within - this list. Of course, the situation is a little more complicated than that. The kernel actually maintains two LRU lists; the second one holds "inactive" pages which have been unmapped, but which still exist in the system:
The kernel will move pages from the active to the inactive list if it thinks they may not be needed in the near future. Pages in the inactive LRU can be moved quickly back to the active list if some process tries to access them. The inactive list can be thought of as a sort of probationary area for pages that the system is considering reclaiming soon.
Of course, the situation is still more complicated than that. Current kernels actually maintain five LRU lists. There are separate active and inactive lists for anonymous pages - reclaim policy for those pages is different, and, if the system is running without swap, they may not be reclaimable at all. There is also a list for pages which are known not to be reclaimable - pages which have been locked into memory, for example. Oh, and it's only fair to say that one set of those lists exists for each memory zone. Despite the proliferation of lists, this set, as a whole, is called the "global LRU."
Creating a diagram with all these lists would overtax your editor's rather inadequate drawing skills, though, so envisioning that structure is left as an exercise for the reader.
The memory controller adds another level of complexity as the result of its need to be able to reclaim pages belonging to specific control groups. The controller needs to track more information for each page, including a simple pointer associating each page with the memory control group it is charged to. Adding that information to struct page was not really an option; that structure is already packed tightly and there is little interest in making it larger. So the memory controller adds a new page_cgroup structure for each page; it has, in essence, created a new, shadow memory map:
When memory control groups are active, there is another complete set of LRU lists maintained for each group. The list_head structures needed to maintain these lists are kept in the page_cgroup structure. What results is a messy structure along these lines:
(Once again, the situation is rather more complicated than has been shown here; among other things, there is a series of intervening structures between struct mem_cgroup and the LRU lists.)
There are a number of disadvantages to this sort of arrangement. Global reclaim uses the global LRU as always, so it operates in complete ignorance of control groups. It will reclaim pages regardless of whether those pages belong to groups which are over their limits or not. Per-control-group reclaim, instead, can only work with one group at a time; as a result, it tends to hammer certain groups while leaving others untouched. The multiple LRU lists are not just complex, they are also expensive. A list_head structure is 16 bytes on a 64-bit system. If that system has 4GB of memory, it has 1,000,000 pages, so 16 million bytes are dedicated just to the infrastructure for the per-group LRU lists.
This is the kind of situation that kernel developers are referring to when they say that control groups have been "bolted onto" the rest of the kernel. This structure was an effective way to learn about the memory controller problem space and demonstrate a solution, but there is clearly room for improvement here.
The memcg naturalization patches from Johannes Weiner represent an attempt to create that improvement by better integrating the memory controller with the rest of the virtual memory subsystem. At the core of this work is the elimination of the duplicated LRU lists. In particular, with this patch set, the global LRU no longer exists - all pages exist on exactly one per-group LRU list. Pages which have not been charged to a specific control group go onto the LRU list for the "root" group at the top of the hierarchy. In essence, per-group reclaim takes over the older global reclaim code; even a system with control groups disabled is treated like a system with exactly one control group containing all running processes.
Algorithms for memory reclaim necessarily change in this environment. The core algorithm now performs a depth-first traversal through the control group hierarchy, trying to reclaim some pages from each. There is no global aging of pages; each group has its oldest pages considered for reclaim regardless of what's happening in the other groups. Each group's hard and soft limits are considered, of course, when setting reclaim targets. The end result is that global reclaim naturally spreads the pain across all control groups, implementing each group's policy in the process. The implementation of control group soft limits has been integrated with this mechanism, so now soft limit enforcement is spread more fairly across all control groups in the system.
Johannes's patch improves the situation while shrinking the code by over 400 lines; it also gets rid of the memory cost of the duplicated LRU lists. On the down side, it makes some fundamental changes to the kernel's memory reclaim algorithms and heuristics; such changes can cause surprising regressions on specific workloads and, thus, tend to need a lot of scrutiny and testing. Absent any such surprises, this early-stage patch set looks like a promising step toward the goal of turning control groups into a proper kernel feature.
ARM kernel consolidation
Some of you might have heard about some discomfort with the state of the ARM architecture in the kernel recently. Given that ARM Linux consolidation was one of the issues that Linaro was specifically set up to address, it is only fair to ask “What is Linaro doing about this?” So it should not come as a surprise that this topic featured prominently at the recent Linaro Developers Summit in Budapest, Hungary.
Duplicate code and out-of-tree patches make Linux on ARM more difficult to use and develop for. Therefore, Linaro is working to consolidate code and to push code upstream. This should make the upstream Linux kernel more capable of handling ARM boards and system-on-chips (SoCs). However, ARM Linux kernel consolidation is an issue not just for Linaro, but rather across the entire ARM Linux kernel community, as well as the ARM SoC, board, and system vendors. Therefore, although I expect that Linaro will play a key role, the ultimate solution spans the entire ARM community. It is also important to note that this effort is a proposal for an experiment rather than a set of hard-and-fast marching orders.
Code organization
If we are to make any progress at all, we must start
somewhere.
An excellent place to start is by organizing the ARM Linux kernel
code by function rather than by SoC/board implementation.
Grouping together code with similar purposes will make it easier
to notice common patterns and, indeed, common code.
For example, currently many ARM SoCs use similar “IP blocks”
(such as I2C controllers) but each SoC provides a completely different
I2C driver that lives in the corresponding arch/arm/mach-
directory.
We expect that drivers
for identical hardware “IP blocks”
across different ARM boards and SoCs
will be consolidated into a single driver that works with any system using
the corresponding IP block.
In some cases, differences in the way that a given IP block is connected
to the SoC or board in question may introduce complications, but such
complications can almost always be addressed.
This raises the question of where similar code should be moved to.
The short answer that was agreed to by all involved is
“Not in the arch/arm
directory!”
Drivers should of course move to the appropriate
subdirectory of the top-level drivers
tree.
That said, ARM SoCs have a wide variety of devices ranging from touchscreens
to GPS receivers to accelerometers,
and new types of devices can be expected to appear.
So in some cases it might be necessary not merely to move the driver to
a new place, but also to create a new place in the drivers
tree.
But what about non-driver code?
Where should it live?
It is helpful to look at several examples: (1) the struct clk
code that Jeremy Kerr, Russell King, Thomas Gleixner, and many others
have been working on,
(2) the device-tree code that Grant Likely has been leading up, and
(3) the generic interrupt chip implementation that Thomas Gleixner has
been working on.
The struct clk
code is motivated by the fact that many
SoCs and boards have elaborate clock trees.
These trees are needed, among other things, to allow the tradeoff between
performance and energy efficiency to be set as needed for individual
devices on that SoC or board.
The struct clk
code allows these trees to be represented
with a common format while providing plugins to accommodate behavior
specific to a given SoC or board.
The
generic interrupt chip
implementation
has a similar role, but with respect to
interrupt distribution rather than clock trees.
Device trees are intended to allow the hardware configuration of a board to be represented via data rather than code, which should ease the task of creating a single Linux kernel binary that boots on a variety of ARM boards. The device-tree infrastructure patches have recently been accepted by Russell King, which should initiate the transition of specific board code to device tree descriptions.
The struct clk
code is already used by
both the ARM and SH CPU architectures,
so it is not ARM-specific, but rather core Linux kernel code.
Similarly, the device-tree code is not ARM-specific; it
is also used by the PowerPC, Microblaze, and SPARC architectures, and even by
x86.
Device tree therefore is also Linux core kernel code.
The virtual-interrupt code goes even further, being common across all
CPU architectures.
The lesson here is that ARM kernel code consolidation need not necessarily
be limited to ARM.
In fact, the more architectures that a given piece of code supports,
the more developers can be expected to contribute both code and testing
to it, and the more robust and maintainable that code will be.
There will of course need to be at least some ARM-specific code, but the end goal is for that code to be limited to ARM core architecture code and ARM SoC core architecture code. Furthermore, the ARM SoC core architecture code should consist primarily of small plugins for core-Linux-kernel frameworks, which should in turn greatly ease the development and maintenance of new ARM boards and SoCs.
It is all very easy to write about doing this, but quite another to actually accomplish it. After all, although there are a good number of extremely talented and energetic ARM developers and maintainers, many of the newer ARM developers are also new to the Linux kernel, and cannot be expected to to know where new code should be placed. Such people might be tempted to continue placing most of their code in their SoC and board subdirectories, which would just perpetuate the current ARM Linux kernel difficulties.
Part of the solution will be additional documentation, especially on writing ARM drivers and board ports. Deepak Saxena, the new Linaro Kernel Working Group lead, will be making this happen. Unfortunately, documentation is only useful to the extent that anyone actually reads it. Fortunately, just as every problem in computer science seems to be solvable by adding an additional level of indirection, every maintainership problem seems to be solvable by adding an additional git tree and maintainers. These maintainers would help generate common code and of course point developers at documentation as it becomes available.
Git trees
One approach would be to use Nicolas Pitre's existing Linaro kernel git tree. However, Nicolas's existing git tree is an integration tree that allows people to easily pull the latest and greatest ARM code against the most recent mainline kernel version. In contrast, a maintainership tree contains patches that are to be upstreamed, normally based on a more-recent mainline release candidate. If we tried to use a single git tree for both integration and for maintainership, we would either unnecessarily expose ARM users to unrelated core-kernel bugs, or we would fail to track mainline closely enough for maintainership, which would force a full rebase and testing cycle to happen in a very short time at the beginning of each merge window.
Of course, in theory we could have both maintainership and integration branches within the same git tree, but separating these two very different functions into separate git trees is most likely to work well, especially in the beginning.
This new git tree (which was announced on May 18) will have at least one branch per participating ARM subarchitecture, and these branches will not be normally subject to rebasing, thus making it easy to develop against this new tree. Following the usual practice, maintainers of participating ARM sub-architectures will send pull requests to a group of maintainers for this new tree. Also following the usual practice, a merge of all the branches will be sent to Stephen Rothwell's -next tree, but the branches will be individually pushed to Linus Torvalds, perhaps via Russell King's existing ARM tree.
The pushing of individual branch to Linus might seem surprising, but Linus really does want to see the conflicts that arise. Such conflicts presumably help Linus identify areas in need of his attention.
Of course, this new git tree will not be limited to Linaro, but neither is it mandatory outside of Linaro. That said, I am very happy to say that some maintainers outside of Linaro have expressed interest in participating in this effort.
The Budapest meeting put forward a list of members of the maintainership group for this new git tree, namely Arnd Bergmann, Nicolas Pitre, and Marc Zyngier, with help from Thomas Gleixner. Russell King will of course also have write access to this tree. The tree will be set up in time to handle the 2.6.41 merge window. The plan is to start small and grow by evolution rather than by any attempts at intelligent design.
As noted at the beginning of this article, this effort is an experiment rather than a set of hard-and-fast marching orders. Although this proposed experiment cannot be expected to solve each and every ARM Linux problem, they will hopefully provide a good start. Every little bit helps, and every cleanup frees a little time to start on the next cleanup. There is reason to hope that this effort will help to reduce the “endless amounts of new pointless platform code” that irritated Linus Torvalds last month.
Acknowledgments
I owe thanks to the many people who helped take notes at the recent Linaro Developers Summit in Budapest, and to all the people involved in the discussions, both in the room and via IRC. Special thanks go to Jake Edge, David Rusling, Nicolas Pitre, Deepak Saxena, and Grant Likely for their review of an early draft of this article. However, all remaining errors and omissions are the sole property of the author.The platform problem
Your editor first heard the "platform problem" described by Thomas Gleixner. In short, the platform problem comes about when developers view the platform they are developing for as fixed and immutable. These developers feel that the component they are working on specifically (a device driver, say) is the only part that they have any control over. If the kernel somehow makes their job harder, the only alternatives are to avoid or work around it. It is easy to see how such an attitude may come about, but the costs can be high.Here is a close-to-home example. Your editor has recently had cause to tear into the cafe_ccic Video4Linux2 driver in order to make it work in settings beyond its original target (which was the OLPC XO 1 laptop). This driver has a fair amount of code for the management of buffers containing image frames: queuing them for data, delivering them to the user, implementing mmap(), implementing the various buffer-oriented V4L2 calls, etc. Looking at this code, it is quite clear that it duplicates the functionality provided by the videobuf layer. It is hard to imagine what inspired the idiotic cafe_ccic developer to reinvent that particular wheel.
Or, at least, it would be hard to imagine except for the inconvenient fact that said idiotic developer is, yes, your editor. The reasoning at the time was simple: videobuf assumed that the underlying device was able to perform scatter/gather DMA operations; the Cafe device was nowhere near so enlightened. The obvious right thing to do was to extend videobuf to handle devices which were limited to contiguous DMA operations; this job was eventually done by Magnus Damm a couple years later. But, for the purposes of getting the cafe_ccic driver going, it simply seemed quicker and easier to implement the needed functionality inside the driver itself.
That decision had a cost beyond the bloating of the driver and the kernel as a whole. Who knows how many other drivers might have benefited from the missing capability in the years before it was finally implemented? An opportunity to better understand (and improve) an important support layer was passed up. As videobuf has improved over the years, the cafe_ccic driver has been stuck with its own, internal implementation which has seen no improvements at all. We ended up with a dead-end, one-off solution instead of a feature that would have been more widely useful.
Clearly, with hindsight, the decision not to improve videobuf was a mistake. In truth, it wasn't even a proper decision; that option was never really considered as a way to solve the problem. Videobuf could not solve the problem at hand, so it was simply eliminated from consideration. The sad fact is that this kind of thinking is rampant in the kernel community - and well beyond. The platform for which a piece of code is being written appears fixed and not amenable to change.
It is not all that hard to see how this kind of mindset can come about. When one develops for a proprietary operating system, the platform is indeed fixed. Many developers have gone through periods of their career where the only alternative was to work around whatever obnoxiousness the target platform might present. It doesn't help that certain layers of the free software stack also seem frustratingly unfixable to those who have to deal with them. Much of the time, there appears to be no alternative to coping with whatever has been provided.
But the truth of the matter is that we have, over the course of many years, managed to create a free operating system for ourselves. That freedom brings many advantages, including the ability to reach across arbitrary module boundaries and fix problems encountered in other parts of the system. We don't have to put up with bugs or inadequate features in the code we use; we can make it work properly instead. That is a valuable freedom that we do not exploit to its fullest.
This is a hard lesson to teach to developers, though. A driver developer with limited time does not want to be told that a bunch of duplicated or workaround code should be deleted and common code improved instead. Indeed, at a kernel summit a few years ago, it was generally agreed that, while such fixes could be requested of developers, to require them as a condition for the merging of a patch was not reasonable. While we can encourage developers to think outside of their specific project, we cannot normally require them to do so.
Beyond that, working on common code can be challenging and intimidating. It may force a developer to move out of his or her comfort zone. Changes to common code tend to attract more attention and are often held to higher standards. There is always the potential of breaking other users of that code. There may simply be the lack of time for - or interest in - developing the wider view of the system which is needed for successful development of common code.
There are no simple solutions to the platform problem. A lot of it comes down to oversight and mentoring; see, for example, the ongoing effort to improve the ARM tree, which has a severe case of this problem. Developers who have supported the idea of bringing more projects together in the same repository also have the platform problem in mind; their goal is to make the lines between projects softer and easier to cross. But, given how often this problem shows up just within the kernel, it's clear that separate repositories are not really the problem. What's really needed is for developers to understand at a deep level that platforms are amenable to change and that one does not have to live with second-rate support.
Patches and updates
Architecture-specific
Core kernel code
Device drivers
Documentation
Filesystems and block I/O
Memory management
Networking
Security-related
Benchmarks and bugs
Miscellaneous
Page editor: Jonathan Corbet
Distributions
UDS security discussions
The Ubuntu Developer Summit (UDS) is an interesting event. It has a much different format than traditional technical conference because its focus is on making decisions for the upcoming Ubuntu release. I spent much of the conference's second day attending meetings in the security track, where the Ubuntu security team and members of the Ubuntu community discussed various topics of interest, with an eye toward what would end up in Oneiric Ocelot (11.10). Those meetings provided a look into the workings of the distribution and its security team.
![[UDS group photo]](https://static.lwn.net/images/2011/uds-group1-sm.jpg)
All of the UDS meetings are set up the same, with a "fishbowl" of half-a-dozen chairs in the center where the microphone is placed so that audio from the meeting can be streamed live. There are two projector screens in each room, one showing the IRC channel so that external participants can comment and ask questions; the other is generally "tuned" to the Etherpad notes for the session, though it can be showing the Launchpad blueprint or some other document of interest.
The team that is running the meeting sits in the fishbowl, while the other attendees are seated just outside of it; sometimes all over the floor and spilling out into the hallway. "Audience" participation is clearly an important part of UDS sessions.
Involving the community
The session on improving the community's relationship with the security team was rescheduled at the last minute—something that occurs with some frequency at UDS—but that didn't deter the folks who showed up from discussing the issue. The discussions at UDS "require" the presence of certain individuals or representatives of teams, and when they are not available, meetings get rescheduled. In this case, a community team representative was unavailable, but the discussion rolled on anyway.
One attendee was interested in the opportunities available for those who are interested in helping out with Ubuntu security. Team members responded that one of the main areas where the security team needs help is in handling updates for the "universe" packages. Those packages are community maintained, which in practice means that some are well maintained and some are not. That stands in contrast to the packages in "main" which are maintained by the security team so security updates are issued in a timely fashion.
Organizations or individuals could adopt packages in universe and ensure that they get timely updates as well. It is an excellent opportunity for anyone who does some programming to learn about Debian packaging, security practices, using the repositories, and more. For folks that are looking to become Ubuntu developers, doing updates for a package or packages would be good experience as well as looking good in the application process.
Interested users don't have to be security experts to help out, it's a matter of following any upstream security updates, applying them, and then testing. The testing needs to minimally determine that it fixes the problem in question, and doesn't break the application. For people interested in security but who don't know where to start, adopting a universe package for security updates is an excellent starting point.
Another audience member was interested in what protections there were for the process of creating and signing packages, as well as how the security team would handle a situation where a sensitive account was compromised. Without getting into specifics, the team explained that there are tools, logging, and procedures in place to detect and handle these kinds of problems. Should an event like that occur, emergency plans have been devised to handle it.
For updates, the security team acts as a buffer between the packagers and the signing process. There is a process that updates go through, which includes looking at the updated code and some automated testing to look for things like permission problems and setuid programs. There is a lot more testing that could be done automatically, which is something that the team would like to add and that could be done in collaboration with others. A Google employee who attended noted that the company was interested in using those kinds of tools, so that might be one opportunity for collaboration.
Security roundtable
Daily "roundtable" sessions are held in the morning for several of the tracks at UDS. Security was no exception. The sessions are kind of "catch-alls" that allow discussion of smaller topics that don't require a full hour, as well as allowing people to bring up topics that might need a full session scheduled later in the week.
The two main topics discussed in Tuesday's roundtable were the dissemination of Ubuntu security notices (USNs) and the idea of putting together a whitepaper that looked at the statistics of security updates for various releases. The team was trying to decide if there was value in continuing to put the USNs on the Bugtraq and Full-disclosure mailing lists given that there is a dedicated Ubuntu-security-announce list as well as a web page that tracks the USNs (the LWN security updates were also noted as a good source).
The general feeling was that anyone truly interested in following Ubuntu security updates would use one of the other methods, rather than try to extract them from the rest of the traffic on Bugtraq or Full-disclosure. One team member noted that the usefulness of Bugtraq dropped significantly when vendors started overwhelming the list with security updates. The current practice of sending to those lists is something that came into Ubuntu from Debian's practices. In the end, it was decided that an announcement would be made to those lists that Ubuntu would no longer post USNs there and pointing to the other sources of the update information.
A whitepaper that looked at the types and severities of vulnerabilities fixed for a particular Ubuntu release was also discussed. It would be a fair amount of work to pull together and there were questions about the purpose a study like that would serve. There are some technical hurdles in that many of the CVEs in the Ubuntu system are classified as "medium" because those values represent the priority of getting a fix out rather than the severity of the vulnerability.
There was some concern that such a study would not be worth the time that was invested in it. There were also worries that the whitepaper might be interpreted to be a comparison to the reports periodically issued by Red Hat. On the other hand, there may be lessons for the team in a report of that sort, including finding classes of vulnerabilities that could be short-circuited via kernel or other changes. Some of the data is already collected as part of the self-analysis that the team does monthly, so the general feeling seemed to be that at least gathering the rest of the data would be a useful exercise. Whether it results in a whitepaper probably depends on how much time the team can find to pull one together.
Ubuntu security notices
USNs came up in another discussion to look at the format of those announcements. While it may be something of a boring subject for some, it is important to communicate the important information about a security vulnerability in security announcements. In addition, some of the elements that make up the announcement end up in other places, like the Update Manager's description of an update.
The team had recently tweaked the format of USNs and wanted to discuss formalizing some of the language and formatting in the various elements. The first order of business was the "Issue Summary", which is a one-line description of the update. That is what appears in Update Manager so it needs to be written so that it is readable by anyone. That, of course, is difficult to do without either getting into security jargon or going on at length. Some standardization of the terminology and descriptions was deemed desirable, so the plan is to try to create templates for ten or so common application types (e.g. kernel, apache, firefox, ...) that would be vetted by the documentation and user experience teams for readability.
Other tweaks to the format were also discussed such as cleaning up the "Software Description" section that sometimes has multiple entries for the same package, which can be confusing. Creating a wiki page that described how to update the system (i.e. run Update Manager) which could be linked from announcements was also planned.
A look inside the sausage factory
![[UDS group photo]](https://static.lwn.net/images/2011/uds-group2-sm.jpg)
While some of these sessions may seem a little bland, I found them interesting for a number of reasons. In a week's time I was able to get a glimpse into what goes into the decision-making process for an Ubuntu release. It is a very community-oriented process that tries to be as inclusive as it can be, while still being focused on the task at hand. Rather than just having Canonical employees make decisions about the smaller details, hearing and acting on the community's concerns is clearly an important part of the process.
There were two or three hundred such sessions throughout the week, some looking into fairly emotional topics (like default web browser and email client) or others about adding support for an out-of-tree kernel feature to the distribution. I also sat through a review of the kernel configuration parameters to determine which would be enabled or disabled for the Oneiric kernel. It was somewhat tedious, as would probably be expected, but an interesting exercise that is done, in public, for each Ubuntu release.
And that is the essence of UDS: public review and discussion of the features that will appear in five-and-a-half months or so. There are also meals, parties, and lots of talks outside of the sessions—much like any conference—but the focus is clearly on getting things done.
[ I would like to thank Canonical for sponsoring my travel to Budapest for UDS. ]
Brief items
Distribution quotes of the week
So I'm using this as an excuse to remind everyone that if you "yum remove gnome* -y" from your Fedora 15 computer, and don't have KDE or any other graphical user interface to fall back to, the next time you reboot, it won't actually finish booting unless you set it to boot as Run Level 3.
Fedora 15 is on track
Fedora 15 is on track for its scheduled release on May 24. In the meantime, there is a release candidate available for testing.Fedora 13 ARM beta
The third beta of the Fedora 13 ARM release is available for testing. "his release includes additional software not found in Beta2, most notably Abiword for your word processing needs. Unfortunately at this time we are not able to offer OpenOffice due to some issues with java packages ( java experts we could use your help! )."
Mageia 1 RC is available
The Mageia project has announced the availability or Mageia 1 RC.- How does the upgrade feel like, starting from Mandriva Linux 2010.1 and 2010.2 to Mageia 1 RC?
- Are your computers and peripherals properly recognised and handled by Mageia?
- How does the installation process feel like to you?
LTTng in the Ubuntu kernel
The Linux Trace Toolkit next generation (LTTng) is a high-performance out-of-tree kernel tracer that has been integrated into several embedded distributions. It is currently being used in the Linaro kernel, which is based on Ubuntu's, but because of the size of the patch set, that is not seen as sustainable into the future. As it turns out, the LTTng team has been working on a way to reduce or eliminate the need for patches to the core kernel by turning LTTng into a kernel module.
Julien Desfossez attended the recent Ubuntu Developer Summit (UDS) to propose adding LTTng to Ubuntu, both for the upcoming 11.10 ("Oneiric Ocelot") as well as the 12.04 LTS release coming next year. Because of its integration with user space, as well as its use in Linaro, LTTng is seen as a desirable feature for Ubuntu. The question came down to how to get there.
There are two versions of LTTng 2.0, one of which requires a substantial, rather intrusive set of patches, while the other, 2.0-distro, only requires a small handful of changes that look to be fairly minor cleanups in the kernel. The ring buffer and the rest of LTTng have been moved to modules for 2.0-distro. Most of the functionality of LTTng is preserved, though there are a few missing pieces. The trace clock has been removed from 2.0-distro, so tracing in NMI contexts is no longer possible. In addition, there is no support for NO_HZ kernels.
Since it was not clear that the changes needed for 2.0-distro would make it upstream before the 11.10 kernel freeze, it was determined that the kernel team would help Desfossez create a personal package archive (PPA) for this release, with an eye toward enabling the feature in the 12.04 release. The LTTng team is still working to try to get the full 2.0 code upstream (beginning with the generic ring buffer that could be shared with Ftrace and perf), but the modularized version will be useful in the meantime.
Later in the week, Desfossez said that Mathieu Desnoyers had found a way to not require any core kernel changes for 2.0-distro in the day or two after the meeting. Whether that will result in Ubuntu building and shipping the LTTng modules as a dkms package for 11.10 is not yet known. In any case, it would seem that LTTng will be available, in one form or another, for Ubuntu kernels going forward.
Distribution News
Debian GNU/Linux
DebConf "Newbies" Funding Initiative
This year the Debian Project invites "Newbies" and non-regular attendees to join DebConf11, which will take place in late July in Banja Luka, Bosnia. "As a special incentive, an extra travel fund has been set up, which is only available to new or returning DebConf attendees."
DebConf10 Final Report released
The DebConf team has released the DebConf10 Final Report. "It's a 46-page document which gives the reader an idea about the conference as a whole. It includes descriptions of talks, DebCamp and Debian Day activities, personal impressions, attendee and budgeting numbers, the work of various teams, social events, funny pictures and so on."
DNS security extensions now available for Debian's zone entries
The Debian Project has announced that its domains debian.org and debian.net are now secured by the DNS Security Extension (DNSSEC). The corresponding DNS records have recently been added in the .net and .org zones.Removal of alpha/hppa from ftp-master.debian.org
The Debian ports for alpha and hppa have moved into the Debian Ports archive. "If you are a user of one of those two architectures please ensure that your sources.list entries point to the new location." Support for alpha and hppa in the Lenny release will continue.
Bits from the perl maintainers
The Debian perl maintainers have completed the perl 5.12 transition in the unstable repository. "As you'll probably have noticed by now, around 12 months after the first upstream 5.12 release, perl 5.12.3 was uploaded to unstable and has now, thanks to the superb work of the release team, has migrated to testing. This marks the first major new version of perl in Debian for three years."
Fedora
Fedora project switching to new contributor agreement
The word has gone out that Fedora is switching to its new, improved contributor agreement. Anybody wanting to contribute to Fedora (with a few exceptions) will need to accept the new agreement by June 17. The full text of the agreement (preceded by a FAQ) is available for the curious.Change in requirements for Board, FESCo, and FAmSCo candidates
The Fedora Project has amended the requirements for candidates to elected and appointed roles in the Fedora Community. "Unfortunately, the laws in the United States which Fedora and Red Hat are subject to place very tight restrictions on the involvement of citizens of certain countries."
openSUSE
openSUSE 11.2 has reached end of SUSE support
Support for openSUSE 11.2 is officially over. The openSUSE Evergreen community effort will continue to provide some support for 11.2. From the Evergreen project page: "As this is the first trial it's not fully outlined which scope of the full distribution can be supported. The plan is to provide updates for as many components as possible. The same holds true for the time period of the support."
openSUSE announces location of 2011 conference
The 2011 openSUSE Conference will take place in Nuremberg, Germany, September 11-14. The conference will be co-located with the SUSE Labs conference. Additional information on the call for proposals can be found on this week's Announcements page.
Newsletters and articles of interest
Distribution newsletters
- Debian Project News (May 17)
- DistroWatch Weekly, Issue 405 (May 16)
- openSUSE Weekly News, Issue 175 (May 14)
Riding the Narwhal: Ars reviews Unity in Ubuntu 11.04 (ars technica)
Ryan Paul reviews the Unity shell in Ubuntu 11.04. "Ubuntu 11.04 pulls together years of Ubuntu usability enhancement efforts—including Unity and the much-improved panel system that has gradually emerged from the Ayatana project—and ties them together to deliver a rich and highly cohesive desktop experience. Although the result is compelling, there are still a lot of rough spots and limitations that chafe along the environment's edges. Some parts—such as the application lens—seem awkward, poorly designed, and incomplete."
New tasty delights in Android's Ice Cream Sandwich (CNet)
Marguerite Reardon answers questions about the Android "Ice Cream Sandwich" release. "The real purpose of Ice Cream Sandwich is to become the unifying version of Android for all mobile products that use the operating system. Though Honeycomb was designed for tablets, Ice Cream Sandwich will be a cross-platform OS. It will be the one OS that runs everywhere. This means it will allow developers to create apps once and then those apps will be able to operate on different devices with different screen sizes and different capabilities. The OS will essentially be smart enough to figure out which type of device the app is running on and then adjust parameters."
Puppy Linux: Top Dog of the Lightweight Distros (OSNews)
Howard Fosdick takes a look at Puppy Linux. "Flexibility is essential when working with low-end computers. You need software that runs on the system you have, rather than requiring you to upgrade, change, or fix hardware. Puppy doesn't impose hardware requirements. For example, Puppy installs and boots from any bootable device and saves your work to any writeable device. No hard disk, optical drive, or USB? No problem. Want to use your old SCSI drive, floppy, Zip drive, LS-120/240 Superdisk, or compact flash memory? Puppy does it. It's great to see a distro that leverages whatever odd old devices your system has."
Slack World interviews
The Slack World has interviews with Robby Workman and Eric Hameleers on the Slackware 13.37 release. Hameleers: "As you are well aware, a primary objective for Slackware (and essentially that of its creator Pat Volkerding) is to release "when it is ready". I think that the first time I mentioned to Pat that he should be thinking about finalizing what would become Slackware-13.2, was as far back as November or December. But Pat held out, waiting for a kernel version that he was satisfied with. That took a while! And the situation of driver stability in X.Org is a permanent source of frustration. It seems that we can never make everybody happy at the same time using any combination of X.Org, mesa and dri versions. That is the reason why you find alternative versions of mesa and the nouveau driver in the /testing directory of Slackware 13.37."
Page editor: Rebecca Sobol
Development
DVCS-autosync
Dropbox is a fantastic concept, but it has some major flaws — namely that the software is proprietary, and that users must entrust their files to a third party. Unfortunately, Linux users have few alternatives to Dropbox — and even fewer that are free software and mature enough for daily use. While not quite mature, DVCS-Autosync is looking like an interesting alternative for file synchronization between Linux systems.
One other option is SparkleShare, an attempt to provide a multi-platform, git-based replacement for Dropbox. However, SparkleShare is still in early development, and is somewhat ambitious — it attempts to replicate not only Dropbox's main feature (synchronizing files seamlessly) but also to provide collaboration tools and an easy to use GUI frontend for viewing file revisions. SparkleShare also puts off some potential users because of the project's technology choices — namely that SparkleShare is written in Mono and uses git rather than another Distributed Version Control System (DVCS).
Many users simply want a tool that will synchronize files, and have no need for collaboration features, multiplatform support, or a GUI to manage all of it. For those users who don't find SparkleShare appealing, DVCS-Autosync is a Python-based alternative that is not as full-featured — but it does get the job done.
René Mayrhofer announced
DVCS-Autosync on March 10th of this year, though the tool has actually
been around for quite a while before that. Mayrhofer says that the script
started life before SparkleShare was announced, as part of managing his
home directory using SVN and then git. Mayrhofer says he lacks time to try
to contribute to SparkleShare, but was "pleased to see others go into
the same direction
" and that he wanted to publish what he had so
far.
All about DVCS-Autosync
So what is DVCS-Autosync? It's a Python script to keep DVCS repositories in sync when files are added, changed, or removed by automatically committing and pushing or pulling the files to the server and clients. It does this by watching one directory for changes, using inotify and communicating between clients using XMPP (Jabber). Thus, when changes are made clients receive a message over XMPP to notify them of the change and to initiate a pull from the git repository.
DVCS-Autosync has been based around (and tested with) git, but it's been written to allow use with other distributed version control systems — so if a user has a strong loyalty to Mercurial, they would be able to configure DVCS-Autosync to use Mercurial or another DVCS instead by editing the commands used for various DVCS-Autosync operations. For example, users could replace the default pull and push commands ("git pull" and "git push," respectively) with the appropriate pull and push commands for another DVCS.
Setting up and Using DVCS-Autosync
Right now, DVCS-Autosync is not yet packaged for most distributions, excepting Arch. Packages for Debian/Ubuntu are on the way but for now if you want to use DVCS-Autosync it requires grabbing the source off of Gitorious and installing the necessary dependencies. Users will need Python 2.6 or later (2.7 seems to work just fine), Pyinotify, and xmpppy (packaged as python-jabberbot on Ubuntu). Of course git or your favorite DVCS is also required. Because the utility uses Jabber/XMPP to communicate between clients, you'll also need a Jabber account that you can use for its notifications.
The installation instructions on the DVCS-Autosync page seem to be a bit
outdated. There's no "autosync.py" included with the source, the
dvcs-autosync script is now what you're looking for. Also you can install
DVCS-Autosync using "python setup.py install
" rather than
copying the scripts manually.
The next step is to initialize a git repository on a central server, and to then clone that repository on each client that will be using DVCS-Autosync. Each client also requires a config file (included as .autosync-example in the source, under the main directory) that needs to be customized to add Jabber account information. After that, run dvcs-autosync and, if all is configured properly, it should just go. As long as dvcs-autosync is running, it will automatically commit files after they're added to the directory or changed, and synchronize files to the other clients after they've been checked in. Users do not need (or have the ability to) manually make commits with dvcs-autosync.
Note that it will not operate quietly, however. The application is set up to display desktop notifications for each change. This is similar to Dropbox's default behavior — but seems utterly unnecessary past the first few hours when one might wish to ensure that the script is actually working. Mayrhofer says that the devel branch now allows users to configure notifications to go to the desktop, XMPP, both, or turn them off entirely.
Users can recover older versions of files, or view commit logs, using the standard git utilities — but dvcs-autosync doesn't have any special tools for that, either. In short, it is currently a single-purpose utility that syncs files between machines.
Looking ahead for DVCS-Autosync
DVCS-Autosync is an interesting alternative to Dropbox and SparkleShare, but it still has some room for improvement and a few missing features that may be problematic for some users.
First and foremost, DVCS-Autosync is missing encryption. While Dropbox's
encryption is significantly compromised by the fact that the company can
decrypt its users files without the users' permission or knowledge, it does
at least offer the feature. DVCS-Autosync, on the other hand, makes no
provision for encryption at all. Mayrhofer says that he
"definitely
" plans to implement encryption at some point
— but for now it's assumed that the repository is trusted. Mayrhofer
is undecided how to implement encryption but may look at using git-annex,
which has gained support
for encrypted backends recently.
Using git-annex might help with another weakness in the current implementation: if you're using DVCS-Autosync for syncing large files that change often, you may be looking at a fairly large repository after a while because it will store the entire file for each change. However, this is a known problem and DVCS-Autosync contributor Dieter Plaetinck has suggested using git-annex to handle large files as a way around the current problem.
Moving directories gracefully, and providing a single commit entry for multiple file events (such as moving a directory, or uncompressing a tarball within the synced directory) are also on the list of bugs to handle soon. Longer term, Mayrhofer has indicated that he'd like to add context to commit messages (such as which applications are open at the time), and a way to specify commit messages via a traybar icon or popup rather than the general commit messages that are generated by DVCS-Autosync currently.
In its current state, DVCS-Autosync has a few kinks to be worked out before it can reliably replace Dropbox for most users. However, there seems to be some strong interest in the application already, and it could quickly become reliable enough to become a staple for Linux users who just want a simple file synchronization tool that requires little attention.
Brief items
Quotes of the week
A Linux system running over JavaScript
Fabrice Bellard has posted an x86 emulator written entirely in JavaScript which enables one to boot a Linux system inside a web browser. There are some technical details available for those who want to know how it works. "The CPU is close to a 486 compatible x86 without FPU. The lack of FPU is not a problem when running Linux as Operating System because it contains a FPU emulator. In order to be able to run Linux, a complete MMU is implemented."
NumPy 1.6.0 released
Release 1.6.0 of the NumPy numeric Python module is out. New features include a 16-bit floating-point type, a new and improved iterator, more polynomial types, and more.Perl 5.14.0 released
The Perl 5.14.0 release is out. Changes in this release include Unicode 6.0 support, better IPv6 support, some regular expression enhancements, performance improvements, and more. See the 5.14.0 perldelta page for lots of details.GNU SIP Witch 1.0 released
GNU SIP Witch is a SIP protocol provisioning and call server; it is a part of the GNU Free Call project. The 1.0 release is now available. "In conjunction with this release, the GNU Free Call project is distributing an initial release of our technological assistance package for common computing platforms by providing our switchview desktop client for use with GNU SIP Witch on your local machine. In the future TAP will enable multi-platform personal encryption, include further support for desktop and mobile secure calling, and provide other basic and common computing services missing on some platforms."
Announcing TermKit
Steven Wittens has written a lengthy weblog entry describing TermKit, a terminal emulator built on WebKit. "So while I agree that having a flexible toolbox is great, in my opinion, those pieces could be built a lot better. I don't want the computer equivalent of a screwdriver and a hammer, I want a tricorder and a laser saw. TermKit is my attempt at making these better tools and addresses a couple of major pain points. I see TermKit as an extension of what Apple did with OS X, in particular the system tools like Disk Utility and Activity Monitor. Tech stuff doesn't have to look like it comes from the Matrix." Source is available on Github.
De Icaza: Announcing Xamarin
On his blog, Miguel de Icaza has announced Xamarin, which is a new company formed to create Mono-based products, specifically for iOS and Android. The company is made up of the Mono team that was recently laid off by Attachmate, and will also continue development of the free software Mono and Moonlight platforms. "The new versions of .NET for the iPhone and Android will be source compatible with MonoTouch and Mono for Android. Like those versions, they will be commercial products, built on top of the open core Mono. [...] In addition, we are going to provide support and custom development of Mono. A company that provides International Mono Support, if you will."
GNOME release-team-lurkers: for people who can read and not respond
The GNOME release-team mailing list is intended for private discussions between release team members. However, for those would like to follow along, the release-team-lurkers list has been set up for a trial run. "If you wanted to follow and can *refrain* from taking part in discussions, subscribe to above list and you'll get a copy of all release-team emails."
Newsletters and articles
Development newsletters from the last week
- Caml Weekly News (May 17)
- PostgreSQL Weekly News (May 15)
- Python-URL! (May 18)
- Tcl-URL! (May 17)
What Every C Programmer Should Know About Undefined Behavior #1/3
The LLVM project blog has the beginning of a three-part series on undefined behavior in C. "Undefined behavior exists in C-based languages because the designers of C wanted it to be an extremely efficient low-level programming language. In contrast, languages like Java have eschewed undefined behavior because they want safe and reproducible behavior across implementations, and willing to sacrifice performance to get it. While neither is 'the right goal to aim for,' if you're a C programmer you really should understand what undefined behavior is."
What every C Programmer should know about undefined behavior #2/3
The second installment in the series on undefined behavior in C has been posted to the LLVM blog. "The end result of this is that we have lots of tools in the toolbox to find some bugs, but no good way to prove that an application is free of undefined behavior. Given that there are lots of bugs in real world applications and that C is used for a broad range of critical applications, this is pretty scary."
Poettering: systemd for developers I
Lennart Poettering has started a new series of systemd articles; this set is aimed at developers. "As you can see this code is actually much shorter then the original. This of course comes at the price that our little service with this change will no longer work in a non-socket-activation environment. With minimal changes we can adapt our example to work nicely both with and without socket activation..."
Page editor: Jonathan Corbet
Announcements
Brief items
Google "Chromebooks" launch
Google has announced the forthcoming commercial availability of "Chromebook" systems built on ChromeOS. "These are not typical notebooks. With a Chromebook you won't wait minutes for your computer to boot and browser to start. You'll be reading your email in seconds. Thanks to automatic updates the software on your Chromebook will get faster over time. Your apps, games, photos, music, movies and documents will be accessible wherever you are and you won't need to worry about losing your computer or forgetting to back up files. Chromebooks will last a day of use on a single charge, so you don't need to carry a power cord everywhere. And with optional 3G, just like your phone, you'll have the web when you need it. Chromebooks have many layers of security built in so there is no anti-virus software to buy and maintain. Even more importantly, you won't spend hours fighting your computer to set it up and keep it up to date." These systems have Linux inside, of course, though one would be hard put to tell from the announcement; LWN reviewed a ChromeOS system in January.
HP: 10 Reasons for Geeks to Love HP webOS
HP has posted a page promoting WebOS for developers. "We have an awesome independent developer community in webOS Internals that does things like replacement kernels, new system services, and overclocking tools. Our community produces innovations that have made their way into later webOS releases; for example, we liked the page cache compression work that they did to improve webOS 1.4.5 so much that we made it part of our standard Linux kernels on webOS 2.0. HP hasn't tried to stop or silence these groups; instead we work with them when possible and even give them hardware to help with their explorations."
Mark Webbink takes over Groklaw 2.0
Pamela Jones has used her last full-time day at Groklaw to announce that Mark Webbink will be running the show from here. "Now that the battlefield has shifted from SCO attacking Linux to Microsoft using patents against it and from servers to mobiles, I realized that Groklaw needs a lawyer at the helm. So I asked Mark Webbink if he would take on this role, and I'm thrilled to tell you that he has accepted. He is the new editor of Groklaw as of today. Mark was General Counsel at Red Hat, as you know, and he is on the board of the Software Freedom Law Center. He is also a law professor, which as I'll explain is a vital piece of what he has planned. Mark is a visiting professor at New York Law School where he runs the Center for Patent Innovations, oversees the Peer To Patent project run with the U.S. Patent and Trademark Office, has been active in seeking reform of the U.S. patent system, and teaches patent licensing."
Announcements from Libre Graphics Meeting
The 6th Libre Graphics Meeting 2011 was held May 10-13, in Montreal. During the meeting the company Fabricatorz helped launch new releases of the Open Clip Art Library and the Open Font Library. They also demonstrated the Milkymist One video synthesizer.
Articles of interest
Some Observations on Oracle v. Google (Groklaw)
Groklaw has an article by Mark Webbink (law professor, former Red Hat general counsel). "So if you are going to develop a new implementation of something like the Java run-time environment, you have to not only use a clean room in order to avoid copyright claims, you also have to work around any relevant patents (and this doesn't require a clean room). Suffice it to say that the approach Google has taken has some potential holes in it with respect to patents."
How Yahoo won the Bedrock patent trial that Google lost (Thomson Reuters)
Google was not the only company sued for patent infringement by Bedrock; there several other defendants, including Yahoo. Most of those defendants have settled, but Yahoo stuck it out to the end and got a "not infringing" verdict for its pain. Thomson Reuters looks at what happened differently this time to enable Yahoo to win. "First off, Bedrock had a stronger case against Google. [Bedrock counsel Douglas] Cawley put on evidence that Google used Bedrock's Linux code on its servers (although Google got rid of the code before trial). Yahoo, on the other hand, used a different form of Linux, and its lead trial lawyer, Yar Chaikovsky and Fay Morisseau of McDermott Will, were able to argue that Yahoo never executed the Bedrock code."
Groklaw - "The blog that made a difference"
The H talks with Pamela "PJ" Jones about her work on Groklaw. "I could never have done Groklaw without all those volunteers who helped me carry the burdens and shared the fun. People show up with skills and because they have those skills and you don't, you'd never think to try what they propose, but when they show you, it's wonderful. That's how we started doing charts of legal documents, comparing versions of a complaint and highlighting the changes or doing one of a complaint and an answer or two opposing memorandums of law. You can see in a glance what matters, what changed, what is at issue, with color coding."
Education and Certification
LPI Announces New Training Partners in China and Philippines
The Linux Professional Institute (LPI) has announced new LPI-Approved Training Partners in the region: Beijing Shenghao Boyuan Technology Company of mainland China and Concentrix of the Philippines.
Calls for Presentations
COSCUP 2011 Call For Proposals
The Conference for Open Source Coders, Users and Promoters (COSCUP) will be held August 20-21, 2011, in Taipei, Taiwan. The call for proposals is open until June 17. "COSCUP is the largest annual FLOSS conference organized by local communities in Taiwan. The conference has sessions for new users, enthusiastic promoters, coders or anyone who is interested in cutting-edge FLOSS technologies. The goal is to create a friendly and informative environment for people from different communities to make friends, learn new technologies and inspire each other."
openSUSE Conference 2011 CfP open!
This year's openSUSE Conference will take place September 11-14, in Nuremberg, Germany. The call for proposals is open untul July 11. "The committee is looking for a wide range of talks and sessions from Free Software contributors, however openSUSE related topics are obviously our focus. To simplify the wide range of activities one could plan, we have created three different sessions following the Read/Write/Execute theme. For Read-Only there are talks with the traditional slides and 5-10 minutes Q&A at the end. For Write there are the BOF sessions where discussions can take place. Finally, in Workshops the Execute bit can be set!"
Upcoming Events
Events: May 26, 2011 to July 25, 2011
The following event listing is taken from the LWN.net Calendar.
Date(s) | Event | Location |
---|---|---|
June 1 June 3 |
Workshop Python for High Performance and Scientific Computing | Tsukuba, Japan |
June 1 | Informal meeting at IRILL on weaknesses of scripting languages | Paris, France |
June 1 June 3 |
LinuxCon Japan 2011 | Yokohama, Japan |
June 3 June 5 |
Open Help Conference | Cincinnati, OH, USA |
June 6 June 10 |
DjangoCon Europe | Amsterdam, Netherlands |
June 10 June 12 |
Southeast LinuxFest | Spartanburg, SC, USA |
June 13 June 15 |
Linux Symposium'2011 | Ottawa, Canada |
June 15 June 17 |
2011 USENIX Annual Technical Conference | Portland, OR, USA |
June 20 June 26 |
EuroPython 2011 | Florence, Italy |
June 21 June 24 |
Open Source Bridge | Portland, OR, USA |
June 27 June 29 |
YAPC::NA | Asheville, NC, USA |
June 29 July 2 |
12º Fórum Internacional Software Livre | Porto Alegre, Brazil |
June 29 | Scilab conference 2011 | Palaiseau, France |
July 9 July 14 |
Libre Software Meeting / Rencontres mondiales du logiciel libre | Strasbourg, France |
July 11 July 16 |
SciPy 2011 | Austin, TX, USA |
July 11 July 12 |
PostgreSQL Clustering, High Availability and Replication | Cambridge, UK |
July 11 July 15 |
Ubuntu Developer Week | online event |
July 15 July 17 |
State of the Map Europe 2011 | Wien, Austria |
July 17 July 23 |
DebCamp | Banja Luka, Bosnia |
July 19 | Getting Started with C++ Unit Testing in Linux | |
July 24 July 30 |
DebConf11 | Banja Luka, Bosnia |
If your event does not appear here, please tell us about it.
Page editor: Rebecca Sobol