User: Password:
Subscribe / Log in / New account Weekly Edition for March 14, 2013

Some impressions from Linaro Connect

By Jonathan Corbet
March 13, 2013
One need only have a quick look at the LWN conference coverage index to understand that our community does not lack for opportunities to get together. A relatively recent addition to the list of Linux-related conferences is the series of "Linaro Connect" events. Recently, your editor was finally able to attend one of these gatherings: Linaro Connect Asia in Hong Kong. Various talks of interest have been covered in separate articles; this article will focus on the event itself.

Linaro is an industry consortium dedicated to improving the functionality and performance of Linux on the ARM processor; its list of members includes many of the companies working in this area. Quite a bit of engineering work is done under the Linaro banner, to the point that it was the source of 4.6% of the changes going into the 3.8 kernel. A lot of Linaro's developers are employed by member companies and assigned to Linaro, but the number of developers employed by Linaro directly has been growing steadily. All told, there are hundreds of people whose work is related to Linaro in some way.

Given that those people work for a lot of different companies and are spread across the world, it makes sense that they would all want to get together on occasion. That is the purpose of the Linaro Connect events. These conferences are open to any interested attendee, but they are focused on Linaro employees and assignees who otherwise would almost never see each other. The result is that, in some ways, Linaro Connect resembles an internal corporate get-together more than a traditional Linux conference.

So, for example, the opening session was delivered by George Grey, Linaro's CEO; he used it to update attendees on recent developments in the Linaro organization. The Linaro Enterprise Group (LEG) was announced last November; at this point there [George Grey] are 25 engineers working with LEG and 14 member companies. More recently, the Linaro Networking Group was announced as an initiative to support the use of ARM processors in networking equipment. This group has 12 member companies, two of which have yet to decloak and identify themselves.

Life is good in the ARM world, George said; some 8.7 billion ARM chips were shipped in 2012. There are many opportunities for expansion, not the least of which is the data center. He pointed out that, in the US, data centers are responsible for 2.2% of all energy use; ARM provides the opportunity to reduce power costs considerably. The "Internet of things" is also a natural opportunity for ARM, though it brings its own challenges, not the least of which is security: George noted that he really does not want his heart rate to be broadcast to the world as a whole. And, he said, the upcoming 64-bit ARMv8 architecture is "going to change everything."

The event resembled a company meeting in other ways; for example, one of the talks on the first day was an orientation for new employees and assignees. Others were mentoring sessions aimed at helping developers learn how to get code merged upstream. One of the sessions on the final day was for the handing out of awards to the people who have done the most to push Linaro's objectives forward. And a large part of the schedule (every afternoon, essentially) was dedicated to hacking sessions aimed at the solution of specific problems. It was, in summary, a focused, task-oriented gathering meant to help Linaro meet its goals.

There were also traditional talk sessions, though the hope was for them to be highly interactive and task-focused as well. Your editor was amused to hear the standard complaint of conference organizers everywhere: despite their attempts to set up and facilitate discussions, more and more of the sessions seem to be turning into lecture-style presentations with one person [Hong
Kong] talking at the audience. That said, your editor's overall impression was of an event with about 350 focused developers doing their best to get a lot of useful work done.

If there is a complaint to be made about Linaro Connect, it would be that the event, like much in the mobile and embedded communities, is its own world with limited connections to the broader community. Its sessions offered help on how to work with upstream; your editor, in his talk, suggested that Linaro's developers might want to work harder to be the upstream. ARM architecture maintainer Russell King was recently heard to complain about Linaro Connect, saying that it works outside the community and that "It can be viewed as corporate takeover of open source." It is doubtful that many see Linaro in that light; indeed, even Russell might not really view things in such a harsh way. But Linaro Connect does feel just a little bit isolated from the development community as a whole.

In any case, that is a relatively minor quibble. It is clear that the ARM community would like to be less isolated, and Linaro, through its strong focus on getting code upstream, is helping to make that happen. Contributions from the mobile and embedded communities have been steadily increasing for the last few years, to the point that they now make up a significant fraction of the changes going into the kernel. That can be expected to increase further as ARM developers become more confident in their ability to work with the core kernel, and as ARM processors move into new roles. Chances are, in a few years, we'll have a large set of recently established kernel developers, and that quite a few of them will have gotten their start at events like Linaro Connect.

[Your editor would like to thank Linaro for travel assistance to attend this event.]

Comments (9 posted)

LC-Asia: Facebook contemplates ARM servers

By Jonathan Corbet
March 12, 2013
By any reckoning, the ARM architecture is a big success; there are more ARM processors shipping than any other type. But, despite the talk of ARM-based server systems over the last few years, most people still do not take ARM seriously in that role. Jason Taylor, Facebook's Director of Capacity Engineering & Analysis, came to the 2013 Linaro Connect Asia event to say that it may be time for that view to change. His talk was an interesting look into how one large, server-oriented operation thinks ARM may fit into its data centers.

It should come as a surprise to few readers that Facebook is big. The company claims 1 billion users across the planet. Over 350 million photographs are uploaded to Facebook's servers every day; Jason suggested that, perhaps 25% of all photos taken end up on Facebook. The company's servers handle 4.2 billion "likes," posts, and comments every day and vast numbers of users checking in. To be able to handle that kind of load, Facebook invests a lot of money into its data centers; that, in turn, has led naturally to a high level of interest in efficiency.

Facebook sees a server rack as its basic unit of computing. Those racks are populated with five standard types of server; each type is optimized for the needs of one of the top five users within the company. Basic web servers offer a lot of CPU power, but not much else, while database servers are loaded with a lot of memory and large amounts of flash storage capable of providing high I/O operation rates. "Hadoop" servers offer medium [Jason Taylor] levels of CPU and memory, but large amounts of rotating storage; "haystack" servers offer lots of storage and not much of anything else. Finally, there are "feed" servers with fast CPUs and a lot of memory; they handle search, advertisements, and related tasks. The fact that these servers run Linux wasn't really even deemed worth mentioning.

There are clear advantages to focusing on a small set of server types. The machines become cheaper as a result of volume pricing; they are also easier to manage and easier to move from one task to another. New servers can be allocated and placed into service in a matter of hours. On the other hand, these servers are optimized for specific internal Facebook users; everybody else just has to make do with servers that might not be ideal for their needs. Those needs also tend to change over time, but the configuration of the servers remains fixed. There would be clear value in the creation of a more flexible alternative.

Facebook's servers are currently all built using large desktop processors made by Intel and AMD. But, Jason noted, interesting things are happening in the area of mobile processors. Those processors will cross a couple of important boundaries in the next year or two: 64-bit versions will be available, and they will start reaching clock speeds of 2.4 GHz or so. As a result, he said, it is becoming reasonable to consider the use of these processors for big, compute-oriented jobs.

That said, there are a couple of significant drawbacks to mobile processors. The number of instructions executed per clock cycle is still relatively low, so, even at a high clock rate, mobile processors cannot get as much computational work done as desktop processors. And that hurts because processors do not run on their own; they need to be placed in racks, provided with power supplies, and connected to memory, storage, networking, and so on. A big processor reduces the relative cost of those other resources, leading to a more cost-effective package overall. In other words, the use of "wimpy cores" can triple the other fixed costs associated with building a complete, working system.

Facebook's solution to this problem is a server board called, for better or worse, "Group Hug." This design, being put together and published through Facebook's Open Compute Project, puts ten ARM processor boards onto a single server board; each processor has a 1Gb network interface which is aggregated, at the board level, into a single 10Gb interface. The server boards have no storage or other peripherals. The result is a server board with far more processors than a traditional dual-socket board, but with roughly the same computing power as a server board built with desktop processors.

These ARM server boards can then be used in a related initiative called the "disaggregated rack." The problem Facebook is trying to address here is the mismatch between available server resources and what a particular task may need. A particular server may provide just the right amount of RAM, for example, but the CPU will be idle much of the time, leading to wasted resources. Over time, that task's CPU needs might grow, to the point that, eventually, the CPU power on its servers may be inadequate, slowing things down overall. With Facebook's current server architecture, it is hard to keep up with the changing needs of this kind of task.

In a disaggregated rack, the resources required by a computational task are split apart and provided at the rack level. CPU power is provided by boxes with processors and little else — ARM-based "Group Hug" boards, for example. Other boxes in the rack may provide RAM (in the form of a simple key/value database service), high-speed storage (lots of flash), or high-capacity storage in the form of a pile of rotating drives. Each rack can be configured differently, depending on a specific task's needs. A rack dedicated to the new "graph search" feature will have a lot of compute servers and flash servers, but not much storage. A photo-serving rack, instead, will be dominated by rotating storage. As needs change, the configuration of the rack can change with it.

All of this has become possible because the speed of network interfaces has increased considerably. With networking speeds up to 100Gb/sec within the rack, the local bandwidth begins to look nearly infinite, and the network can become the backplane for computers built at a higher level. The result is a high-performance computing architecture that allows systems to be precisely tuned to specific needs and allows individual components to be depreciated (and upgraded) on independent schedules.

Interestingly, Jason's talk did not mention power consumption — one of ARM's biggest advantages — at all. Facebook is almost certainly concerned about the power costs of its data centers, but Linux-based ARM servers are apparently of interest mostly because they can offer relatively inexpensive and flexible computing power. If the disaggregated rack experiment succeeds, it may well demonstrate one way in which ARM-based servers can take a significant place in the data center.

[Your editor would like to thank Linaro for travel assistance to attend this event.]

Comments (21 posted)

SCALE: The life and times of the AGPL

By Nathan Willis
March 13, 2013

At SCALE 11x in Los Angeles, Bradley Kuhn of the Software Freedom Conservancy presented a unique look at the peculiar origin of the Affero GPL (AGPL). The AGPL was created to solve the problem of application service providers (such as Web-delivered services) skirting copyleft while adhering to the letter of licenses like the GPL, but as Kuhn explained, it is not a perfect solution.

The history of AGPL has an unpleasant beginning, middle, and end, Kuhn said, but the community needs to understand it. Many people think of the AGPL in conjunction with the "so-called Application Service Provider loophole"—but it was not really a loophole at all. Rather, the authors of the GPLv2 did not foresee the dramatic takeoff of web applications—and that was not a failure, strictly speaking, since no one can foresee the future.

In the late 1980s, he noted, client-server applications were not yet the default, and in the early 1990s, client/server applications running over the Internet were still comparatively new. In addition, the entire "copyleft hack" that makes the GPL work is centered around distribution, as it functions in copyright law. To the creators of copyleft, making private modifications to a work has never required publishing one's changes, he said, and that is the right stance. Demanding publication in such cases would violate the user's privacy.

Nevertheless, when web applications took off, the copyleft community did recognize that web services represented a problem. In early 2001, someone at an event told Kuhn "I won’t release my web application code at all, because the GPL is the BSD license of the web." In other words, a service can be built on GPL code, but can incorporate changes that are never shared with the end user, because the end user does not download the software from the server. Henry Poole, who founded the web service company Allseer to assist nonprofits with fundraising, also understood how web applications inhibited user freedom, and observed that "we have no copyleft." Poole approached the Free Software Foundation (FSF) looking for a solution, which touched off the development of what became the AGPL.

Searching for an approach

Allseer eventually changed its name to Affero, after which the AGPL is named, but before that license was written, several other ideas to address the web application problem were tossed back and forth between Poole, Kuhn, and others. The first was the notion of "public performance," which is a concept already well-established in copyright law. If running the software on a public web server is a public performance, then perhaps, the thinking went, a copyleft license's terms could specify that such public performances would require source distribution of the software.

The trouble with this approach is that "public performance" has never been defined for software, so relying on it would be somewhat unpredictable—as an undefined term, it would not be clear when it did and did not apply. Establishing a definition for "public performance" in software terms is a challenge in its own right, but without a definition for software public performance, it would be difficult to write a public performance clause into (for example) the GPL and guarantee that it was sufficiently strong to address the web application issue. Kuhn has long supported adding a public performance clause anyway, saying it would be at worst a "no op," but so far he has not persuaded anyone else.

The next idea floated was that of the Ouroboros, which in antiquity referred to a serpent eating its own tail, but in classic computer science terminology also meant a program that could generate its own source code as output. The idea is also found in programs known as quines, Kuhn said, although he only encountered the term later. Perhaps the GPL could add a clause requiring that the program be able to generate its source code as output, Kuhn thought. The GPLv2 already requires in §2(c) that an interactive program produce a copyright notice and information about obtaining the license. Thus, there was a precedent that the GPL can require adding a "feature" for the sole purpose of preserving software freedom.

The long and winding license development path

In September 2002, Kuhn proposed adding the "print your own source code" feature as §2(d) in a new revision of the GPL, which would then be published as version 2.2 (and would serve as Poole's license solution for Affero). Once the lawyers started actually drafting the language, however, they dropped the "computer-sciencey" focus of the print-your-own-source clause and replaced it with the AGPL's now-familiar "download the corresponding source code" feature requirement instead. Poole was happy with the change and incorporated it into the AGPLv1. The initial draft was "buggy," Kuhn said, with flaws such as specifying the use of HTTP, but it was released by the FSF, and was the first officially sanctioned fork of the GPL.

The GPLv2.2 (which could have incorporated the new Affero-style source code download clause) was never released, Kuhn said, even though Richard Stallman agreed to the release in 2003. The reasons the release was never made were mostly bad ones, Kuhn said, including Affero (the company) entering bankruptcy. But there was also internal division within the FSF team. Kuhn chose the "wrong fork," he said, and spent much of his time working on license enforcement actions technical work, which distracted him from other tasks. Meanwhile, other FSF people started working on the GPLv3, and the still-unreleased version 2.2 fell through the cracks.

Kuhn and Poole had both assumed that the Affero clause was safely part of the GPLv3, but those working on the license development project left it out. By the time he realized what had happened, Kuhn said, the first drafts of GPLv3 appeared, and the Affero clause was gone. Fortunately, however, Poole insisted on upgrading the AGPLv1, and AGPLv3 was written to maintain compatibility with GPLv3. AGPLv3 was not released until 2007, but in the interim Richard Fontana wrote a "transitional" AGPLv2 that projects could use to migrate from AGPLv1 to the freshly-minted AGPLv3. Regrettably, though, the release of AGPLv3 was made with what Kuhn described as a "whimper." A lot of factors—and people—contributed, but ultimately the upshot is that the Affero clause did not revolutionize web development as had been hoped.

The Dark Ages

In the time that elapsed between the Affero clause's first incarnation (in 2002) and the release of AGPLv3 (in 2007), Kuhn said, the computing landscape had changed considerably. Ruby on Rails was born, for example, launching a widely popular web development platform that had no ties to the GPL community. "AJAX"—which is now known simply as JavaScript, but at the time was revolutionary—became one of the most widely-adopted way to deliver services. Finally, he said, the possibility of venture–capital funding trained new start-ups to build their businesses on a "release everything but your secret sauce" model.

Open source had become a buzzword-compliance checkbox to tick, but the culture of web development did not pick copyleft licenses, opting instead largely for the MIT License and the three-clause BSD License. The result is what Kuhn called "trade secret software." It is not proprietary in the old sense of the word; since it runs on a server, it is not installed and the user never has any opportunity to get it.

The client side of the equation is no better; web services deliver what they call "minified" JavaScript: obfuscated code that is intentionally compressed. This sort of JavaScript should really be considered a compiled JavaScript binary, Kuhn said, since it is clearly not the "preferred form for modifying" the application. An example snippet he showed illustrated the style:

    try{function e(b){throw b;}var i=void 0,k=null;
    function aa(){return function(b){return b}}
    function m(){return function(){}}
    function ba(b){return function(a){this[b]=a}}
    function o(b){ return function(){return this[b]}}
    function p(b){return function(){return b}}var q;
    function da(b,a,c){b=b.split(".");c=c||ea;
    !(b[0]in c)&&c.execScript&&c.execScript("var "+b[0]);
    for(var d;b.length&&(d=b.shift());)
    function fa(b,a){for(var c=b.split("."),d=a||ea,f;f=c.shift();)
which is not human-readable.

Microsoft understands the opportunity in this approach, he added, noting that proprietary JavaScript can be delivered to run even on an entirely free operating system. Today, the "trade secret" server side plus "compiled JavaScript" client side has become the norm, even with services that ostensibly are dedicated to software freedom, like the OpenStack infrastructure or the git-based GitHub and Bitbucket.

In addition to the non-free software deployment itself, Kuhn worries that software freedom advocates risk turning into a "cloistered elite" akin to monks in the Dark Ages. The monks were literate and preserved knowledge, but the masses outside the walls of the monastery suffered. Free software developers, too, can live comfortably in their own world as source code "haves" while the bulk of computer users remain source code "have-nots."

One hundred years out

Repairing such a bifurcation would be a colossal task. Among other factors, the rise of web application development represents a generational change, Kuhn said. How many of today's web developers have chased a bug from the top of the stack all the way down into the kernel? Many of them develop on Mac OS X, which is proprietary but is of very good quality (as opposed to Microsoft, he commented, which was never a long term threat since its software was always terrible...).

Furthermore, if few of today's web developers have chased a bug all the way down the stack, as he suspects, tomorrow's developers may not ever need to. There are so many layers underneath a web application framework that most web developers do not need to know what happens in the lowest layers. Ironically, the success of free software has contributed to this situation as well. Today, the best operating system software in the world is free, and any teenager out there can go download it and run it. Web developers can get "cool, fun jobs" without giving much thought to the OS layer.

Perhaps this shift was inevitable, Kuhn said, and even if GPLv2.2 had rolled out the Affero clause in 2002 and he had done the best possible advocacy, it would not have altered the situation. But the real question is what the software freedom community should do now.

For starters, he said, the community needs to be aware that the AGPL can be—and often is—abused. This is usually done through "up-selling" and license enforcement done with a profit motive, he said. MySQL AB (now owned by Oracle) is the most prominent example; because it holds the copyright to the MySQL code and offers it under both GPL and commercial proprietary licenses, it can pressure businesses into purchasing commercial proprietary licenses by telling them that their usage of the software violates the GPL, even if it does not. This technique is one of the most frequent uses of the AGPL (targeting web services), Kuhn said, and "it makes me sick," because it goes directly against the intent of the license authors.

But although using the AGPL for web applications does not prevent such abuses, it is still the best option. Preserving software freedom on the web demands more, however, including building more federated services. There are a few examples, he said, including and MediaGoblin, but the problem that such services face is the "Great Marketing Machine." When everybody else (such as Twitter and Flickr) deploys proprietary web services, the resulting marketing push is not something that licensing alone can overtake.

The upshot, Kuhn said, is that "we’re back to catching up to proprietary software," just as GNU had to catch up to Unix in earlier decades. That game of catch-up took almost 20 years, he said, but then again an immediate solution is not critical. He is resigned to the fact that proprietary software will not disappear within his lifetime, he said, but he still wants to think about 50 or 100 years down the road.

Perhaps there were mistakes made in the creation and deployment of the Affero clause, but as Kuhn's talk illustrated, the job of protecting software freedom in web applications involves a number of discrete challenges. The AGPL is not a magic bullet, nor can it change today's web development culture, but the issues that it addresses are vital for the long term preservation of software freedom. The other wrinkle, of course, is that there are a wide range of opinions about what constitutes software freedom on the web. Some draw the line at whatever software runs on the user's local machine (i.e., the JavaScript components), others insist that public APIs and open access to data are what really matters. The position advocated by the FSF and by Kuhn is the most expansive, but because of it, developers now have another licensing option at their disposal.

Comments (43 posted)

Page editor: Jonathan Corbet

Inside this week's Weekly Edition

  • Security: Hockeypuck key server; New vulnerabilities in gksu-polkit, kernel, openshift, puppet, ...
  • Kernel: SO_REUSEPORT; The trouble with CAP_SYS_RAWIO; Android upstreaming.
  • Distributions: A look at openSUSE 12.3; CentOS, Kali, ubermix, wmlive, ...
  • Development: GCC's move to C++; Ardour 3.0; Emacs 24.3; feature proposals for GNOME 3.10; ...
  • Announcements: VP8 and MPEG LA, Open Source at CeBIT, LinuxGizmos, ...
Next page: Security>>

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds