January 26, 2011
This article was contributed by Nathan Willis
Free software advocates have been pushing hard against the growing trend of commercial Software-as-a-Service (SaaS) — and the resulting loss of autonomy and software freedom — for several years now. A new project named Unhosted takes a different approach to the issue than that used by better-known examples like Diaspora and StatusNet. Unhosted is building a framework in which all of a web application's code is run on the client-side, and users have the freedom to choose any remote data storage location they like. The storage nodes use strong encryption, and because they are decoupled from the application provider, users always have the freedom to switch between them or to shut off their accounts entirely.
The Unhosted approach
An outline of the service model envisioned by Unhosted can be found on the project's Manifesto page, written by founder Michiel de Jong. "A hosted website provides two things: processing and storage. An unhosted website only hosts its source code (or even just a bootloader for it). Processing is done in the browser, with ajax against encrypted cloud storage."
In other words, the manifesto continues, despite the availability of the Affero GPL (AGPL), which requires making source code available to network end-users, licensing alone is not enough to preserve user freedom because proprietary SaaS sites require users to upload their data to "walled silos" run by the service provider. An Unhosted application is a JavaScript program that runs in the browser, but accesses online storage on a compliant storage node. It does not matter to the application whether the storage node is run by the application provider, the user, or a third party.
Storage nodes are essentially commodity infrastructure, but in order to
preserve user freedom, Unhosted requires that applications encrypt and sign
the data they store.
The project defines an application-layer protocol called Unhosted JSON Juggling
Protocol (UJJP, sometimes referred to as UJ) for applications to
communicate with storage nodes, for
requesting and exchanging objects in JavaScript Object Notation (JSON) format.
As the FAQ explains, this constitutes a distinctly different model than most other free software SaaS projects. Most (like StatusNet and Diaspora) focus on federation, which allows each user to run his or her own node, and requires no centralized authority linking all of the user accounts. The down side of the federated systems are that they may still require the users to entrust their data to a remote server.
Eben Moglen's FreedomBox, on the other hand,
focuses on putting the storage under the direct control of the user
(specifically, stored at home on a self-managed box). This is a greater
degree of freedom, but home-hosting is less accessible from the Internet at
large than most web services because it often depends on Dynamic
DNS. Home-hosting is also vulnerable to limited upstream bandwidth and common ISP restrictions on running servers.
Unhosted, therefore, attempts to preserve the "accessible anywhere" nicety of popular Web 2.0 services, but de-link the application from the siloed data.
Connecting applications to storage
Obviously, writing front-end applications entirely in HTML5 and JavaScript is not a new idea. The secret sauce of Unhosted is the connection method that links the application to the remote storage node — or, more precisely, that links the application to any user-defined storage node. The system relies on Cross-Origin Resource Sharing (CORS), a W3C Working Draft mechanism by which a server can opt-in to make its resources available to requests originating from other servers.
In the canonical "web mail" example, the Unhosted storage node sees a cross-origin request from the webmail application, checks the source, user credentials, and request type against its access control list, and returns the requested data only if the request is deemed valid. UJJP defines the operations an application can perform on the storage node, including creating a new data store, setting and retrieving key-value pairs, importing and exporting data sets, and completely deleting a data store.
Security-wise, each application only has access to its own data store,
not the user's entire storage space, and CORS does allow each storage node
to determine a policy about which origins it will respond to. But beyond
that, the system also relies on the fact that the user has access to all of
the application source code, because it runs in the browser. Thus it is up
to the user to notice if the application does something sinister like relay
user credentials to an untrusted third party. Dealing with potentially
obfuscated JavaScript may be problematic for users, but it is still an improvement over server-side processing, which happens entirely out of sight.
Finally, each application needs a way to discover which storage node a user account is associated with, preferably without prompting the user for the information every time. The current Unhosted project demo code relies on Webfinger-based service discovery, which uniquely associates a user account with an email address. The user would log in to the application with an email address, the application would query the address's Webfinger identity to retrieve a JSON-formatted array of Unhosted resource identifiers, and connect to the appropriate one to find the account's data store.
This is not a perfect solution, however, because it depends on the email service provider supporting Webfinger. Other proposed mechanisms exist, including using Jabber IDs and Freedentity.
The tricky bits
Currently, one of the biggest sticking points in the system is protecting the stored data without making the system arduous for end users. The present model relies on RSA encryption and signing for all data stores. Although the project claims this is virtually transparent for users, it gets more difficult when one Unhosted application user wishes to send a message to another user. Because the other user is on a different storage node, that user's public key needs to be retrieved in order to encrypt the message. But the system cannot blindly trust any remote storage node to authoritatively verify the other user's identity — that would be trivial to hijack. In response, the Unhosted developers are working on a "fabric-based public key infrastructure" that enables users to deterministically traverse through a web-of-trust from one user ID to another. Details on that part of the system are still forthcoming.
It is also an open question as to what sort of storage engine makes a suitable base for an Unhosted storage node. The demo code includes servers written in PHP, Perl, and Python that all run on top of standard HTTP web servers. On the mailing list, others have discussed a simple way to implement Unhosted storage on top of WebDAV, but there is no reason that a storage node could not be implemented on top of a distributed filesystem like Tahoe, or a decentralized network like Bittorrent.
Perhaps the most fundamental obstacle facing Unhosted is that it eschews server-side processing altogether. Consequently, no processing can take place while the user is logged out of the application. Logged out could simply mean that the page or tab is closed, or an application could provide a logout mechanism that disconnects from the storage node, but continues to perform other functions. This is fine for interactive or message-based applications like instant messaging, but it limits the type of application that can be fit into the Unhosted mold. Judging by the mailing list, the project members have been exploring queuing up operations on the storage node side, which could enable more asynchronous functionality, but Unhosted is still not a replacement for every type of SaaS.
Actual code and holiday bake-offs
The project has a Github repository, which
is home to some demonstration code showing off both parts of the Unhosted platform — although it loudly warns users that it is not meant for production use. The "cloudside" directory includes an example Unhosted storage node implementation, while the "wappside" directory includes three example applications designed to communicate with the storage node.
The storage node module speaks CORS and is written in PHP with a MySQL back-end. It does not contain any server-side user authentication, so it should not be deployed outside the local area network, but it works as a sample back-end for the example applications.
The example application set includes a JavaScript library named unhosted.js that incorporates RSA data signing and signature verification, encryption and decryption, and AJAX communication with the CORS storage node. There is a separate RSA key generation Web utility provided as a convenience, but it is not integrated into the example applications.
The example named "wappblog" is a simple blog-updating application. It creates a client-side post editor that updates the contents of an HTML file on a storage node, which is then retrieved for reading by a separate page. The "wappmail" application is a simple web mail application, which requires you to set up multiple user accounts, but shows off the ability to queue operations — incoming messages are stored and processed when each user logs in.
The third example is an address book, which demonstrates the fabric-based PKI system (although the documentation warns "it's so new that even I don't really understand how it works, and it's mainly there for people who are interested in the geeky details").
A more practical set of example applications are the third-party projects written for Unhosted's "Hacky Holidays" competition in December. The winning entry was Nathan Rugg's Scrapbook, which allows users to manipulate text and images on an HTML canvas, and shows how an Unhosted storage node can be used to store more than just plain text. Second place was shared between the instant messenger NezAIM and the note-taking application Notes.
The fourth entry, vCards, was deemed an honorable mention, although it used some client-side security techniques that would not work in a distributed environment in the real world (such as creating access control lists on the client side). The author of vCards was commended by the team for pushing the envelope of the protocol, though — he was one of the first to experiment with queuing operations so that one Unhosted application could pass messages to another.
Hackers wanted
At this stage, Unhosted is still primarily a proof-of-concept. The storage node code is very young, and has not been subjected to much real-world stress testing or security review. The developers are seeking input for the next (0.3) revision of UJJP, in which they hope to define better access control mechanisms for storage nodes (in part to enable inter-application communication) as well as a REST API.
On a bad day, I see "unresponsive script" warnings in Firefox and think
rich client-side JavaScript applications sound like a terrible idea, but
perhaps that is missing the bigger picture. StatusNet, Diaspora, and the
other federated web services all do a good job of freeing users from
reliance on one proprietary application vendor — but none of them are
designed to make the storage underneath a flexible, replaceable commodity.
One of the Unhosted project's liveliest metaphors for its storage
de-coupling design is that it provides "a grease layer" between the hosted software and the servers that host it. That is an original idea, whether the top layer is written in JavaScript, or not.
Comments (10 posted)
By Jake Edge
January 26, 2011
After just four months since splitting away from the OpenOffice.org
project, LibreOffice has made
its first stable release. While LibreOffice 3.3 and the
(also just released)
OpenOffice.org 3.3 share most of the same code,
LibreOffice has started to differentiate itself from its progenitor. It
has also built an impressive community in that time, and will be included
in the next releases of the major community distributions.
From almost any angle, it looks like LibreOffice is on a roll.
There are quite a few new
features, as well as bug fixes, in the new release. Some of them may
not seem all that new, at least to those who have been using the
OpenOffice.org 3.3 release candidates. For some time, Linux users have
generally been getting a much-enhanced version of OpenOffice.org based on the
builds maintained by the Go-oo project.
Since Go-oo has essentially morphed into the LibreOffice project, much of
the new functionality will be found in both LibreOffice 3.3 and OpenOffice.org 3.3 on many
distributions.
For example, the SVG import feature for Writer and Draw that is listed as a
LibreOffice-only feature also appears in the latest OpenOffice.org for Fedora 14 (which is
based on OpenOffice.org 3.3-rc9 plus the Go-oo patches). It may be that Windows and
Mac OS X users are the most likely to notice big differences, depending on
where they were getting their version of OpenOffice.org (from Sun/Oracle or Go-oo).
It should also be noted that the SVG import feature still has some bugs
to be excised. On an import of the SVG of the LWN penguin, both LibreOffice and OpenOffice.org took many minutes
(on the order of ten) to render the SVG, and the rendering was incorrect. Both GIMP and
Inkscape render it in seconds (or less) and are both in agreement that it
should look much the way it does in the upper left of this page.
I gave LibreOffice 3.3 a try on Fedora 14. Not finding any experimental LibreOffice yum
repository in a quick search, I decided to go ahead and download the tarball, which
provided a handful of (unsigned) RPMs. After installing those, there is an additional "desktop-integration"
RPM to install, which conflicted with the various OpenOffice.org packages that were still
installed. After a moment's thought, I went ahead and removed OpenOffice.org, which
proved uneventful as LibreOffice is a drop-in replacement.
Working with various documents and spreadsheets in LibreOffice was also
uneventful, which is no real surprise. It's not clear what differences
there are between Fedora's OpenOffice.org 3.3 and LibreOffice 3.3, but they were not
particularly evident in the (fairly simple) documents that I worked with.
For power users, perhaps there are more obvious differences. But there is
also no reason to go back to OpenOffice.org that I can see. Apart from the
LibreOffice start-up splash screen, it really isn't apparent that you
aren't running OpenOffice.org.
Lots of Linux users will likely be using LibreOffice soon anyway, as
Ubuntu, openSUSE, and Fedora all plan to ship it in their next release.
openSUSE 11.4 is currently scheduled for March, so it may well be the first
of those to switch over to LibreOffice. But Ubuntu 11.04 ("Natty Narwhal")
and Fedora 15 won't be far behind, with the former scheduled for April and
the latter for May. Debian "Squeeze" (6.0) will still be shipping
OpenOffice.org (3.2.x), which is not surprising given the stability that
Debian likes to bake into its releases.
Looking beyond the 3.3 release, LibreOffice has a fairly aggressive
release schedule, with plans for 3.3.1 in February and 3.3.2 in March,
both of which will consist mostly of bug fixes.
There are also plans for a more major 3.4 release in May. Over time, the
plan is to
do major releases every six months and to try to align them with distribution
release cycles by making those releases in March and September.
The biggest new feature in LibreOffice 3.3 is probably the SVG import
mentioned earlier. Another is the ability to have spreadsheets with up to
a million rows (the previous limit was 64K). Many of the rest seem like
they will be popular with a much smaller subset of LibreOffice users,
though the improved support for importing MS Works, Lotus Word Pro, and
WordPerfect formats will likely strike a chord with folks who have to deal
with documents in those formats.
Many of the new features listed seem like longstanding bugs (or
misfeatures) of OpenOffice.org that are finally being addressed. An easier
to use title page dialog box, a tree view for headings, better slide layout
handling for Impress, radio button widgets in the menus, auto-correction
that correctly matches the case of the word replaced, and so on, all seem
like things that have been bugging users for some time but weren't getting
addressed in the OpenOffice.org releases.
The ability to address some of these warts is part of why LibreOffice
exists. The large number of patches that were carried along by the Go-oo
project was not going to be a sustainable model, and the development style
of the OpenOffice.org project made it unable, or unwilling, to incorporate many
of those kinds of changes, at least quickly. The LibreOffice developers have clearly
learned from that experience and are trying to fix these kinds of things as
quickly as reasonable patches are submitted.
One of the goals of the LibreOffice project is to be welcoming to new
developers and their patches. That's part of the reason that there is no
contributor agreement required for LibreOffice patches. But the welcoming
approach goes beyond that. The now semi-famous list of "easy hacks" as an
easy introduction for developers (and others) is a perfect example. Many
projects would probably find it easier to get people involved by
maintaining a similar list.
There is also an active development
mailing list, with discussions about all kinds of
patches, bugs, and features. There are other mailing
lists for users, design, marketing, documentation, and so on, along
with an active #documentfoundation IRC channel on Freenode.
Some friction is to be expected in the formative stages of a new project
and LibreOffice is not immune to that. The OOXML debate was one such
incident. In addition, steering committee member Florian Effenberger alludes
to some unhappiness in the community about the role of that
committee. Project governance is by no means a solved problem, and
community members will often disagree about the direction of the project
and its leadership. That certainly isn't just a problem for new projects as
the current turmoil in the
FFmpeg project will attest.
OpenOffice.org is still chugging along, but a look at its mailing lists
suggests that there is far less enthusiasm in that community than
LibreOffice's. That may not be a good way to measure, or even estimate,
a community's fervor, but it definitely seems like the wind has gone
out of OpenOffice.org's sails. Oracle has an interest in continuing
Oracle Open Office (formerly StarOffice)—the commercial
offshoot of
OpenOffice.org—development, but one has to wonder how long it will
be willing to maintain an
open source edition.
Because of the "corporate" development style and the contributor agreement
requirements for OpenOffice.org—two of the major reasons that
LibreOffice forked—it seems likely that external contributions, such
as they were, will be on the decline. The two projects have the
same LGPLv3
license, so code can theoretically migrate between them, but new features that go into
LibreOffice may not make their way into OpenOffice.org because of the
contributor agreement. That means that LibreOffice can cherry-pick
features from OpenOffice.org, at least as long as the code bases don't
diverge too much, while OpenOffice.org has to either forgo or reimplement
them. Should LibreOffice be successful, it will provide a pretty clear
object lesson on the perils of requiring contributor agreements.
Overall, the progress made by LibreOffice has been very impressive.
Obviously the Go-oo project (and OpenOffice.org itself) gave the
LibreOffice founders a good starting
point—and a lot of lessons and experience—but that doesn't
diminish what has been accomplished at all. One can only imagine the
strides that will be made over the next year or two. It will no doubt be
interesting to see where it goes from here.
Comments (1 posted)
By Jonathan Corbet
January 25, 2011
Vint Cerf is widely credited as one of the creators of the Internet.
So, when he stood up at linux.conf.au in Brisbane to say that the net is
currently in need of some "serious evolution," the attendees were more than
prepared to listen. According to Vint, it is not too late to create a
better Internet, despite the fact that we have missed a number of
opportunities to update the net's infrastructure. Quite a few problems
have been discovered over the years, but the solutions are within our
reach.
His talk started back in 1969, when he was hacking
SIGMA 7 to make it talk to the ARPAnet's first Internet message
processor (IMP). The net has
grown a little since then; current numbers suggest that there are around
768 million connected machines - and that doesn't count the vast
numbers of systems with transient connections or which are hiding behind
corporate firewalls. Nearly 2 billion users have access to the net.
But, Vint said, that just means that the majority of the world's population
is still waiting to connect to the net.
From the beginning, the net was designed around the open architecture ideas
laid down by Bob Kahn. Military requirements were at the top of the list
then, so the designers of the net created a system of independent networks
connected via routers with no global control. Crucially, the designers had
no particular application in mind, so there are relatively few assumptions
built into the net's protocols. IP packets have no understanding of what
they carry; they are just hauling loads of bits around. Also important was
the lack of any country-based addressing scheme. That just would not make
sense in the military environment, Vint said, where it can be very
difficult to get an address space allocation from a country which one is
currently attacking.
The openness of the Internet was important: open source, open
access, and open standards. But Vint was also convinced from an early date
that commercialization of the Internet had to happen. There was no way
that governments were going to pay for Internet access for all their
citizens, so a commercial ecosystem had to be established to build that
infrastructure.
The architecture of the network has seen some recent changes. At the top
of the list is IPv6. Vint was, he said, embarrassed to be the
one who decided, in 1977, that 32 bits would be more than enough for
the net's addressing system. Those 32 bits are set to run out any day
now, so, Vint said, if you're not doing IPv6, you should be. We're seeing
the slow adoption of non-Latin domain names and the DNSSEC protocol. And,
of course, there is the increasing prominence of mobile devices on the net.
One of the biggest problems which has emerged from the current Internet
is security. He was "most disturbed" that many of the problems are
not technical, they are a matter of suboptimal user behavior - bad
passwords, for example. He'd like to see the widespread use of two-factor
authentication on the net; Google is doing that internally now, and may try
to support it more widely for use with Google services. The worst
problems, he said, come from "dumb mistakes" like configuration errors.
So where are security problems coming from? Weak operating systems are
clearly a part of the problem; Vint hoped that open-source systems would
help to fix that. The biggest problem at the moment, though, is browsers.
Once upon a time, browsers were simple rendering engines which posed little
threat; now, though, they contain interpreters and run programs from the
net. The browser, he said, has too much privilege in the system; we need a
better framework in which to securely run web-based applications. Botnets
are a problem, but they are really just a symptom of easily-penetrated
systems. We all need to work on the search for better solutions.
Another big issue is privacy. User choices are a part of the problem here;
people put information into public places without realizing that it could
come back to haunt them later. Weak protection of information by third
parties is also to blame, though. But, again, technology isn't the
problem; it's more of a policy issue within businesses. Companies like
Google and others have come into possession of a great deal of
privacy-sensitive information; they need to protect it accordingly.
Beyond that, there's the increasing prevalence of "invasive devices,"
including cameras, devices with location sensors, and more. It is going to
be increasingly difficult to protect our privacy in the future; he
expressed worries that it may simply not be possible.
There was some talk about clouds. Cloud computing, he said, has a lot of
appeal. But each cloud is currently isolated; we need to work on how
clouds can talk to each other. Just as the Internet was created through
the connection of independent networks, perhaps we need an "intercloud" (your
editor's term - he did not use it) to
facilitate collaboration between clouds.
Vint had a long list of other research problems which have not been solved;
there was not time to talk about them all. But, he says, we have
"unfinished work" to deal with. This work can be done on the existing
network - we do not need to dump it and start over.
So what is this unfinished work? "Security at all levels" was at the top
of the list; if we can't solve the security problem, it's hard to see that
the net will be sustainable in the long run. We currently have no
equivalent to the Erlang
distribution to describe usage at the edges of the network, making
provisioning and scaling difficult. The quality of service (and network
neutrality) debate, he said, will be going on for a very long time. We
need better distributed algorithms to take advantage of mixed cloud
environments.
There were, he said, some architectural mistakes made which are now making
things harder. When the division was made between the TCP and IP layers,
it was decided that TCP would use the same addressing scheme as IP. That
was seen as a clever design at the time; it eliminated the need to
implement another layer of addressing at the TCP level. But it was a
mistake, because it binds higher-level communications to whatever IP
address was in use when the connection was initiated. There is no way to
move a device to a new address without breaking all of those connections.
In the designers' defense, he noted, the machines at the time, being
approximately room-sized, were not particularly mobile. But he wishes they
had seen mobile computing coming.
Higher-level addressing could still be fixed by separating the address used
by TCP from that used by IP. Phone numbers, he said, once were tied to a
specific location; now they are a high-level identifier which can be
rebound as a phone moves. The same could be done for network-attached
devices. Of course, there are problems to be solved - for example, it must
be possible to rebind a TCP address to a new IP address in a way which does
not expose users to session hijacking. This sort of high-level binding
would also solve the multi-homing and multipath problems; it would be
possible to route a single connection transparently through multiple ISPs.
Vint would also like to see us making better use of the net's broadcast
capabilities. Broadcast makes sense for real-time video, but it could be
applied in any situation where multiple users are interested in the same
content - for software updates, for example. He described the use of
satellites to "rain packets" to receivers; it is, he said, something which
could be done today.
Authentication remains an open issue; we need better standards and some
sort of internationally-recognized indicators of identity. Internet
governance was on the list; he cited the debate over network censorship in
Australia as an example. That sort of approach, he said, is "not very
effective." He said there may be times when we (for some value of "we")
decide that certain things should not be found on the net; in such
situations, it is best to simply remove such materials when they are
found. There is no hope in any attempt to stop the posting of undesirable
material in the first place. Governance, he said, will only become more
important in the future; we need to find a way to run the net which
preserves its fundamental openness and freedom.
Performance: That just gets harder as the net gets bigger; it can be
incredibly difficult to figure out where things are going wrong. He said
that he would like a button marked "WTF" on his devices; that button could
be pressed when the net isn't working to obtain a diagnosis of what the
problem is. But, to do that, we need better ways of identifying
performance problems on the net.
Addressing: what, he asked, should be addressable on the Internet?
Currently we assign addresses to machines, but, perhaps, we should assign
addresses to digital objects as well? A spreadsheet could have its own
address, perhaps. One could argue that a URL is such an address, but URLs
are dependent on the domain name system and can break at any time.
Important objects should have locators which can last over the long
term.
Along those lines, we need to think about the long-term future of complex
digital objects which can only be rendered with computers. If the software
which can interpret such an object goes away, the objects themselves
essentially evaporate. He asked: will Windows 3000 be able to interpret a
1997 Powerpoint file? We should be thinking about how these files will
remain readable over the course of thousands of years. Open source can
help in this regard, but proprietary applications matter too. He suggested
that there should be some way to "absorb" the intellectual property of
companies which fail, making it available so that files created by that
company's software remain readable. Again, Linux and open source have
helped to avoid that problem, but they are not a complete solution. We
need to think harder about how we will preserve our "digital stuff"; he is
not sure what the solution will look like.
Wandering into more fun stuff, Vint talked a bit about the next generation
of devices; a network-attached surfboard featured prominently. He talked
some about the sensor network in his house, including the all-important
temperature sensor which sends him a message if the temperature in his wine
cellar exceeds a threshold. But he'd like more information; he knows about
temperature events, or whether somebody entered the room, but there's no
information about what may have happened in the cellar. So maybe it is
time to put RFIDs on the bottles themselves. But that won't help him to
know if a specific bottle has gotten too warm; maybe it's time to put
sensors into the corks to track the state of the wine. Then he could
unerringly pick out a ruined bottle whenever he had to give a bottle of
wine to somebody who is unable to tell the difference.
The talk concluded with some discussion of the interplanetary network.
There was some amusing talk of alien porn and oversized ovipositors, but
the real problem is one of arranging for network communications within the
solar system. The speed of light is too slow, meaning that the one-way
latency to Mars is, at a minimum, about three minutes (and usually quite a
bit more). Planetary rotation can interrupt communications to specific
nodes; rotation, he says, is a problem we have not yet figured out how to
solve. So we need to build tolerance of delay and disruption deep into our
protocols. Some thoughts on this topic have been set down in RFC 4838, but there is more
to be done.
We should also, Vint said, build network nodes into every device we send
out into space. Even after a device ceases to perform its primary
function, it can serve as a relay for communications. Over time, we could
deploy a fair amount of network infrastructure in space with little added
cost. That is a future he does not expect to see in its full form, but he
would be content to see its beginning.
There was a question from the audience about bufferbloat. It is,
Vint said, a "huge problem" that can only be resolved by getting device
manufacturers to fix their products. Ted Ts'o pointed out that LCA
attendees had been advised (via a leaflet in the conference goodie bag) to
increase buffering on their systems as a way of getting better network
performance in Australia; Vint responded that much harm is done by people
who are trying to help.
Comments (107 posted)
By Jonathan Corbet
January 26, 2011
Geoff Huston is the Chief Scientist at the Asia Pacific Network Information
Centre. His frank linux.conf.au 2011 keynote took a rather different tack than
Vint Cerf's talk did the day before.
According to Geoff, Vint is "a professional optimist." Geoff was not even
slightly optimistic; he sees a difficult period coming for the net; unless
things happen impossibly quickly, the open net that we often take for
granted may be gone forevermore.
The net, Geoff said, is based on two "accidental technologies": Unix and
packet switching. Both were new at their time, and both benefited from
open-source reference implementations. That openness created a network
which was accessible, neutral, extensible, and commercially exploitable.
As a result, proprietary protocols and systems died, and we now have a
"networking monoculture" where TCP/IP dominates everything. Openness was
the key: IPv4 was as mediocre as any other networking technology at that
time. It won not through technical superiority, but because it was open.
But staying open can be a real problem. According to Geoff, we're about to
see "another fight of titans" over the future of the net; it's not at all
clear that we'll still have an open net five years from now. Useful
technologies are not static, they change in response to the world around
them. Uses of technologies change: nobody expected all the mobile
networking users that we now have; otherwise we wouldn't be in the
situation we're in where, among other things, "TCP over wireless is crap."
There are many challenges coming. Network neutrality will be a big fight,
especially in the US. We're seeing more next-generation networks based
around proprietary technologies. Mobile services tend to be based on
patent-encumbered, closed applications. Attempts to bundle multiple types
of services - phone, television, Internet, etc. - are pushing providers
toward closed models.
The real problem
But the biggest single challenge by far is the simple fact that we are out
of IP addresses.
There were 190 million IP addresses allocated in 2009, and 249 million
allocated in 2010. There are very few addresses left at this time: IANA
will run out of IPv4 addresses in early February, and the regional
authorities will start running out in July. The game is over.
Without open addressing, we don't have an open network
that anybody can join. That, he said, is "a bit of a bummer."
This problem was foreseen back in 1990; in response, a nice plan - IPv6 - was
developed to ensure that we would never run out of network addresses. That
plan assumed that the transition to IPv6 would be well underway by the time
that IPv4 addresses were exhausted. Now that we're at that point, how is
that plan
going? Badly: currently 0.3% of the systems on the net are running IPv6. So,
Geoff said, we're now in a position where we have to do a full transition
to IPv6 in about seven months - is that feasible?
To make that transition, we'll have to do more than assign IPv6 addresses
to systems. This technology will have to be deployed across something like
1.8 billion people, hundreds of millions of routers, and more.
There's lots of fun system administration work to be done; think about all
of the firewall configuration scripts which need to be rewritten. Geoff's
question to the audience was clear: "you've got 200 days to get this done -
what are you doing here??"
Even if the transition can be done in time, there's another little problem:
the user experience of IPv6 is poor.
It's slow, and often unreliable. Are we really going to go through 200
days of panic to get to a situation which is, from a user point of view,
worse than what we have now? Geoff concludes that IPv6 is simply not the
answer in that time frame - that transition is not going to happen. So
what will we do instead?
One commonly-suggested approach is to make much heavier use of network
address translation (NAT) in routers. A network sitting behind a NAT
router does not have globally-visible addresses; hiding large parts of the
net behind multiple layers of NAT can thus reduce the pressure on the
address space. But it's not quite that simple.
Currently, NAT routers are an externalized cost for Internet service
providers; they are run by customers and ISP's need not worry about them.
Adding more layers of NAT will force ISPs to install those routers. And,
Geoff said, we're not talking about little NAT routers - they have to be
really big NAT routers which cannot fail. They will not be cheap. Even
then, there are problems: multiple levels of NAT will break applications
which have been carefully crafted to work around a single NAT router. How
NAT routers will play together is unclear - the IETF refused to standardize
NAT, so every NAT implementation is creative in its own way.
It gets worse: adding more layers of NAT will break the net in
fundamental ways. Every connection through a NAT router requires a port on
that router; a single web browser can open several connections in an
attempt to speed page loading. A large NAT router will have to handle large
numbers of connections simultaneously, to the point that it will run out of
port numbers. Ports numbers are only 16 bits, after all. So ISPs are
going to have to think about how many ports they will make available to
each customer; that number will converge toward "one" as the pressure
grows. Our aperture to the net, Geoff said, is shrinking.
So perhaps we're back to IPv6, somehow. But there is no compatibility
between IPv4 and IPv6, so systems will have to run both protocols during
the transition. The transition plan, after all, assumed that it would be
completed before IPv4 addresses ran out. But that plan did not work; it
was, Geoff said, "economic rubbish." But we're going to have to live with
the consequences, which include running dual stacks for a transition period
that, he thinks, could easily take ten years.
During that time, we're going to have to figure out how to make our
existing IPv4 addresses last longer. Those addresses, he said, are going
to become quite a bit more expensive. There will be much more use of NAT,
and, perhaps, better use of current private addresses. Rationing policies
will be put into place, and governmental regulation may well come into
play. And, meanwhile, we know very little about the future we're heading
into. TCP/IP is a monoculture, there is nothing to replace it. We don't
know how long the transition will take, we don't know who the winners and
losers will be, and we don't know the cost. We live in interesting times.
An end to openness
Geoff worried that, in the end, we may never get to the point where we have
a new, IPv6 network with the same degree of openness we have now. Instead,
we may be heading toward a world where we have privatized large parts of
the address space. The problem is this: the companies which have lost the
most as the result of the explosion of the Internet - the carriers - are
now the companies which are expected to fund and implement the transition
to IPv6. They are the ones who have to make the investment to bring this
future around; will they really spend their money to make their future
worse? These companies have no motivation to create a new, open network.
So what about the companies which have benefited from the open net:
companies like Google, Amazon, and eBay? They are not going to rescue us
either for one simple reason: they are now incumbents. They have no
incentive to spend money which will serve mainly to enable more
competitors. They are in a position where they can pay whatever it
takes to get the address space they need; a high cost to be on the net, is,
for them, a welcome barrier to entry that will keep competition down. We
should not expect help from that direction.
So perhaps it is the consumers who have to pay for this transition. But
Geoff did not see that as being realistic either. Who is going to pay
$20/month more for a dual-stack network which works worse than the one they
have now? If one ISP attempts to impose such a charge, customers will flee
to a competitor which does not. Consumers will not fund the change.
So the future looks dark. The longer we toy with disaster, Geoff said, the
more likely it is that the real loser will be openness. It is not at all
obvious that we'll continue to have an open net in the future. He doesn't
like that scenario; it is, he said, the worst possible outcome. We all
have to get out there, get moving, and fix this problem.
One of the questions asked was: what can we do? His answer was that we
really need to make better software: "IPv6 sucks" currently. Whenever IPv6
is used, performance goes down the drain. Nobody has yet done the work to
make the implementations truly robust and fast. As a result, even systems
which are capable of speaking both protocols will try to use IPv4 first;
otherwise, the user experience is terrible. Until we fix that problem,
it's hard to see how the transition can go ahead.
Comments (198 posted)
Page editor: Jonathan Corbet
Security
By Jake Edge
January 26, 2011
Web site visits are increasingly being tracked by advertisers and others
ostensibly to better target advertising. But recording which sites we
visit as we click our way around the web is not only an invasion of
privacy, but one
that has multiple avenues for abuse. Both Mozilla and Google have recently
announced browser features that could reduce or eliminate tracking—at
least
for advertisers that comply.
Using a wide variety of techniques: browser or Flash cookies, web "bugs",
JavaScript trickery, browser fingerprinting, and so forth, advertising and
tracking companies are getting a detailed look at the web sites we visit.
Most web advertising also provides a means to track web site visitors on a
wide variety of sites, not just the single site where that particular ad
appears. It is somewhere between difficult
and impossible for users to stop this behavior, if they even know it is
taking place. The information is then stored by these third-parties for
their use—or to sell to others
What privacy advocates would like is a way for users to opt-out of
tracking. It would be better still if users had to opt-in to tracking, but
an initiative like that is vanishingly unlikely because of opposition from
advertising/tracking companies. A subset of advertising companies have come together in a group
called the Network Advertising Initiative (NAI), which provides an opt-out
service to disable tracking by member companies. That web page gives
an eye-opening list of advertisers and the status of their cookies in your
browser. On can then choose which to opt-out from (with a helpful "Select
All" button if one is willing to turn on JavaScript for that site).
There are a number of problems with the NAI approach, as outlined
in a recent Electronic Frontier Foundation (EFF) blog posting. The biggest
problem from a privacy perspective is that some members interpret
opting out differently than others:
Some tracking companies recognize
that an "opt out" should be an opt out from being tracked, others insist on
interpreting the opt out as being an opt out for receiving targeted
advertising. In other words, the NAI allows its members to to tell people
that they've opted out, when in fact their web browsing is still being
observed and recorded indefinitely.
Another problem is that the opt-out choice is recorded in a cookie for each
different advertising or tracking company, so one must visit that page
frequently as additional companies join the NAI. Privacy conscious users
will also periodically delete their cookies, which also necessitates
revisiting that page. Overall, it is a fairly fragile solution.
Google's idea
is to provide a Chrome extension ("Keep
My Opt-Outs") that blocks the deletion of the opt-out
cookies (both browser and Flash cookies) so that users can still delete the
rest of their cookies without having to re-up at the NAI web site. It is
fundamentally just a list of cookies that shouldn't be deleted, and that
list will need to be updated periodically, presumably through the extension
update mechanism. It is similar to the Beef
TACO (Targeted Advertising Cookie Opt-Out) Firefox extension, though
TACO handles more than just the NAI-listed companies' cookies.
Keep My Opt-Outs and TACO are useful today, though they can't address
the problem of differing interpretations of the opt-out. Mozilla has gone
a step further and implemented a more sweeping
change
with its "Do Not
Track" HTTP header. Do Not Track is going to require buy-in from other
browsers and the
tracking companies before it can even work, but it "solves" the problem in
a much simpler way.
The basic idea is straightforward: a user can indicate that they do not wish to be
tracked and Firefox will send a Do Not Track HTTP header with every
request. That header could be interpreted by the tracking companies as the
equivalent of their opt-out cookies. It would be even better if they
interpreted it to mean what it clearly says and would turn off all tracking,
rather than just turning off targeted (i.e. behavioral) advertising. The
latter will undoubtedly take some major convincing—or regulatory pressure.
Using an HTTP header for this purpose is a far superior technical solution
in that users (or their browsers) don't have to keep track of lists of
advertisers and their cookies, while clearly indicating to the web sites
that the user has
requested that tracking be disabled. No new cookies need to be installed
or preserved and violators will be fairly easily spotted. While the EFF has made
it clear that it is backing the Do Not Track header approach, there are
still several groups that will need to be convinced: advertising networks,
tracking companies, and browser makers (some
of which run their own ad networks: Google and Apple).
Though there are already Firefox extensions that implement the
X-Do-Not-Track header (and the related
X-Behavioral-Ad-Opt-Out header), like Universal
Behavioral Advertising Opt-out and NoScript,
but, for now at least, they are just "feel good" extensions. It remains to
be seen if the NAI and other advertisers/trackers start to handle these
headers. One might guess they would be resistant—probably will
be—but there's no real reason to believe that users would opt-out in
droves. There are also reasonable arguments
that Do Not Track will have a
minimal impact on online advertising.
Of course, even if there were, miraculously, full adoption by advertisers
or, rather less miraculously, regulations from the US Federal Trade
Commission (FTC) and other, similar, agencies
that require advertisers to adopt it, there will still be some amount of
tracking. Whether those violators are outside of the FTC's jurisdiction
or just flying below the radar, clickstream information has value and there
will always be those trying to extract that value. Unfortunately, there doesn't seem to be any
possible technical—or regulatory—solution to that particular problem.
Comments (7 posted)
Brief items
Advocates for data retention typically focus narrowly on the benefits
afforded to law enforcement without accounting for the massive costs and
extreme security risks that come with storing significant quantities of
data about every Internet user — databanks that will prove to be
irresistible not only to government investigators but also civil litigants
(read: ex-spouses, insurance companies, disgruntled neighbors) and
malicious hackers of every stripe. A legal obligation to log users'
Internet use, paired with
weak federal privacy
laws that allow the
government to easily obtain those records, would dangerously expand the
government's ability to surveil its citizens, damage privacy, and chill
freedom of expression.
--
Electronic
Frontier Foundation in its Deeplinks blog
We first jumped on the OpenID bandwagon back in 2007 when it was seen as a
promising way to make logging into websites simpler. What we've learned
over the past three years is that it didn't actually make anything any
simpler for the vast majority of our customers. Instead it just made things
harder. Especially when people were having problems with the often flaky
OpenID providers and couldn't log into their account. OpenID has been a
burden on support since the day it was launched.
--
37signals
drops OpenID support
Comments (12 posted)
The Electronic Frontier Foundation has sent out
a
release on mobile device security, noting that open devices can be made more
secure even if the original vendor is not interested. "
By contrast,
mobile systems lag far behind the established industry standard for open
disclosure about problems and regular patch distribution. For example,
Google has never made an announcement to its android-security-announce
mailing list, although of course they have released many patches to resolve
many security problems, just like any OS vendor. But Android open source
releases are made only occasionally and contain security fixes unmarked, in
among many other fixes and enhancements."
Comments (22 posted)
New vulnerabilities
awstats: arbitrary code injection
| Package(s): | awstats |
CVE #(s): | CVE-2010-4369
|
| Created: | January 24, 2011 |
Updated: | February 21, 2011 |
| Description: |
From the Ubuntu advisory:
It was discovered that AWStats did not correctly filter the LoadPlugin
configuration option. A local attacker on a shared system could use this
to inject arbitrary code into AWStats.
|
| Alerts: |
|
Comments (none posted)
dpkg: symlink attack
| Package(s): | dpkg |
CVE #(s): | CVE-2011-0402
|
| Created: | January 24, 2011 |
Updated: | January 26, 2011 |
| Description: |
From the CVE entry:
dpkg-source in dpkg before 1.14.31 and 1.15.x allows user-assisted remote attackers to modify arbitrary files via a symlink attack on unspecified files in the .pc directory. |
| Alerts: |
|
Comments (none posted)
fuse: denial of service
| Package(s): | fuse |
CVE #(s): | CVE-2010-3879
|
| Created: | January 20, 2011 |
Updated: | April 29, 2013 |
| Description: |
From the Ubuntu advisory:
It was discovered that FUSE could be tricked into incorrectly updating the
mtab file when mounting filesystems. A local attacker, with access to use
FUSE, could unmount arbitrary locations, leading to a denial of service.
|
| Alerts: |
|
Comments (none posted)
libuser: default user password
| Package(s): | libuser |
CVE #(s): | CVE-2011-0002
|
| Created: | January 20, 2011 |
Updated: | April 21, 2011 |
| Description: |
From the Red Hat advisory:
It was discovered that libuser did not set the password entry correctly
when creating LDAP (Lightweight Directory Access Protocol) users. If an
administrator did not assign a password to an LDAP based user account,
either at account creation with luseradd, or with lpasswd after account
creation, an attacker could use this flaw to log into that account with a
default password string that should have been rejected. (CVE-2011-0002)
|
| Alerts: |
|
Comments (none posted)
openoffice.org: multiple vulnerabilities
| Package(s): | openoffice.org |
CVE #(s): | CVE-2010-3450
CVE-2010-3451
CVE-2010-3452
CVE-2010-3453
CVE-2010-3454
CVE-2010-3689
CVE-2010-4253
CVE-2010-4643
|
| Created: | January 26, 2011 |
Updated: | May 9, 2011 |
| Description: |
From the Debian advisory:
During an internal security audit within Red Hat, a directory traversal vulnerability has been discovered in the way OpenOffice.org 3.1.1 through 3.2.1 processes XML filter files. If a local user is tricked into opening a specially-crafted OOo XML filters package file, this problem could allow remote attackers to create or overwrite arbitrary files belonging to local user or, potentially, execute arbitrary code. (CVE-2010-3450)
During his work as a consultant at Virtual Security Research (VSR), Dan Rosenberg discovered a vulnerability in OpenOffice.org's RTF parsing functionality. Opening a maliciously crafted RTF document can caus an out-of-bounds memory read into previously allocated heap memory, which may lead to the execution of arbitrary code. (CVE-2010-3451)
Dan Rosenberg discovered a vulnerability in the RTF file parser which can be leveraged by attackers to achieve arbitrary code execution by convincing a victim to open a maliciously crafted RTF file. (CVE-2010-3452)
As part of his work with Virtual Security Research, Dan Rosenberg discovered a vulnerability in the WW8ListManager::WW8ListManager() function of OpenOffice.org that allows a maliciously crafted file to cause the execution of arbitrary code. (CVE-2010-3453)
As part of his work with Virtual Security Research, Dan Rosenberg discovered a vulnerability in the WW8DopTypography::ReadFromMem() function in OpenOffice.org that may be exploited by a maliciously crafted file which allowins an attacker to control program flow and potentially execute arbitrary code. (CVE-2010-3454)
Dmitri Gribenko discovered that the soffice script does not treat an empty LD_LIBRARY_PATH variable like an unset one, may lead to the execution of arbitrary code. (CVE-2010-3689)
A heap based buffer overflow has been discovered with unknown impact. (CVE-2010-4253)
A vulnerability has been discovered in the way OpenOffice.org handles TGA
graphics which can be tricked by a specially crafted TGA file that could
cause the program to crash due to a heap-based buffer overflow with unknown
impact. (CVE-2010-4643) |
| Alerts: |
|
Comments (none posted)
perl-Convert-UUlib: buffer overflow
| Package(s): | perl-Convert-UUlib |
CVE #(s): | |
| Created: | January 20, 2011 |
Updated: | January 26, 2011 |
| Description: |
From the Fedora advisory:
Fix a one-byte-past-end-write buffer overflow in UURepairData (reported, analysed and testcase
provided by Marco Walther)
|
| Alerts: |
|
Comments (none posted)
request-tracker: unsalted password hashing
| Package(s): | request-tracker3.6 |
CVE #(s): | CVE-2011-0009
|
| Created: | January 24, 2011 |
Updated: | May 25, 2012 |
| Description: |
From the Debian advisory:
It was discovered that Request Tracker, an issue tracking system,
stored passwords in its database by using an insufficiently strong
hashing method. If an attacker would have access to the password
database, he could decode the passwords stored in it.
|
| Alerts: |
|
Comments (none posted)
tomcat: cross-site scripting
| Package(s): | tomcat6 |
CVE #(s): | CVE-2010-4172
|
| Created: | January 24, 2011 |
Updated: | May 19, 2011 |
| Description: |
From the Ubuntu advisory:
It was discovered that Tomcat did not properly escape certain parameters in
the Manager application which could result in browsers becoming vulnerable
to cross-site scripting attacks when processing the output. With cross-site
scripting vulnerabilities, if a user were tricked into viewing server
output during a crafted server request, a remote attacker could exploit
this to modify the contents, or steal confidential data (such as
passwords), within the same domain.
|
| Alerts: |
|
Comments (none posted)
wordpress: cross-site scripting
| Package(s): | wordpress |
CVE #(s): | CVE-2010-4536
|
| Created: | January 20, 2011 |
Updated: | January 26, 2011 |
| Description: |
From the Red Hat bugzilla entry:
A Cross-site scripting(XSS) flaw was found in KSES, which
is the wordpress HTML sanitation library.
|
| Alerts: |
|
Comments (none posted)
wordpress-mu: multiple cross-site scripting vulnerabilities
| Package(s): | wordpress-mu |
CVE #(s): | CVE-2010-4536
|
| Created: | January 24, 2011 |
Updated: | January 26, 2011 |
| Description: |
From the CVE entry:
Multiple cross-site scripting (XSS) vulnerabilities in KSES, as used in WordPress before 3.0.4, allow remote attackers to inject arbitrary web script or HTML via vectors related to (1) the & (ampersand) character, (2) the case of an attribute name, (3) a padded entity, and (4) an entity that is not in normalized form. |
| Alerts: |
|
Comments (none posted)
Page editor: Jake Edge
Kernel development
Brief items
The current development kernel is 2.6.38-rc2,
released on January 21.
"
Anyway. -rc2 is out there, and the only reason it's reasonably sized
is that it was a short -rc2 - I think a few of the pull requests I got were
a bit larger than I would have been happy with. And I might as well warn
people that because the laptop I'm bringing with me is pitifully slow, I'm
also planning on going into 'anal' mode, and not even bother pulling from
trees unless they are clearly -rc material. IOW, don't try to push large
pushes on me. I won't take them, and they can wait for 39." See the
announcement for the short changelog, or
the
full changelog for all the details.
Stable updates: there have been no stable or longterm updates
released in the last week.
Comments (none posted)
We care about everything. If the objective was to make life easier
for ourselves, we'd all be on the golf course.
--
Andrew Morton
No way in hell do I want the situation of "the system is screwed,
so let's overwrite the disk" to be something the kernel I release
might do. It's crazy.
--
Linus Torvalds
Nice to see it gone - it seemed such a good idea in Linux 1.3
--
Alan Cox won't miss the BKL
Comments (3 posted)
By Jonathan Corbet
January 26, 2011
The removal of the big kernel lock (BKL) has been on the kernel
community's "to do" list almost since that lock was first added to make the
kernel work on multiprocessor systems. Over time, the significance of the
lock has diminished as finer-grained locking was added to various kernel
subsystems, but the BKL itself has endured. Getting rid of it for good
remained desirable because the BKL can still cause unwanted latencies at
times. There's also a certain amount of pride involved in completing the
job. That completion has been long in coming, though; once the worst
performance issues associated with the BKL were resolved, interest in doing
the low-level work needed to finish the job declined.
Two years or so ago, though, developers started working on BKL removal
again. Some of this work was motivated by the realtime tree, where
patience with latency sources is rather more limited. Still, it seemed
like completion remained a distant goal; hundreds of BKL call sites
remained in the kernel.
Then Arnd Bergmann took on the task of eliminating the BKL entirely. His
cleanup work has been going on for some time; if he has his way, this patch set (or something derived from it)
will remove the BKL entirely in 2.6.39. To get there, about a dozen
modules need to be addressed. Some of them (i830, autofs3, and smbfs) are
simply to be removed. Others (appletalk and hpfs) are to be moved to the
staging tree for near-term removal, though there is some resistance to that
idea. The remaining modules are to be fixed in some way. Once that's taken
care of, the final patch in the series
removes the lock itself. It will not be missed.
Comments (24 posted)
Kernel development news
By Jonathan Corbet
January 26, 2011
Power management is often seen as a concern mostly for embedded and mobile
systems. They worry about power management because we want our phones
to run for longer between recharges and our laptops to not inflict burns on
our thighs. But power management is equally important for data centers,
which are currently responsible for about 3% of the total power consumption
in the US. Keeping the net going in the US requires about
15TW 15GW of power -
the dedicated output of about 15 nuclear power plants. Clearly there would
be some real value to saving some of that power. Matthew Garrett's
talk during the Southern Plumbers Miniconf at linux.conf.au 2011 covered
the work that is being done in that area and where Linux stands relative to
other operating systems.
Much of the power consumed by data centers is not directly controllable by
Linux - it is overhead which is consumed outside of the computers
themselves. About one watt of power is consumed by overhead for each watt
consumed by computation. This overhead includes network infrastructure
and power supply loss, but the biggest component is air conditioning. So
the obvious thing to do here is to create more efficient cooling and power
infrastructure. Running at higher ambient temperatures, while
uncomfortable for humans, can also help. The best contemporary data
centers have been able to reduce their overhead to about 20% - a big
improvement. Cogeneration techniques - using heat from data centers to
warm buildings, for example - can reduce that overhead even further.
But we still have trouble. A 48-core system, Matthew says, will draw about
350W when it is idle; a rack full of such systems will still pull a lot of
power. What can be done? Most power management attention has been focused
on the CPU, which is where a lot of the power goes. As a result, an idle
Intel CPU now draws approximately zero watts of power - it is "terrifying"
how well it works. When the CPU is working, though, the situation is a bit
different; the power consumption is about 20W per core, or about 960W for a
busy 48-core system.
The clear implication is that we should keep the CPUs idle whenever
possible. That can be tricky, though; it is hard to write software which
does nothing. Or - as Matthew corrected himself - it's hard to write
useful software which does nothing.
There are some trends which can be pointed to in this area. CPU power
management is essentially solved; Linux is quite good at it. In fact,
Linux is better than any other operating system with regard to CPU power;
we have more time in deep idle states and fewer wakeups than others. So
interest is shifting toward memory power management. If all of the CPUs in
a package within the system can be idled, the associated memory controller
will go idle as well. It's also possible to put memory into "self-refresh"
mode if it is idle, reducing power use while preserving the contents. In
other situations, running memory at a lower clock rate can reduce power
usage. There will be a lot of work in this area because, at this point,
memory looks like the biggest, lowest-hanging fruit.
Even more power can be saved by simply turning a system off; that is where
virtualization comes into play. If applications are run on virtualized
servers, those servers can be consolidated onto a small number of machines
during times of low load, allowing the other machines to be powered down.
There is a great deal of customer interest in this capability, but there is
still work to be done; in particular, we need fast guest migration, which
is a hard problem to solve.
The other hard problem is the fact that optimal power behavior may make
tradeoffs which enterprise customers may be unwilling to make. Performance
matters for these people, and, if that means expending more energy, they
are willing to pay that cost.
As an example, consider the gettimeofday() system call which,
while having been ruthlessly optimized, can still be slower than some
people would like. Simply reading the processor's time stamp counter (TSC)
can be faster. The problem is that the TSC can become unreliable in the
presence of power management. Once upon a time, changing the CPU frequency
would change the rate of the TSC, but that problem has been solved by the
CPU vendors for a few years now. So TSC problems are no longer an excuse
to avoid lowering the clock frequency.
Unfortunately, that is not too useful, because it rarely makes sense to run
a CPU at a lower frequency; best results usually come from running at full
speed and spending more time in a sleep state ("C state"). But C
states can stop the TSC altogether, once again creating problems for
performance-sensitive
users. In response, manufacturers have caused the TSC to run even when the
CPU is sleeping. So, while virtualization remains a hassle, systems
running on bare metal can expect the TSC to work properly in all power
management states.
But that still doesn't satisfy some performance-sensitive users because
deep C states create latency. It can take a millisecond to wake a CPU out
of a deep sleep - that is a very long time in some applications. We have
the pm_qos mechanism which can let the
kernel know whether deep sleeps are acceptable at any given time, allowing
power management to happen when latency is not an immediate concern. Not a
perfect solution, but that may be as good as it gets for now.
Another interesting feature of contemporary CPUs is the "turbo" mode, which
can allow a CPU to run in an overclocked mode for a period of time. Using
this mode can get work done faster, allowing longer sleeps and better power
behavior, but it depends on good power management if it is to work at all.
If a core is to run in turbo mode, all other cores on the same die must be
in a sleep state. The end result is that turbo mode can give good results
for single-threaded workloads.
Some effort is going into powering down unused hardware components - I/O
controllers, for example - even though the gains to be had in this area are
relatively small. Many systems have quite a few USB ports, many of which
are entirely unused. Versions 1 and 2 of the USB specification
make powering down those port hard; even worse, those ports will repeatedly
wake the CPU even if nothing is plugged in. USB 3 is better in this
regard.
Unfortunately, even in this case, it's hard to power down the ports because
it is a feature which is poorly specified, poorly documented, and poorly
implemented. The reliability of the hardware varies; Windows tends not to
use the PCI power management event infrastructure, so it often simply does
not work. This problem has been solved by polling the hardware once every
second; that is "the least bad thing" they could come up with. The result
is better power behavior, but also up to one second of latency before the
system responds to the plugging-in of a new USB device. Since, as Matthew
noted, that one second is probably less than the user already lost while
trying to insert the plug upside-down, it shouldn't be a problem.
Similar things can be done with other types of hardware - firewire ports,
audio devices, SD ports, etc. It's just a matter of figuring out how to
make it work. There is also some interest in reducing the power
consumption of graphics processors (GPUs), even though enterprise systems
tend not
to have fancy GPUs. The level of support varies from one GPU to the next,
but work is being done to improve power consumption for most of them.
Work for the future includes
better CPU frequency governor development; we need to do better at ramping
up the processor's frequency when there is work to be done. The scheduler
needs tweaks to do a better job of consolidating jobs onto one package,
allowing others to be powered down. And there is the continued
exploitation of other power management features in hardware; there are a
lot of them that we are not using. On the other hand, others are not using
those features either, so they probably do not work.
In summary: Linux is doing pretty well with regard to enterprise-level
power management; the GPU is the only place where we perform worse than
Windows does. But we can always do better, so work will continue in that
direction.
Comments (8 posted)
January 26, 2011
This article was contributed by Paul McKenney
Introduction
Symmetric multiprocessing (SMP) code often requires expensive
instructions, including atomic
operations and memory barriers, and often causes expensive cache
misses.
Yet some SMP code can be extremely cheap and fast, using no expensive
instructions at all.
Examples of cheap SMP code
include per-CPU counters and RCU read-side critical sections.
So why can't all SMP code be cheap?
Is it just that we aren't smart enough to spot clever ways of implementing
other algorithms, for example, concurrent stacks and queues?
Is it that we might be able to implement concurrent stacks and queues
without expensive instructions, but only at the cost of mind-bending
complexity?
Or is it simply impossible to implement concurrent stacks and queues
without using expensive instructions?
My traditional approach has been to place my faith
in two observations: (1) if you beat your head against a wall long
enough, one of two things is bound to happen, and (2) I
have a hard head.
Although this approach has worked well, something less painful would
be quite welcome.
And so it was with great interest that I read a paper
entitled "Laws of
Order: Expensive Synchronization in
Concurrent Algorithms Cannot be Eliminated" by Attiya et al., with
the “et al.” including Maged Michael, whom I have had
the privilege of working with for quite some time.
It is important to note that the title overstates the paper's
case somewhat.
Yes, the paper does present some laws requiring that many concurrent
algorithms use expensive instructions,
however, all laws have their loopholes,
including the Laws of Order.
So while we do need to understand the Laws of Order, we most especially
need to understand how to fully exploit their loopholes.
To arrive at the Laws of Order, this paper first expands
the definition of commutativity to include
sequential composition, which in the C language can best be thought
of as the “;” operator.
In this case, commutativity depends not just on the operator, but on
the operands, which for our purposes can be thought of as calls
to arbitrary C functions.
For example, the statements:
atomic_inc(&x);
atomic_inc(&y);
are commutative: the values of x and y are
the same regardless of the order of execution.
In contrast:
atomic_set(&x, 1);
atomic_set(&x, 2);
are non-commutative:
the value of x will be either 1 or 2, depending on which
executes first.
These examples execute sequentially, but the paper considers
concurrent execution.
To see this, consider a concurrent set that has these operations:
- a set-member-addition
function (call it
set_add())
that returns an indication of whether the element to be
added was already in the set,
- a set-member-test function (call it
set_member()), and
- a set-member-removal function (call it
set_remove())
that returns an indication of whether anything had actually been
removed.
Then concurrently testing two distinct members is commutative: the
order in which set_member(&s, 1) and
set_member(&s, 2)
execute will not affect the return values from either, and the final
value of set s will be the same in either case.
Therefore, it is not necessary for the two invocations to
coordinate with each other.
The fact that coordination is not required means that there is some hope
that expensive instructions are not needed to implement
set_member().
In contrast, concurrent invocation of set_add(&s, 1)
and set_member(&s, 1)
would not be commutative if the set s initially
did not contain the value 1.
The set_member() invocation would return true only if it
executed after the set_add().
Some coordination between the two functions is clearly required.
The most important results of the paper rely on a strong form
of non-commutativity, which is strangely enough termed
“strong non-commutativity”, which we can abbreviate
to “SNC”.
The example in the previous paragraph is not SNC because, while
set_add() can affect set_member(),
the reverse is not the case.
In contrast, an SNC pair of functions would each affect the other's
result.
For example, consider set_add(&s, 1) and
set_remove(&s, 1), where the set s is initially
empty.
If set_add(&s, 1) executes first, then both functions
will indicate success, and set s will be empty.
On the other hand, if set_remove(&s, 1) executes first,
then only the set_add(&s, 1) will indicate success
and the set s will contain 1.
In this case, the return value of set_remove() is affected
by the order of execution.
On the other hand, if the set s initially contains 1,
it will be set_add(&s, 1) whose return value is affected.
Therefore, the order of execution can affect the return value of both
functions, and these functions are therefore SNC.
Quick Quiz 1:
Is atomic_add_return() SNC? In other words, are multiple
concurrent calls to this function SNC?
Answer
The key result of the paper is that under certain conditions,
the implementation of
a pair of SNC functions must contain a heavyweight instruction,
where a “heavyweight instruction” can either be an
atomic read-modify-write instruction or a heavyweight memory barrier.
In the Linux kernel, only smp_mb() qualifies as
a heavyweight memory barrier.
The “certain conditions” are:
- Both functions in the pair must be
deterministic, in other words,
the final state (including return values) must be a strict
function of the initial state and order of execution.
- The functions must be
linearizable.
Interestingly enough, although the paper requires that the implementation
of an SNC, deterministic, and linearizable pair of functions each
contain at least one heavyweight instruction, it does
not require that
this instruction be executed on each invocation.
Quick Quiz 2:
Imagine an increment function that is not permitted to lose counts
even when multiple invocations execute concurrently, and that does
not return the value of the counter.
Must the implementation of such a function contain an
atomic read-modify-write instruction or a heavyweight memory barrier?
Answer
So if we want our code to run fast, we have four ways to avoid
heavyweight instructions:
- Formulate the API to be non-SNC.
- Design the implementation so that any required heavyweight
instructions almost never need to actually be executed.
- Accept non-determinism.
- Accept non-linearizability. The paper ignores
this possibility, possibly due to the common academic
view that non-linearizable algorithms are by
definition faulty.
Interestingly enough,
relativistic programming
has long suggested use of several of these approaches to attain
good performance and scalability.
The “Laws of Order” therefore provides a good theoretical
basis for understanding why relativistic programming is both
desirable and necessary.
Let's take a look at some examples, starting with a memory allocator.
Given that concurrent calls to kmalloc() are not supposed
to return a pointer to the same block of memory, we have to conclude that
kmalloc() is SNC and thus might need heavyweight instructions
in its implementation.
Quick Quiz 3:
How can we avoid the use of heavyweight instructions in the implementation
of kmalloc?
If it turns out to be impossible to completely avoid their use, how can
we reduce the frequency of their execution?
Answer
The second example is of course RCU.
Let's focus on the rcu_read_lock(),
rcu_read_unlock(),
synchronize_rcu(),
rcu_dereference() and
rcu_assign_pointer() API members.
The rcu_read_lock() function is unaffected by any of the
other members, so any pair that includes rcu_read_lock()
is non-SNC, which is why this function need not include any heavyweight
instructions.
The same is true of rcu_read_unlock().
Interestingly enough, synchronize_rcu() is affected
by both rcu_read_lock() and rcu_read_unlock(),
in that the former can prevent synchronize_rcu() from
returning and the latter can enable it to return.
However, neither rcu_read_lock() nor
rcu_read_unlock() is affected by synchronize_rcu().
This means that synchronize_rcu() is non-SNC and might therefore
have an implementation that does not use heavyweight instructions.
However, such an implementation seems quite implausible if you include
the actions of the updater both before and after the call to
synchronize_rcu() in conjunction with the RCU
read-side critical section.
The paper, though, considers only data flowing via
those function's arguments and return value.
It would be interesting to see a generalization of this work that
includes side effects.
My guess is that for a given code fragment to be non-SNC, any conceivable
API would need to be non-SNC.
If my guess is correct, then the full RCU update is non-SNC with respect
to any RCU read-side critical section containing rcu_dereference().
The reasoning is that the return value from rcu_dereference() can
be affected by the RCU update, and the duration for which
synchronize_rcu() blocks can be affected by
rcu_read_lock() and rcu_read_unlock().
Quick Quiz 4:
Are there any conditions in which rcu_read_unlock() will
be SNC with respect to synchronize_rcu()?
Answer
Finally, let us look at the set implementation that includes
set_add(),
set_member(), and
set_remove().
We saw that set_add() and set_remove()
were SNC.
Quick Quiz 5:
Is there any way to implement set_add() and
set_remove() without using heavyweight instructions?
Answer
Of course, this paper does have a few shortcomings, many of
which fall under the rubric of “future work”:
- The paper describes the theoretical limitations at great length,
but does not describe many ways of avoiding them.
However, I am quite confident that the Linux kernel community will be
more than able to produce good
software engineering solutions that work around
these limitations.
In fact, there is a lot to be said for letting the theoreticians
worry about limitations and letting us hackers worry about
solving problems in spite of those limitations.
- The paper focuses almost exclusively on reordering carried out
by the CPU.
It turns out that reordering due to compiler optimizations can be
at least as “interesting” as CPU reordering.
These sorts of compiler optimizations are allowed by the
current C-language standard, which permits the
compiler to
assume that there is only one thread in the address space.
Within the Linux kernel, the
barrier() directive
restricts the compiler's ability to move code, and this
directive (or its open-coded equivalent) is used in locking
primitives, atomic operations, and memory barriers.
- There is some uncertainty about exactly what properties of code
must be SNC for this paper's results to hold.
The paper focuses almost exclusively on function arguments and return
values, but
my guess is that the list of properties is quite general.
For example, an unconditional lock-acquisition primitive certainly
seems like it should be covered by this paper's result,
but such primitives do not return a value.
Can the fact that the second of two concurrent acquisitions
simply fails to return be considered to be evidence of the
SNC nature of lock acquisition?
If not, exactly why not?
If so, exactly what is the set of effects that must be
taken into account when judging whether or not this code fragment
is SNC?
This seems to be a future-work topic.
- A bit of thought about the results of this paper
give clear reasons why it is
often so hard to
parallelize existing sequential code.
Sequential code inflicts no penalties for the use of SNC
APIs, so SNC APIs can be expected to appear in sequential
code even when a non-SNC API might have served just as well.
After all, what programmer could resist the
temptation to make
set_add() return an indication
of whether the element was already in the set?
The paper would have done well to state this point clearly.
- The paper fails to call out non-linearizability as a valid
loophole to its laws of order.
- An interesting open question: What are the consequences of
using one of the loopholes of the laws of order?
In my limited personal experience, leveraging non-linearizability
and privatization permits full generality (for example,
parallel memory allocators), while leveraging non-SNC and
non-determinism results in specialized algorithms
(for example, RCU).
It would be quite interesting to better understand any
theoretical and software-engineering limitations imposed
by these loopholes.
- The paper overreaches a bit when it states that:
For synthesis and verification of concurrent algorithms,
our result is potentially useful in the sense that a
synthesizer or a verifier need not generate or attempt
to verify algorithms that do not use RAW
[smp_mb()]
and AWAR [atomic read-modify-write operations] for
they are certainly incorrect.
As we have seen, it is perfectly legal for a concurrent algorithm
to avoid use of these operations as long as that algorithm is either:
(1) non-SNC, (2) non-deterministic, or (3) non-linearizable.
There are a few other places where the limitations on the
main result are not stated as carefully as they should be.
Given that the rest of the paper seems quite accurate and
on-point, I would guess that this sentence is simply an
honest error that slipped through the peer-review process.
We all make mistakes.
Although I hope that these shortcomings will be addressed, I
hasten to add that they are insignificant compared to the huge
step forward that this paper represents.
In summary, the “Laws of Order” paper shines some
much-needed
light on the question of whether heavyweight instructions are needed to
implement a given concurrent algorithm.
Although I am not going to say that this paper fully captures my
parallel-programming intuition, I am quite happy that it does
land within a timezone or two, which represents a great improvement
over previous academic papers.
But the really good news is that the limitations called out in this
paper have some interesting
loopholes that can be exploited in many cases.
If the Linux kernel community pays careful attention to both the
limitations and the loopholes called out in this paper,
I am confident that the community's
already-impressive parallel-programming capabilities will
become even more formidable.
I owe thanks to Maged Michael, Josh Triplett, and Jon Walpole for illuminating
discussions and for their review of this paper, and to Jim Wasko
for his support of this effort.
This work represents the view of the author and does not necessarily
represent the view of IBM.
Linux is a registered trademark of Linus Torvalds.
Other company, product, and service names may be trademarks or
service marks of others.
Quick Quiz 1:
Is atomic_add_return() SNC? In other words, are multiple
concurrent calls to this function SNC?
Answer: Yes.
Suppose that an atomic_t variable named a
is initially zero
and that a pair of concurrent atomic_add_return(1, &a)
functions execute.
The first one to execute will return zero, and the second one will
return one.
Each instance's return value is therefore affected by the order of
execution, which indicates strong non-commutativity.
This may seem strange, given that addition is commutative.
And in fact the final value of a will be two
regardless of order of execution.
To see the reasoning behind the definition of SNC,
consider atomic_inc(&a),
which also adds one to a but does not return the
initial value.
In this case, because there are no return values, the invocations
of atomic_inc(&a) cannot possibly affect each others'
return values.
Therefore, atomic_inc(&a) is non-SNC.
It is interesting to note that the designers of the Linux
kernel's suite of atomic operations had an intuitive understanding
of the results of this paper.
The atomic operations that return a value (and thus are more likely
to be SNC) are the ones that are required to provide
full memory ordering.
Back to Quick Quiz 1.
Quick Quiz 2:
Imagine an increment function that is not permitted to lose counts
even when multiple invocations execute concurrently, and that does
not return the value of the counter.
Must the implementation of such a function contain an
atomic read-modify-write instruction or a heavyweight memory barrier?
Answer:
No.
Although this function is deterministic and linearizable, it is non-SNC.
And in fact such a function could be implemented via a “split
counter” that uses per-CPU non-atomic variables.
Because each CPU increments only its own variable, counts are never
lost.
To get the aggregate value of the counter, simply sum up the individual
per-CPU variables.
Of course, it might be necessary to disable preemption and/or
interrupts across the increments, but such disabling requires
neither atomic read-modify-write instructions nor heavyweight memory
barriers.
However, the linearizability of this function depends
on the counter always being incremented by the value 1.
To see this, imagine a counter with an initial value of zero
to which three CPUs are concurrently adding the
values 3, 5, and 7, and that meanwhile three other CPUs are
reading out the counter's value.
Because there are no ordering guarantees, these three other CPUs
might see the additions in any order.
One of these CPUs might add the per-CPU variables and obtain a sum
of 3, another might obtain a sum of 5, and the third might obtain
a sum of 7.
These three results are not consistent with any possible ordering
of the additions, so this counter is not linearizable.
However, for a great many uses, this lack of linearizability
is not a problem.
Back to Quick Quiz 2.
Quick Quiz 3:
How can we avoid the use of heavyweight instructions in the implementation
of kmalloc()?
If it turns out to be impossible to completely avoid their use, how can
we reduce the frequency of their execution?
Answer:
The usual approach is to observe that a given pair of invocations of
kmalloc() invocations will be SNC only if it is possible
for them to be satisfied by the same block of memory.
The usual way to greatly reduce the probability of a pair of
kmalloc() invocations fighting over the same block of
memory is to maintain per-CPU pools of memory blocks, which is what
the Linux kernel's implementation of kmalloc() actually
does.
Heavyweight instructions are executed only if a given CPU's pool either
becomes exhausted or overflows.
This approach is related to the paper's suggestion of using
“single-owner” algorithms.
It might be possible to avoid heavyweight instructions by
introducing non-determinism, for example, by making kmalloc()
randomly fail.
This can certainly be accomplished by making kmalloc()
unconditionally return NULL if the CPU's pool was
exhausted, but such an implementation might not prove to be fully
satisfactory to its users.
Coming up with a reasonable implementation that uses non-determinism
to avoid heavyweight instructions is left as an exercise for the
adventurous reader.
Similarly, eliminating heavyweight instructions by introducing
non-linearizability is left as an exercise for the adventurous reader.
Back to Quick Quiz 3.
Quick Quiz 4:
Are there any conditions in which rcu_read_unlock() will
be SNC with respect to synchronize_rcu()?
Answer:
In some implementations of RCU, synchronize_rcu() can
interact directly with rcu_read_unlock() when the
grace period has extended too long, either via
force_quiescent_state()
machinations our via
RCU priority boosting.
In these implementations, rcu_read_unlock() will
be SNC with respect to synchronize_rcu(). The
Linux kernel's Preemptible Tree RCU
is an example of such an implementation, as can be seen by examining
the rcu_read_unlock_special() function in
kernel/rcutree_plugin.h.
This code executes rarely, thus using the second loophole called out
above (“Design the implementation so that any required heavyweight
instructions almost never need to actually be executed”).
Back to Quick Quiz 4.
Quick Quiz 5:
Is there any way to implement set_add() and
set_remove() without using heavyweight instructions?
Answer:
This can be done easily for sets containing small integers if
there is no linearizability requirement.
The set is represented as a dense array of bytes so that each potential
member of the set maps to a specific byte.
The set_add() function would set the corresponding byte to
one, the set_remove() function would clear the corresponding
byte to zero, and the set_member() function would
test the corresponding byte for non-zero.
This implementation is non-linearizable because different CPUs
might well disagree on the order that members were added to and removed
from the set.
Back to Quick Quiz 5.
Comments (9 posted)
January 22, 2011
This article was contributed by Goldwyn Rodrigues
At the end of 2010,
the LIO
project was chosen to replace STGT as the in-kernel SCSI target
implementation. There were two main contenders (LIO and SCST) which tried
to get their code into the Linux kernel tree. This article will compare
the two projects and try to describe what these implementations have to offer.
What are SCSI targets?
The SCSI subsystem uses a sort of client-server model. Typically a computer
is the client or "initiator," requesting that blocks be written to or read
from a "target," which is usually a data storage device. The SCSI target
subsystem enables a computer node to behave as a SCSI storage device, responding
to storage requests by other SCSI initiator nodes. This opens up the possibility
of creating custom SCSI devices and putting intelligence behind the storage.
An example of an intelligent SCSI target is Data Domain's
online backup appliance, which supports de-duplication (thus saving
space). The appliance, functioning as a SCSI target, is a computer node which
intelligently writes only those blocks which are not already stored, and
increases the reference counts of the blocks which are already present, thus
writing only the blocks which have changed since the last backup. On the
other side of the SCSI link,
the initiator sees the appliance as a normal, shared SCSI storage device and
uses its regular backup application to write to the target.
The most common implementation of the SCSI target subsystem is an iSCSI
server, which uses a standard TCP/IP encapsulation of SCSI to export a SCSI
device over the network. Most SCSI target projects started with the idea
supporting iSCSI
targets before supporting other protocols. Since only a network interface is needed to act as both an iSCSI
initiator and an iSCSI target, supporting iSCSI doesn't require any
special hardware beyond a network port, which almost every computer has
these days. However, most SCSI
targets can be supported with existing initiator cards, so if you have a
Fibre, SAS, or Parallel SCSI card, it should be possible to use one of
the SCSI target projects to make your computer into a SCSI target for
the particular SCSI bus supported by the card.
Current Status
The Linux kernel SCSI subsystem currently uses STGT to implement the
SCSI target functionality; STGT was introduced into the Linux kernel at the
end of 2006 by
Fujita Tomonori. It has a library in the kernel which assists the in-kernel
target drivers. All target processing happens in user space, which may
lead to performance bottlenecks.
Two out-of-tree kernel SCSI target solutions were contenders to replace
STGT: LIO and SCST. SCST has been pushing to
be included in the Linux kernel since at least 2008. It was decided
then that
the STGT project could serve the kernel for a little longer. As time passed, the
design limitations of STGT were encountered and a replacement sought. The
main criteria for a replacement SCSI target subsystem
defined by James Bottomley, the SCSI maintainer, were:
- That it would be a drop in replacement for STGT (our current in-kernel
target mode driver), since there is room for only one SCSI target
infrastructure.
- That it used a modern sysfs-based control and configuration plane.
- That the code was reviewed as clean enough for inclusion.
The first condition proved to be too restrictive; it was not possible to
avoid breaking the ABI entirely. So the current goal, instead, is to find
a way to gracefully transition STGT users to the new interface.
Hints of LIO replacing the STGT project came in
the 2010 Linux Storage and Filesystem
Summit. Christoph Hellwig volunteered to review and clean
up the code; he managed to reduce the code-base by around 10,000 lines to
make it
ready to merge into the kernel.
Comparison
Both projects have drawn comparison charts of their feature lists which are
available on their respective web sites: LIO and SCST. However, before
exploring the differences, lets compare the similarities. Both projects
implement an in-kernel SCSI target core. They provide local SCSI targets
similar to loop devices, which comes in handy for using targets in virtualized
environments. Both projects support iSCSI, which was one of the initial and main
motivations for both projects.
Back-storage handlers are available on both projects in kernel space as
well as for user space. Back-storage handlers allow target administrators
to control how devices are exported to the initiators. For example, a
pass-through handler allows exporting the SCSI hardware as it is, instead
of masking the details of that hardware, while a virtual-disk handler
allows exporting of files as virtual disk to the initiator.
Both projects support Persistent Reservations (PR); a feature for I/O
fencing and failover/retakeover of storage devices in high-availability
clusters. Using the PR commands, an initiator can establish, preempt,
query, or reset a reservation policy with a specified target. During a
failover takeover, the new virtual resource can reset the reservation
policy of the old virtual resource, making device takeover easier and
faster.
SCST
The main users of the SCSI target subsystem are storage companies providing
storage solutions to the industry. Most of these storage solutions are
plug-and-play appliances which can be attached to the storage network and
used with little or no configuration. SCST boasts of a wider user base, which
probably comes from the fact that they have wider range of transport support.
SCST supports both Qlogic and Emulex fibre channel cards whereas LIO
supports only Qlogic target drives for now, and it is still in its beta stages of
development. SCST supports the SCSI RDMA Protocol
(SRP), and claims to be ahead in terms of
development with respect to Fibre
Channel over Ethernet (FCoE), LSI's
Parallel/Wide SCSI Fibre Channel, and Serial Attached SCSI
(SAS). It already has
support for IBM's
pSeries Virtual SCSI. Companies such as Scalable Informatics, Storewize, and Open-e
have developed PnP appliance products which rely on these target
transports based on SCST.
SCST supports notifications of session changes using asynchronous event
notification (AEN). AEN is a protocol feature that may be used
by SCSI targets to notify a SCSI initiator of events that occur in the target,
when the target is not serving a request. This enables
initiators to be notified of changes at the target end, such as devices added,
removed, resized, or media changes. This way the initiators can see any target
changes in a plug-and-play manner.
The SCST developers claim that their design conforms to more SCSI standards
in terms of robustness
and safety. The SCSI protocol requires that if an initiator clears a
reservation
held by another initiator, the reservation holder must be notified about the
reservation clearance or else several initiators could change reservation data,
ultimately corrupting it. SCST is capable of implementing safe RESERVE/RELEASE
operations on devices to avoid such corruption.
According to the SCSI protocol, the initiator and target can communicate with
each other to decide on the transfer size. An incorrect transfer size
communicated by the initiator can lead to target device lockups or a crash.
SCST safeguards against miscommunication of transfer sizes or transfer
directions to avoid such a situation. The code claims to have a good memory
management policy to avoid out-of-memory (OOM) situations. It can also limit the
number of initiators that can connect to the target to avoid resource usage by
too many connections. It also offers per-portal visibility control, which
means that it can be configured in such a way that a target is visible to a
particular subset of initiators only.
LIO
The LIO project began with the iSCSI design as its core objective, and
created a generic SCSI target subsystem to support iSCSI. Simplicity has
been a major
design goal and hence LIO is easier to understand. Beyond that, the LIO
developers have shown more willingness to work with the kernel developers
as James pointed out
to SCST maintainer Vladislav Bolkhovitin:
Look, let me try to make it simple: It's not about the community you
bring to the table, it's about the community you have to join when you
become part of the linux kernel. The interactions in the wider
community are critical to the success of an open source project. You've
had the opportunity to interact with a couple of them: sysfs we've
covered elsewhere, but in the STGT case you basically said, here's our
interface, use it. LIO actually asked what they wanted and constructed
something to fit. Why are you amazed then when the STGT people seem to
prefer LIO?
The LIO project also boasts of features which are either not present in SCST
or are in early development phases. For example, LIO supports asymmetric
logical unit
assignment (ALUA). ALUA allows a target administrator to manage the access
states and path attributes of the targets. This allows the multipath routing
method to select the best possible path to optimize usage of available
bandwidth, depending on the current access states of the targets. In other
words, the path taken by the initiator in a multipath environment can be
manipulated by target administrator by changing the access states.
LIO supports Management
Information Base (MIB) which makes management of SCSI
devices simpler. The SCSI target devices export management
information values described in SCSI MIB RFC-4455 which is
picked up by an SNMP agent. This feature extends to iSCSI devices and is
beneficial in managing a storage network with multiple SCSI devices.
An error in the iSCSI connection can happen at three different levels: the
session, digest, or connection level. Error recovery can be initiated
at each of these levels, which makes sure that the recovery is made at the
current level, and the error does not pass through to the next one. Error
recovery starts with detecting a broken connection. In reponse, the iSCSI
initiator driver
establishes another TCP connection to the target, then it informs the
target that the SCSI command path is being changed to
the new TCP connection. The target can then continue processing
SCSI commands on the new TCP connection. The upper level SCSI driver remains
unaware that a new TCP connection has been established and that control has
been transferred to the new connection. The iSCSI Session remains active during
the period and does not have to be reinstated. LIO supports a maximum Error
Recovery Level (ERL) of 2, which means that it can recover errors at the
session, digest, or connection levels. SCST supports an ERL of 0, which
means
it can recover from session-level errors only and that all connection
oriented errors are communicated to the SCSI driver.
LIO also supports "multiple connections per session" (MC/S). MC/S allows the
initiator to open multiple connections between the initiator and target, either
on the same or a different physical link. Hence, in case of a failure of
one path,
the established session can use another path without terminating the session.
MC/S can also be used for load balancing across all established connections.
Architectural session command ordering is preserved across those communication
paths.
The LIO project also claims that its code is used in a number of appliance products and
deployments though the user base does not seem to be as varied as that of
SCST.
No comparison can be complete without a performance comparison. SCST developers
have released their performance numbers from time to time. However, all their
numbers were compared against STGT. The SCST comparison
page speaks of SCST performing better than LIO, but the results were drawn on
source-code study and not using real-world tests. SCST blames LIO for not
releasing performance numbers, and there exist no performance data (to my
knowledge) which would compare apples to apples.
The decision has finally been made, though, with quite a bit of opposition.
Now comes the task of getting all the niche features which LIO lacks to
be ported from SCST to LIO. While the decision was contentious, it is yet
another example of the difficulty of getting something merged without
being able to cooperate with the kernel development community.
Comments (8 posted)
Patches and updates
Kernel trees
Core kernel code
Development tools
Device drivers
Filesystems and block I/O
Janitorial
Memory management
Architecture-specific
Security-related
Virtualization and containers
Miscellaneous
Page editor: Jonathan Corbet
Distributions
Fedora is getting closer to defining its long term goals. Board member Máirín Duffy posted a summary of the first draft of goals on January 11, with an invitation to comment on the goals on the blog or on the advisory board mailing list.
One might think that the Fedora Project has sufficiently defined what it is and what it's doing. The project has defined objectives, a mission statement, vision statement, core values, and has identified its target audience.
Fedora's vision statement gives a broad idea of what the project is about,
and the core values (a.k.a. "four foundations") help briefly state
what informs the vision. In this case, "freedom, friends, features, first,"
which (with the vision) helps provide a fairly decent "elevator pitch" to
describe Fedora and for teams within the project to consider when making
decisions.
default offering.
Say what you will about Fedora, but you can't fault the project for
being overly vague. But being precise is the point. All of this is part of
Fedora's strategic planning effort, something that is lacking from many open
source projects. During a discussion of Fedora's mission statement in
October of 2009, Mike McGrath expressed the problem that many were
seeing with Fedora at the time:
Right now Fedora is a place for everyone to just come and do whatever they want which is harming us in the long term. There's plenty of room for everyone in the Linux universe. I understand that by narrowing our focus we might lose some contributors who disagree with our values and mission. But that's better than not having one and having volunteers work against each other because they joined The Fedora Project thinking it was one thing only to find it's something else.
Having a clear mission statement and values also enables the project to
move forward without being distracted with activities that aren't part of
its scope — like worrying
about using Fedora for infrastructure services when long term supported
releases are not an objective. Rather than being drawn into an (overly)
long debate about what Fedora "should be" it's possible to point to the
project's objectives — which do not in any way encourage a long term
support release. It also enables Fedora to prioritize its resources. As former Fedora Project Leader Paul Frields wrote during the target audience discussion, "having an audience in mind, we as a community can prioritize resources, and at the same time make it possible for people who want to concentrate on other audiences to build community around those efforts."
Fedora has been wrestling with these issues for some time, and there has
been some unease expressed by some members of the Fedora community that the
board is espousing
its view of what Fedora should be rather than what the community wishes
Fedora to be. Greg DeKoenigsberg addressed
this by saying "the Fedora leadership should stake out positions
that they believe to be correct, and should work to mobilize resources that
move us in those directions... [while guaranteeing] the freedom for
dissenting community members to move in their own directions."
A long list of goals
With all of the other strategic items in place, it is now up to the
board to define goals for the next few releases; they now have a working
list to consider. The initial
list includes 15 goals that have been culled by the board from
proposals out of the larger Fedora community. Much of the discussion
happened took
place back in November on the advisory-board list.
Initially the call for goals was for the "next 3-4 releases," but that seems to have been cut down to the next two releases over the intervening months. The goals are to move Fedora closer to the vision statement for Fedora, which is:
The Fedora Project creates a world where free culture is welcoming and widespread, collaboration is commonplace, and people control their content and devices.
The final list includes improving and simplifying communication in the Fedora community, improving communication within the project, recruiting uncommon skillsets into Fedora, and improving the developer experience within Fedora.
Some of the goals seem to describe things Fedora already does well. Goal
14, "Evaluate
late-breaking technologies for inclusion/interaction with Fedora," for
instance, seems to be well underway already. Others, like the goals around
communication, could be combined into a single goal. It does seem that,
like many FOSS projects, Fedora finds communication within and without the
project to be a continual source of difficulty.
A set of 15 goals, of course, is far too many to be practical, so the
board is trying to reduce the list of goals to five goals for the next two
releases. The board has settled on its five,
being:
- Goal #1: Improve and simplify collaboration in the Fedora Community.
- Goal #2: Improve and encourage high-quality communication in the Fedora Community.
- Goal #4: It is extraordinarily easy to join the Fedora community and quickly find a project to work on.
- Goal #11: Expand global presence of Fedora among users & contributors.
- Goal #12: Improve education & skill sharing in community.
So far, there's been very little discussion on the advisory-board mailing list, but there has been discussion among some of the subgroups in Fedora. For example, the Fedora Ambassadors Steering Committee (FAMSCo) brainstormed ahead of meeting with the board to offer their suggestions on which goals should be chosen. The board and FAMSCo seem to have mind-melded, as they share the exact same list of five goals.
Fedora's Engineering Steering Committee (FESCo) has also met and discussed the goals and agreed on three goals. The first two (Improve and simplify collaboration in the Fedora Community, Improve and encourage high-quality communication in the fedora community) mirror the board and FAMSCo. The third, unsurprisingly for the engineering committee, is to improve the developer experience in Fedora.
All of the goals that have been put forward so far seem perfectly reasonable. Goal #1, for example, would put emphasis on improving Fedora's governance structure and carries a suggestion that the Fedora board meet in person at least once per year. Goal #2 overlaps with #1, and both carry a suggestion about creating a calendaring solution for Fedora. (Also, perhaps, highlighting the absence of a decent FOSS calendaring solution.)
More feedback will trickle in before the final goals are set, but it
looks likely that improving communication and collaboration will be the
primary goals for the next two releases. It's important to note that the
goals are only suggestions and, as the wiki states
"to help people who want to work on several things to prioritise their
time."
Coming up with a mission and goals for a large distribution is not easy. This is particularly true of a project with a corporate sponsor shifting from an closed development model to open, and a mixture of paid and unpaid contributors. Consider the efforts of the openSUSE Project to define its strategy. The effort has been in process now since 2009, and is still being worked on, with no target date for completion.
The goals will be a hot topic at the upcoming FUDCon in Tempe,
Arizona from January 29 through 31. There will be a session on the
goals led by Duffy and the board members present, and a governance
hackfest where the goals will likely be discussed as well. Fedora,
thankfully, is at the tail end of the process. The question now becomes how
well the various subgroups in Fedora will adhere to the goals — and
whether they'll actually lead to success.
Comments (none posted)
Brief items
The timing of it is very important, as most major distros would like to
adopt some of the features that just became popular in the various new app
markets and stores, such as screenshots, user comments and ratings. It
looks like a lot of new code is about to be written, or a lot of existing
code is about to gain quite a bit of popularity.
-- Enrico Zini
The fact is that it's still much easier to work on things in a corner. The
funny thing is that people are generally not opposed to work together, far
from it. When discussing low-level bits related to packaging systems, many
would expect dpkg/apt developers and rpm/zypp developers to have some
heated discussion just because we always hear confrontational stories here
and there. The truth is that those stories are generally from users, and
developers are generally happy to accept differences.
-- Vincent
Untz
Comments (none posted)
The Debian Project has invited representatives of Debian-derived
distributions to participate in a census of Debian derivatives. "By
participating in the census you will increase the visibility of your
derivative within Debian, provide Debian contributors with a contact point
and a set of information that will make it easier for them to interact with
your distribution. Representatives of distributions derived from Ubuntu are
encouraged to get their distribution added to the Ubuntu Derivative Team
wiki page."
Full Story (comments: none)
The Debian Installer team has announced the second release candidate of the
installer for Debian Squeeze. "We need your help to find bugs and
further improve the installer, so please try it."
Full Story (comments: none)
The Extra Packages for Enterprise Linux (EPEL) project has announced the
release of EPEL 6. "EPEL 6 is a collection of add-on packages
available for Red Hat Enterprise Linux (RHEL) 6 and other compatible
systems, maintained by the community under the umbrella of the Fedora
Project. EPEL 6 is designed to supplement RHEL 6 by providing additional
functionality and does not replace any RHEL 6 packages. As a community
project, EPEL is maintained and supported by volunteers via Bugzilla and
mailing lists. EPEL is not commercially supported by Red Hat, Inc."
Full Story (comments: none)
The Fedora IBM System z (s390x) Secondary Arch team has announced the
official release of Fedora 14 for IBM System z 64bit.
Full Story (comments: none)
Foresight Linux 2.5.0 ALPHA 1 GNOME Edition has been released. "Well
known for being a desktop operating system featuring an intuitive user
interface and a showcase of the latest desktop software, this new release
brings you the latest GNOME 2.32 release, a newer Linux kernel 2.6.35.10,
Xorg-Server 1.8, Conary 2.2 and a ton of updated applications!"
Full Story (comments: none)
Distribution News
Debian GNU/Linux
The Debian Security Team had a productive meeting earlier the month.
Topics discussed include Improvements to the team workflow, Hardening
compiler flags, Longer security support for Debian stable, "Beta testing"
of security updates, README.test, Backports security support, Issues in
specific packages. There is also a call for volunteers.
Full Story (comments: none)
DebConf team is looking for new people to help with DebConf11.
"DebConf is a huge process, and there are many things we could use
help on. People come and go, and are usually overworked after a year or
two -- so we would love new people to get involved. If you have new
ideas, we'd love to hear about them and we can discuss if they'd work
and how to make them happen. And by the way, if you are looking for a
good way to get involved with Debian and don't know where to start,
this might be among the best options!"
Full Story (comments: none)
The Debian Project will be present at several upcoming events. "The
Debian Project invites all interested persons to said events, ask
questions, take a look at Debian 6.0 "Squeeze", exchange GPG-Fingerprints
to boost the Web of trust and get to know the members and the community
behind the Debian Project."
Full Story (comments: none)
Fedora
Fedora project leader Jared Smith has announced that a Fedora account was compromised. It would appear that the account credentials were somehow compromised externally and that the Fedora infrastructure is not vulnerable to some kind of exploit. "While the user in question had the ability to commit to Fedora SCM, the
Infrastructure Team does not believe that the compromised account was used to
do this, or cause any builds or updates in the Fedora build system. The
Infrastructure Team believes that Fedora users are in no way threatened by this
security breach and we have found no evidence that the compromise extended
beyond this single account."
Full Story (comments: none)
The minutes of
the January 24 meeting of the Fedora Board cover a Fedora strategic goals
discussion with FESCo.
Comments (none posted)
Ubuntu family
During the January 11th meeting the board
discussed a version template for -extras and reorganizing
drivers/owners/release managers permissions in Launchpad.
The January 25th meeting covers default
ntpd configuration and Seamonkey microrelease SRU exception.
Comments (none posted)
Newsletters and articles of interest
Comments (none posted)
On his blog, Vincent Untz reflects on the recently completed cross-distribution App Installer meeting, which by his and others' accounts was definitely a success. In the posting, he also spends some time talking about the need for more cross-distribution collaboration. "To be honest, since I started working on openSUSE, I've kept wondering why all distributions duplicate so much work. Sometimes, there is a good reason, like a radically different technical approach. But sometimes, it looks like we're going different ways just for the sake of doing something ourselves. We should fix this. Cross-distro collaboration is not the way we usually do things, and I believe we're wrong most of the time. Cross-distro collaboration is a cultural shift for us. But it's very well needed."
Comments (72 posted)
Mark Shuttleworth has plans for more Qt
applications in Ubuntu. "System settings and prefs, however,
have long been a cause of friction between Qt and Gtk. Integration with
system settings and preferences is critical to the sense of an application
"belonging" on the system. It affects the ability to manage that
application using the same tools one uses to manage all the other
applications, and the sorts of settings-and-preference experience that
users can have with the app. This has traditionally been a problem with Qt
/ KDE applications on Ubuntu, because Gtk apps all use a
centrally-manageable preferences store, and KDE apps do things
differently. To address this, Canonical is driving the development of
dconf bindings for Qt, so that it is possible to write a Qt app that uses
the same settings framework as everything else in Ubuntu. We've contracted
with Ryan Lortie, who obviously knows dconf very well, and he'll work with
some folks at Canonical who have been using Qt for custom development work
for customers. We're confident the result will be natural for Qt
developers, and a complete expression of dconf's semantics and
style."
Comments (none posted)
Susan Linton takes
a look at Saline OS 1.0, a new distribution based on Debian Squeeze. "Saline OS is delivered as an installable live CD and features Linux 2.6.36, Xorg X Server 1.7.7, and GCC 4.4.5. Chromium Web browser, IceDove mail client, Rhythmbox, Fotoxx photo manager, Parole video player, Osmo organizer, OpenOffice.org, Pidgin, and Xfburn media creator are part of the software stack. Synaptic setup with Debian Squeeze repositores is available to install other software if desired. An icon on the upper panel launches automatic updates, which are pulled in from Debian Squeeze. The lower panel with lots of application launchers hides until hover."
Comments (none posted)
Og Maciel writes about why he
likes Foresight Linux. "Reason 2 - Roll backs: Because the
entire system is kept under a complete version control down to the file
level, It is possible to perform something that other distributions can
only dream of: system roll backs! Don't like the application you've just
installed? Remove it and it will be as if your system never had it
installed! Want to go back to the update you ran 3 weeks or even months
ago? Not a problem! Your system is like a giant Git/Mercurial repository
and you control what to clone and what branch to checkout."
Comments (17 posted)
Joe "Zonker" Brockmeier reviews
a beta of Mepis 11. "Mepis is not one of the best-known Linux distributions, but it does have a loyal following. Though it's never been my distro of choice, it was a favored distribution with some of my colleagues at Linux.com circa 2005 and 2006. In fact, it was favored by a lot of users then - coming in 5th in the DistroWatch listings in 2005, and 4th in 2006. What happened? The Ubuntu/Kubuntu juggernaut, that's what. But user base is not a clear indication of the quality of a distribution. Let's see what Mepis 11 has to offer."
Comments (none posted)
Matt Domsch has been working on
Consistent
Network Device Naming for Fedora 15 (and beyond). "Systems
running Linux have long had ethernet network devices named ethX. Your
desktop likely has one ethernet port, named eth0. This works fine if you
have only one network port, but what if, like on Dell PowerEdge servers,
you have four ethernet ports? They are named eth0, eth1, eth2, eth3,
corresponding to the labels on the back of the chassis, 1, 2, 3, 4,
respectively. Sometimes. Aside from the obvious confusion of names
starting at 0 verses starting at 1, other race conditions can happen such
that each port may not get the same name on every boot, and they may get
named in an arbitrary order. If you add in a network card to a PCI slot,
it gets even worse, as the ports on the motherboard and the ports on the
add-in card may have their names intermixed."
Comments (63 posted)
Page editor: Rebecca Sobol
Development
January 26, 2011
This article was contributed by Robert Fekete
Correlating log messages to get a deeper insight about the actual events
happening on a network or server is an important element of IT security.
Being able to do so is mandated by several security compliance standards, best
practices, and also common sense. However, many common log analyzing and
correlation engines cannot handle high message rates in real time,
requiring administrators to filter the input of the analyzing
engine. Proprietary solutions are often licensed based on the number of
processed messages, which limits their usefulness. The syslog-ng project aims to provide a flexible, real-time correlation solution that scales well even to extreme performance requirements.
Syslog-ng is
an advanced system logging tool, which can be a replacement for the
standard syslogd and rsyslog daemons. The syslog-ng pattern database,
introduced almost two years ago, allows for real-time message
identification and classification by comparing the incoming log messages to
a set of message patterns. The classification engine of syslog-ng is much
faster and scalable than using regular expressions to identify messages,
and also permits the administrator to extract relevant information from the
message body or to add custom metadata (for example, tags) to log
messages. We looked at
message classification in syslog-ng just over a year ago.
The new message correlation feature extends the syslog-ng pattern database to make it possible to associate related log messages, and to treat the information from those messages as if they were a single event.
Message correlation is one of the foundations of log analysis and
reporting, because log messages tend to be hectic, and often separate
important information about events into different log messages. For
example, the Postfix e-mail server logs the sender and recipient addresses
into separate log messages. For OpenSSH, if there is an unsuccessful login
attempt, the server sends a log message about the authentication failure
with the reason for the failure in the next message. But in fact the event
and its exact details are interesting, not necessarily the individual log
messages, therefore being able to collect information as events rather than
messages can be a boon for every system administrator.
How correlation works in syslog-ng
Message correlation in syslog-ng operates on the log messages
successfully identified by the syslog-ng's pattern database: you can extend the rules describing message patterns with instructions on how to correlate the matching messages.
Correlating log messages involves collecting the messages into message
groups called contexts. A context consists of a series of log messages that
are related to each other in some way, for example, the log messages of an
SSH session can belong to the same context. Messages may be added to a
context as they are processed. The context of a log message can be specified using simple static
strings or with macros and dynamic values. For example, you can group
messages received from the same host ($HOST), application ($HOST$PROGRAM), or process ($HOST$PROGRAM$PID).
Messages belonging to the same context are correlated, and can be processed in a number of ways. It is possible to include the information contained in an earlier message of the context in messages that are added later. For example, if a mail server application sends separate log messages about every recipient of an e-mail (like Postfix), you can merge the recipient addresses to the previous log message. Another option is to generate a completely new log message that contains all the important information that was stored previously in the context, for example, the login and logout (or timeout) times of an authenticated session (like SSH or telnet), and so on.
To ensure that a context handles only log messages of related events, a timeout value can be assigned to a context, which determines how long the context accepts related messages. If the timeout expires, the context is closed.
Triggering new messages and external actions
In syslog-ng Open Source Edition (OSE) 3.2, you can automatically generate new messages when a particular message is recognized, or the correlation timeout of a context expires. The generated messages can be configured within the pattern database rules, meaning that if needed, a new message can be generated for every incoming log message. Obviously this not necessary, unless you take log normalization really seriously.
When used together with message correlation, you can also refer to
fields and values of earlier messages of the context. For example, the
patterns:
<pattern>
Accepted @QSTRING:SSH.AUTH_METHOD: @ for@QSTRING:SSH_USERNAME: \
@from @QSTRING:SSH_CLIENT_ADDRESS: @port @NUMBER:SSH_PORT_NUMBER:@ ssh2
</pattern>
<pattern>
pam_unix(sshd:session): session closed for user @ESTRING:SSH_USERNAME: @
</pattern>
could be used to match OpenSSH's log messages. Then the action:
<value name="MESSAGE">
An SSH session for $SSH_USERNAME from ${SSH_CLIENT_ADDRESS}@1 \
closed. Session lasted from ${DATE}@1 to $DATE.
</value>
would put out a correlated message that included information from both log
messages. The above is just a snippet, consult the full XML rules for all the gory details.
Sending alerts directly from syslog-ng is currently not supported, but would be a welcome addition to the next versions. However, it is reasonably simple to pass the selected messages to an external script that sends out alerts in e-mail or SNMP. And since completely new messages can be created from the information extracted from the correlated messages, all the script has to do is to send out the alerts, for example using sendmail or snmptrap.
To process already collected log messages, syslog-ng also allows for
correlating log messages from log files. For this reason, the time elapsed
between two log messages is calculated from the actual timestamps of the
log messages instead of using the system time.
Beyond syslog-ng 3.2
Work on syslog-ng OSE 3.3 has already started, and focuses on improving the support for multicore and multithreaded operations to increase the performance of syslog-ng and make it even more suitable for high-message rate environments. Transforming the internal representation of log messages to other, non-syslog outputs like JSON or WELF is also on the roadmap.
As correlating log messages becomes increasingly important for companies
and organizations, it is welcome to see that open source tools are also
focusing on solving this problem. Although the syslog-ng project has had a
sometimes rocky relationship with the open source community in the past,
its OSE is under active development. In fact, the message
correlation feature, among others, is currently available only in the OSE.
[ The author is a technical writer for BalaBit, which developed syslog-ng. ]
Comments (4 posted)
Brief items
As more groups warm to the beauty that is embodied in Qt, I hope
that the message of working together (rather than dictating, for
life or otherwise) also spreads. That mode of operation is what got
Qt and KDE Platform, as high quality developer tools, to where they
are today. It is what motivates us to look at the development
platforms we build for application developers and ask ourselves,
"How can we make this as painless as possible for the developer
while giving them access to as many platforms as seamlessly as
possible?" It's a way of thinking that helps create a superior
result, and we're always looking for new ways to expand the
benefits it brings.
-- Aaron
Seigo
...but this caught my eye.
($=[$=[]][(__=!$+$)[_=-~-~-~$]+({}+$)[_/_]+
($$=($_=!''+$)[_/_]+$_[+$])])()[__[_/_]+__
[_+~$]+$_[_]+$$](_/_)
Care to guess what that does?
-- "adamcecc"
Comments (2 posted)
KDE.News announces
the release of KDE 4.6, including KDE Plasma
Workspaces, updated KDE
applications, and the mobile
platform.
Comments (26 posted)
The Document Foundation has announced the release of LibreOffice 3.3, which is the first stable release of the OpenOffice.org fork. "LibreOffice 3.3 brings several unique new features. The 10 most-popular among community members are, in no particular order: the ability to import and work with SVG files; an easy way to format title pages and their numbering in Writer; a more-helpful Navigator Tool for Writer; improved ergonomics in Calc for sheet and cell management; and Microsoft Works and Lotus Word Pro document import filters. In addition, many great extensions are now bundled, providing PDF import, a slide-show presenter console, a much improved report builder, and more besides. A more-complete and detailed list of all the new features offered by LibreOffice 3.3 is viewable on the following web page: http://www.libreoffice.org/download/new-features-and-fixes/".
Comments (11 posted)
The H looks
at the OpenOffice.org 3.3 release. "OpenOffice.org 3.3.0 features an
updated, easier to use, Extension Manager user interface (UI) and several
improvements to Calc spreadsheets, such as an increase in the number of
rows supported from 65,536 to 1,048,576. The print system has been
restructured, the thesaurus dialogue has been redesigned for better
usability and slide layout handling has been improved in the presentation
application, Impress." More information can be found in the OOo New Features
page and the release
notes.
Comments (16 posted)
OpenSSH 5.7 has been released. Some new features in this release include
Elliptic Curve Cryptography modes for key exchange (ECDH) and host/user
keys (ECDSA), a protocol extension to support a hard link operation added
to sftp, new options for scp and ssh, and more. The announcement (click
below) contains additional information.
Full Story (comments: 15)
Sala is a command-line tool for the management of an encrypted password
database. Actual passwords are stored in their own file, making the use of
tab-completion for lookups possible. The 1.0 release is available now.
Full Story (comments: 1)
Newsletters and articles
Comments (none posted)
Nathan Willis looks at Blender's new UI over at Linux.com. "And as with every new Blender release, there are indeed new tools in 2.56a. For example, the Solidify tool allows you to select a thin, planar object and automatically extrude thickness into it. There is a new paintbrush system, which lets you modify any brush's size, strength, texture, and low-level behavior curves. Sculpt Mode, in which you modify objects by whittling or squishing them around, was also rewritten, making it easier to do multi-resolution sculpting (for example, sculpting at a rough resolution to define a character's body, but working with much finer detail on its face)."
Comments (none posted)
On his blog, Jesse Barnes has a nice description of how computer displays work in terms of the memory organization and timings, along with some tips on debugging display problems (with photos and links to videos). "There are several variables that apply: bits per pixel, indexed or not, tiling format, and color format (in the Intel case, RGB or YUV), and stride or pitch. Bits per pixel is as simple as it sounds, it simply defines how large each pixel is in bits. Indexed planes, rather than encoding the color directly in the bits for the pixel, use the value as an index into a palette table which contains a value for the color to be displayed. The tiling mode indicates the surface organization of the plane. Tiled surfaces allow for much more efficient rendering, and allowing planes to use them directly can save copies from tiled rendering targets to an un-tiled display plane. Finally, the color format defines what values the pixels represent."
Comments (1 posted)
Page editor: Jonathan Corbet
Announcements
Brief items
Google Summer of Code 2011 was announced
at linux.conf.au. "This will be the 7th year for Google Summer of Code, an innovative program dedicated to introducing students from colleges and universities around the world to open source software development. The program offers student developers stipends to write code for various open source projects with the help of mentoring organizations from all around the globe. Over the past 6 years Google Summer of Code has had 4,500 students from over 85 countries complete the program. We are excited to announce that we will extend the scope of the program this year by targeting a 25% increase in accepted student applications as well as accepting a larger number of mentoring organizations. Our goal is to help these students pursue academic challenges over the summer break while they create and release open source code for the benefit of all."
Comments (1 posted)
The Open Source Lab at Oregon State University has announced
Supercell. "Supercell is a new on demand virtualization and continuous integration resource, made possible by a generous grant from Facebook's Open Source Team. We have created this cluster for use by open source projects who need to run software tests regularly but may not have access to the appropriate hardware or the funds to pay for outsourcing this service. Supercell will also allow projects to do manual testing to verify that a submitted patch has actually fixed the intended bug or to determine that their software package runs correctly on a particular operating system or distribution. The service will also allow projects to test their software in a large cluster using several VMs concurrently. Supercell will also provide temporary space for projects who would like to test drive new features in their code base or on their website."
Comments (none posted)
The Document Foundation has released "the more or less final draft
" of its trademark
policy, pending a legal review.
Full Story (comments: none)
The Ubuntu Project has announced the opening of a new exhibition at
London's Design Museum dedicated to the Ubuntu Font, in collaboration with
international typeface designers Dalton Maag. "Entitled "Shape My Language," the exhibition will run from January 28 to February 28, 2011. The exhibition marks a significant milestone for the Ubuntu Project's advance in design and aims to enhance the consumer experience of using open computing platforms, such as Ubuntu."
Full Story (comments: none)
Articles of interest
A second
batch of FOSDEM speaker interviews is available. Martijn Dashorst
(Wicket), David Fetter (PL/Parrot), Andrew Godwin (Django), Soren Hansen
(OpenStack), Lennart Poettering (systemd), Spike Morelli (devops), and
Kenneth Rohde Christiansen (Qt WebKit) are interviewed in this round.
Comments (none posted)
The Fellowship of the Free Software Foundation Europe has an interview with Anne Østergaard. "Anne Østergaard is a veteran of the Free Software community, and attended the first Open Source Days, back in 1998. She holds a Law Degree from The University of Copenhagen, Denmark, and after a decade in government service, international organisations, and private enterprise, she has become a devoted Free Software advocate. Her interests lie in the long-term strategic issues of Free Software; in the social, legal, research, and economic areas of our global society. A former Vice Chairman at GNOME, shes heavily involved in political lobbying, and has been fighting for changes in software patents and copyright for a number of years."
Comments (none posted)
Ars technica examines
some evidence in the Oracle vs. Google lawsuit. "Patent reform
activist Florian Mueller has published
what he believes to be new evidence of copyright infringement in Google's
Android software platform. He has found files in the Android code
repository that have Sun copyright headers identifying them as proprietary
and confidential. A close look at the actual files and accompanying
documentation, however, suggest that it's not a simple case of copy and
paste."
Comments (86 posted)
On its Deeplinks blog, the EFF has a strongly worded look at the actions taken by Sony against George Hotz for finding and publicizing security holes in its PlayStation 3 console. "Not content with the DMCA hammer, Sony is also bringing a slew of outrageous Computer Fraud and Abuse Act claims. The basic gist of Sony's argument is that the researchers accessed their own PlayStation 3 consoles in a way that violated the agreement that Sony imposes on users of its network (and supposedly enabled others to do the same). But the researchers don't seem to have used Sony's network in their research — they just used the consoles they bought with their own money. Simply put, Sony claims that it's illegal for users to access their own computers in a way that Sony doesn't like. Moreover, because the CFAA has criminal as well as civil penalties, Sony is actually saying that it's a crime for users to access their own computers in a way that Sony doesn't like."
Comments (33 posted)
On his Computerworld UK blog, Simon Phipps writes about the OSI and FSF teaming up to file a request [PDF] to the US Department of Justice (DOJ) to investigate the CPTN patent purchase (i.e. the 882, now 861, Novell patents). "Whatever the outcome of the matter, its importance has done a great service providing the OSI and the FSF with a first public opportunity to continue the positive relationship that has resulted in earlier private collaborations, such as when both organisations endorsed the formation of the Document Foundation. I strongly hope that both organisations will continue to explore ways to act collaboratively from their different perspectives of software freedom in the interests of the overlapping communities."
Comments (1 posted)
Ian Hickson introduces the end to
version numbers for the HTML specification. "The WHATWG HTML
spec can now be considered a "living standard". It's more mature than any
version of the HTML specification to date, so it made no sense for us to
keep referring to it as merely a draft. We will no longer be following the
"snapshot" model of spec development, with the occasional "call for
comments", "call for implementations", and so forth." (Thanks to
Paul Wise)
Comments (45 posted)
New Books
The Apache Reference Manual is available as a printed book from Network
Theory Ltd. For each copy of this manual sold, $1 will be donated to the
Apache Software Foundation.
Full Story (comments: none)
Pragmatic Bookshelf has released "HTML5 and CSS3", by Brian Hogan.
Full Story (comments: none)
MAKE Magazine Volume 25 ("Arduino Revolution") features DiY projects using
the Arduino microcontrollers.
Full Story (comments: none)
Calls for Presentations
EuroScipy 2011 will be held in Paris, France, August 25-28, 2011. The call
for papers is open until May 8, 2011.
Full Story (comments: none)
The Grace Hopper Celebration of
Women in Computing will take place November 8-12, 2011 in Portland,
Oregon. This year's theme is "What if...?" The call
for participation is open until March 15, 2011.
Comments (none posted)
Upcoming Events
Red Hat has announced
that registration is open for the 2011 Red Hat Summit and JBoss
World. "This marks the seventh year that Red Hat has gathered
customers, partners, visionary thinkers, technologists and open source
enthusiasts to learn, network and explore open source. The event will be
held in Boston at the Seaport World Trade Center, May 3-6, 2011. A full
list of sessions is now posted with talks covering a wide range of topics
from general overviews and roadmaps of Red Hat's cloud, virtualization,
platform and middleware offerings to the more developer-focused sessions
that include tips, tricks and demonstrations."
Comments (none posted)
The Southern California Linux Expo (SCALE) has announced the list of
speakers for this year's conference, which will take place in Los Angeles,
California, February 25-27, 2011.
Full Story (comments: none)
Events: February 3, 2011 to April 4, 2011
The following event listing is taken from the
LWN.net Calendar.
| Date(s) | Event | Location |
February 2 February 3 |
Cloud Expo Europe |
London, UK |
| February 5 |
Open Source Conference Kagawa 2011 |
Takamatsu, Japan |
February 5 February 6 |
FOSDEM 2011 |
Brussels, Belgium |
February 7 February 11 |
Global Ignite Week 2011 |
several, worldwide |
February 11 February 12 |
Red Hat Developer Conference 2011 |
Brno, Czech Republic |
| February 15 |
2012 Embedded Linux Conference |
Redwood Shores, CA, USA |
| February 25 |
Build an Open Source Cloud |
Los Angeles, CA, USA |
| February 25 |
Ubucon |
Los Angeles, CA, USA |
February 25 February 27 |
Southern California Linux Expo |
Los Angeles, CA, USA |
| February 26 |
Open Source Software in Education |
Los Angeles, CA, USA |
March 1 March 2 |
Linux Foundation End User Summit 2011 |
Jersey City, NJ, USA |
| March 5 |
Open Source Days 2011 Community Edition |
Copenhagen, Denmark |
March 7 March 10 |
Drupalcon Chicago |
Chicago, IL, USA |
March 9 March 11 |
ConFoo Conference |
Montreal, Canada |
March 9 March 11 |
conf.kde.in 2011 |
Bangalore, India |
March 11 March 13 |
PyCon 2011 |
Atlanta, Georgia, USA |
| March 19 |
Open Source Conference Oita 2011 |
Oita, Japan |
| March 19 |
OpenStreetMap Foundation Japan Mappers Symposium |
Tokyo, Japan |
March 19 March 20 |
Chemnitzer Linux-Tage |
Chemnitz, Germany |
March 21 March 22 |
Embedded Technology Conference 2011 |
San Jose, Costa Rica |
March 22 March 24 |
OMG Workshop on Real-time, Embedded and Enterprise-Scale Time-Critical Systems |
Washington, DC, USA |
March 22 March 24 |
UKUUG Spring 2011 Conference |
Leeds, UK |
March 22 March 25 |
Frühjahrsfachgespräch |
Weimar, Germany |
March 22 March 25 |
PgEast PostgreSQL Conference |
New York City, NY, USA |
March 23 March 25 |
Palmetto Open Source Software Conference |
Columbia, SC, USA |
| March 26 |
10. Augsburger Linux-Infotag 2011 |
Augsburg, Germany |
| March 28 |
Perth Linux User Group Quiz Night |
Perth, Australia |
March 28 April 1 |
GNOME 3.0 Bangalore Hackfest | GNOME.ASIA SUMMIT 2011 |
Bangalore, India |
March 29 March 30 |
NASA Open Source Summit |
Mountain View, CA, USA |
April 1 April 3 |
Flourish Conference 2011! |
Chicago, IL, USA |
| April 2 |
Texas Linux Fest 2011 |
Austin, Texas, USA |
April 2 April 3 |
Workshop on GCC Research Opportunities |
Chamonix, France |
If your event does not appear here, please
tell us about it.
Page editor: Rebecca Sobol