By Nathan Willis
March 27, 2013
Evan Prodromou surprised a number of free software microbloggers in
December 2012 when he announced
that he would be closing down Status.Net, the "Twitter like" software
service he launched in 2008, in favor of his new project, pump.io. But Status.Net's flagship site, Identi.ca has grown into a
popular social-networking hub for the free and open source
software community, and a number of Identi.ca users took the
announcement to mean that Identi.ca would disappear, much to the
community's detriment. Prodromou has reassured users Identi.ca will
live on, though it will move from StatusNet (the software package, as
distinguished from Status.Net, the company) over to pump.io. Since then,
pump.io has rolled out to some test sites, but it is still in heavy
development, and remains something of an unknown quantity to users.
Prodromou has
some markedly different goals in mind for pump.io. The underlying
protocol is different, but more importantly, StatusNet never quite
reached its original goal of becoming a decentralized, multi-site
platform—instead, the debut site Identi.ca was quickly branded
as an open source "Twitter replacement." That misconception hampered
StatusNet's adoption as a federated solution, putting the bulk of the
emphasis on Identi.ca as the sole destination, with relatively few
independent StatusNet sites. The pump.io rollout is progressing
more slowly than StatusNet's, but that strategy is designed to avoid
some of the problems encountered by StatusNet and Identi.ca.
The December announcement started off by saying that Status.Net
would stop registering new hosted sites (e.g., foo.status.net) and
was discontinuing its "premium" commercial services. The software
itself would remain available, and site maintainers would be able to
download the full contents of their databases. Evidently, the
announcement concerned a number of Identi.ca
users, though, because Prodromou posted a follow-up
in January, reassuring users that the Identi.ca site would remain
operational.
But there were changes afoot. The January post indicated that
Identi.ca would be migrated over to run on pump.io (which necessarily
would involve some changes in the feature set, given that it was not
the same platform), and that all accounts which had been active in the
past year would be moved, but that at some point no new registrations
would be accepted.
Indeed Identi.ca stopped accepting new user registrations on March 26. The shutdown of new registrations was timed so that new
users could be redirected to one of several free, public pump.io sites
instead. Visiting http://pump.io/tryit.html
redirects the browser to a randomly-selected pump.io site, currently
chosen from a pool of ten. Users can set up an account on one of the
public servers, but getting used to pump.io may be a learning
experience, seeing as it presents a distinctly different experience
than the Twitter-like StatusNet.
What is pump.io anyway?
At its core, StatusNet was designed as an implementation of the OStatus
microblogging standard. An OStatus server
produces an Atom feed of status-update messages, which are pushed to
subscribers using PubSubHubbub.
Replies to status updates are sent using the Salmon protocol, while the
other features of Twitter-like microblogging, such as
follower/following relationships and "favoriting" posts, are
implemented as Activity Streams.
The system is straightforward enough, but with a little
contemplation it becomes obvious that the 140-character limit
inherited from Twitter is a completely artificial constraint.
StatusNet did evolve to support longer messages, but ultimately there
is no reason why the same software could not deliver pictures
à la Pinterest or Instagram, too, or handle other types
of Activity Stream.
And that is essentially what pump.io is; a general-purpose
Activity Streams engine. It diverges from OStatus in a few other
respects, of course, such as sending activity messages as JSON rather
than as Atom, and by defining a simple REST inbox API instead of using
PubSubHubbub and Salmon to push messages to other servers. Pump.io
also uses a new database abstraction layer called Databank, which has
drivers for a variety of NoSQL databases, but supports real relational
databases, too. StatusNet, in contrast, was bound closely to MySQL.
But, in the
end, the important thing is the feature set; a pump.io instance can
generate a microblogging feed, an image stream, or essentially any
other type of feed. Activity Streams defines actions (which are called
"verbs") that handle common social networking interaction; pump.io
merely sends and receives them.
The code is available at Github; the
wiki explains that the
server currently understands a subset of Activity Streams verbs that
describe common social networking actions: follow,
stop-following, like, unlike,
post, update, and so on. However, pump.io will
process any properly-formatted Activity Streams message, which means
that application authors can write interoperable software simply by
sending compliant JSON objects. There is an example of this as well;
a Facebook-like farming game called Open Farm Game. The game
produces messages with its own set of verbs (for planting, watering,
and harvesting crops); the pump.io test sites will consume and display
these messages in the user's feed with no additional configuration.
The pump.io documentation outlines the other primitives understood
by the server—such as the predefined objects (messages, images,
users, collections, etc.) on which the verbs can act, and the API
endpoints (such as the per-user inbox and outbox). Currently, the demo
servers allow users to send status updates, post images, like or
favorite posts, and reply to updates. Users on the demo servers can
follow one another, although at the moment the UI to do so is
decidedly unintuitive (one must visit the other user's page and click
on the "Log in" link; only then does a "Follow" button become
visible). But Prodromou said in an email that more is still to come.
For those users and developers who genuinely prefer StatusNet, the
good news is that the software will indeed live on. There are
currently two actively-developed forks, GNU social and Free & Social. Prodromou said there
was a strong possibility the two would merge, although there will be a
public announcement with all of the details when and if that happens.
Where to now?
Pump.io itself (and its web interface) are the focus of
development, but they are not the whole story. Prodromou is keen to
avoid the situation encountered at the StatusNet launch, where the
vast majority of new users joined the first demo site (Identi.ca), and
it became its own social network, which ended up consuming a
significant portion of StatusNet's company resources. Directing new
registrations to a randomly-selected pump.io service is one tactic to
mitigate the risk; another is intentionally limiting what pump.io
itself will do.
For instance, while StatusNet could be linked to Twitter or other
services via server-side plugins, pump.io will rely on third-party applications for bridging to
other services. Prodromou cited TwitterFeed and IFTTT as
examples. "My hope is that hackers find pump.io fun to develop
for," he said, "and that they can 'scratch an itch' with
cool bridges and other apps." The narrow scope of pump.io also
means that a pump.io service only serves up per-user content; that is
to say, each user has an activity stream outbox and an inbox
consisting of the activities the user follows, but there is no site-wide
"public" stream—no tag feeds, no "popular notices."
That may frustrate Identi.ca users at the beginning, Prodromou
says, but he reiterates that the goal is to make such second-tier
services easy for others to develop and deploy, by focusing on the
core pump.io API. For example, the pump.io sites forward all messages
marked as "public" to the ofirehose.com site; any developer could
subscribe to this "fire hose" feed and do something interesting with
it. Ultimately, Prodromou said, he hopes to de-emphasize the
importance of "sites" as entities, in favor of users. Users do not care
much about SMTP servers, he said; they care about the emails sent and
received, not about enumerating all of the accounts on the server.
That is true in the SMTP world (one might argue that the only
people who care to enumerate the user accounts on a server probably
have nefarious goals in mind), but it does present some practical
problems in social networking. Finding other users and searching (both on message content
and on metadata) have yet to be solved in pump.io. Prodromou said he
is working on "find your friend" sites for popular services (like
Facebook and Twitter) where users already have accounts, but that
search will be trickier.
Identi.ca and other things in the future
Eventually, the plan is for Identi.ca to become just one more
pump.io service among many; the decentralization will mean it is no
harder to follow users on another pump.io server or to carry on a
conversation across several servers than it is to interact with others
on a monolithic site like Twitter. But getting to that future will
place a heavier burden on the client applications, be they mobile,
web-based, or desktop.
Prodromou has not set out a firm timeline for the process; he is
working on the pump.io web application (which itself should be
mobile-friendly HTML5) and simple apps for iOS and Android. In the
medium term, the number of public pump.io sites is slated to ramp up
from ten to 15 or 20. But at some point Prodromou will start directing
new registrations to a free Platform-as-a-Service (PaaS) provider that
offers pump.io as a one-click-install instead (AppFog and OpenShift
were both mentioned, but only as hypothetical examples).
Where pump.io goes from there is hard to predict. Prodromou is
focused on building a product developers will like; he deliberately
chose the permissive Apache 2.0 license over the AGPL because the
Node.js and JavaScript development communities prefer it, he said.
Applications, aggregation, and PaaS delivery are in other people's
hands, but that is evidently what he wants. As he explained it,
running Status.Net took considerable resources (both human and server)
to manage hosted instances and public services like Identi.ca, which
slowed down development of the software itself. "I want to get
out of the business of operating social networking sites and into the
business of writing social networking software."
At some point in the next few months, Identi.ca will switch over
from delivering OStatus with StatusNet to running pump.io. That will
be a real watershed moment; as any social-networking theorist will
tell you, the value of a particular site is measured by the community
that uses it, not the software underneath. Identi.ca has grown into a
valued social-networking hub for the free software community;
hopefully that user community survives the changeover, even if it
takes a while to find its bearings again on the new software platform.
Comments (1 posted)
By Jonathan Corbet
March 27, 2013
The Wayland project, which seeks to design and implement next-generation
display management for Linux and beyond, does not lack for challenges. The
project is competing with a well-established system (the X Window System)
that was written by many of the same developers. It is short of
developers, and often seems to have a hard time communicating its reasons
for existence and goals to a somewhat skeptical community. Canonical decided
to create its own display manager for Ubuntu rather than work to help
improve Wayland, and Android has yet another solution of its own. About
the only thing the project lacked was a fork and internal fighting — until
now. The story behind this episode merits a look at an example of the
challenges involved in keeping a development community healthy.
Scott Moreau is an established contributor to both Wayland (the protocol
definition and implementation) and Weston (the reference compositor
implementation for Wayland). A quick search of the project's repositories
shows that he contributed 84 changes to the project since the beginning of
2012 — about 2% of the
total. Until recently, he was an active and often helpful presence on the
project's mailing lists. So it might come as a surprise to learn that
Scott was recently banned from the Wayland IRC
channel and, subsequently, the project's mailing list. A simple
reading of the story might suggest that the project kicked him out for
creating his own fork of the code; when one looks closer, though, the story
appears to be even simpler than that.
Last October, Wayland project leader Kristian Høgsberg suggested that it might be time to add a
"next" branch to the Weston repository for new feature development. He
listed a few patches that could go there, including "Scott's minimize etc
work." Scott responded favorably at the
time, but suggested that Wayland, too, could use a "next" branch. It does
not appear that any such branch was created in the official repositories,
though. So, for some months, the idea of a playground
repository for new features remained unimplemented.
In mid-March 2013, Scott announced the creation
of staging repositories for both Wayland and Weston, and started responding
to patch postings with statements that they had been merged into
"gh next". Two days later, he complained that "Kristian has
expressed no interest in the gh next series or the benefits that it
might provide" and that Kristian had not merged his latest patches.
He also let
it be known that he thought that Weston could be developed into a full
desktop environment — a goal the Wayland developers, who are busy enough
just getting the display manager implemented properly, do not share.
The series of messages continued with this
lengthy posting comparing the "gh next" work with the Compiz window
manager and its Beryl fork, claiming that, after the two projects merged back together, most of
the interesting development had come from the Beryl side. Similarly, Scott
intends "gh next" to be a place where developers can experiment with shiny
new features, the best of which can eventually be merged back into the
Wayland and Weston repositories. Scott's desire to "run ahead" is seen as
a distraction by many Wayland developers who would rather focus on
delivering a solid platform first, but that is not where the real
discord lies.
There was, for example, a certain amount of disagreement with Scott's
interpretation of the Compiz story. More importantly, he was asked to, if
possible, avoid forking Wayland and making incompatible protocol changes
that would be hard to integrate later. When Scott was shown how his
changes could be made in a more cooperative manner, he responded "This sounds great but this is
not the solution I have come up with." Meanwhile, the lengthy
missives to the mailing list continued. And, evidently, he continued a pattern of behavior
on the project's IRC channel that fell somewhere between "unpleasant" and
"abusive." Things reached a point where other Wayland developers were
quite vocal about their unwillingness to deal with Scott.
What developers in the project are saying now is that the fork had nothing
to do with Scott's banishment from the Wayland project. Even his plans to
make incompatible changes could have been overlooked, and his eventual
results judged on their merits when the time came. But behavior that made
it hard for everybody else to get their work done was not something that
the project could accept.
There is no point in trying to second-guess the project's leadership here
with regard to whether Scott is the sort of "poisonous person" that needs
to be excluded from a development community. But there can be no doubt
that such people can, indeed, have a detrimental effect on how a community
works. When a community's communication channels turn unpleasant or
abusive, most people who do not have a strong desire to be there will find
somewhere else to be — and a different project to work on. Functioning
communities are fragile things; they cannot take that kind of stress
indefinitely.
Did this community truly need to expel one of its members as an act of self
preservation? Expulsion is not an act without cost; Wayland has, in this
case, lost an enthusiastic contributor. So such actions are not to be
taken lightly; the good news is that our community cannot be accused of
doing that. But, as long as our communities are made up of humans, we will
have difficult interactions to deal with. So stories like those outlined
above will be heard again in the future.
Comments (4 posted)
By Jake Edge
March 27, 2013
Python core developer Raymond Hettinger's PyCon 2013 keynote had elements of a revival meeting
sermon, but it was also meant to spread the "religion" well beyond those
inside the meeting tent. Hettinger specifically tasked attendees to use
his "What makes Python awesome?" talk as a sales tool with
management and other Python
skeptics. While he may have used the word "awesome" a few too many times
in the talk,
Hettinger is clearly an excellent advocate of the language from a
technical—not just cheerleading—perspective.
He started the talk by noting that he teaches "Python 140 characters at a
time" on Twitter (@raymondh).
He has been a core developer for twelve years, working on builtins, the
standard library, and a few core language features. For the last year and
a half, Hettinger has had a chance to
"teach a lot of people Python". Teaching has given him a perspective on
what is good and bad in Python.
Context for success
Python has a "context for success", he said, starting with
its license. He and many others would never have heard of Python if it
were not available under an open source license. It is also important for
a "serious language" to have commercial distributions and the support that
comes with those.
Python also has a "Zen", he said, which is
also true of some other languages, like Ruby, but "C++ does not have Zen".
Community is another area
where Python excels. "C is a wonderful language", but it doesn't have a
community, Hettinger said.
The PyPI repository for Python
modules and packages is another important piece of the puzzle. Python also
has a "killer app", in fact it has more than one. Zope, Django, and pandas are all killer apps, he said.
Windows support is another important attribute of Python. While many in the
audience may be "Linux weenies" and look down on Windows users, most of the
computers in the world are running Windows, so it is important for Python
to run there too, he said. There are lots of Python books available,
unlike some other languages. Hettinger is interested in Go, but there
aren't many books on that language.
All of these attributes make up a
context for success, and any language that has them
is poised to succeed. But, he asked, why is he talking about the good
points of
Python at PyCon, where
everyone there is likely to already know much of what he is saying? It is
because attendees will often be in a position to recommend or defend
Python. Hettinger's goal is for attendees to be able to articulate what is
special about the language.
High-level qualities
The Python language itself has certain qualities that make it special, he
said, starting with "ease of learning". He noted that David Beazley runs
classes where students are able to write "amazing code" by the end of the
second day. One of the exercises in those classes is to write a web log
summarizing tool, which shows how quickly non-programmers can learn Python.
Python allows for a rapid development cycle as well. Hettinger used to
work at a high-frequency trading company that could come up with a trading
strategy in the morning and be using it by the afternoon because of
Python. Though he was a good Java programmer, he could never get that
kind of rapid turnaround using Java.
Readability and beauty in a language is important, he said, because it
means that programmers will want to program in the language. Python
programmers will write code on evenings and weekends, but "I never code C++
on the weekend" because it is "not fun, not beautiful". Python is both, he
said.
The "batteries included" philosophy of Python, where the standard library
is part of the language, is another important quality. Finally, one of
Hettinger's favorite Python qualities is the protocols that it defines,
such as the database
and WSGI
protocols. The database protocol means that you can swap
out the underlying database system, switching to or from MySQL, Oracle,
or PostgreSQL without changing the code to access the database. Once you
know how to access one of them through Python, you know how to access them all.
As an example of the expressiveness and development speed of the language,
Hettinger put up a slide
with a short program. In a class he was teaching, someone asked how he
would deduplicate a disk full of photos, and in five minutes he was able to
come up with a fifteen-line program to do so. It is a real testament to
the language that he could write that program live in class, but even more
importantly, he can teach others to do the same. That one slide shows
"a killer feature of the language: its productivity, and its beauty and
brevity", he said.
But, there is a problem with that example. A similar slide could be
created for Ruby or Perl, with roughly the same brevity. That would be
evidence for the "all scripting languages are basically the same, just with
different syntax" argument that he hears frequently from software
executives. But all scripting languages are not the same, he said.
That may have
been true in 2000, but "we've grown since then"; there are lots of features
that separate Python from the pack.
Winning language features
First up on Hettinger's list of "winning language features" is the required
indentation of the language. It was an "audacious move" to make that choice
for the language, but it contributes to the "clean, uncluttered" appearance
of the code. He claimed that Python was the first to use indentation that
way, though he later received a "Miranda warning" from an audience member
as the Miranda language uses indentation and predates Python. People new to
the language sometimes react negatively to the forced indentation, but it
is a net positive. He showed some standard examples of where C programs
can go wrong because the indentation doesn't actually match the control
flow, which is impossible with Python. Python "never lies with its visual
appearance", which is a winning feature, he said.
The iterator protocol is one of his favorite parts of the language. It is
a "design pattern" that can be replicated in languages like Java and C++,
but it is "effortless to use" in Python. The yield statement can
create iterators everywhere. Because iterators are so deeply wired into
the language, they can be used somewhat like Unix pipes. So the shell
construct:
cat filename | sort | uniq
can be expressed similarly in Python as:
sorted(set(open(filename)))
This shows how iterators can be used as composable filters. In addition,
Python has a level of expressiveness that is
similar to SQL, so:
sum(shares*price for symbol, shares, price in port)
will sum the number of shares times the price for all of the entries in
port, which is much like the SQL equivalent:
SELECT SUM(shares*price) FROM port;
Languages that don't have
for loops that are as powerful as
Python's cannot really compete, he said.
One of his favorite things to teach about Python are list comprehensions.
The idea came from mathematical set building notation. They "profoundly
improve the expressiveness of Python", Hettinger said. While list
comprehensions might at first appear to violate the "don't put too much on
one line" advice given to new programmers, it is actually a way to build up a
higher-level view. The examples he gave can fairly easily be expressed as
natural
language sentences:
[line.lower() for line in open(filename) if 'INFO' in line]
which creates a list of lower-cased lines that contain "INFO". The second seems
directly derived from math notation:
sum([x**3 for x in range(10000)])
which sums a list of the cubes of the first 10,000 integers (starting at zero).
Since list comprehensions can generally be expressed as single sentences,
it is reasonable to write them that way in Python.
The generators feature is a "masterpiece" that was stolen from the Icon language.
Now that Python has generators, other languages are adding them as
well. Generators allow Python functions to "freeze their execution" at a
particular point and to resume execution later. Using generators makes
both iterators
and coroutines easier to implement in a "clean, readable, beautiful" form.
Doing things that way is something that Python has "that others don't".
His simple example
showed some of the power of the feature:
def pager(lines, pagelen=60):
for lineno, line in enumerate(lines):
yield line
if lineno % pagelen == 0:
yield FORMFEED
Generator expressions come from Hettinger's idea of combining generators
and list comprehensions. Rather than requiring the creation of a list,
generators can be used in expressions directly:
sum(x**3 for x in range(10000))
From that idea, dictionary and set comprehensions
are obvious extensions, he said. Generator expressions are one way to combat
performance problems in Python code because they have a small memory
footprint and are
thus cache friendlier, he said.
But generators have a problem: they are a "bad date". Like a date that can
only talk about themselves, generators can only talk, not listen. That led
to the idea of two-way generators. Now
generators can accept inputs in the form of send(),
throw(), and close() methods. It is a feature that is
unique to Python, he said, and is useful for implementing coroutines. It
also helps "tame" some of the constructs in Twisted.
Decorators have an interesting history in Python. They don't really add
new functionality that can't be done other ways, so the first few times
they were proposed, they were turned down. But they kept being proposed, so
Guido van Rossum (Python's benevolent dictator for life) used a tried and
true strategy to make the problem go away: he said that if everyone could
agree on a syntax for decorators, he would consider adding them. For the
first time ever, the entire community came together and agreed on a
syntax. It
presented that agreement to Van Rossum, who agreed: "you shall have
decorators, but not the syntax you asked for".
In retrospect,
the resistance to decorators (from Van Rossum and other core developers)
was wrong, Hettinger said, as they have turned out to be a "profound
improvement to the language". He pointed to the lightweight web frameworks
(naming itty, Flask, and CherryPy) as examples of how decorators
can be used to create simple web applications. His one
slide example of an itty-based web service uses decorators for
routing. Each new service is usually a matter of adding three lines or so:
@get('/freespace')
def compute_free_disk_space(request):
return subprocess.check_output('df')
The code above creates a page at
/freespace that runs
df and
returns its output as a web page.
"Who's digging Python now?", he asked with a big grin, as he did in spots
throughout the
talk—to much applause.
The features he had mentioned are reasons to pick Python over languages
like Ruby, he said. While back in 2000, Python may have been the
equivalent of other scripting languages, that has clearly changed.
There are even more features that make Python compelling, such as the
with
statement. Hettinger thinks that "context managers" using
with may turn
out to be as
important to programming as was the invention of the subroutine. The
with statement is a
tool for making code "clean and beautiful" by setting up a temporary
context where the entry and exit conditions can be ensured (e.g. files
closed or locks unlocked) without sprinkling try/finally
blocks all over. Other languages have a with, but they are
not at all the same as Python's. The best uses for it have not yet
been discovered, he said, and suggested that audience members "prove to the
world that they are awesome", so that other languages get them.
The last winning feature that he mentioned was one that he initially didn't
want to be added: abstract base classes. Van Rossum had done six months of
programming in Java and "came back" with abstract base classes. Hettinger
has come to embrace them. Abstract base classes help clarify what a sequence
or a mapping
actually is by defining the interfaces used by those types. They are
also useful for
mixing in different classes to better organize programs and modules.
There is something odd that comes with abstract base classes, though.
Python uses "duck typing", which means that using isinstance() is
frowned upon. In fact, novice Python programmers spend their first six
months adding isinstance() calls, he said, and then spend the next
six months taking them back out.
With abstract base classes, there is an addition to the
usual "looks like a duck, walks like a duck, quacks like a duck" test
because isinstance() can lie. That leads to code that uses:
"well, it said it was
a duck, and that's good enough for me", he said with a laugh. He thought
this was "incredibly weird", but it turns out there are some good use cases
for the feature. He showed an example
of using the collections.Set abstract base class to create a
complete list-based set just by implementing a few basic operations. All
of the
normal set operations (subset and superset tests, set equality, etc.) are
simply inherited from the base class.
Hettinger wrapped up his keynote with a request: "Please take this
presentation and go be me". He suggested that attendees present it to
explain what Python has that other languages are missing, thus why Python
should be
chosen over a language like Ruby. He also had "one more thing" to note:
the Python community has a lot of both "established superstars" as
well as "rising young superstars". Other languages have "one or two
stars", he said, but Python has many; just one more thing that Python has
that other languages don't.
Comments (73 posted)
Page editor: Jonathan Corbet
Security
By Nathan Willis
March 27, 2013
Version 6.2 of the OpenSSH package was released on March 22,
bringing with it the usual palette of new encryption and
authentication schemes, administrative options, and bug fixes. The
notable changes include improved granularity for existing
options, but there are also brand new features—such as the
ability to fetch authorized keys via an external command, and the
ability to require multiple authentication methods when users log in
through sshd. Because OpenSSH includes a flexible mechanism to
invoke operating system authentication methods, this means
sshd can require genuine multi-factor authentication, with
hardware tokens or biometrics.
Answer first these questions three (or two)
The sshd daemon in OpenSSH is configured (by default) through the
/etc/ssh/sshd_config file. 6.2 adds a new keyword,
AuthenticationMethods, which takes one or more
comma-separated lists of authentication methods as its argument (with
the lists themselves separated by spaces). If
only one list is supplied, users must successfully complete all
of the methods—in the order listed—to be granted access.
If several lists are provided, users need complete only one of the
lists. For example, the line
AuthenticationMethods ``publickey,password hostbased,publickey''
would require the user to either perform public-key authentication
then enter a password, or to connect from a trusted host (i.e., using
host-based authentication) and perform public-key authentication. The
listed authentication methods are invoked in order, which matters most
visibly when several interactive methods are specified. It is also
important to note that the
AuthenticationMethods feature
applies only to the SSH 2
protocol, and that each authentication method listed must also be
explicitly enabled in the
sshd_config file.
There are just four allowable methods: publickey,
password, hostbased, and
keyboard-interactive. That might sound a tad inflexible, but
keyboard-interactive is a generic method that can trigger
other mechanisms, such as BSD Auth,
S/KEY, or Pluggable Authentication
Modules (PAM). By appending the desired submechanism after a colon,
such as keyboard-interactive:pam, one can add any PAM module
to the mix—including modules that build on an entirely different
authentication model, such as Kerberos, hardware security tokens, or
perhaps face recognition. Naturally, PAM needs
to be configured to specify the module of interest as well, via
/etc/pam.conf or /etc/pam.d/sshd.
The salient point, of course, is that modules are available which
rely on "something you have" or "something you are" authentication.
Otherwise, requiring users to provide a public key and a password is
not genuinely multi-factor authentication, since both factors are
squarely in the "something you know" column. Granted, there is a gray
area where public keys are concerned, since users do not memorize
them, but the fact that exact digital copies can be made separates
them from hardware tokens. Whether widespread
ease-of-deployment through OpenSSH will reinvigorate the development
and maintenance of hardware and biometric PAM modules remains an open
question.
Key lookups
The second addition to sshd's authentication toolkit is the
AuthorizedKeysCommand keyword. The command supplied as the
argument to this keyword is invoked to fetch the key of the user
attempting to authenticate. In previous releases, this key-fetching
step was limited to looking up the user's key on the filesystem (in
the file specified by the AuthorizedKeysFile keyword). By
accepting a command instead, administrators can look up the incoming
user's key in a database, or over (for example) LDAP.
OpenSSH imposes some restrictions on the program specified for
AuthorizedKeysCommand; it will be passed one argument (the
username of the user attempting to connect) and it must return output
on stdout consisting of zero or more lines of public keys,
formatted in the same manner used by the
AuthorizedKeysFile. In addition, the command is run as
the user specified in the AuthorizedKeysCommandUser
directive, which the OpenSSH project advises be a dedicated account
with no other role on the machine.
If AuthorizedKeysCommand fails to return a key that
authenticates the user, then it falls back onto the old-fashioned
local key lookup method, which allows for a safety net in the event
that the command fails or the remote host queried is unavailable.
In a related feature, OpenSSH 6.2 also adds support for Key
Revocation Lists (KRLs). KRLs are compact, binary files that can be
generated by OpenSSH's ssh-keygen utility. The KRL specification
allows for several formats, depending on the type of credential
revoked. The file can list a plain-text public key, a key's SHA1
hash, a certificate's key ID string, or a 64-bit certificate serial
number (in decimal, hexadecimal, or octal). When revoking
certificates via serial number, the KRL can specify a range, which is
what leads to a claim in the OpenSSH 6.2 release notes that KRL can be so
compact that it takes "as little a one bit per certificate."
The multiple acceptable formats specified in the KRL format can
simplify the task of revoking keys. Among other things, the ability to
revoke a credential by its serial number or key ID alone—without the
original in hand—makes it possible for an administrator to
revoke a compromised key or certificate rapidly when fetching a copy
of the complete credential might cost valuable time. In addition, it
allows administrators to revoke keys or certificates that have been
(or are feared) lost.
Encryption for all
The new sshd authentication features are not the only changes in
OpenSSH 6.2, of course. There are several new ciphers, modes, and
message authentication codes (MACs) supported, such as Galois/Counter
Mode (GCM) for AES, the 128-bit flavor of Message Authentication Code using Universal
Hashing (UMAC-128), and encrypt-then-MAC (EtM) mode for SSH 2.
The latter alters the packet format, computing the MAC over the packet
length and the entire encrypted packet, rather than over the plaintext
data payload.
There is also a new option available for specifying TCP forwarding
behavior in sshd. In previous releases, the only possible options
were "yes" and "no"; 6.2 adds two more: "local" and "remote", which
allow administrators to limit TCP forwarding to just local or just
remote hosts. In addition, there are several new command-line
switches for logging and interactive help, which should make OpenSSH
easier to work with, even though they do not add new features. The
prospect of multi-factor authentication with ssh may have the most
far-reaching implications, but OpenSSH 6.2 includes plenty of
practical updates as well.
Comments (3 posted)
Brief items
We log every message. We log who sent it, from what IP address, and to
whom. We scan headers and payloads for potentially "spammy" topics. We
even retain binary attachments and other payload data. Whereas before,
the Email system only logged trivial, non-identifiable information, it
now warehouses every message and tags it internally with a customer
account number.
This is the "equal and opposite" reaction to blacklists: Total
Information Awareness applied to our customer's email.
--
Anonymous PRIVACY Forum reader (and ISP
employee)
Paula
Broadwell, who had an affair with CIA director David Petraeus,
similarly took extensive precautions to hide her identity. She never logged
in to her anonymous e-mail service from her home network. Instead, she used
hotel and other public networks when she e-mailed him. The
FBI correlated hotel registration data from several different hotels -- and hers was the common name.
The Internet is a surveillance state. Whether we admit it to ourselves or
not, and whether we like it or not, we're being tracked all the
time. Google tracks us, both on its pages and on other pages it has access
to. Facebook
does the same; it even tracks
non-Facebook users. Apple tracks us on our iPhones and iPads. One
reporter used a tool called Collusion to track who was tracking him; 105
companies tracked his Internet use during one 36-hour period.
--
Bruce
Schneier
So, you know all that talk about things like
Aaron's
Law and how [the US] Congress needs to
fix
the CFAA [Computer Fraud and Abuse Act]? Apparently, the House Judiciary Committee has decided to raise a giant middle finger to folks who are concerned about abuses of the CFAA. Over the weekend, they began circulating a "draft" of a "cyber-security" bill that is so bad that it almost feels like the Judiciary Committee is doing it on purpose as a dig at online activists who have fought back against things like SOPA, CISPA and the CFAA. Rather than fix the CFAA, it expands it. Rather than rein in the worst parts of the bill, it makes them worse. And, from what we've heard, the goal is to try to push this through quickly, with a big effort underway for a "cyberweek" in the middle of April that will force through a bunch of related bills.
--
Mike
Masnick analyzes proposed US legislation
Comments (1 posted)
Over at the grsecurity blog, Brad Spengler and the PaX Team have co-written a
lengthy look at kernel address space layout randomization (KASLR) and its failures. "
KASLR is an easy to understand metaphor. Even non-technical users can make sense of the concept of a moving target being harder to attack. But in this obsession with an acronym outside of any context and consideration of its limitations, we lose sight of the fact that this moving target only moves once and is pretty easy to spot. We forget that the appeal of ASLR was in its cost/benefit ratio, not because of its high benefit, but because of its low cost."
Comments (14 posted)
Matthew Garrett
asserts that people
attacking UEFI secure boot are aiming at the wrong target. "
Those
who argue against Secure Boot risk depriving us of the freedom to make a
personal decision as to who we trust. Those who argue against Secure Boot
while ignoring Restricted Boot risk depriving us of even more. The
traditional PC market is decreasing in importance. Unless we do anything
about it, free software will be limited to a niche group of enthusiasts
who've carefully chosen from a small set of devices that respect user
freedom. We should have been campaigning against Restricted Boot 10 years
ago. Don't delay it even further by fighting against implementations that
already respect user freedom."
Comments (85 posted)
New vulnerabilities
euca2ools: insecure snapshots
| Package(s): | euca2ools |
CVE #(s): | CVE-2012-4066
|
| Created: | March 25, 2013 |
Updated: | March 28, 2013 |
| Description: |
From the Red Hat bugzilla:
euca2ools 2.1.3 contains the node controller's end of the fix for bug 916709. We consequently need to roll that version out at least in every branch that contains eucalyptus. |
| Alerts: |
|
Comments (none posted)
gnome-online-accounts: information disclosure
| Package(s): | gnome-online-accounts |
CVE #(s): | CVE-2013-1799
|
| Created: | March 25, 2013 |
Updated: | March 27, 2013 |
| Description: |
From the Ubuntu advisory:
It was discovered that GNOME Online Accounts did not properly check SSL
certificates when configuring online accounts. If a remote attacker were
able to perform a man-in-the-middle attack, this flaw could be exploited to
alter or compromise credentials and confidential information. |
| Alerts: |
|
Comments (none posted)
kernel: denial of service
| Package(s): | kernel |
CVE #(s): | CVE-2013-1819
|
| Created: | March 22, 2013 |
Updated: | March 27, 2013 |
| Description: |
From the Red Hat bugzilla:
Linux kernel built with support for XFS file system is vulnerable to a NULL
pointer dereference flaw. This occurs while accessing blocks beyond the end
of the file system, possibly on a corrupted device.
A user able to mount the file system could use this flaw to crash the kernel, resulting in DoS. |
| Alerts: |
|
Comments (none posted)
kernel: multiple vulnerabilities
| Package(s): | kernel |
CVE #(s): | CVE-2013-1873
CVE-2013-1796
CVE-2013-1797
CVE-2013-1798
|
| Created: | March 25, 2013 |
Updated: | April 1, 2013 |
| Description: |
From the CVE entries:
The kvm_set_msr_common function in arch/x86/kvm/x86.c in the Linux kernel through 3.8.4 does not ensure a required time_page alignment during an MSR_KVM_SYSTEM_TIME operation, which allows guest OS users to cause a denial of service (buffer overflow and host OS memory corruption) or possibly have unspecified other impact via a crafted application. (CVE-2013-1796)
Use-after-free vulnerability in arch/x86/kvm/x86.c in the Linux kernel through 3.8.4 allows guest OS users to cause a denial of service (host OS memory corruption) or possibly have unspecified other impact via a crafted application that triggers use of a guest physical address (GPA) in (1) movable or (2) removable memory during an MSR_KVM_SYSTEM_TIME kvm_set_msr_common operation. (CVE-2013-1797)
The ioapic_read_indirect function in virt/kvm/ioapic.c in the Linux kernel through 3.8.4 does not properly handle a certain combination of invalid IOAPIC_REG_SELECT and IOAPIC_REG_WINDOW operations, which allows guest OS users to obtain sensitive information from host OS memory or cause a denial of service (host OS OOPS) via a crafted application. (CVE-2013-1798)
CVE-2013-1873 is a duplicate of CVE-2013-2634, CVE-2013-2635, and CVE-2013-2636.
net/dcb/dcbnl.c in the Linux kernel before 3.8.4 does not initialize certain structures, which allows local users to obtain sensitive information from kernel stack memory via a crafted application. (CVE-2013-2634)
The rtnl_fill_ifinfo function in net/core/rtnetlink.c in the Linux kernel before 3.8.4 does not initialize a certain structure member, which allows local users to obtain sensitive information from kernel stack memory via a crafted application. (CVE-2013-2635)
net/bridge/br_mdb.c in the Linux kernel before 3.8.4 does not initialize certain structures, which allows local users to obtain sensitive information from kernel memory via a crafted application. (CVE-2013-2636)
|
| Alerts: |
|
Comments (none posted)
keystone: revocation check bypass
| Package(s): | keystone |
CVE #(s): | CVE-2013-1865
|
| Created: | March 21, 2013 |
Updated: | April 5, 2013 |
| Description: |
From the Ubuntu advisory:
Guang Yee discovered that Keystone would not always perform all
verification checks when configured to use PKI. If the keystone server was
configured to use PKI and services or users requested online verification,
an attacker could potentially exploit this to bypass revocation checks.
Keystone uses UUID tokens by default in Ubuntu. |
| Alerts: |
|
Comments (none posted)
libxml2: denial of service
| Package(s): | libxml2 |
CVE #(s): | CVE-2013-0339
|
| Created: | March 26, 2013 |
Updated: | March 27, 2013 |
| Description: |
From the Debian advisory:
Brad Hill of iSEC Partners discovered that many XML implementations are
vulnerable to external entity expansion issues, which can be used for
various purposes such as firewall circumvention, disguising an IP
address, and denial-of-service. libxml2 was susceptible to these
problems when performing string substitution during entity expansion. |
| Alerts: |
|
Comments (none posted)
nova: two vulnerabilities
| Package(s): | nova |
CVE #(s): | CVE-2013-0335
CVE-2013-1838
|
| Created: | March 21, 2013 |
Updated: | April 5, 2013 |
| Description: |
From the Ubuntu advisory:
Loganathan Parthipan discovered that Nova did not properly validate VNC
tokens after an instance was deleted. An authenticated attacker could
exploit this to access other virtual machines under certain circumstances.
This issue did not affect Ubuntu 11.10. (CVE-2013-0335)
Vish Ishaya discovered that Nova did not always enforce quotas on fixed
IPs. An authenticated attacker could exploit this to cause a denial of
service via resource consumption. Nova will now enforce a quota limit of
10 fixed IPs per instance, which is configurable via 'quota_fixed_ips'
in /etc/nova/nova.conf. (CVE-2013-1838) |
| Alerts: |
|
Comments (none posted)
openstack-packstack: insecure file handling
| Package(s): | openstack-packstack |
CVE #(s): | CVE-2013-1815
|
| Created: | March 22, 2013 |
Updated: | March 27, 2013 |
| Description: |
From the Red Hat advisory:
PackStack is a command line utility that uses Puppet modules to support
rapid deployment of OpenStack on existing servers over an SSH connection.
PackStack is suitable for deploying both single node proof of concept
installations and more complex multi-node installations.
It was found that PackStack did not handle the answer file securely. In
some environments, such as those using a non-default umask, a local
attacker could possibly modify the answer file if PackStack was run in an
attacker controlled directory, or attempted to create the answer file in
"/tmp/", allowing the attacker to modify systems being deployed using
OpenStack. Note: After applying this update, PackStack will create the
answer file in the user's home directory by default. It will no longer
create it in the current working directory or the "/tmp/" directory by
default. (CVE-2013-1815)
The CVE-2013-1815 issue was discovered by Derek Higgins of the Red Hat
OpenStack team. |
| Alerts: |
|
Comments (none posted)
privoxy: proxy spoofing
| Package(s): | privoxy |
CVE #(s): | CVE-2013-2503
|
| Created: | March 22, 2013 |
Updated: | April 3, 2013 |
| Description: |
From the Fedora advisory:
Privoxy before 3.0.21 does not properly handle Proxy-Authenticate and Proxy-Authorization headers in the client-server data stream, which makes it easier for remote HTTP servers to spoof the intended proxy service via a 407 (aka Proxy Authentication Required) HTTP status code.
|
| Alerts: |
|
Comments (none posted)
Page editor: Jake Edge
Kernel development
Brief items
The current development kernel is 3.9-rc4,
released on March 23. Linus says:
"
Another week, another -rc. And things haven't calmed down, meaning
that the nice small and calm -rc2 was definitely the outlier so far.
… While it hasn't been as calm as I'd like things to be, it's not
like things have been hugely exciting either. Most of this really is
pretty trivial. It's all over, with the bulk in drivers (drm, md, net, mtd,
usb, sound), but also some arch updates (powerpc, arm, sparc, x86) and
filesystem work (cifs, ext4)."
Stable updates: 3.2.42 was released
on March 27.
The 3.8.5,
3.4.38,
and 3.0.71 updates are in the review cycle
as of this writing; they can be expected on or after March 28. Also
in review are 3.5.7.9 and
3.6.11.1 (a new, short-term series meant to
support the 3.6-based realtime stable kernels).
Comments (none posted)
Be careful, you've already submitted some kernel patches; keep on
this patch and you might just wake up one morning and find yourself
a kernel developer.
—
Paul Moore
You can't just constantly ignore patches, that's reserved for
kernel developers with more experience :)
—
Greg Kroah-Hartman (Thanks to Thomas Petazzoni)
This patch adds new knob "reclaim under proc/<pid>/" so task
manager can reclaim any target process anytime, anywhere. It could
give another method to platform for using memory efficiently.
It can avoid process killing for getting free memory, which was
really terrible experience because I lost my best score of game I
had ever after I switch the phone call while I enjoyed the game.
—
Minchan Kim
Comments (none posted)
Kernel development news
By Michael Kerrisk
March 27, 2013
Linus Torvalds has railed frequently and loudly against kernel
developers breaking user space. But that rule is not ironclad; there
are exceptions. As Linus once noted:
But the "out" to that rule is that "if nobody notices, it's not
broken" […] So breaking user space is a bit like trees falling
in the forest. If there's nobody around to see it, did it really
break?
The story of how a kernel change caused a GlusterFS breakage shows
that there are sometimes unfortunate twists to those exceptions.
The kernel change and its consequences
GlusterFS is a widely-used, free,
scale-out,
distributed filesystem that is available on Linux and a number of other
UNIX-like systems. GlusterFS was initially developed by Gluster, Inc., but
since Red Hat acquired that company in 2011, it has mainly driven work on
the filesystem.
GlusterFS's problems sprang from an ext4 filesystem patch
by Fan Yong that addressed a long-standing issue in ext4's support for the
readdir() API by widening the "directory offset" values used by
the API from 32 to 64 bits. That change was needed to reliably support
readdir() traversals in large directories; we'll discuss those
changes and the reasons for making them in a
companion article. One point from that discussion is worth making here:
these "offset" values are in truth a kind of cookie, rather than a true
offset within a directory. Thus, for the remainder of this article, we'll
generally refer to them as "cookies". Fan's patch made its way into the
mainline 3.4 kernel (released in May 2012), but appears also to have been
ported into the 3.3.x kernel that was released with Fedora 17 (also
released in May 2012).
Fan's patch solved a problem for ext4, but inadvertently created one
for GlusterFS servers that use ext4 as their underlying storage
mechanism. However, nobody reported problems in time to cause the patch to
be reconsidered. The symptom on affected systems, as noted in a July 2012
Red Hat bug
report, was that using readdir() to scan a directory on a
GlusterFS system would end up in an infinite loop in some cases.
The cause of the problem—as detailed
by Anand Avati in a recent (March 2013) discussion on the ext4 mailing
list—is that GlusterFS makes some assumptions about the "cookies"
used by the readdir() API. In particular, although these values
are 64 bits long, the GlusterFS developers noted that only the lower 32
bits were used, and so decided to encode some additional
information—namely the index of the Gluster server holding the
file—inside their own internal version of the cookie, according to
this formula:
final_d_off = (ext4_d_off * MAX_SERVERS) + server_idx
This GlusterFS internal cookie is exchanged in the 64-bit cookie that
is passed in NFSv3 readdir() requests between GlusterFS clients and
front-end servers. (An ASCII art diagram
posted in the mailing list thread by J. Bruce Fields clarifies the
relationship of the various GlusterFS components.) The GlusterFS internal
cookie allows the server to easily encode the identify of the GlusterFS
storage server that holds a particular directory.
This scheme worked fine as long as only 32 bits were used in the ext4
readdir() cookies (ext4_d_off), but promptly blew up when
the cookies switched to using 64 bits, since the multiplication caused some
bits to be lost from the top end of ext4_d_off.
An August 2012 gluster.org blog
post by Joe Julian pointed out that the problem affected not only
Fedora 17's 3.3 kernel, but also the kernel in Red Hat's Enterprise Linux
distribution, because the kernel change had been backported into the much
older 2.6.32 distribution kernel supplied in RHEL 6.3 and later.
The recommended workaround was either to downgrade
to an earlier kernel version that did not include the patch or
to reformat the GlusterFS bricks (the fundamental storage unit on a
GlusterFS node) to use XFS instead of ext4. (Using XFS rather than ext4 had
already been recommended practice when using GlusterFS.) Needless to say,
neither of these solutions was easily practicable for some GlusterFS users.
Mitigating GlusterFS's problem
In his March 2013 mail, Anand bemoaned the fact that the manual pages
gave no indication that the readdir() API "offsets" were cookies
rather than something like a conventional file offset whose range might
bounded. Indeed, the manual pages rather hinted towards the latter
interpretation. (That, at least, is a problem that is now addressed.)
Anand went on to request a fix to the problem:
You can always say "this is your fault" for interpreting the man
pages differently and punish us by leaving things as they are (and
unfortunately a big chunk of users who want both ext4 and gluster
jeopardized). Or you can be kind, generous and be considerate to
the legacy apps and users (of which gluster is only a subset) and
only provide a mount option to control the large d_off behavior.
But, as the ext4 maintainer, Ted Ts'o, noted, Fan's patch addressed a real problem
that affected well-behaved applications that did not make mistaken
assumptions about the value returned by telldir(). Adding a mount
option that nullified the effect of that patch would affect all programs
using a filesystem and penalize those well-behaved applications by
exposing them to the problem that the patch was designed to fix.
Ted instead proposed another approach: a per-process setting that
allowed an application to request the older readdir() cookie
semantics. The advantage of that approach is that it provides a solution
for applications that misuse the cookie without penalizing applications
that do the right thing. This solution could, he said, take the form of an ext4-specific
ioctl() operation employed immediately after calling
opendir(). Anand thought that
should be a workable solution for GlusterFS. The requisite patch does not
yet seem to have appeared, but one supposes that it will be written and
submitted during the 3.10 merge window, and possibly backported into
earlier stable kernels.
So, a year after the ext4 kernel change broke GlusterFS, it seems that
a (kernel) solution will be found to address GlusterFS's difficulties. In
passing, it's probably fair to mention that one reason that the (proposed)
fix took so long in coming was that the GlusterFS developers initially
thought they might be able to work around the kernel change by making
changes in GlusterFS. However, it ultimately turned
out to be impossible to exchange both a full 64-bit readdir()
cookie and a GlusterFS storage server ID in the NFS readdir()
requests exchanged between GlusterFS clients and front-end servers.
Summary: the meta-problem
In the end, the GlusterFS breakage might have been
avoided. Ted's proposed fix could have been rolled out at the same time
as Fan's patch, so as to minimize any disruptions for GlusterFS
users. Returning to Linus's quote at the beginning of this article puts us
on the trail of a deeper problem.
"If there's nobody around to see it, did it really break?"
was Linus's rhetorical question. The problem is that this is a test whose
results can be rather arbitrary. Sometimes, as was the case in the implementation
of EPOLLWAKEUP, a kernel change that causes a minor breakage
in a user-space application that is doing strange things will be reverted
or modified because it is fortuitously spotted by someone close to the
development scene—namely, a kernel developer who notices a
misbehavior on their desktop system.
However, other users may be so far from the scene of change that it can
be a considerable time before they see a problem. By the time those users
detect a user-space breakage, the corresponding stable kernel may already
be several release cycles in the past. One can easily imagine that few
kernel developers are running a GlusterFS node on their development
systems. Conversely, one can imagine that most users of GlusterFS are
running production environments where stability and uptime are critical,
and testing an -rc kernel is neither practical nor a high priority.
Thus, a rather important user-space breakage was missed—one that,
if it had been detected, would almost certainly have triggered modification
or reversion of the relevant patches, or stern words from Linus in the face
of any resistance to making such changes. And, certainly, this is not a
one-off case. Your editor did not need to look too far to find another
example, where a change in the way that POSIX
message queue limits are enforced in Linux 3.5 led to a report
of breakage in a database engine nine months later.
The "if there's nobody around to see it" metric requires that someone
is looking. That is of course a strong argument that the developers of
user-space applications such as GlusterFS who want to ensure that their
applications keep working on newer kernels must vigilantly and thoroughly
test -rc kernels. Clearly that did not happen.
However, it seems a little unfair to place the blame solely on user
space. The ext4 modifications that affected GlusterFS clearly represented a
change to the kernel-user-space ABI (and for reasons that we describe in
our follow-up article, that change was clearly necessary). In cases such as
this (and the POSIX message queue change), perhaps even more caution was
warranted when making the change. At the very least, a loud announcement in
the commit message that the kernel changes represented a change to the ABI
would have been helpful; that might have jogged some reviewers to think
about the possible implications and resulted in the ext4 changes
being made in a way that minimized problems for GlusterFS. A greater
commitment on both sides to improving the documentation would also be
helpful. It's notable that even after deficiencies in the documentation
were mentioned as a contributing factor to GlusterFS problem, no-one sent a
patch to improve said documentation. All in all, it seems that parties on
both sides of the ABI could be doing a better job.
Comments (29 posted)
By Michael Kerrisk
March 27, 2013
In a separate article, we explained how
an ext4 change to the kernel-user-space ABI in Linux 3.4 broke the
GlusterFS filesystem; here, we look in detail at the change and why it was
needed. The change in question was a
patch by Fan Yong that widened the readdir() "cookies"
produced by ext4 from 32 to 64 bits. Understanding why Fan's patch was
necessary first requires a bit of background on the readdir() API.
The readdir API consists of a number of functions that allow
an application to walk through the entries in a directory list. The opendir()
function opens a directory stream for a specified directory. The readdir()
function returns the contents of a directory stream, one entry at a
time. The telldir()
and
seekdir() functions provide lseek-style functionality: an
application can remember its current position in a directory stream using
telldir(), scan further entries with readdir(), and then
return to the remembered position using seekdir().
It turns out that supporting the readdir API is a source of
considerable pain for filesystem developers. The API was designed in a
simpler age, when directories were essentially linear tables of filenames
plus inode numbers. The first of the widely used Linux filesystems, ext2,
followed that design. In such filesystems, one can meaningfully talk about
an offset within a directory table.
However, in the interests of improving performance and supporting new
features, modern filesystems (such as ext4) have long since adopted more
complex data structures—typically B-trees (PDF)—for
representing directories. The problem with B-tree structures, from the
point of view of implementing the readdir() API, is that the nodes
in a tree can undergo (sometimes drastic) rearrangements as entries are
added to and removed from the tree. This reordering of the tree renders the
concept of a directory "offset" meaningless. The lack of a stable offset
value is obviously a difficulty when implementing telldir() and
seekdir(). However, it is also a problem for the implementation of
readdir(), which must be done in such a way that a loop using
readdir() to scan an entire directory will return a list of all
files in the directory, without duplicates. Consequently,
readdir() must internally also maintain some kind of stable
representation of a position within the directory stream.
Although there is no notion of an offset inside a B-tree, the
implementers of modern filesystems must still support the
readdir API (albeit
reluctantly); indeed, support for the API is a POSIX
requirement. Therefore, it is necessary to find some means of supporting
"directory position" semantics. This is generally done by fudging the
returned offset value, instead returning an internally understood "cookie"
value. The idea is that the kernel computes a hash value that encodes some
notion of the current position in a directory (tree) and returns that value
(the cookie) to user space. A subsequent readdir() or
seekdir() will pass the cookie back to the kernel, at which point
the kernel decodes the cookie to derive a position within the directory.
Encoding the directory position as a cookie works, more or less, but
has some limitations. The cookie has historically been a 31-bit hash
value, because older NFS implementations could handle only 32-bit
cookies. (The hash is 31-bit because the off_t type used to
represent the information is defined as a signed type, and negative offsets
are not allowed.) In earlier times, a 31-bit hash was not too much of a
problem: filesystem limitations meant that directories were usually small, so
the chance that two directory entries would hash to the same value was
small.
However, modern filesystems allow for large directories—so large
that the chance of two files producing the same 31-bit hash is
significant. For example, in a directory with 2000 entries, the chance of a
collision is around 0.1%. In a directory with 32,768 entries (the
historical limit in ext2), the chance is somewhat more than 20%. (For the
math behind these numbers, see the Wikipedia article
on the Birthday Paradox.) Modern filesystems have much higher limits on
the number of files in a directory, with a corresponding increase in the
chance of hash collisions; in a directory with 100,000
entries, the probability is over 90%.
Two files that hash to the same cookie value can lead to problems when
using readdir(), especially on NFS. Suppose that we want to scan
all of the files in a directory. And suppose that two files, say
abc and xyz, hash to the same value, and that the
directory is ordered such that abc is scanned first. When an NFS
client readdir() later reaches the file xyz, it will
receive a cookie that is exactly the same as for abc. Upon passing
that cookie back to the NFS server, the next readdir() will
commence at the file following abc. The NFS client code has some logic
to detect this situation; that logic causes readdir() to give the
(somewhat counter-intuitive) error ELOOP, "Too many levels of
symbolic links".
This error can be fairly easily reproduced on NFS with older
kernels. One simply has to create an ext4 directory containing enough
files, mount that directory over NFS, and run any program that performs a
readdir() loop over the directory on the NFS client. When working
with a local filesystem (no NFS involved), the same problem exists, but in
a different form. One does not encounter it when using readdir(),
because of the way in which that function is implemented on top of the
getdents() system call. Essentially, opendir() opens a
file descriptor that is used by getdents(); the kernel is able to
internally associate a directory position with that file descriptor, so
cookies play no part in the implementation of readdir(). By
contrast, because NFS is stateless, each
readdir() over NFS requires that the NFS server explicitly locate
the directory position corresponding to the cookie sent by the client.
On the other hand, the problem can be observed with a local ext4
filesystem when using telldir(), because that function explicitly
returns the directory "offset" cookie to the caller. If two directory
entries produce the same "offset" cookie when calling telldir(),
then a call to seekdir() after either of the telldir()
calls will go back to the same location. A user-space loop such as the
following easily reveals the problem, encountering a difficulty analogous
to a readdir() loop over NFS:
dirp = opendir("/path/to/ext4/dir");
while ((dirent = readdir(dirp)) != NULL) {
...
seekdir(dirp, telldir(dirp));
...
}
The seekdir(dirp, telldir(dirp)) call is a seeming no-op,
simply resetting the directory position to its current location. However,
where a directory entry hashes to the same value as an earlier
directory entry, the effect of the call will be to reset the directory
position to the earlier entry with the same hash. An infinite loop thus
results. Real programs would of course not use telldir() and
seekdir() in this manner. However, every now and then programs
that use those calls would obtain a surprising result: a seekdir()
would reposition the directory stream to a completely unexpected location.
Thus, the cookie collision problem needed to be fixed for the benefit
of both ext4 and (especially) NFS. The simplest way of reducing the
likelihood of hash collisions is to increase the size of the hash
space. That was the purpose of Fan's patch, which increased the size of the
hash space for the offset cookies produced by ext4 from 31 bits to 63. (A
similar
change has also been merged for ext3.) With a 63-bit hash space, even a
directory containing one million entries would have less than one chance in
four million of producing a hash collision. Of course, a corresponding
change is required in NFS, so that the NFS server is able to deal with the
larger cookie sizes. That change was provided in a
patch by Bernd Schubert.
Reading this article and the GlusterFS article together, one might
wonder why GlusterFS doesn't have the same problems with XFS that it has
with ext4. The answer, as noted by Dave
Chinner, is that XFS uses a rather different scheme to produce
readdir() cookies. That scheme produces cookies that require only
32 bits, and the cookies are produced in such a way as to guarantee that no
two files can generate the same cookie. XFS is able to produce unique
32-bit cookies due to the virtual mapping it overlays onto the directory
index; adding such a mapping to ext4 (which does not otherwise need it)
would be a large job.
Comments (29 posted)
By Jonathan Corbet
March 26, 2013
The world was a simpler place when the TCP/IP network protocol suite was
first designed. The net was slow and primitive and it was often a triumph
to get a connection to a far-away host at all. The machines at either end
of a TCP session normally did not have to concern themselves with how that
connection was made; such details were left to routers. As a result, TCP
is built around the notion of a (single) connection between two hosts. The
Multipath TCP (MPTCP) project looks
to change that view of networking by adding support for multiple transport
paths to the endpoints; it offers a lot of benefits, but designing a
deployable protocol for today's Internet is surprisingly hard.
Things have gotten rather more complicated in the years since TCP was first
deployed.
Connections to multiple networks, once the province of large server
systems, are now ubiquitous; a smartphone, for example, can have separate,
simultaneous interfaces to a cellular network, a WiFi network, and,
possibly, other networks via Bluetooth or USB ports. Each of those networks
provides a possible way to reach a remote host, but any given
TCP session will use only one of them. That leads to obvious policy
considerations (which interface should be used when) and operational
difficulties: most handset users are familiar with how a WiFi-based TCP
session will be broken if the device moves out of range of the access
point, for example.
What if a TCP session could make use of all of the available paths between
the two endpoints at any given time? There would be performance
improvements, since each of the paths could carry data in parallel, and
congested paths could be avoided in favor of faster paths at any given
time. Sessions could also be more robust. Imagine a video stream that is
established over both WiFi and cellular networks; if the watcher leaves the
house (one hopes somebody else is driving), the stream would shift
transparently to the cellular connection without interruption. Data
centers, where multiple paths between systems and variable congestion are
both common, could also make use of a multipath-capable transport protocol.
The problem is that TCP does not work that way. Enter MPTCP, which
is designed to work that way.
How it works
A TCP session is normally set up by way of a three-way handshake. The
initiating host sends a packet with the SYN flag set, the receiving host,
if it is amenable to the connection, responds with a packet containing both
the SYN and ACK flags. The final ACK packet sent by the initiator puts
the connection into the "established" state; after that, data can be
transferred in either direction.
An MPTCP session starts in the same way, with one change: the initiator
adds the new MP_CAPABLE option to the SYN packet. If the receiving host
supports MPTCP, it will add that option to its SYN-ACK reply; the two hosts
will also include cryptographic keys in these packets for later use. The
final ACK (which must also carry the MP_CAPABLE option) establishes a
multipath session, albeit a session using a single path just like
traditional TCP.
When MPTCP is in use, both sides recognize a distinction between the
session itself and any specific "subflow" used by that session. So, at
any point, either party to the session can initiate another TCP connection
to the other side, with the proviso that the address and/or port at one end or the
other of the connection must differ. So, if a smartphone has initiated an
MPTCP connection to a server using its WiFi interface:
It can add another
subflow at any time by connecting to the same server by way of its cellular
interface:
That subflow is added by sending a SYN packet with the MP_JOIN option; it
also includes information on which MPTCP session is to be joined. Needless
to say, the protocol designers are concerned that a hostile party might try
to join somebody else's session; the previously-exchanged cryptographic
keys are used to prevent such attacks from succeeding. If the receiving
server is amenable to adding the subflow, it will allow the establishment
of the new TCP connection and add it to the MPTCP session.
Once a session has more than one subflow, it is up to the systems on each
end to decide how to split traffic between them (though it is possible to
mark a specific subflow for use only when any others no longer work). A
single receive window applies to the session as a whole. Each subflow
looks like a normal TCP connection, with its own sequence numbers, but the
session as a whole has a separate sequence number; there is another TCP
option (DSS, or "Data Sequence Signal") which is used to inform the other
end how data on each subflow fits into the overall stream.
Subflows can come and go over the life of an MPTCP connection. They can be
explicitly closed by either end, or they can simply vanish if one of the
paths becomes unavailable. If the underlying machinery is working well,
applications should not even notice these changes. Just like IP can hide
routing changes, MPTCP can hide the details of which paths it is using at
any given time. It should, from an application's point of view, just work.
Needless to say, there are vast numbers of details that have been glossed
over here. Making a protocol extension like this work requires thinking
about issues like congestion control, how to manage retransmissions over a
different path, how one party can tell the other about additional addresses
(paths) it could use, how to decide when setting up multiple subflows is
worth the expense,
and so on. The MPTCP designers have done much of that thinking; see
RFC 6824 for the details.
The dreaded middlebox
One set of details merits a closer look, though. The designers of MPTCP
are not interested in going through an idle academic exercise; they want to
create a solution to real problems that will be deployed on the existing
Internet. And that means designing something that will function with the
net as it exists now. At one level, that means making things work
transparently for TCP-based applications. But there is an entire section in
the RFC that is concerned with "middleboxes" and how they can sabotage
any attempt to introduce a new protocol.
Middleboxes are routers that impose some sort of constraint or
transformation on network traffic passing through them. Network address
translation (NAT) boxes are one example: they hide an entire network behind
a translation layer that will change the address and port of a connection
on its way through. NAT boxes can also insert data into a stream — adding
commands to make FTP work, for example. Some boxes will acknowledge data
on its way through, well before it arrives at the real destination, in an
attempt to increase pipelining. Some routers will drop packets with
unknown options; that behavior made the rollout of the selective
acknowledgment (SACK) feature much harder than it needed to be. Firewalls
will kill connections with holes in the sequence number stream; they will
also, sometimes, transform sequence numbers on the way through. Splitting
and coalescing of segments can cause options to be dropped or duplicated.
And so on; the list of potential problems is impressive.
On top of that, anybody trying to introduce an entirely new transport-layer
is likely to discover that it will not make it across the Internet at all.
Much of the routing infrastructure on the net assumes that TCP and UDP are
all there is; anything else has a poor chance of making it through.
Working around these issues drove the design of MPTCP at all levels. TCP
was never designed for multiple subflows; rather than bolting that idea
onto the protocol, it might well have been better to start over. One could
have incorporated the lessons learned from TCP in all ways — including
doing things entirely differently where it made sense. But the resulting
protocol would not work on today's Internet, so the designers had no choice
but to create a protocol that, to almost every middlebox out there, looks
like plain old TCP.
So every subflow is an independent TCP connection in every respect. Since
holes in sequence numbers can cause problems, each subflow has its own
sequence and a mapping layer must be added on top. That mapping layer uses
relative sequence numbers because some middlebox may have changed those
numbers as they passed through. The two sides assign "address identifiers"
to the IP addresses of their interfaces and use those identifiers to
communicate about those interfaces, since the addresses themselves may be
changed by a NAT box in the middle. When one side tells the other about an
available interface, it adds an "address identifier" to be used in future
messages because a NAT box might change the visible address of that
interface. Special checks exist for subflows that corrupt data, insert
preemptive acknowledgments, or strip unknown options; such subflows will
not be used. And the whole thing is designed to fall back gracefully to
ordinary TCP if the interference is too strong to overcome.
It is all a clever bit of design on the part of the MPTCP developers, but
it also highlights an area of concern: the "dumb" Internet with end-to-end
transparent routing of data is a thing of the distant past. What we have
now is inflexible and somewhat hostile to the deployment of new technologies. The
MPTCP developers have been able to work around these limitations, but the
effort required was considerable. In the future, we may find that the net
is broken in fundamental ways and it simply cannot be fixed; some might say
that the difficulties in moving to IPv6 show that this has already
happened.
Future directions
The current MPTCP code can be found at the MPTCP github
repository; it adds a good 10,000 lines to the mainline kernel's
networking subtree. While it has apparently been the subject of
discussions with various networking developers, it has not, yet,
been posted for public review or inclusion into the mainline. It does,
however, seem to work: the MPTCP developers claim to have implemented the fastest TCP
connection ever by transmitting at a rate of 51.8Gb/s over six 10Gb
links.
MPTCP is still relatively young, so there is almost certainly quite a bit
of work yet to be done before it is ready for mainline merging or
production use. There is also some thinking to be done on the application
side; it may be possible for MPTCP-aware applications to make better use of
the available paths. Projects like this are arguably never finished (we are
still refining TCP, after all), but MPTCP does seem to have reached the
point where more users may want to start experimenting with it.
Anybody wanting to play with this code can grab the project's kernel
repository and build a custom kernel. For those who are not up to that
level of effort, the project offers a number of other
options, including a Debian repository, instructions for running MPTCP
on Amazon's EC2, and kernels for a handful of Android-based handsets.
Needless to say, the developers are highly interested in hearing bug
reports or other testing results.
Comments (70 posted)
Patches and updates
Kernel trees
- Thomas Gleixner: 3.8.4-rt1 .
(March 23, 2013)
- Sebastian Andrzej Siewior: 3.8.4-rt2 .
(March 27, 2013)
Build system
Core kernel code
Device drivers
Filesystems and block I/O
Memory management
Networking
Architecture-specific
Security-related
Virtualization and containers
Miscellaneous
Page editor: Jonathan Corbet
Distributions
By Nathan Willis
March 27, 2013
A recent debate on the Fedora desktop list shined some light on
the occasionally awkward relationship between user interface design and
open source projects. The original issue was one of visual branding, in
particular where and how the distribution logo should be displayed on
the login screen. But the subsequent discussion revealed just how
quickly such questions can pivot into more substantial
issues—such as end-user support, the selection of system
components, and the easily entangled needs of upstream and downstream
projects.
Leggo my logo
Ryan Lerch wrote to the list on March 18, observing that in Fedora
19 the Fedora logo on the GDM login screen had been moved to the side
and reduced significantly in size. Lerch originally asked only why
the logo had been moved; in reply,
GNOME designer Allan Day said that GNOME had decided that the layout
used in Fedora 18 was causing problems. There was already a bug open
on the topic, and while Day agreed that the layout used in Fedora 19
looked wrong, simply reverting back to the older design was a
non-starter.
The problem with the old layout started with the fact that the
distribution logo sat directly above GDM's list of user accounts,
which put it in the way whenever the list was long and vertical space
ran short. Whether that means that the logo looked weird
if it was pushed to the top of the screen or if it was simply
impossible to place the logo statically (given that the user list can
change size) was not fully explained, but there were other visual
problems at issue, too—such as having the centered logo
sitting on top of the left-justified list of users.
Several ideas were bandied about. Eventually the solution that was
implemented in GNOME 3.8 test builds (and is slated for inclusion in
Fedora 19) moved the logo
to the upper-left-hand corner of the login screen, shrunken down to fit
within the confines of the menu bar. Lerch pointed to a screenshot
(see the Fedora
18 version for comparison). The result is virtually unreadable; the
Fedora logo includes text but it has the "infinity f" bubble floating
above it, too; the upshot is that when scaled down the text is half
the height of the date and time display. In addition to the size,
however, Lerch reported that placing
the logo in the menu bar was confusing, because it looked like an
interactive UI element (which is the case for everything else in the
menu bar).
The look, the feel of GNOME
On the bug report, Day commented
that GNOME's design team had decided that the distribution logo should
be dropped from the GDM login screen entirely, and that the
distribution name should be rendered as a text string in the menu
bar. He opened two additional bugs (695691 and 695692) to discuss where else the
distribution could place its branding elements.
But that solution did not sit well with the Fedora team. Jared
Smith commented that the change hurt
Fedora's branding. Fedora designer Máirín Duffy
asked how often GNOME expected there to be so many users on a system
that the GDM user list would need all of the screen space,
and asked for clarification on how the logo "visually clashes" with
the login screen, as an earlier bug
described it. "Removing the logo completely and replacing it
with a string is completely unacceptable from a Fedora point of view,
and I'm very surprised this is the suggested solution," she
said.
Day replied with additional detail
on the visual problems, explaining:
... the logo was felt to be a distracting presence. We've made an
effort to make sure that the most important elements are the most
visually prominent, and we want the primary interaction points to be
the ones that jump out at you. The logo was a strong visual presence
placed above the user list: this drew the eye to it, making it the
first thing you saw, and distracted you from the parts of the screen
that are actually useful to the user (ie. the user list).
His preference was to move the distribution branding to the corner
so as to "mitigate the negative impact of including a
logo while retaining a visual reference to the distributor," he
said, although he also agreed that Lerch's critique of the solution
deployed was valid.
But therein lies the root of the disagreement. Does the
distribution logo "negatively impact" the user's experience, or not?
The Fedora project members clearly regard branding the login screen to
be an important part of the overall user experience. Those on the GNOME
side argued that branding which grabs the user's attention makes the
user experience worse, and thus hurts the distribution. In fact, they
argued that any prominent logo was problematic—neither
Day nor anyone else from the GNOME team was advocating removing the
Fedora logo and replacing it with a GNOME logo. Cosimo Cecchi even asked why the login screen needs any
branding whatsoever, since he wants to get past the login screen as
quickly as possible, and on to his desktop.
Seth Vidal asked "So the question is this: Is the user installing
Fedora or are they installing Gnome? I think it is Fedora." Duffy
concurred; she responded that as a practical matter, where the user goes
when they encounter a problem is paramount; since Fedora users will
come to the Fedora community (not the GNOME project) for help,
reinforcing the Fedora brand is important.
Complicating the question is the fact that GNOME is the default
desktop environment in Fedora, but historically it has not been the
only option. Changes in the GNOME 3 era have seen
desktop-neutral Fedora components replaced with GNOME-specific ones,
which can marginalize or adversely affect other environments like Xfce.
Adam Williamson noted the replacement
of Fedora firstboot with gnome-initial-setup, and pointed out that GDM was "now a
special instance of GNOME Shell, strongly integrated with
GNOME." Vidal even suggested
that Fedora consider display managers other than GDM, but that idea
was not well received.
Another level of complication stems from the fact that many
developers are active participants in both projects, and many are paid
employees of Red Hat. As Colin Walters observed, even
if Red Hat does not dictate changes to Fedora, its developers must
keep Red Hat Enterprise Linux (RHEL) in mind while they work, since
Fedora serves as RHEL's upstream.
The hidden mysteries of design
Finally, the discussion also reveals how tricky it can be to merge
the work of software developers and user interface designers. At
times, the two camps do not even seem to speak the same language.
Design rarely results in something that can be read, diff'ed,
or checked in, so feedback from designers can at times be
frustratingly terse or opaque. Consider Day's comment
"the design is to have a string with the distributor name in the
top left hand corner." That reads like a final decision; one
could be forgiven for not seeing how to respond to it.
Duffy's comments, however, illustrate that the gap can be bridged.
Design is not the same as engineering, but solutions can be
researched, tested, and evaluated, which is good engineering practice,
and takes design out of the hard-to-grasp "pure aesthetic" realm and
integrates it with developing an actual product. She questioned the
"design" angle of removing the logo, saying:
I would like to see user data backing up the assertion that providing
the vendor logo a minimal amount of space on the login screen is harmful
to the user experience. I have seen remarks that it 'visually clutters'
the login screen, and is 'distracting,' but I'd like to see more than
personal opinions on this. [...]
I always strive to follow a design process that includes user research,
brainstorming, and iteration - user research can help identify problems
to solve; brainstorming and iteration involve coming up with solutions
to those problems; then you research again to see if you actually fixed
them.
Here I see iteration and I don't see user research.
Similarly, Lerch's observations that the Fedora 19 logo was
unreadably small and that its placement in the menu bar was easily
confused with an interactive element are both feedback from a
real-world user test (albeit an informal one). Distributions tend to
put branding in predictable places: boot manager, splash screen, login
screen, desktop wallpaper, system menus, and so forth. There may not
be a quantifiably optimal size and placement for the Fedora
logo (or any other user interface element), but testing is the only
way to adequately compare the imperfect solutions available.
For now the GDM login screen in GNOME 3.8 is a done deal; the
project has entered a freeze in preparation for the release of 3.8.0.
The good news is that Day and the other members of the design team are
open to releasing an update with 3.8.1. Fedora 19 is not scheduled
for release until late June 2013, which should be plenty of time to
try out a variety of possibilities and come up with something that
both upstream and downstream developers are satisfied to see while
they enter their passwords.
Comments (20 posted)
Brief items
Packages without bugs are packages that nobody has bothered to test... :)
--
Rich Freeman
Comments (none posted)
Canonical has
announced
a collaboration with the Chinese government to create a standard operating
system reference architecture based on the Ubuntu distribution. "
The
initial work of the CCN Joint Lab is focused on the development of an
enhanced version of the Ubuntu desktop with features specific to the
Chinese market. The new version is called Ubuntu Kylin and the first
version will be released in April 2013 in conjunction with Ubuntu’s global
release schedule. Future work will extend beyond the desktop to other
platforms."
Comments (15 posted)
Slackware and Arch Linux have announced the removal of MySQL in favor of
MariaDB. See the
Slackware and
Arch
announcements for details.
Comments (none posted)
Distribution News
Debian GNU/Linux
The Debian Systems Administration Team (DSA) has a few bits covering "the
last year or so". Topics include the Five Year Plan for hardware and
hosting, systems management, and account management.
Full Story (comments: none)
Newsletters and articles of interest
Comments (none posted)
The H
looks
at the latest release of FreeNAS. "
FreeNAS 8.3.1 introduces the ability to set up full disk encryption on ZFS volumes and several other smaller improvements. FreeNAS is a FreeBSD-based Network Attached Storage (NAS) distribution that enables users to easily set up and control their own storage and file servers."
Comments (none posted)
Page editor: Rebecca Sobol
Development
By Jake Edge
March 27, 2013
Introduced as needing no introduction, Python's creator and benevolent
dictator for life
(BDFL), Guido van Rossum, took the stage on March 17 for a PyCon 2013 keynote. One might expect
a high-level talk of the language's features and/or future from
the BDFL, but that was not at all the case here. Unlike many keynote
speakers, Van
Rossum launched into
a technical talk about a rather deep subject, while granting permission to
leave to
those recovering from the previous night's party. A single feature that
is
in the future of Python 3, asynchronous I/O, was his topic.
Van Rossum started looking into the problem after a post
on the python-ideas mailing list that was "innocently" proposing
changes to the asyncore module in the standard library. The subject of the
posting, "asyncore: included batteries don't fit" piqued his
interest, so he followed the thread, which grew to a "centithread" in two
weeks. He "decided to dive in", because he had done a lot of work recently
on the
asynchronous API for Google App Engine. Unlike previous times that
asynchronous I/O had come up, he now understood why people cared
and why it was so controversial.
Existing approaches
The basic idea behind asynchronous I/O is "as old as computers". Essentially it
is the idea that the program can do something else while it is waiting for
I/O to complete. That is unlike the normal operation of Python and other
languages, where doing an I/O operation blocks the program. There have been
lots of approaches to asynchronous I/O over the years,
including interrupts, threads, callbacks, and events.
Asynchronous I/O is desirable because I/O is slow and the CPU is not needed to
handle most of it, so it would be nice to use the CPU while the I/O is
being done.
When clicking a button for a URL, for example, asynchronous I/O would allow the
user interface to stay responsive, rather than giving the user a "beach
ball" until the "other server burps up the last byte" of its response.
The user could initiate another I/O operation by clicking on a new URL, so
there might be multiple outstanding I/O requests.
A common paradigm is to use threads for asynchronous I/O. Threads are
well-understood, and programmers can still write synchronous code because a
thread will just block when waiting for I/O, but the other threads will
still run. However, threads have their
limits, and operating system (OS) threads are somewhat costly. A program with
ten threads is fine, but 100 threads may start to
cause some worry. Once you get up to 1000 threads, you are "already in
trouble".
For example, handling lots of sockets is problematic. The OS
kernel imposes limits on the number of sockets, but those limits are
typically one or two orders of magnitude larger than the number of threads
that can be supported. That means you can't have a thread per connection
if you want to be able to support the maximum number of connections on the
system.
Beyond that, though, a "big problem" with OS threads is that they are
preemptively scheduled, which means
that a thread can be interrupted even if it isn't waiting for I/O. That
leads to problems with variables and data structures shared between
threads. Avoiding race conditions then requires adding
locks, but that can lead to lock contention which slows everything down.
Threads may be a reasonable solution in some cases, but there are tradeoffs.
The way to do asynchronous I/O without threads is by using select() and
poll(), which is the mechanism that asyncore uses. But asyncore
is showing its age, it isn't very extensible, and most people ignore it
entirely and write their own asynchronous code using select()
and poll(). There are various frameworks that can be used,
including Twisted, Tornado, or ZeroMQ. Most of the C
libraries (e.g. libevent, libev, libuv) that handle asynchronous I/O have
Python wrappers available, but
that gives them a "C-like API style". Stackless and gevent (along with a few others) provide
another set of alternatives.
And that is part of the problem: there are too many choices. "Nobody likes
callbacks" as an interface, or at least Van Rossum doesn't, and many of
the choices rely on that. The APIs tend to be complicated partly because
of the callbacks, and the
standard library doesn't cooperate, so it is of "very little use" when
using one of the solutions.
Advocates of gevent would claim that it solves all those problems, but
"somehow it doesn't do it" for him. There are some "scary implementation
details", including CPython- and x86-specific code. It does some "monkey patching" of
the standard library to make it "sort of work". It also does not avoid
the problem of knowing where the scheduler can switch tasks.
There is a specific call that gets made to cause task switches to
happen, but you never know when that may be called. Some library function
could be making that
call (or
may in the future), for example. It essentially is the same situation as with
OS threads. Beyond switching at unexpected times, there is also the
problem of not switching enough, which can cause some tasks to be CPU
starved. He "can't help" wanting to know if a particular line of code could
end up suspending the current task.
A new framework
So, Van Rossum is working on "yet another standard framework that is going
to replace all the other standard frameworks ... seriously", he said with
a chuckle—to applause. The framework will standardize the event loop.
There aren't too many choices for how to implement an event loop, he said.
The ones that exist are all doing more or less the same things.
The event loop is special because it serializes the execution of the
program. It guarantees that while your code is running, nothing else is,
and that the shared state cannot be changed until "you say you are done
with it", Van Rossum said. All of that implies that there should only be one
event loop in a program. You can't really mix a Twisted event loop and a
gevent event loop in the same program, which means that the existing
frameworks do not interoperate.
Van Rossum looked at the existing frameworks and their event loops to look
for commonality. The essential elements of an event loop are fairly
straightforward. There needs to be a way to start and stop the loop. Some
way to schedule a callback in the future (which might be "now") needs to be
available, as well as a way
to schedule repeated, periodic callbacks. The last
piece is a way to associate callbacks with file descriptors (or other OS
objects that represent I/O in some way) of interest. Depending on the OS
paradigm, those callbacks can be made when the file descriptor is "ready"
(Unix-like) or when it is "done" (Windows). There is also the need for the
framework to abstract away choosing the proper I/O multiplexing mechanism
(select(), poll(), epoll(), others) for the
system in an intelligent way.
The existing frameworks do not interoperate today, and each has its
strengths and weaknesses. Twisted is good for esoteric internet protocols, while
Tornado is well-suited for web serving, but making them work together is
difficult. There are various "pairwise" solutions for interoperability, but
there are lots of pairs that are not covered.
So, he has come up with Python Enhancement Proposal
(PEP) 3156 and a
reference implementation called Tulip. Using a slide of the xkcd comic on standards, Van Rossum
noted
that he was solving the problem of too many standards by adding a new
standard. But, he pointed out that PEP 3156 could actually be considered a
standard
because it will eventually end up in the standard library. That was greeted
with some laughter.
"I know this is madness", he said, as everyone has
their favorite framework. Suggestions to put Twisted in the standard
library or to "go back further in history" and adopt Stackless (along with
other
ideas) were floated in the original mailing list thread. He did not
completely make up his own framework, though, instead he looked at the
existing solutions and adopted pieces that he felt made sense. Certain
things from Twisted, particularly its higher-level abstraction for I/O
multiplexing (which works well for Windows), as well as its Transports and
Protocols, were adapted into Tulip.
So PEP 3156 is the interface for a standard event loop, while Tulip is an
experimental prototype that will eventually turn into a reference
implementation. Tulip will be available to use in Python 3.3, even after
it is incorporated "under a better name" into the standard library for
Python 3.4. Tulip will also serve as a repository for extra functionality
that doesn't belong in the standard library going forward.
PEP 3156 is not just an event loop API proposal, it also proposes an interface
to completely swap out the event loop. That means that
other frameworks could plug in their event loop using a conforming adaptor
and the user code would still work
because it makes Tulip/3156 calls. The hope is that eventually the frameworks
switch to using the standard event loop.
Callbacks without callbacks
There is even more to the PEP, to the point where some have suggested he
split it into two pieces, which he may still do.
The second major piece is a new way to write callbacks. Futures, a
mechanism for running asynchronous computations, was introduced in PEP 3148 and added in
Python 3.2. Tulip/3156 has adapted
Futures to be used with
coroutines as a way to specify callbacks, without actually using callbacks.
A Future is an abstraction for a value that has not yet been computed. The
Futures class used in Tulip is not exactly the same as the Python 3.2
version, because
instead of blocking when a result is required, as the earlier version does,
it must use ("drum roll please") the yield from statement
that came from PEP 380, which got
added in Python 3.3. It is "an incredibly cool, but also brain-exploding
thing", Van Rossum said.
While he wanted to emphasize the
importance of yield from, he wasn't going to teach it in the
talk, he said. The best way to think about it is that
yield from is a magic way to block your coroutine without
blocking the application. The coroutine will unblock and unsuspend when
the event it is waiting on completes (or is ready). The way to think about
Futures is to "try to forget they are there". A yield from
and a Future just kind of cancel out and the value is
what would be returned from the equivalent blocking function.
That is the "best I
can say it without bursting into tears", he said.
The fact that Futures have an API, with result() and
exception() methods, as well as callbacks, can largely be
ignored. One just calls a function that returns a Future and does a
yield from on the result. Error handling is simplified
compared to using callbacks because a normal try/except
block can be used around the yield from.
Coroutines are basically just generators, and the @coroutine
decorator is empty in the current Tulip code. It is purely there for the
human reader of the code, though there may be some debug code added
eventually. Coroutines by themselves do not give concurrency, it is the
yield from that drives the coroutine execution.
Van Rossum was running on low on time, and said there was "lots more" he
could talk about. If the interoperability story fails, the xkcd comic comes
true, he said, but he is hopeful that over time the new framework "will
help us move
to a world where we can actually all get along". So that if someone finds
some code that uses Twisted and other code that uses gevent, both of which
are needed in their application, they will be able to use both.
"When can you have it?", he asked. The code and the PEP are very much in
flux right now. He is pushing hard to have something complete by November
23, which is the cutoff for Python 3.4. By then, Tulip should be available
in the PyPI repository. Once 3.4 is out the door, the rest of the standard
library can be looked at with an eye toward making them work with the
asynchronous framwork. Some pieces (e.g. urllib, socketserver) will
likely need to be deprecated or will be emulated on top of PEP 3561.
Older Pythons (i.e. 2.7) are "out of luck". He has no plans to support
them, and hopes that the new framework serves as a carrot to move people to
3.3 (and beyond)—there are so many "silly things in older versions of the
language". After a round of acknowledgments, Van Rossum left the stage,
heading off to some code sprints scheduled as part of PyCon over the next
two days.
Comments (3 posted)
Brief items
The code itself was like a magnificent ASCII waterfall or those
Incan irrigation systems I often saw in AP Spanish. It worked, and
nothing I read on Google told me that things were actually
terribly wrong and that any programmer who read my code would
never let me nor my progeny near the web again.
—
Michelle
Bu (courtesy of a re-tweet by Garrett LeSage)
4. WAKE UP IN A PANIC TO THE SOUND OF MONSTERS IN YOUR ROOM.
5. After getting out of bed and changing your pants, realize
that after your computer restarted, Chrome helpfully re-opened all
of your tabs, including Netflix, and so it restarted playing the
episode of Supernatural that you watched before bed.
—
Michael
Lehenbauer, detailing the final two "steps to reproduce the
problem" in his Chrome bug report.
Comments (1 posted)
OpenSSH 6.2 is out. New features include some new encryption modes, the
ability to require multiple authentication protocols (requiring both public
key and a password, for example), key revocation list support, better
seccomp-filter sandbox support, and more.
Full Story (comments: 24)
The GCC 4.8.0 release is out. "
Extending the widest support for hardware architectures in the industry,
GCC 4.8 has gained support for the upcoming 64-bit ARM instruction set
architecture, AArch64. GCC 4.8 also features support for Hardware
Transactional Memory on the upcoming Intel Haswell CPU
architecture." There's a lot of new stuff in this release; see
the changes file and
LWN's GCC 4.8.0 coverage for details.
Full Story (comments: none)
Sebastian Sauer has
announced
the availability of the first version of the
Calligra office suite for Android
systems. For now, the focus is on providing a viewer for ODT documents.
"
Since bringing a whole Office suite to another platform is a huge
task and I am a small team I had to focus. Later on I plan to add doc/docx
support, editing, saving and Calligra Sheets (spreadsheets) and Calligra
Stage (presentations)." The application can be installed from
the
Play Store.
Comments (7 posted)
GTK+ 3.8.0 has been released. This version includes support for Wayland
1.0, and contains many new features and performance improvements.
Full Story (comments: 36)
Carsten "Rasterman" Haitzler has released version 0.3 of Terminology, an EFL-based terminal emulator billed as "the fanciest terminal emulator out there." The newest additions to Terminology's fanciful lineup include tabs, split mode, and the ability to play multimedia in the background via escape codes. Which does sound pretty fancy after all.
Full Story (comments: 10)
James Hunt has released version 1.8 of the Upstart init system. This version adds two new features: upstart-file-bridge, "a new bridge that allows jobs to react to file events," and upstart-monitor, a tool for watching event flows (and which includes both GUI and CLI modes).
Full Story (comments: none)
The GNOME 3.8 release is out. "
The exciting new features and
improvements in this release include a integrated application search,
privacy and sharing settings, notification filtering, a new classic
mode, OwnCloud integration, previews of clocks, notes, photos and
weather applications, and many more." See
the release notes
for details.
Full Story (comments: 120)
Newsletters and articles
Comments (none posted)
Rusty Russell
ran an
investigation to determine whether code compiled with the GCC C++
compiler is slower than code from the C compiler. "
With this in
mind, and Ian Taylor’s bold assertion that 'The C subset of C++ is as
efficient as C', I wanted to test what had changed with some actual
measurements. So I grabbed gcc 4.7.2 (the last release which could do
this), and built it with C and C++ compilers." His conclusion is
that the speed of the compiler is the same regardless of how it was built;
using C++ does not slow things down.
Comments (24 posted)
John Regehr
explains how
new optimizations in GCC 4.8.0 can break code making use of undefined
behavior. "
A C compiler, upon seeing d[++k], is permitted to assume
that the incremented value of k is within the array bounds, since otherwise
undefined behavior occurs. For the code here, GCC can infer that k is in
the range 0..15. A bit later, when GCC sees k<16, it says to itself: 'Aha--
that expression is always true, so we have an infinite loop.'"
Comments (71 posted)
The H has
an
extensive survey of available RSS reader applications, both open source
and proprietary. "
ownCloud is a complete self-hosted service
platform that provides file sharing and collaboration features including
calendaring, to do lists, a document viewer, and integration with Active
Directory and LDAP. The software also includes a feed reader application,
which started as a Google Summer of Code effort and takes many design cues
from Google Reader."
Comments (13 posted)
Page editor: Nathan Willis
Announcements
Brief items
Syrian software engineer Bassel Khartabil is the winner of this year's
Index on Censorship Digital Freedom Award, sponsored by Google.
"
Khartabil is a free internet pioneer who has spent his career
advancing open source technologies. On March 15, 2012, he was illegally
imprisoned in Syria. His family were given no official information about
why or where he was detained but have since learnt that he is being held at
the security branch of Kafer Sousa, Damascus."
Full Story (comments: none)
The Linux Foundation has
announced
the availability of its annual Enterprise End User Report. "
Because this is the third year we've surveyed the world's largest enterprises and The Linux Foundation's End User Council about Linux adoption, we're able to share some interesting trending data.
But perhaps most interesting is the opportunity to understand how Linux is outpacing Windows in server revenue in the enterprise. IDC's latest quarterly tracker shows Linux growing at 12.7 percent year-over-year while Windows is stagnating at 3.2 percent year-over-year growth. In fact, the quarter prior (Q312), Windows was actually in decline while Linux was growing."
Comments (1 posted)
The German newspaper taz.die tageszeitung (TAZ) has received this year's
Document Freedom award, presented by the Free Software Foundation Europe
(FSFE) and the Foundation for a Free Information Infrastructure (FFII).
"
The TAZ receives the Document Freedom award because it delivers its
electronic paper to its subscribers in a choice of open formats, and
without digital restrictions (DRM). "We are awarding the TAZ with the
Document Freedom Award for their longstanding commitment to Open Standards
and continuos efforts in offering their newspaper without restrictions"
says Erik Albers, Fellowship Coordinator Berlin."
Full Story (comments: none)
For those who could not attend PyCon US 2013,
videos from the
talks are now available.
Comments (1 posted)
Videos from the Red Hat/Fedora devconf.cz are
available on YouTube.
(Thanks to Scott Dowdle)
Comments (none posted)
Articles of interest
Perhaps the best description and analysis of the unfortunate events at
PyCon can be found in
this post from
Amanda Blum. In short, she concludes that everybody lost in this
incident.
Any comments posted should, please, have something new to say and
demonstrate the highest level of respect for others, whether or not you
agree with them.
See also: What
really happened at PyCon.
Comments (151 posted)
The Spanish association Hispalinux has filed a complaint against Microsoft
to the European Commission, Reuters
reports.
"
In its 14-page complaint, Hispalinux said Windows 8 contained an
"obstruction mechanism" called UEFI Secure Boot that controls the start-up
of the computer and means users must seek keys from Microsoft to install
another operating system. The group said it was "a de facto technological
jail for computer booting systems ... making Microsoft's Windows platform
less neutral than ever"." (Thanks to Pat Read)
Comments (18 posted)
Upcoming Events
Red Hat has
announced
the agenda for the Red Hat Summit, which takes place June 11-14 in
Boston, Massachusetts.
Comments (none posted)
Events: March 28, 2013 to May 27, 2013
The following event listing is taken from the
LWN.net Calendar.
| Date(s) | Event | Location |
| March 30 |
Emacsconf |
London, UK |
| March 30 |
NYC Open Tech Conference |
Queens, NY, USA |
April 1 April 5 |
Scientific Software Engineering Conference |
Boulder, CO, USA |
April 4 April 5 |
Distro Recipes |
Paris, France |
April 4 April 7 |
OsmoDevCon 2013 |
Berlin, Germany |
April 6 April 7 |
international Openmobility conference 2013 |
Bratislava, Slovakia |
| April 8 |
The CentOS Dojo 2013 |
Antwerp, Belgium |
April 8 April 9 |
Write The Docs |
Portland, OR, USA |
April 10 April 13 |
Libre Graphics Meeting |
Madrid, Spain |
April 10 April 13 |
Evergreen ILS 2013 |
Vancouver, Canada |
| April 14 |
OpenShift Origin Community Day |
Portland, OR, USA |
April 15 April 17 |
Open Networking Summit |
Santa Clara, CA, USA |
April 15 April 17 |
LF Collaboration Summit |
San Francisco, CA, USA |
April 15 April 18 |
OpenStack Summit |
Portland, OR, USA |
April 16 April 18 |
Lustre User Group 13 |
San Diego, USA |
April 17 April 18 |
Open Source Data Center Conference |
Nuremberg, Germany |
April 17 April 19 |
IPv6 Summit |
Denver, CO, USA |
April 18 April 19 |
Linux Storage, Filesystem and MM Summit |
San Francisco, CA, USA |
| April 19 |
Puppet Camp |
Nürnberg, Germany |
April 22 April 25 |
Percona Live MySQL Conference and Expo |
Santa Clara, CA, USA |
| April 26 |
MySQL® & Cloud Database Solutions Day |
Santa Clara, CA, USA |
April 27 April 28 |
LinuxFest Northwest |
Bellingham, WA, USA |
April 27 April 28 |
WordCamp Melbourne 2013 |
Melbourne, Australia |
April 29 April 30 |
Open Source Business Conference |
San Francisco, CA, USA |
April 29 April 30 |
2013 European LLVM Conference |
Paris, France |
May 1 May 3 |
DConf 2013 |
Menlo Park, CA, USA |
May 9 May 12 |
Linux Audio Conference 2013 |
Graz, Austria |
May 14 May 15 |
LF Enterprise End User Summit |
New York, NY, USA |
May 14 May 17 |
SambaXP 2013 |
Göttingen, Germany |
May 15 May 19 |
DjangoCon Europe |
Warsaw, Poland |
| May 16 |
NLUUG Spring Conference 2013 |
Maarssen, Netherlands |
May 22 May 25 |
LinuxTag 2013 |
Berlin, Germany |
May 23 May 24 |
PGCon 2013 |
Ottawa, Canada |
May 24 May 25 |
GNOME.Asia Summit 2013 |
Seoul, Korea |
If your event does not appear here, please
tell us about it.
Page editor: Rebecca Sobol