The GTK+ application toolkit is most closely associated with the
GNOME desktop, but it is used by a variety of non-GNOME environments
and applications as well. It even runs on non-Linux operating systems.
That level of diversity has at times fostered an unease about the
nature and direction of GTK+: is it a GNOME-only technology, or it is
a system-neutral tool with GNOME as its largest consumer? The subject
came up in several talks at GUADEC 2013, mingled in with other
discussions of the toolkit's immediate and long-term direction.
GTK+ 3.10
Matthias Clasen delivered a talk on the new features that are set
to debut in GTK+ 3.10 later this year, and he did so with an unusual
approach. Rather than build a deck of slides outlining the new
widgets and properties of the 3.10 release—slides which would
live in isolation from the canonical developer's
documentation—he wrote a tutorial in the GNOME
documentation and provided code samples in the GTK+
source repository.
The tutorial walked through the process of creating a "GNOME 3–style"
application using several of the newer classes, widgets, and support
frameworks. Clasen's example application was a straightforward
text-file viewer (essentially it could open files and count the number
of lines in each), but it integrates with GNOME 3's session bus, newer APIs (such as
the global "App Menu"), GSettings settings framework, animated
transition effects, and more.
Many of the widgets shown in the tutorial are new for 3.10, and up until now have
only been seen by users in previews of the new "core" GNOME
applications, like Clocks,
Maps,
or Web.
These new applications tend to be design-driven utilities, focusing on
presenting information (usually just one type of information) simply,
without much use for the hierarchical menus and grids of buttons one
might see in a complex editor.
But the stripped-down design approach has given rise to several new
user interface widgets, such as GtkHeaderBar,
the tall menubar featuring centered items that is visible in all of
the new core applications. Also new is the strip of toggle-buttons
that lets
the user switch between documents and views. These toggle-buttons are
a GtkStackSwitcher,
which is akin to the document tabs common in older applications. The
switcher is bound to a GtkStack,
which is a container widget that can hold multiple child widgets, but
shows only one at a time. In his example, Clasen showed how to open
multiple files, each as a child of the GtkStack. The
GtkStack–GtkStackSwitcher pair is not all that different than the
tabbed GtkNotebook of earlier GTK+ releases, but it has far fewer
properties to manage, and it can take advantage of new animated
transitions when switching between children. Clasen showed sliding
and crossfading transitions, and commented that they were made
possible by Owen Taylor's work on frame
synchronization.
He also showed the new GtkSearchBar
widget, which implements a pre-fabricated search tool that drops out
of the GtkHeaderBar, thus making adding search functionality to an
application simpler (and more unified across the spectrum of GTK+
applications). He also added a sidebar to his example, essentially just to show off
the other two new widgets, GtkRevealer
and GtkListBox.
GtkRevealer is the animated container that slides (or fades) in to
show the sidebar, while GtkListBox is the sortable list container
widget.
The talk was not all widgets, though; Clasen also demonstrated how
the GSettings preferences system works, setting font parameters for
his example application, then illustrating how they could be changed
from within the UI of the application itself, or with the
gsettings command-line tool. He also showed how
glib-compile-resources can be used to bundle application
resources (such as icons and auxiliary files) into the binary, and how
GtkBuilder templates can simplify the creation of user interfaces.
All in all, the application he created from scratch was a simple one,
but it was well-integrated with GNOME's latest features and, he said,
only about 500 lines of C in total, with an additional 200 lines (of
XML) describing the user interface.
What about Bob?
Clasen's talk brought application developers up to speed on the
latest additions to GTK+ itself, while two other sessions looked
further out, to the 3.12 development cycle and beyond. Emmanuele
Bassi is the maintainer of the Clutter toolkit, which is
used in conjunction with GTK+ by a few key projects, most notably
GNOME Shell and the Totem video player. His session dealt with the
recurring suggestions he hears from users and developers: either what
"Clutter 2.0" should do, or that Clutter should be merged into GTK+.
"This talk is less of a presentation, and more of an intervention" he
said.
Clutter uses OpenGL or OpenGL ES to render a scene graph;
interface elements are "actors" on the application's
ClutterStage. Actors can be easily (even implicitly)
animated, with full hardware acceleration. Clutter has been able to
embed GTK+ widgets, and GTK+ applications have been able to embed
Clutter stages, for several years. Nevertheless, as Bassi explained,
Clutter was never meant to be a generic toolkit, and he is not even
sure there should ever be a Clutter 2.0.
Originally, he said, Clutter was designed as the toolkit for a
full-screen media center application; it got adapted for other
purposes over its seven-year history, but most people who have used it
have ended up writing their own widget toolkit on top of Clutter
itself. That should tell you that it isn't done right, he said.
Today Clutter is really only used by GNOME Shell, he said. But
being "the Shell toolkit" is not a great position to be in, since
GNOME Shell moves so quickly and "plays fast and loose with the APIs."
There are two other Clutter users in GNOME, he added, but they use it
for specific reasons. Totem uses Clutter to display GStreamer videos
only "because putting a video on screen with GStreamer is such a
pain"—but that really just means that GStreamer needs to get its
act together. The virtual machine manager Boxes also uses Clutter, to
do animation of widgets.
So when it comes to Clutter's future, Bassi is not too interested
in creating a Clutter 2.0, because the current series already
implements all of the scene graph and animation features he wants it to (and the things it doesn't do yet would require
breaking the API). But the most common alternative
proposal—merging Clutter into GTK+—is not all that
appealing to him either. As he pointed out, other applications have implemented
their own widget toolkits on top of Clutter with little in the way of widespread success, using
libraries to "paper over" Clutter's problems. If you want to do
another, he said, "be my guest." At the same time, compositors like
GNOME Shell's Mutter have to "strip out a bunch of stuff" like the
layout engine. In addition, GTK+ already has its own layout, event
handling, and several other pieces that are duplicated in Clutter.
Offering both systems to developers would send a decidedly mixed
message.
Ultimately, though, Bassi does think that GTK+ needs to start
adding a scene graph library, which is the piece of Clutter that
everyone seems to want. But, he said, there is no reason he needs to
call it Clutter. "We can call it Bob," he suggested. But Bob needs
design work before it can be implemented, and he had several
suggestions to make. It should have some constraints, such as being
confined to GDK ("which sucks, but is still better than Qt" he
commented) as the backend. It should avoid input and event
handling, which do not belong in the scene graph. It should be 2D
offscreen surfaces blended in 3D space—using OpenGL "since
that's all we've got." It should not have a top-level actor (the
ClutterStage), since that was an implementation decision made
for purely historical reasons. And it must not introduce API breaks.
Considering those constraints separately, Bassi said, the scene
graph in Clutter is actually okay. Porting it over would require some
changes, but is possible. He has already started laying the
groundwork, he said, since the April 2013 GTK+ hackfest. He implemented a
GtkSceneGraph-3.0 library in GTK+, which passed its initial tests but
was not really doing anything. He also implemented the first steps of
adding OpenGL support to GDK: creating a GLContext and passing it down
to Cairo. There is much more to come, of course; several other core
GNOME developers had questions about Bassi's proposal, including how it
would impact scrolling, input events, custom widgets, and GTK+ support
on older GPU hardware. Bassi explains a bit more
on the GNOME wiki, but the project is certain to remain a hot topic
for some time to come.
Whose toolkit is it anyway?
Last but definitely not least, Benjamin Otte presented a session on
the long-term direction of GTK+, in particular the technical features
it needs to add and the all-important question of how it defines its
scope. That is, what kind of toolkit is GTK+ going to be? How will
it differ from Qt, HTML5, and all of the other toolkits?
On the technical front, he cited two features which are
repeatedly requested by developers: the Clutter-based scene graph
based on Bassi's work mentioned above, and gesture support. The scene
graph is important because GTK+'s current drawing functions make it
difficult to tell what element the cursor is over at any moment.
Making each GTK+ widget a Clutter-based actor would make that
determination trivial, and provide other features like making widgets
CSS-themable. Gesture support involves touch detection and gesture
recognition itself (i.e., defining a directional "swipe" that can be
bound to an action); Otte noted that GTK+'s existing input support is
essentially just XInput.
The bigger part of the talk was spent examining what Otte called
the "practical" questions: defining what GTK+ is meant to be and what
it is not. His points, he stated at the outset, do not
represent what he personally likes, but are the result of many
conversations with others. They already form the de-facto
guidance for GTK+ development, he said; he was simply putting them out
there.
The first point is OS and backend support: which OSes will GTK+
support, and how well? The answer is that GTK+ is primarily intended
to be used on the GNOME desktop, using X11 as the backend. Obviously
it is transitioning to Wayland while supporting X11, which has forced
developers to work in a more abstract manner than they might have
otherwise. That makes this a good time for any
interested parties to write their own backends (say, for Android or
for something experimental). But the fact remains that in the absence of new developers, the
project will make sure that features work right on X11 and Wayland,
and will do its best to support them on other platforms. For
example, Taylor's frame synchronization is written native to X11, and
the timer mechanism can only be approximated on Mac OS X, but it
should work well enough.
Similarly, he continued, GTK+ is targeting laptops as the device
form factor, with other form factors (such as phones, or development
boards without FPUs) often requiring some level of compromise.
Desktops are "laptop-like," he said, particularly when it comes to CPU
power and screen size.
Laptops also dictate that "keyboard and mouse" are the target input
devices. Touchscreen support will hopefully arrive in the future, he
said, but that is as touchscreens become more common on laptops.
These decisions lead into the bigger question of whether GTK+ seeks
to be its own platform or to be a neutral, "integrated" toolkit. For
example, he said, should a GTK+ app running on KDE be expected to look
like a native KDE app? His answer was that GTK+ must focus on being
the toolkit of the GNOME platform first, and tackle integration
second. The project has tried to keep cross-platform compatibility,
he said. For example, the same menus will work in GNOME, Unity, and
KDE, but the primary target platform is GNOME.
Finally, he said, people ask whether GTK+ is focused on creating
"small apps" or "large applications," and his answer is "small apps."
In other words, GTK+ widgets are designed to make it easy and fast to
write small apps for GNOME: apps like Clocks, rather than GIMP
or Inkscape. The reality of it is, he said, that large applications
like GIMP, Inkscape, Firefox, and LibreOffice typically write large
amounts of custom widgets to suit their particular needs. If GTK+
tried to write a "docking toolbar" widget, the odds are that GIMP
developers would complain that it did not meet their needs, Inkscape
developers would complain that it did not meet their needs either, and
no one else would use it at all.
An audience member asked what Otte's definitions of "small" and
"large" are, to which he replied that it is obviously a spectrum and
not a bright line. As a general rule, he said, if the most
time-consuming part of porting an application to a different platform
is porting all of the dialog boxes, then it probably qualifies as
"large." Then again, he added, this is primarily a matter of
developer time: if a bunch of new volunteers showed up this year
wanting to extend GTK+ to improve the PiTiVi video editor, then a year
from now GTK+ would probably have all sorts of timeline widgets.
People often ask why they should port an application from GTK2 to
GTK3, Otte said. His answer historically was that GTK3 is awesome and
everyone should port, but he said he has begun to doubt that. The
truth is that GTK2 is stable and unchanging, even boring—but
that is what some projects need. He cited one project that targets
RHEL5 as its platform, which ships a very old version of GTK2.
Creating a GTK3 port would just cost them time, he said. The real
reason someone should port to GTK3 today, he concluded, is to take
advantage of the new features that integrate the application with
GNOME 3—but doing so means committing to keeping up with GNOME
3's pace of change, which is intentionally bold in introducing new features.
Eventually, he said, he hopes that GTK+ will reach a point where
the bold experiments are done. This will be after the scene graph and
gesture support, but it is hard to say when it will be. Afterward,
however, Otte hopes to make a GTK4 major release, removing all of the
deprecated APIs, and settling on a GTK2-like stable and unchanging
platform. The project is not there yet, he said, and notably it will
keep trying to be bold and add new things until application developers
"throw enough rocks" to convince them to stop. The rapidly-changing
nature of GTK3 is a headache for many developers, he said, but it has
to be balanced with those same developers' requests for new features
like gesture recognition and Clutter integration.
Otte's statements that GTK+ was a "GNOME first" project were
frequently a topic for debate at the rest of GUADEC. One audience
member even asked him during his talk whether this stance left out
other major GTK+-based projects like LXDE and Xfce. Otte replied that
he was not trying to keep those projects out; rather, since GNOME
developers do the majority of the GTK+ coding, their decisions push
GTK+ in their direction. If other projects want to influence GTK+, he
said, they need to "participate in GTK+ somehow," at the very least by
engaging with the development team to communicate what the projects want.
"What is GTK+" is an ongoing question, which is true of most free
software projects (particularly of libraries). There is no simple
answer, of course, but the frank discussion has benefits of its own,
for the project and for GTK+ developers. As the 3.10 releases of GTK+
and GNOME approach, at least both projects are still assessing how
what they do can prove useful to other application developers.
[The author wishes to thank the GNOME Foundation for assistance
with travel to GUADEC 2013.]
Comments (24 posted)
August 8, 2013
This article was contributed by Josh Berkus
Through me pass into the site of downtime,
Through me pass into eternal overtime
Through me pass and moan ye in fear
All updates abandon, ye who enter here.
A decade ago, software deployments were something you did fairly
infrequently; at most monthly, more commonly quarterly, or even
annually. As such, pushing new and updated software was not something
developers, operations (ops) staff, or database administrators (DBAs) got much practice with. Generally, a deployment was a major downtime event, requiring the kind of planning and personnel NASA takes to land a robot on Mars ... and with about as many missed attempts.
Not anymore. Now we deploy software weekly, daily, even
continuously. And that means that a software push needs to become a
non-event, notable only for the exceptional disaster. This means that
everyone on the development staff needs to become accustomed to and familiar with the deployment drill and their part in it. However, many developers and ops staff — including, on occasion, me — have been slow to make the adjustment from one way of deployment to another.
That's why I presented "The Seven Deadly Sins of
Software Deployment [YouTube]" at OSCON Ignite on July 22. Each of the "sins" below is a chronic bad habit I've seen in practice, which turns what should be a routine exercise into a periodic catastrophe. While a couple of the sins aren't an exact match to their medieval counterparts, they're still a good check list for "am I doing this wrong?".
Sloth
Why do you need deployment scripts?
That's too much work to get done.
I'll just run the steps by hand,
I know I won't forget one.
And the same for change docs;
wherefore do you task me.
For info on how each step works,
when you need it you just ask me.
Scripting and documenting every step of a software deployment process
are, let's face it, a lot of work. It's extremely tempting to simply
"improvise" it, or just go from a small set of notes on a desktop
sticky. This works fine — until it doesn't.
Many people find out the hard way that nobody can remember a 13-step process in their head. Nor can they remember whether or not it's critical to the deployment that step four succeed, or whether step nine is supposed to return anything on success or not. If your code push needs to happen at 2:00AM in order to avoid customer traffic, it can be hard to even remember a three-step procedure.
There is no more common time for your home internet to fail, the VPN server to lose your key, or your pet to need an emergency veterinary visit than ten minutes before a nighttime software update. If the steps for the next deployment are well-scripted, well-documented, and checked into a common repository, one of your coworkers can just take it and run it. If not, well, you'll be up late two nights in a row after a very uncomfortable staff meeting.
Requiring full scripting and documentation has another benefit; it makes developers and staff think more about what they're doing during the deployment than they would otherwise. Has this been tested? Do we know how long the database update actually takes? Should the ActiveRecord update come before or after we patch Apache?
Greed
Buy cheap staging servers, no one will know:
they're not production, they can be slow.
They need not RAM, nor disks nor updates.
Ignore you QA; those greedy ingrates.
There's a surprising number of "agile" software shops out there who either lack staging servers entirely, or who use the old former production servers from two or three generations ago. Sometimes these staging servers will have known, recurring hardware issues. Other times they will be so old, or so unmaintained, they can't run the same OS version and libraries which are run in production.
In cases where "staging" means "developer laptops", there is no way to check for performance issues or for how long a change will take. Modifying a database column on an 8MB test database is a fundamentally different proposition from doing it on the 4 terabyte production database. Changes which cause new blocking actions between threads or processes also tend not to show up in developer tests.
Even when issues do show up during testing, nobody can tell for certain
if the issues are caused by the inadequate staging setup or by new
bugs. Eventually, QA staff start to habitually ignore certain kinds of
errors, especially performance problems, which makes doing QA at all an
exercise of dubious utility. Why bother to run response time tests if
you're going to ignore the results because the staging database is known to
be 20 times slower than production?
The ideal staging system is, of course, a full replica of your production setup. This isn't necessarily feasible for companies whose production includes dozens or hundreds of servers (or devices), but a scaled-down staging environment should be scaled down in an intelligent way that keeps performance at a known ratio to production. And definitely keep those staging servers running the exact same versions of your platform that production is running.
Yes, having a good staging setup is expensive; you're looking at spending at least ¼ as much as you spent on production, maybe as much. On the other hand, how expensive is unexpected downtime?
Gluttony
Install it! Update it! Do it ASAP!
I'll have Kernel upgrades,
a new shared lib or three,
a fat Python update
and four new applications!
And then for dessert:
Sixteen DB migrations.
If you work at the kind of organization where deployments happen relatively infrequently, or at least scheduled downtimes are once-in-a-blue-moon, there is an enormous temptation to "pile on" updates which have been waiting for weeks or months into one enormous deployment. The logic behind this usually is, "as long as the service is down for version 10.5, let's apply those kernel patches." This is inevitably a mistake.
As you add additional changes to a particular deployment, each change increases the chances it will fail somehow, both because each change has a chance of failure, and because layered application and system changes can mess each other up (for example, a Python update can cause an update to your Django application to fail due to API changes). Additional changes also make the deployment procedure itself more complicated and thus increase the chances of an administrator or scripting error, and you make it harder and more time-consuming to test all of the changes both in isolation or together. To make this into a rule:
The odds of deployment failure approach 100% as the number of distinct change sets approaches seven.
Obviously, the count of seven is somewhat dependent on your
infrastructure, nature of the application, and testing setup. However, even
if you have an extremely well-trained crew and an unmatched staging
platform, you're really not going to be able to tolerate many more distinct
changes to your production system before making failure all but certain.
Worse, if you have many separate "things" in your deployment, you've also made rollback longer and more difficult — and more likely to fail. This means, potentially, a serious catch-22, where you can't proceed because deployment is failing, and you can't roll back because rollback is failing. That's the start of a really long night.
The solution to this is to make deployments as small and as frequent as possible. The ideal change is only one item. While that goal is often unachievable, doing three separate deployments which change three things each is actually much easier than trying to change nine things in one. If the size of your update list is becoming unmanageable, you should think in terms of doing more deployments instead of larger ones.
Pride
Need I no tests, nor verification.
Behold my code! Kneel in adulation.
Rollback scripts are meant for lesser men;
my deployments perfect, as ever, again.
Possibly the most common critical deployment failure is when developers and administrators don't create a rollback procedure at all, let alone rollback scripts. A variety of excuses are given for this, including: "I don't have time", "it's such a small change", or "all tests passed and it looks good on staging". Writing rollback procedures and scripts is also a bald admission that your code might be faulty or that you might not have thought of everything, which is hard for anyone to admit to themselves.
Software deployments fail for all sorts of random reasons, up to and including sunspots and cosmic rays. One cannot plan for the unanticipated, by definition. So you should be ready for it to fail; you should plan for it to fail. Because when you're ready for something to fail, most of the time, it succeeds. Besides, the alternative is improvising a solution or calling an emergency staff meeting at midnight.
You don't need to be complicated or comprehensive. If the deployment is
simple, the rollback may be as simple as a numbered list of steps on a
shared wiki page. There are two stages to planning to roll back properly:
- write a rollback procedure and/or scripts
- test that the rollback succeeds on staging
Many people forget to test their rollback procedure just like they test the original deployment. In fact, it's more important to test the rollback, because if it fails, you're out of other options.
Lust
On production servers,
These wretches had deployed
all of the most updated
platforms and tools they enjoyed:
new releases, alpha versions,
compiled from source.
No packages, no documentation,
and untested, of course.
The essence of successful software deployments is repeatability. When
you can run the exact same steps several times in a row on both development
and staging systems, you're in good shape for the actual deployment, and if it fails, you can roll back and try again. The cutting edge is the opposite of repeatability. If your deployment procedure includes "check out latest commit from git HEAD for library_dependency", then something has already gone wrong, and the chances of a successful deployment are very, very low.
This is why system administrators prefer known, mainstream packages and
are correct to do so, even though this often leads to battles with
developers. "But I need feature new_new_xyz, which is only in the
current beta!" is a whine which often precipitates a tumultuous staff
meeting. The developer only needs to make their stack work once (on their
laptop) and can take several days to make it work; the system administrator
or devops staff needs to make it work within minutes — several times.
In most cases, the developers don't really need the
latest-source-version of the platform software being updated, and this can be settled in the staff meeting or scrum. If they really do need it, then the best answer is usually to create your own packages and documentation internally for the exact version to be deployed in production. This seems like a lot of extra work, but if your organization isn't able to put in the time for it, it's probably not as important to get that most recent version as people thought.
Envy
I cannot stand meetings,
I will not do chat
my scripts all are perfect,
you can count on that.
I care only to keep clean my name
if my teammates fail,
then they'll take the blame.
In every enterprise, some staff members got into computers so that they wouldn't have to deal with other people. These antisocial folks will be a constant trial to your team management, especially around deployment time. They want to do their piece of the large job without helping, or even interacting with, anyone else on the team.
For a notable failed deployment at one company, we needed a network administrator to change some network settings as the first step of the deployment. The administrator did this, logging in, changing the settings, and logging back out, and telling nobody what he'd done. He then went home. When it came time for step two, the devops staff could not contact the administrator, and nobody still online had the permissions to check if the network settings were changed. Accordingly, the whole deployment had to be rolled back, and tried again the following week.
Many software deployment failures can be put down to poor communication between team members. The QA people don't know what things they're supposed to test. The DBA doesn't know to disable replication. The developers don't know that both features are being rolled out. Nobody knows how to check if things are working. This can cause a disastrously bad deployment even when every single step would have succeeded.
The answer to this is lots of communication. Overdetermine that everyone knows what's going to happen during the deployment, who's going to do it, when they're going to do it, and how they'll know when they're done. Go over this in a meeting, follow it up with an email, and have everyone on chat or VoIP conference during the deployment itself. You can work around your antisocial staff by giving them other ways to keep team members updated such as wikis and status boards, but ultimately you need to impress on them how important coordination is. Or encourage them to switch to a job which doesn't require teamwork.
Wrath
When failed the deployment,
again and again and again they would try,
frantically debugging
each failing step on the fly.
They would not roll back,
but ground on all night,
"the very next time we run it
the upgrade will be all right."
I've seen (and been part of) teams which did everything else right. They scripted and documented, communicated and packaged, and had valid and working rollback scripts. Then, something unexpected went wrong in the middle of the deployment. The team had to make a decision whether to try to fix it, or to roll back; in the heat of the moment, they chose to press on. The next dawn found the devops staff still at work, trying to fix error after error, now so deep into ad-hoc patches that the rollback procedure wouldn't work if they tried to follow it. Generally, this is followed by several days of cleaning up the mess.
It's very easy to get sucked into the trap of "if I fix one more thing, I can go to bed and I don't have to do this over again tomorrow." As you get more and more into overtime, your ability to judge when you need to turn back gets worse and worse. Nobody can make a rational decision at two in the morning after a 15-hour day.
To fight this, Laura Thompson at Mozilla introduced the "three strikes" rule. This rule says: "If three or more things have gone wrong, roll back." While I was working with Mozilla, this saved us from bad decisions about fixing deployments on the fly at least twice; it was a clear rule which could be easily applied even by very tired staff. I recommend it.
Conclusion
To escape DevOps hell
avoid sin; keep to heart
these seven virtues
of an agile software art.
Just as the medieval seven deadly sins have seven virtues to counterbalance them, here are seven rules for successful software deployments:
- Diligence: write change scripts and documentation
- Benevolence: get a good staging environment
- Temperance: make small deployments
- Humility: write rollback procedures
- Purity: use stable platforms
- Compassion: communicate often
- Patience: know when to roll back
You can do daily, or even "continuous", deployments if you
develop good practices and stick to them. While not the totality of what
you need to do for more rapid, reliable, and trouble-free updates and
pushes, following the seven rules of good practice will help you avoid some of the common pitfalls which turn routine deployments into hellish nights.
For more information, see the video of my "The Seven Deadly Sins of
Software Deployment" talk, the slides
[PDF],
and verses. See
also the slides
[PDF] from Laura Thompson's excellent talk "Practicing Deployment", and
Selena Deckelmann's related talk: Mistakes Were Made [YouTube].
Comments (18 posted)
Page editor: Jonathan Corbet
Security
At GUADEC 2013 in Brno, Czech
Republic, Stef Walter presented his
recent work to improve the security features of GNOME by removing
problematic—and frequently ignored—"security features."
The gist of Walter's approach is that interrupting users to force them
to make a security decision produces the wrong result most of the
time; far better is to try and determine the user's intent for the
task at hand, and design the application to work correctly without
intervention. This is a fairly abstract notion, but Walter presented
three concrete examples of it in action.
The users and the humans
He started off the session by tweaking the standard security
developer's notion of "the user." A "user," he said, is someone that
security people frequently get annoyed by; users click on the wrong
things, they fall for phishing attacks, and make plenty of other
mistakes. It is better to think of users in terms of "human beings,"
because "human beings" are active, creative, and use their computers
to do things—although they also get overwhelmed when faced with
too much information at once.
This is where security design enters into the picture. Humans' brains
filter out extraneous information, on a constant basis, as part of
making sense of the world. So developers should not be surprised when
those humans tune out or dismiss dialog boxes, for example. This means that
"if you force the user to be part of the security
system,"—primarily by forcing the user to make security
decisions—"you're gonna have a really bad time." He likened the
problem to a doctor that gives the patient all of the possible
treatment options: the patient will get frustrated and ask "what would
you do?" Software developers need to be prepared to make a strong
recommendation, rather than presenting all of the choices to the user.
Walter then had a few bits of wisdom to share from this approach to
security design. First, he said, the full extent of the humans'
involvement in security should be to identify themselves. You can ask them
for a password to prove who they are, but after that they should not
be interrupted with questions about security policy. Next, it is
important to remember that "professional users" are not different in
this regard. By "professionals" he seemed to mean developers, system
administrators, and others with knowledge of security systems. But
just because they have this knowledge does not mean they should be
interrupted.
That is because the worst possible time to ask the user to make a
risky decision is when they are in the middle of trying to do
something else, he said. "You're going to get results that are worse
than random chance."
Application to applications
For developers, Walter offered two design maxims. First:
Prompts are dubious, he said. If you are refactoring your
code and you see a user prompt, regard it with suspicion, asking if
you really need to prompt the user for a response. The end goal, he
said, should be to get rid of Yes/No prompts.
The second maxim follows from the first: Security prompts are
wrong. Or at least they are wrong 99% of the time or more, he
said. Sure, you ask for a password, but that is an identification
prompt, and passwords are an unfortunate fact of life. But prompts
that ask questions about security, like "Do you want to continue?" or
"Do you want to ignore this bad certificate?" are wrong. Furthermore,
he added, if you then make the user's choice permanent, you add insult
to injury.
He gave several examples of this bad design pattern, including the
all-too-familiar untrusted-certificate prompt from the web browser,
the "this software is signed by an untrusted provider" prompt from a
package manager, and an "a new update is available that fixes your
problem, please run the following command" prompt from Fedora's automatic bug reporting
tool.
The correct approach, he said, is instead to stop interrupting the user, let
the user take some action that expresses their intent, and then make a
decision based on that intent. In other words, figure out what the
user is trying to do, and design the software so that he can express
his intent while working.
A positive example in this regard is Android's Intents system,
which he called ripe with potential for getting it wrong, but actually
gets it right. So, for example, the "file open" Intent could
prompt the user with a bad dialog of the form "Application X has
requested read/write access to file /foo/bar/baz. Continue? Disallow?"
But, instead, it just opens up the file chooser and lets the user
select the desired file. Thus the user gets asked to take a clear
action, rather than asked a security-policy question.
A second, theoretical example would be the potentially private
information in the Exif tags of a photo. If the user starts to upload
a photo, the wrong approach would be to interrupt with a dialog asking
if the user is aware that there is private information in the Exif
tags. The better approach is simply to show the information (e.g.,
geographic location and a detailed timestamp) with the photo and make it
easy to clear out the information with a button click.
The fix is in
Walter then showed off three new pieces of work he is developing to
improve just such security-interruption problems. The first is the
removal of untrusted-certificate prompts. This garnered a round of
applause from the audience, although they were a bit more skeptical of
Walter's solution, which is to simply drop the connection.
Dropping the connection is usually the correct behavior on the
browser's part, he said, since the certificate problem is either an attack or a
server-side misconfiguration. But there is one major class of
exception, he added: enterprise certificate authorities (CAs). In
these situations, an enterprise deploys an "anchor" certificate for
its network which is not known to browsers out of the box. By adding
support for managing enterprise CAs, GNOME can handle these situations
without bringing back the untrusted certificate prompt.
Walter's solution is p11-kit-trust,
which implements a shared "Trust Store" where any crypto library can
store certificates, blacklists, credentials, or other information, and
they will automatically be accessible to all applications. So far,
NSS and GnuTLS support the Trust Store already, with a temporary
workaround in place for OpenSSL and Java. Packages are already
available for Debian and Fedora. There are command-line tools for
administrators to add new certificates to the store, but there are not
yet GUI tools or documentation. The same tools, he said, should be
used for installing test certificates, personal or self-signed
certificates, and other use-cases encountered by "professional" users.
The second new project is a change to how applications store
passwords. Right now, gnome-keyring stores all passwords for
all applications, but Walters noted that this is really surprising to
users, particularly when they learn that any application can request
any other application's stored passwords. The user's expectation, he
said, is that passwords are "account data" and would be stored with
other account information for the application. That is true, he
observed, but it has not been done in practice because there is not a
reliable way to encrypt all of this per-application storage.
The solution is libsecret, which
applications can use to encrypt and store passwords with their other
account information. Libsecret uses the Linux kernel keyring to hold
a session key that the applications request to use for encrypting
their saved passwords. Normally this session key is derived at the
start of the session from the user's login password, but other values
can also be returned to applications for policy reasons. Returning a
blank key, Walter said, means "store your data in the clear," while
not returning any value means the application is not permitted to save
data.
The third new feature Walter is working on is the solution to a
GNOME annoyance, in which the user is prompted at login time for the
password, even if they have logged in via another method (such as
fingerprint, PIN, or auto-login). The cause of this re-authentication
is that GNOME needs the user password to decrypt secret data; the same
double-step occurs when a user is prompted once for their password
when unlocking an encrypted hard disk, and again when logging in to
the session.
Walter's solution is a pluggable authentication module (PAM) called
pam_unsuck that, again, relies on the kernel keyring. The
kernel keyring will hold the user's password after login so it can be
reused. If an account does not use any password to log in, a password
will be created for it and saved in hardware-protected storage (where
possible). He noted that the decision to use auto-login,
fingerprints, or PINs already constitutes the user's conscious choice
to use an authentication method less secure than a password. This
scheme allows them to make that decision, it just prevents the
nuisances of being prompted for a password anyway.
Walter ended the session by imploring developers to "go forth and
kill ... prompts." There are many more places where changing the
user-interruption paradigm can help GNOME craft a more secure system
overall, he said, by putting fewer security decisions in front of the
user.
[The author wishes to thank the GNOME Foundation for assistance
with travel to GUADEC 2013.]
Comments (11 posted)
Brief items
This whole issue of privacy is utterly fascinating to me. Who's ever heard
of this information being misused by the government? In what way?
—
Larry
Ellison as quoted in
The Register
You, an executive in one of those companies, can fight. You'll probably
lose, but you need to take the stand. And you might
win. It's time we
called the government's actions what it really is:
commandeering. Commandeering is a practice we're used to in wartime, where
commercial ships are taken for military use, or production lines are
converted to military production. But now it's happening in peacetime. Vast
swaths of the Internet are being commandeered to support this surveillance
state.
—
Bruce
Schneier has advice for internet company executives
This experience has taught me one very important lesson: without
congressional action or a strong judicial precedent, I would _strongly_
recommend against anyone trusting their private data to a company with
physical ties to the United States.
—
Ladar Levison shuts down the Lavabit
email service
Today, another secure email provider, Lavabit, shut down their system lest
they "be complicit in crimes against the American people." We see the
writing the wall, and we have decided that it is best for us to shut down
Silent Mail now. We have not received subpoenas, warrants, security
letters, or anything else by any government, and this is why we are acting
now.
—
Silent
Circle shuts down its email service
Comments (12 posted)
On his blog, KDE hacker Martin Gräßlin issues a
call to action for free software developers to have their projects default to privacy-preserving operation.
"
With informational self-determination every user has to be always aware of which data is sent to where. By default no application may send data to any service without the users consent. Of course it doesn't make sense to ask the user each time a software wants to connect to the Internet. We need to find a balance between a good usability and still protecting the most important private data.
Therefore I suggest that the FLOSS community designs a new specification which applications can use to tell in machine readable way with which services they interact and which data is submitted to the service. Also such a specification should include ways on how users can easily tell that they don't want to use this service any more."
Comments (37 posted)
New vulnerabilities
chrony: two vulnerabilities
| Package(s): | chrony |
CVE #(s): | CVE-2012-4502
CVE-2012-4503
|
| Created: | August 12, 2013 |
Updated: | September 18, 2013 |
| Description: |
From the Red Hat bugzilla:
Chrony upstream has released 1.29 version correcting the following two security flaws:
* CVE-2012-4502: Buffer overflow when processing crafted command packets
When the length of the REQ_SUBNETS_ACCESSED, REQ_CLIENT_ACCESSES
command requests and the RPY_SUBNETS_ACCESSED, RPY_CLIENT_ACCESSES,
RPY_CLIENT_ACCESSES_BY_INDEX, RPY_MANUAL_LIST command replies is
calculated, the number of items stored in the packet is not validated.
A crafted command request/reply can be used to crash the server/client.
Only clients allowed by cmdallow (by default only localhost) can crash
the server.
With chrony versions 1.25 and 1.26 this bug has a smaller security
impact as the server requires the clients to be authenticated in order
to process the subnet and client accesses commands. In 1.27 and 1.28,
however, the invalid calculated length is included also in the
authentication check which may cause another crash.
* CVE-2012-4503: Uninitialized data in command replies
The RPY_SUBNETS_ACCESSED and RPY_CLIENT_ACCESSES command replies can
contain uninitalized data from stack when the client logging is disabled
or a bad subnet is requested. These commands were never used by chronyc
and they require the client to be authenticated since version 1.25. |
| Alerts: |
|
Comments (none posted)
cxf: denial of service
| Package(s): | cxf |
CVE #(s): | CVE-2013-2160
|
| Created: | August 12, 2013 |
Updated: | August 14, 2013 |
| Description: |
From the Red Hat bugzilla:
Multiple denial of service flaws were found in the way StAX parser implementation of Apache CXF, an open-source web services framework, performed processing of certain XML files. If a web service application utilized the services of the StAX parser, a remote attacker could provide a specially-crafted XML file that, when processed by the application would lead to excessive system resources (CPU cycles, memory) consumption by that application. |
| Alerts: |
|
Comments (none posted)
mozilla: multiple vulnerabilities
| Package(s): | firefox |
CVE #(s): | CVE-2013-1706
CVE-2013-1707
CVE-2013-1712
|
| Created: | August 14, 2013 |
Updated: | August 14, 2013 |
| Description: |
From the CVE entries:
Stack-based buffer overflow in maintenanceservice.exe in the Mozilla Maintenance Service in Mozilla Firefox before 23.0, Firefox ESR 17.x before 17.0.8, Thunderbird before 17.0.8, and Thunderbird ESR 17.x before 17.0.8 allows local users to gain privileges via a long pathname on the command line. (CVE-2013-1706)
Stack-based buffer overflow in Mozilla Updater in Mozilla Firefox before 23.0, Firefox ESR 17.x before 17.0.8, Thunderbird before 17.0.8, and Thunderbird ESR 17.x before 17.0.8 allows local users to gain privileges via a long pathname on the command line to the Mozilla Maintenance Service. (CVE-2013-1707)
Multiple untrusted search path vulnerabilities in updater.exe in Mozilla Updater in Mozilla Firefox before 23.0, Firefox ESR 17.x before 17.0.8, Thunderbird before 17.0.8, and Thunderbird ESR 17.x before 17.0.8 on Windows 7, Windows Server 2008 R2, Windows 8, and Windows Server 2012 allow local users to gain privileges via a Trojan horse DLL in (1) the update directory or (2) the current working directory. (CVE-2013-1712) |
| Alerts: |
|
Comments (none posted)
phpMyAdmin: multiple vulnerabilities
Comments (none posted)
putty: multiple vulnerabilities
| Package(s): | putty |
CVE #(s): | CVE-2013-4206
CVE-2013-4207
CVE-2013-4208
CVE-2013-4852
|
| Created: | August 12, 2013 |
Updated: | September 30, 2013 |
| Description: |
From the Debian advisory:
CVE-2013-4206:
Mark Wooding discovered a heap-corrupting buffer underrun bug in the
modmul function which performs modular multiplication. As the modmul
function is called during validation of any DSA signature received
by PuTTY, including during the initial key exchange phase, a
malicious server could exploit this vulnerability before the client
has received and verified a host key signature. An attack to this
vulnerability can thus be performed by a man-in-the-middle between
the SSH client and server, and the normal host key protections
against man-in-the-middle attacks are bypassed.
CVE-2013-4207:
It was discovered that non-coprime values in DSA signatures can
cause a buffer overflow in the calculation code of modular inverses
when verifying a DSA signature. Such a signature is invalid. This
bug however applies to any DSA signature received by PuTTY,
including during the initial key exchange phase and thus it can be
exploited by a malicious server before the client has received and
verified a host key signature.
CVE-2013-4208:
It was discovered that private keys were left in memory after being
used by PuTTY tools.
CVE-2013-4852:
Gergely Eberhardt from SEARCH-LAB Ltd. discovered that PuTTY is
vulnerable to an integer overflow leading to heap overflow during
the SSH handshake before authentication due to improper bounds
checking of the length parameter received from the SSH server. A
remote attacker could use this vulnerability to mount a local denial
of service attack by crashing the putty client. |
| Alerts: |
|
Comments (none posted)
python-glanceclient: incorrect SSL certificate CNAME checking
| Package(s): | python-glanceclient |
CVE #(s): | CVE-2013-4111
|
| Created: | August 14, 2013 |
Updated: | September 4, 2013 |
| Description: |
From the openSUSE advisory:
This update of python-glanceclient fixed SSL certificate
CNAME checking. |
| Alerts: |
|
Comments (none posted)
ReviewBoard, python-djblets: multiple vulnerabilities
| Package(s): | ReviewBoard, python-djblets |
CVE #(s): | |
| Created: | August 8, 2013 |
Updated: | October 2, 2013 |
| Description: |
From the Fedora advisory:
* Function names in diff headers are no longer rendered as HTML.
* If a user’s full name contained HTML, the Submitters list would render it as HTML, without
escaping it. This was an XSS vulnerability.
* The default Apache configuration is now more strict with how it serves up file attachments.
This does not apply to existing installations. See
http://support.beanbaginc.com/support/solutions/articles/... for
details.
* Uploaded files are now renamed to include a hash, preventing users from uploading malicious
filenames, and making filenames unguessable.
* Recaptcha support has been updated to use the new URLs provided by Google. |
| Alerts: |
|
Comments (none posted)
spice: denial of service
| Package(s): | spice |
CVE #(s): | CVE-2013-4130
|
| Created: | August 12, 2013 |
Updated: | September 4, 2013 |
| Description: |
From the Red Hat bugzilla:
Currently, both red_channel_pipes_add_type() and red_channel_pipes_add_empty_msg() use plaing RING_FOREACH() which is not safe versus removals from the ring within the loop body. Yet, when (network) error does occur, the current item could be removed from the ring down the road and the assertion in RING_FOREACH()'s ring_next() could trip, causing the process containing the spice server to abort.
An user able to initiate spice connection to the guest could use this flaw to crash the guest. |
| Alerts: |
|
Comments (none posted)
strongswan: denial of service
| Package(s): | strongswan |
CVE #(s): | CVE-2013-5018
|
| Created: | August 14, 2013 |
Updated: | August 23, 2013 |
| Description: |
From the openSUSE advisory:
This update of strongswan fixed a denial-of-service
vulnerability, that could be triggered by special XAuth
usernames and EAP identities. |
| Alerts: |
|
Comments (none posted)
swift: denial of service
| Package(s): | swift |
CVE #(s): | CVE-2013-4155
|
| Created: | August 13, 2013 |
Updated: | September 4, 2013 |
| Description: |
From the Debian advisory:
Peter Portante from Red Hat reported a vulnerability in Swift.
By issuing requests with an old X-Timestamp value, an
authenticated attacker can fill an object server with superfluous
object tombstones, which may significantly slow down subsequent
requests to that object server, facilitating a Denial of Service
attack against Swift clusters. |
| Alerts: |
|
Comments (none posted)
vlc: unspecified vulnerability
| Package(s): | vlc |
CVE #(s): | CVE-2013-3565
|
| Created: | August 12, 2013 |
Updated: | August 14, 2013 |
| Description: |
From the vlc 2.0.8 announcement:
2.0.8 is a small update that fixes some regressions of the 2.0.x branch of VLC.
2.0.8 fixes numerous crashes and dangerous behaviors. |
| Alerts: |
|
Comments (none posted)
xymon: unauthorized file deletion
| Package(s): | xymon |
CVE #(s): | CVE-2013-4173
|
| Created: | August 12, 2013 |
Updated: | August 14, 2013 |
| Description: |
From the Mageia advisory:
A security vulnerability has been found in version 4.x of the
Xymon Systems & Network Monitor tool
The error permits a remote attacker to delete files on the server
running the Xymon trend-data daemon "xymond_rrd".
File deletion is done with the privileges of the user that Xymon is
running with, so it is limited to files available to the userid
running the Xymon service. This includes all historical data stored
by the Xymon monitoring system. |
| Alerts: |
|
Comments (none posted)
Page editor: Jake Edge
Kernel development
Brief items
The current development kernel is 3.11-rc5,
released on August 11. Linus said:
"
Sadly, the numerology doesn't quite work out, and while releasing
the final 3.11 today would be a lovely coincidence (Windows 3.11 was
released twenty years ago today), it is not to be. Instead, we have
3.11-rc5." Along with the usual fixes, this prepatch contains the
linkat() permissions change discussed
in the August 8 Kernel Page.
Stable updates: 3.10.6,
3.4.57, and 3.0.90 were released on August 11.
The 3.10.7,
3.4.58, and
3.0.91 updates are in the review process as
of this writing; they can be expected sometime on or after August 15.
Comments (1 posted)
All companies end up in the Open Source Internet Beam Of Hate at
some point or another, not always for good reason. I've felt that
heat myself a few times in the last few years, I know all too well
what it's like to be hated by the people you're trying to help.
—
Jean-Baptiste
Quéru
One of the properties that π is conjectured to have is that it is
normal, which is to say that its digits are all distributed evenly,
with the implication that it is a disjunctive sequence, meaning
that all possible finite sequences of digits will be present
somewhere in it. If we consider π in base 16 (hexadecimal) , it is
trivial to see that if this conjecture is true, then all possible
finite files must exist within π. The first record of this
observation dates back to 2001.
From here, it is a small leap to see that if π contains all
possible files, why are we wasting exabytes of space storing those
files, when we could just look them up in π!
—
The π filesystem
We seem to have reached the point in kernel development where
"security" is the magic word to escape from any kind of due process
(it is, in fact, starting to be used in much the same way the
phrase "war on terror" is used to abrogate due process usually
required by the US constitution).
—
James
Bottomley
It's disturbing to me that there are almost as many addresses from
people like Lockheed Martin, Raytheon Missile, various govt
agencies from various countries with access to the coverity db as
there are people who actually have contributed something to the
kernel in the past. (The mix is even more skewed when you factor in
other non-contrib companies like anti-virus vendors).
There's a whole industry of buying/selling vulnerabilities, and our
response is basically "oh well, we'll figure it out when an exploit
goes public".
—
Dave
Jones
Comments (6 posted)
Dan Siemon has posted
a
detailed overview of how the Linux network stack queues packets.
"
As of Linux 3.6.0 (2012-09-30), the Linux kernel has a new feature
called TCP Small Queues which aims to solve this problem for TCP. TCP Small
Queues adds a per TCP flow limit on the number of bytes which can be queued
in the QDisc and driver queue at any one time. This has the interesting
side effect of causing the kernel to push back on the application earlier
which allows the application to more effectively prioritize writes to the
socket."
Comments (1 posted)
Kernel development news
By Jonathan Corbet
August 14, 2013
Many LWN readers have been in the field long enough to remember the
year-2000 problem, caused by widespread use of two decimal digits to store
the year. Said problem was certainly overhyped, but the frantic effort to
fix it was also not entirely wasted; plenty of systems would, indeed, have
misbehaved had all those COBOL programmers not come out of retirement to
fix things up. Part of the problem was that the owners of the affected systems
waited until almost too late to address the issue, despite the fact that it
was highly predictable and had been well understood decades ahead of time.
One would hope that, in the free software world, we would not repeat this
history with another, equally predictable problem.
We'll have the opportunity to find out, since one such problem lurks over
the horizon. The classic Unix representation for time is a signed 32-bit
integer containing the number of seconds since January 1, 1970. This
value will overflow on January 19, 2038, less than 25 years from now.
One might think that the time remaining is enough to approach a fix in a
relaxed manner, and one would be right. But, given the longevity of many
installed systems, including hard-to-update embedded systems, there may be
less time for a truly relaxed fix than one might think.
It is thus interesting to note that, on August 12, OpenBSD developer Philip
Guenther checked in a patch to the OpenBSD
system changing the types of most time values to 64-bit quantities. With
64 bits, there is more than enough room to store time values far past
the foreseeable future, even if high-resolution (nanosecond-based) time
values are used. Once the issues are shaken out, OpenBSD will likely have
left the year-2038 problem behind; one could thus argue that they are well
ahead of Linux on this score. And perhaps that is true, but there are some
good reasons for Linux to proceed relatively slowly with regard to this
problem.
The OpenBSD patch changes types like time_t and clock_t
to 64-bit quantities. Such changes ripple outward quickly; for example,
standard types like struct timeval and
struct timespec contain time_t fields, so those
structures change as well. The struct stat passed to the
stat() system call also contains a set of time_t values.
In other words, the changes made by OpenBSD add up to one huge,
incompatible ABI change. As a
result, OpenBSD kernels with this change will generally not run binaries
that predate the change; anybody updating to the new code is advised to do
so with a great deal of care.
OpenBSD can do this because it is a self-contained system, with the kernel
and user space built together out of a single repository. There is little
concern for users with outside binaries; one is expected to update the
system as a whole and rebuild programs from source if need be. As a
result, OpenBSD developers are much less reluctant to break the kernel ABI
than Linux developers are. Indeed, Philip went ahead and expanded
ino_t (used to represent inode numbers) as well while he was at
it, even though that type is not affected by this problem. As long as
users testing this code follow the recommendations and start fresh with a full
snapshot, everything will still work. Users attempting to update an
installed system will need
to be a bit more careful.
In the Linux world, we are unable to simply drag all of user space forward
with the kernel, so we cannot make incompatible ABI changes in this way.
That is going to complicate the year-2038 transition considerably — all the
more reason why it needs to be thought out ahead of time. That said, not
all systems are at risk. As a general
rule, users of 64-bit systems will not have problems in 2038, since 64-bit
values are already the norm on such machines. The 32-bit x32 ABI was also designed with 64-bit time
values. So many Linux users are already well taken care of.
But users of the pure 32-bit ABI will run into trouble. Of course, there
is a possibility that there
will be no 32-bit systems in the wild 25 years from now, but history argues
otherwise. Even with its memory addressing limitations (a 32-bit processor
with the physical address extension feature will struggle to work with 16GB of
memory which, one assumes, will barely be enough to hold a "hello world"
program in 2038), a 32-bit system can perform a lot of useful tasks. There
may well be large numbers of embedded 32-bit systems running in 2038 that
were deployed many years prior. There will almost certainly be 32-bit
systems running in 2038 that will need to be made to work properly.
During a brief discussion on the topic last June, Thomas Gleixner described a possible approach to the problem:
If we really want to survive 2038, then we need to get rid of the
timespec based representation of time in the kernel altogether and
switch all related code over to a scalar nsec 64bit storage. [...]
Though even if we fix that we still need to twist our brains around
the timespec/timeval based user space interfaces. That's going to
be the way more interesting challenge.
In other words, if a new ABI needs to be created anyway, it would make
sense to get rid of structures like timespec (which split times
into two fields, representing seconds and nanoseconds) and use a simple
nanosecond count. Software could then migrate over to the new system calls
at leisure. Thomas suggested keeping the
older system call infrastructure in place for five years, meaning that
operations using the older time formats would continue to be directly
implemented by the kernel; that would prevent unconverted code from
suffering performance regressions. After that period passed, the
compatibility code would be replaced by wrappers around the new system
calls, possibly slowing the emulated calls down and providing an incentive for
developers to update their code. Then, after about ten years, the old
system calls could be deprecated.
Removal of those system calls could be an interesting challenge, though;
even Thomas suggested keeping them for 100 years to avoid making Linus
grumpy. If the system calls are to be kept up to (and past) 2038, some way
will need to be found to make them work in some fashion. John Stultz had
an interesting suggestion toward that end:
turn time_t into an unsigned value, sacrificing the ability to
represent dates before 1970 to gain some breathing room in the future.
There are some interesting challenges to deal with, and some software would
surely break, but, without a change, all software using 32-bit
time_t values will break in 2038. So this change may well be
worth considering.
Even without legacy applications to worry about, making 32-bit Linux
year-2038 safe would be a significant challenge. The ABI constraints make
the job harder yet. Given that some parts of any migration simply cannot
be rushed, and given that some deployed systems run for many years, it
would make sense to be thinking about a solution to this problem now.
Then, perhaps, we'll all be able to enjoy our retirement without having to
respond to a long-predicted time_t mess.
Comments (61 posted)
By Jake Edge
August 14, 2013
Network port numbers are a finite resource, and each port number can only
be used by one application at a time. Ensuring that the "right"
application gets a particular port number is important because that
number is required by remote programs trying to connect to the program.
Various methods exist to reserve specific ports, but there are still ways
for an application to lose "its" port. Enter KPortReserve, a Linux Security Module (LSM)
that allows an administrator to ensure that a program gets its reservation.
One could argue that KPortReserve does not really make sense as an LSM—in
fact, Tetsuo Handa asked just that question in his RFC post proposing it.
So far, no one has argued that way, and Casey Schaufler took the opposite view, but the RFC has only
been posted to the LSM and kernel hardening mailing lists. The level of
opposition might rise if and when the patch set heads toward the mainline.
But KPortReserve does solve a real problem. Administrators can ensure that
automatic port assignments (i.e. those chosen when the bind() port
number is zero) adhere to specific ranges by setting a range or ranges of
ports in the /proc/sys/net/ipv4/ip_local_reserved_ports file. But
that solution only works for
applications that do not choose a specific port number. Programs that do
choose a particular port will be allowed to grab it—possibly at the expense
of the
administrator's choice. Furthermore, if the port number is not in the
privileged range (<= 1024), even unprivileged programs can allocate it.
There is at least one existing
user-space solution using portreserve, but it still suffers from race
conditions. Systemd has a race-free way to reserve ports, but it requires
changes to programs that will listen on those ports and is not
available everywhere, which is why Handa turned to a kernel-based solution.
The solution itself is fairly straightforward. It provides a
socket_bind() method in its struct security_operations to
intercept bind() calls, which checks the reserved list. An
administrator can write
some values to a control file (where, exactly, that control file
would live and the syntax it would use were being discussed in the thread) to
determine which ports are reserved and what program should be allowed to
allocate them. For example:
echo '10000 /path/to/server' >/path/to/control/file
That would restrict port 10,000 to only being used by the server program
indicated by the path. A
special "
<kernel>" string could be used to specify that
the port is reserved for kernel threads.
Vasily Kulikov
objected to
specifying that certain programs could bind the port, rather a user ID
or some LSM security context, but Schaufler disagreed, calling it "very 21st century
thinking". His argument is that using unrelated attributes to
govern port reservation could interfere with the normal uses of those
attributes:
[...] Android used (co-opted, hijacked) the
UID to accomplish this. Some (but not all) aspects of SELinux policy
in Fedora identify the program and its standing within the system.
Both of these systems abuse security attributes that are not intended
to identify programs to do just that. This limits the legitimate use
of those attributes for their original purpose.
What Tetsuo is proposing is using the information he really cares
about (the program) rather than an attribute (UID, SELinux context,
Smack label) that can be associated with the program. Further, he
is using it in a way that does not interfere with the intended use
of UIDs, labels or any other existing security attribute.
Beyond that, Handa noted that all of the
programs he is interested in for this feature are likely running as root.
While it would seem that root-controlled processes could be coordinated so
that they didn't step on each other's ports, there are, evidently,
situations where that is not so easy to arrange.
In his initial RFC, Handa wondered if the KPortReserve functionality should
simply be added to the Yama LSM. At the 2011 Linux Security Summit, Yama
was targeted as an LSM to hold
discretionary access control (DAC) enhancements, which port reservations
might be shoehorned into—maybe. But, then and since, there has been a
concern that Yama not become a "dumping ground" for unrelated
security patches. Thus, Schaufler argued, Yama is not the right place for
KPortReserve.
However, there is the well-known problem
for smaller, targeted LSMs: there is
currently no way to have more than one LSM active on any given boot of
the system. Handa's interest in Yama may partly be because it has, over
time, changed from a "normal" LSM to one that can be unconditionally
stacked, which means that it will be called regardless of which LSM is
currently active. Obviously, if KPortReserve were added to Yama, it would
likewise evade the single-LSM restriction.
But, of course, Schaufler has been working on another way around that
restriction for some time now. There have been attempts to stack (or chain
or compose) LSMs for nearly as long as they have existed, but none has ever
reached the mainline. The latest entrant, Schaufler's "multiple
concurrent LSMs" patch set, is now up to version 14. Unlike some
earlier versions, any of the existing LSMs (SELinux, AppArmor, TOMOYO, or
Smack) can now be arbitrarily combined using the technique. One would
guess it wouldn't be difficult to incorporate a single-hook LSM like
KPortReserve into the mix.
While there was some discussion of Schaufler's patches when they were
posted at the end of July—and no objections to the idea—it still is unclear
when (or if) we will see this capability in a mainline kernel. One senses
that we are getting closer to that point, and new single-purpose LSM ideas
crop up fairly regularly, but we aren't there yet. Schaufler will be
presenting his ideas at the Linux
Security Summit in September. Perhaps the discussion there will help
clarify the future of this feature.
Comments (4 posted)
By Jonathan Corbet
August 14, 2013
The kernel's lowest-level primitives can be called thousands of times (or
more) every second, so, as one might expect, they have been ruthlessly
optimized over the years. To do otherwise would be to sacrifice some of
the system's performance needlessly. But, as it happens, hard-won
performance can slip away over the years as the code is changed and gains
new features. Often, such performance loss goes unnoticed until a
developer decides to take a closer look at a specific kernel subsystem.
That would appear to have just happened with regard to how the kernel
handles preemption.
User-space access and voluntary preemption
In this case, things got started when Andi Kleen decided to make the
user-space data access routines — copy_from_user() and friends —
go a little faster. As he explained in the
resulting patch set, those functions were once precisely tuned
for performance on x86 systems. But then they were augmented with calls to
functions like might_sleep() and might_fault(). These
functions initially served in a debugging role; they scream loudly if they are
called in a situation where sleeping or page faults are not welcome. Since
these checks are for debugging, they can be turned off in a production kernel,
so the addition of these calls should not affect performance in situations
where performance really matters.
But, then, in 2004, core kernel developers started to take latency issues a
bit more seriously, and that led to an interest in preempting execution of
kernel code if a higher-priority process needed the CPU. The problem was
that, at that time, it was not exactly clear when it would be safe to preempt
a thread in kernel space. But, as Ingo Molnar and Arjan van de Ven
noticed, calls to might_sleep() were, by definition,
placed in locations where the code was prepared to sleep. So a
might_sleep() call had to be a safe place to preempt a thread
running in kernel mode. The result was the voluntary preemption patch set, adding a
limited preemption mode that is still in use today.
The problem, as Andi saw it, is that this change turned
might_sleep() and might_fault() into a part of the
scheduler; it is no longer
compiled out of a kernel if voluntary preemption is enabled. That, in
turn, has slowed down user-space access functions by (on his system) about
2.5µs for each call. His patch set does a few things to try to make the
situation better. Some functions (should_resched(), which is
called from might_sleep(), for example) are
marked __always_inline to remove the function calling overhead.
A new might_fault_debug_only() function goes back to the original
intent of might_fault(); it disappears entirely when it is not
needed. And so on.
Linus had no real objection to these patches, but they clearly raised a
couple of questions in his mind. One of his first comments was a suggestion that, rather than optimizing the
might_fault() call in functions like copy_from_user(), it
would be better to omit the check altogether. Voluntary preemption points are
normally used to switch between kernel threads when an expensive operation
is being performed. If a user-space access succeeds without faulting, it
is not expensive at all; it is really just another memory fetch. If,
instead, it causes a page fault, there will already be opportunities for
preemption. So, Linus reasoned, there is little point in slowing down
user-space accesses with additional preemption checks.
The problem with full preemption
To this point, the discussion was mostly concerned about voluntary
preemption, where a thread running in the kernel can lose access to the
processor, but only at specific spots. But the kernel also supports "full
preemption," which allows preemption almost anywhere that preemption has
not been explicitly disabled.
In the early days of kernel preemption, many users shied away
from the full preemption option, fearing subtle bugs. They may have been
right at the time, but, in the intervening years, the fully preemptible
kernel has become much more solid. Years of experience, helped by tools
like the locking validator, can work wonders that way. So there is little
reason to be afraid to enable full preemption at this point.
With that history presumably in mind, H. Peter Anvin entered the
conversation with a question: should
voluntary preemption be phased out entirely in favor of full kernel
preemption?
It turns out that there is still one reason to avoid turning on full
preemption: as Mike Galbraith put it, "PREEMPT munches
throughput." Complaints about the cost of full preemption have been
scarce over the years, but, evidently, it does hurt in some cases. As long
as there is a performance penalty to the use of full preemption, it is
going to be hard to convince throughput-oriented users to switch to it.
There would not seem to be any fundamental reason why full preemption
should adversely affect throughput. If the rate of preemption were high, there
could be some associated cache effects, but preemption should be a
relatively rare event in
a throughput-sensitive system. That suggests that something else is going
on. A clue about that "something else" can be found in Linus's observation that the testing of
the preemption count — which happens far more often in a fully preemptible
kernel — is causing the compiler to generate slower code.
The thing is, even if that is almost never taken, just the fact
that there is a conditional function call very often makes code
generation *much* worse. A function that is a leaf function with no
stack frame with no preemption often turns into a non-leaf function
with stack frames when you enable preemption, just because it had a
RCU read region which disabled preemption.
So configuring full preemption into the kernel can make
performance-sensitive code slower. Users who are concerned about latency may
well be willing to make that tradeoff, but those who want throughput will
not be so agreeable. The
good news is that it might be possible to do something about this problem
and keep both camps happy.
Optimizing full preemption
The root of the problem is accesses to the variable known as the
"preemption count," which can be found in the
thread_info structure, which, in turn
lives at the bottom of the kernel stack. It is not just a counter, though;
instead it is a 32-bit quantity that has been divided up into several
subfields:
- The actual preemption count, indicating how many times kernel code has
disabled preemption. This counter allows calls like
preempt_disable() to be nested and still do the right thing
(eight bits).
- The software interrupt count, indicating how many nested software
interrupts are being handled at the moment (eight bits).
- The hardware interrupt count (ten bits on most architectures).
- The PREEMPT_ACTIVE bit indicating that the current thread
is being (or just has been) preempted.
This may seem like a complicated combination of fields, but it has one
useful feature: the preemptability of the currently-running thread can be
tested by comparing the entire preemption count against zero. If any of
the counters has been incremented (or the PREEMPT_ACTIVE bit set),
preemption will be disabled.
It seems that the cost of testing this count might be reduced significantly
with some tricky assembly language work; that is being hashed out as of
this writing. But there's another aspect of the preemption count that
turns out to be costly: its placement in the thread_info
structure. The location of that structure must be derived from the kernel
stack pointer, making the whole test significantly more expensive.
The important realization here is that there is (almost) nothing about the
preemption count that is specific to any given thread. It will be zero for
every non-executing thread; and no executing thread will be preempted if
the count is nonzero. It is, in truth, more of an attribute of the CPU
than of the running process. And that suggests that it would be naturally
stored as a per-CPU variable. Peter Zijlstra has posted a patch that changes things in just that way.
The patch turned out to be relatively straightforward; the only twist is
that the PREEMPT_ACTIVE flag, being a true per-thread attribute,
must be saved in the thread_info structure when preemption occurs.
Peter's first patch didn't quite solve the entire problem, though: there is
still the matter of the TIF_NEED_RESCHED flag that is set in the
thread_info structure when kernel code (possibly running in an
interrupt handler or on another CPU) determines that the currently-running
task should be preempted. That flag must be tested whenever the preemption
count returns to zero, and in a number of other situations as well; as long
as that test must be done, there will still be a cost to enabling full
preemption.
Naturally enough, Linus has a solution to this
problem in mind as well. The "need rescheduling" flag would move to
the per-CPU preemption count as well, probably in the uppermost bit. That
raises an interesting problem, though. The preemption count, as a per-CPU
variable, can be manipulated without locks or the use of expensive atomic
operations. This new flag, though, could well be set by another CPU
entirely; putting it into the preemption count would thus wreck that
count's per-CPU
nature. But Linus has a scheme for dancing around this problem. The "need
rescheduling" flag would only be changed using atomic operations,
but the remainder of the preemption count
would be updated locklessly as before.
Mixing atomic and non-atomic operations is normally a way to generate
headaches for everybody involved. In this case, though, things might just
work out. The use of atomic operations for the "need rescheduling" bit
means that any CPU can set that bit without corrupting the counters. On
the other hand, when a CPU changes its preemption count, there is a small
chance that it will race with another CPU that is trying to set the "need
rescheduling" flag, causing that flag to be lost.
That, in turn, means that the currently executing thread will
not be preempted when it should be. That result is unfortunate, in that it
will increase latency for the higher-priority task that is trying to run,
but it will not generate incorrect results. It is a minor bit of
sloppiness that the kernel can get away with if the performance benefits
are large enough.
In this case, though, there appears to be a better solution to the problem.
Peter came back with an alternative
approach that keeps the TIF_NEED_RESCHED flag in the
thread_info structure, but also adds a copy of that flag in the
preemption count. In current kernels, when the kernel sets
TIF_NEED_RESCHED, it also
signals an inter-processor interrupt (IPI) to inform the relevant CPU that
preemption is required. Peter's patch makes the IPI handler copy the flag
from the thread_info structure to the per-CPU
preemption count; since that copy is done by the processor that owns
the count variable, the per-CPU nature of that count is preserved and the
race conditions go away. As of this writing, that approach seems like the
best of all worlds — fast testing of the "need rescheduling" flag without
race conditions.
Needless to say, this kind of low-level tweaking needs to be done carefully
and well benchmarked. It could be that, once all the details are taken
care of, the performance gained does not justify the trickiness and
complexity of the changes. So this work is almost certainly not 3.12
material. But, if it works out, it may be that much of the throughput cost
associated with enabling full preemption will go away, with the eventual
result that the voluntary preemption mode could be phased out.
Comments (10 posted)
Patches and updates
Kernel trees
- Sebastian Andrzej Siewior: 3.10.6-rt3 .
(August 12, 2013)
Build system
Core kernel code
Development tools
Device drivers
Documentation
Filesystems and block I/O
Memory management
Networking
Architecture-specific
Virtualization and containers
Miscellaneous
Page editor: Jonathan Corbet
Distributions
By Jake Edge
August 14, 2013
The idea of organizing a distribution into "rings" appears to be an idea
that is resonating
right now. Last week, we reported on some
openSUSE discussions surrounding some of the problems the distribution is
experiencing, along with possible solutions. One of those solutions
involved separating openSUSE into cohesive chunks (i.e. the rings), each of
which is built on the capabilities of the lower rings. As it turns out,
Matthew Miller, Fedora's Cloud Architect, posted a similar idea to the
fedora-devel mailing list in July. It would seem that Fedora and openSUSE
are both experiencing many of the same problems—and perhaps coming to some
of the
same conclusions.
Miller not only posted his ideas to the mailing list, he also presented
them at the recently concluded Flock conference for Fedora
contributors. The slides, video, and transcript
from that talk are all available for those interested. In essence, the
mailing list post was a preview of what he presented.
After starting by outlining the obligatory good attributes of Fedora,
Miller pointed out some of the
same problem areas that SUSE VP of Engineering Ralf Flaxa noted in his openSUSE
conference keynote: Fedora is not as widely used as it could be, including
by users of RHEL, the distribution is not seen as particularly relevant (or
exciting), and it isn't a good base for others to build upon. To solve
those problems, Miller suggested the idea of breaking the distribution up
into rings.
Miller starts by describing Ring 1 as "Fedora Core"—a name that predictably
raised some hackles on the list. The original Core was
determined based on who maintained the package. Those handled by Red
Hat employees went into Core, while those maintained by volunteers went
into "Extras". There wasn't any way for the community to participate in
the development or maintenance of Core. In addition: "the quality standards for Fedora Extras, the collection of packages
built around the core, were much, much higher. Upside-down!", he
said. Those mistakes would not be repeated, he stressed.
So, Ring 1 contains the "base functionality and behavior that
everyone can expect of any Fedora system". It is, in effect, the
foundation for the higher levels. Underneath Ring 1 is Ring 0, which is
"Just Enough Fedora". It would be based on the current @core,
but would be slimmed down from there.
Ring 2 is less of a ring, really, and more of a collection of what Miller
calls "environments and stacks". Environments are
"where you run the code you care about", and he gave examples
like desktop environments, as well as cloud and virtual machine images.
Stacks are collections of tools used by other software, such as languages,
database systems, web frameworks, and so on. Perhaps X and Wayland would be
considered stacks in that model, he said.
The idea behind the rings is to give the Fedora special interest groups
(SIGs) a place where their customizations fit into the Fedora picture.
Each ring would have less restrictive policies as you move toward the
higher levels, so changes to Ring 0 for a Spin (which is the end product
of a SIG) would likely not be possible, and Ring 1 changes strongly
discouraged (or disallowed), but Ring 2 would be more open.
Some of the kinds of policies that SIGs might want to override include
packaging type (e.g. not RPM), changing software versions from lower rings,
allowing
some library
bundling, and the lifecycle. So, potentially a SIG could create a Spin
that had a longer life than the 13-month Fedora norm, for example, or that
certain package versions (a language, say) would be supported longer than
it is elsewhere in the Fedora ecosystem.
That "elsewhere" is what Miller calls the "Fedora Commons". It would contain
the packages that are outside of the Core and the packages would be
maintained in the same
way that Fedora does today. In fact, any of the packages that aren't
incorporated into Rings 0 or 1 would automatically become members of the
Commons. These
are the packages that SIGs could choose to maintain separately in order to
differentiate their Spins from the rest of Fedora.
Miller's proposal is quite lengthy and detailed, the description here
largely just hits the high points. There has been, unsurprisingly, quite a
bit of discussion on the list and it can only be characterized as "mixed".
That's not much of a surprise either—it's rare that a radical
reshaping of anything is met with immediate near-universal acclaim
(or condemnation for that matter). The transcript of Miller's talk
indicates that people are certainly interested in the topic as does the
mailing list thread.
It is, of course, just a proposal, and one that Miller makes clear is not
set in stone (how could it be?) at all. It is an interesting rethinking of
what a distribution is and how it might be structured. It is also
completely different than what other Linux distributions are doing, which
might make it fairly risky. Except that openSUSE may be headed in a
similar direction.
Perhaps that's the most interesting piece: two distributions looking to
grow their user and contributor bases are both considering fairly
radical—but similar—changes to their structure. Where either distribution
goes is anyone's guess at this point, but it will be worth keeping an eye
on the discussions and, if any should materialize, plans. Stay tuned ...
Comments (8 posted)
Brief items
I like it when you violently agree with me
--
Patrick Lauer
Comments (1 posted)
The "Luna" release of the
elementary
OS distribution is now available; see
this blog entry
for more information on this release. LWN
looked at elementary OS in 2011.
Comments (10 posted)
Distribution News
Debian GNU/Linux
Debian will be celebrating its 20th birthday on August 16 at this year's
DebConf in Vaumarcus, Switzerland. "
During the Debian Birthday, the
Debian conference will open its doors to anyone interested in finding out
more about Debian and Free Software, inviting enthusiasts, users, and
developers to a half day of talks relating to Free Software, the Debian
Project, and the Debian operating system."
Full Story (comments: none)
Lucas Nussbaum presents his monthly report on his Debian Project Leader
activities. Topics include a survey of new contributors, an ITWire
interview, logo registration as a trademark, delegations, and more.
Full Story (comments: none)
Fedora
Máirín Duffy has posted recaps of Fedora's Flock event. Even though she was
not physically present Flock was available to remote participants. The
recaps include links to slides and transcripts of the talks. Not all
videos are available yet, but you'll find links to those that have been
released. See the posts for
day
1,
day
2, and
day
3. (Thanks to Matthew Miller)
Comments (6 posted)
Newsletters and articles of interest
Comments (none posted)
Ars technica has posted
a
lengthy look at the business side of Canonical. "
What may
surprise some people is that Canonical could be profitable today if
Shuttleworth was willing to give up his dream of revolutionizing end user
computing and focus solely on business customers. Most people who know
Ubuntu are familiar with it because of the free desktop operating system,
but Canonical also has a respectable business delivering server software
and the OpenStack cloud infrastructure platform to data
centers. Canonical's clearest path to profitability would be dumping the
desktop and mobile businesses altogether and focusing on the data center
alone."
Comments (28 posted)
BinaryTides
reviews the security-oriented
Kali Linux distribution. Kali is the latest version of the BackTrack distribution (we
looked at BackTrack 4 in 2010), which is now based on Debian rather than Ubuntu. The review looks at the distribution itself and the "top ten" security packages that come with it. "
The "Applications > Kali Linux" menu has a separate list for the top 10 security tools. These are the most useful, popular and featureful tools that find immense application in various kinds of tasks related to security like penetration testing, security analysis, application testing etc. Most of the tools are the best in their fields with no other similar equivalent or alternative."
Comments (none posted)
Wired
talks
with FreeBSD co-founder Jordan Hubbard. "
And Hubbard believes
FreeBSD can still hold its own against Linux. 'It has greater provenance,'
he says. 'If I’m going to buy a car, I want to buy one from someone well
established.' He also says the project is more transparent and holistic
than most Linux distributions. 'You want a single source tree with
everything that goes into the system? You have that with FreeBSD. It’s
clear what parts go into it.'"
Comments (61 posted)
Page editor: Rebecca Sobol
Development
August 14, 2013
This article was contributed by Bruce Byfield
GNOME's last formal usability testing was
conducted by Sun Microsystems in 2001—before the release of
GNOME 2. Since then, any usability testing that has occurred has been
informal, usually carried out by individual developers with a variety of
methods. However, the situation shows signs of improvement for GNOME 3, thanks to the work of Aakanksha Gaur, a graduate student currently completing her thesis at the National Institute of Design in Bangalore, India.
Gaur's interest in testing GNOME began when her computer crashed, and she
replaced her proprietary operating system with Ubuntu and GNOME. Looking
around for help, she realized that she was one of the few students at the
Institute who was using free software, and that GNOME could be the focus
she needed for her thesis on usability. Later, when she successfully
applied for a mentorship with the Outreach Program for
Women (OPW), it also became a source of funding for four months. In turn, Gaur has become one of the OPW's success stories.
In the three months since the conclusion of the mentorship, Gaur has continued to work with her mentor, GNOME design lead Allan Day, filing bugs and suggestions for improvements. At the start of Gaur's research, Day commented in an email that it "will be one of the first opportunities we have had to do an extended research study."
GNOME executive director Karen Sandler has helped to arrange token payments for Gaur's usability testers — a standard practice in academic research involving test subjects.
Designing the tests
With this support, Gaur has focused her thesis on usability testing of GNOME — mainly, utilities and configuration tools — and on users' perceptions of GNOME.
Unfortunately the blog notes on her early work became unavailable when her
provider closed down, and Gaur has yet to repost them. For now, the
clearest record of them is from an article
I did when she was beginning her work. These blog notes included Gaur's early research into the desktop metaphor and her earliest informal testing, together with some rough suggestions for redesign of the To Do and Character Map utilities.
This early work also shows Gaur learning how to conduct her research. After
trying to study usability by constantly asking questions as people worked,
she wrote, "I was under the confident assumption that I shall take long interviews of users and magically they will reveal the design mistakes which we shall fix and hence, rule the world."
In practice, though, she immediately found the technique lacking. The
feedback was "very vague and very unfocused" and she realized that, "I
ended up putting words in the mouth of the interviewee." The end result was
a complete lack of "data that challenged my existing beliefs about the
system in any way," and was therefore of minimal use.
Gaur's work beyond this point is documented on her current blog. Instead of micro-managing interviewees' experience, she opted for a test script in which interviewees are given a dozen basic tasks, such as changing the desktop wallpaper, searching for documents, and managing virtual workspaces. Meanwhile, Gaur observed how efficiently interviewees did each task, what mistakes they made, whether they could recall tasks later, and their emotional states after finishing a task.
The blog includes transcripts
of pilot test sessions, as well as recordings of the sessions. Like her
earlier blog entries, the available ones show Gaur making mistakes and
improving her methodology, a degree of transparency that she suggests is appropriate for a free software project. After the pilot sessions, Gaur went on to interview eight testers with her revised methodology.
The first observations
Gaur is still finalizing her results. However, she does have a few general
observations about both her methods and the state of GNOME
usability. First, based on her research, Gaur concludes that "GNOME suffers
from the issue of discoverability." That is, the tools users want are
available, but may not be easy to discover. "The problem to crack is how to
make them visible," she said, adding that "utilities like the Tweak tool
and the rich set of extensions" might be the most immediate way to deliver
improvements.
Second, people's expectations of GNOME are based heavily on the operating
systems with which they are familiar, and the web applications that they
use. While she has not finished assembling her research, she suspects that
web and
mobile applications have become more important than the operating system
for people who spend more time on the web.
Never forgetting the self-criticism, Gaur also observed that her work would be improved if she made greater efforts towards "making a user comfortable in the first few seconds of interaction."
Her preliminary conclusion? "GNOME is doing the best it can," especially
since free software development has traditionally been driven by developers
rather than users.
However, integrating usability testing and continuous efforts to monitor
and measure the impact of new design will help immensely. Having said that,
I really want to be the volunteer to take this up seriously.
Integrating Usability into GNOME Practice
These tentative conclusions are hardly startling. However, the point is that they have not been systematically recorded for GNOME 3. Instead, like all free software projects, GNOME has relied on bug reports and personal impressions, both of which are considerably better than nothing, but do not necessarily provide accurate views of the average user's experience.
Bug reports, for example, are likely to describe the experience of those
with enough knowledge to know how to file them, and not
newcomers. Similarly, for all the controversy over GNOME 3, all available
records indicate that the designers believed that they were providing simple and practical solutions to major problems. By putting matters on a more impartial basis, usability testing like Gaur's may act as a reality check to design proposals.
Certainly, GNOME is taking Gaur's work, as preliminary as it is, seriously. "I'm hopeful that this work will serve as a template for user testing exercises in the future," Day said. "There are certainly challenges involved in doing usability testing without a dedicated lab and equipment, so having a publicly accessible account of a successful open research exercise will be valuable."
Sandler agreed, adding, "The GNOME Foundation whole-heartedly supports this work. GNOME 3.8 has had a really good response, but employing systematic tests will help us improve further."
For now, Gaur is focusing on completing her thesis. Once it is accepted,
her first concern will be to make her work as widely available as possible,
especially outside of GNOME. Then, she plans to go into more detail:
Looking at specific apps and making processes for usability testing in
GNOME. GNOME has got a lifelong contributor in me, and I have a list of
nifty UX [User Experience] research tributaries that have emerged from my
current study that I would like to continue. Outside of GNOME, I will be
looking to collaborate with more FOSS projects and make my career as an
Open Source UX Researcher.
Gaur's work is just beginning. Yet the degree to which it has been accepted in
eight months speaks highly of its quality and transparency, to say nothing
of its
originality. Perhaps in another twelve years, usability testing won't have to be
re-introduced, but will have long ago become a routine concern for GNOME
and
other free software
projects.
Comments (7 posted)
Brief items
There's some magic here that I'm not going to get into right now. It's even better than I'm telling you.
—
Matthias
Clasen, explaining ... at least to a degree ... how to develop applications with GTK+ 3.10.
Kind of disappointed that Subversion (the project) hasn't moved its source control to git. Get with the program guys.
—
Geoffrey Thomas
Comments (none posted)
Version 2.18 of the GNU C library is out. It contains a lot of bug fixes,
a number of architecture-specific optimizations, a new benchmark framework,
and partial support for
hardware lock
elision, among many other changes.
Full Story (comments: 8)
Version 1.2 of tig, the ncurses-based front-end for git, is now available. Among the numerous new features in this release are "the ability to jump directly from diff to the corresponding line in the changed file, a stash view, and improvements to the log view.
Full Story (comments: none)
Version 6.14 of the TestDisk disk recovery application and PhotoRec, its sibling geared toward recovering flash storage card, have been released. This update incorporates a lengthy list of improvements, including presenting more disk information in the user interface, support for the Nintendo Wii's backup format, and recognition of 15 additional file formats.
Full Story (comments: none)
The KDE Community has
announced
the release of major updates to the Plasma Workspaces, Applications and
Development Platform. "
The Plasma Workspaces 4.11 will receive long term support as the team focuses on the technical transition to Frameworks 5. This then presents the last combined release of the Workspaces, Applications and Platform under the same version number."
Comments (none posted)
Newsletters and articles
Comments (none posted)
The Linux Journal has
an overview
of development in the graphics area. "
By providing a complete
direct rendering solution for OpenGL, the DRI3 and Present extension
provide applications with the key benefits promised by Wayland, without the
need of replacing X. Initial implementation of these extensions has been
done, but they've not yet landed in the Xserver repository. If it lands
soon, we may see this functionality in Xserver 1.15, scheduled for release
in September or October; otherwise it'll need to wait for 1.16 this
spring."
Comments (40 posted)
Øyvind "pippin" Kolås of GIMP and GEGL fame has posted an extremely compact, "spatially stable" dithering algorithm that he believes will prove useful in applications like creating animated GIFs and e-ink videos. Spatial stability means that the color of a particular pixel will not change if the original image pixel is static beneath it; this property can cut the file size of animated GIFs considerably and saves on e-ink refreshes. Kolås notes that the small code size "makes it suited for micro-controller or direct hardware implementations," and believes the algorithm used is in the public domain.
Comments (5 posted)
Page editor: Nathan Willis
Announcements
Brief items
The Free Software Foundation has
announced
the availability of the first set of videos from the 2013 LibrePlanet
conference. This set includes talks by Marina Zhurakhinskaya, Benjamin
Mako Hill, Stefano Zacchiroli, and more.
Comments (1 posted)
DebConf13 runs August 11-18. Videos of some of the talks
are
available now. There are also
live
streams of some talks.
Comments (6 posted)
Articles of interest
GitHub has been criticized for hosting code with no explicit software
license. Richard Fontana
looks into the
situation on opensource.com. "
Few would deny that the rise of GitHub as a popular hosting service for software projects is one of the most significant developments to affect open source during the past five years. GitHub's extraordinary success is necessary context for understanding the criticism leveled at it during the past year from some within or close to the open source world. This criticism has focused on licensing, or rather the lack of it: it is claimed that GitHub hosts an enormous amount of code with no explicit software license. Some critics have suggested that this situation results from a combination of the ignorance of younger developers about legal matters and willful inaction by GitHub's management."
Comments (13 posted)
Calls for Presentations
CFP Deadlines: August 15, 2013 to October 14, 2013
The following listing of CFP deadlines is taken from the
LWN.net CFP Calendar.
| Deadline | Event Dates |
Event | Location |
| August 15 |
August 22 August 25 |
GNU Hackers Meeting 2013 |
Paris, France |
| August 18 |
October 19 |
Hong Kong Open Source Conference 2013 |
Hong Kong, China |
| August 19 |
September 20 September 22 |
PyCon UK 2013 |
Coventry, UK |
| August 21 |
October 23 |
TracingSummit2013 |
Edinburgh, UK |
| August 22 |
September 25 September 27 |
LibreOffice Conference 2013 |
Milan, Italy |
| August 30 |
October 24 October 25 |
Xen Project Developer Summit |
Edinburgh, UK |
| August 31 |
October 26 October 27 |
T-DOSE Conference 2013 |
Eindhoven, Netherlands |
| August 31 |
September 24 September 25 |
Kernel Recipes 2013 |
Paris, France |
| September 1 |
November 18 November 21 |
2013 Linux Symposium |
Ottawa, Canada |
| September 6 |
October 4 October 5 |
Open Source Developers Conference France |
Paris, France |
| September 15 |
November 8 |
PGConf.DE 2013 |
Oberhausen, Germany |
| September 15 |
November 15 November 16 |
Linux Informationstage Oldenburg |
Oldenburg, Germany |
| September 15 |
October 3 October 4 |
PyConZA 2013 |
Cape Town, South Africa |
| September 15 |
November 22 November 24 |
Python Conference Spain 2013 |
Madrid, Spain |
| September 15 |
April 9 April 17 |
PyCon 2014 |
Montreal, Canada |
| September 15 |
February 1 February 2 |
FOSDEM 2014 |
Brussels, Belgium |
| October 1 |
November 28 |
Puppet Camp |
Munich, Germany |
If the CFP deadline for your event does not appear here, please
tell us about it.
Upcoming Events
Ohio LinuxFest has announced that Mark Spencer will be a keynote speaker at
this year's event (September 13-15 in Columbus, Ohio). "
Mark Spencer is the creator of Asterisk, a Linux-based open-sourced PBX in
software, and is the founder, Chairman and CTO of Digium, an open-source
telecommunications supplier most notable for its development and
sponsorship of Asterisk. Previously he achieved notice as the original
author of the GTK+-based instant messaging client Gaim (which has since
been renamed to Pidgin), of the L2TP daemon l2tpd, and of the Cheops
Network User Interface."
Full Story (comments: none)
Events: August 15, 2013 to October 14, 2013
The following event listing is taken from the
LWN.net Calendar.
| Date(s) | Event | Location |
August 11 August 18 |
DebConf13 |
Vaumarcus, Switzerland |
August 16 August 18 |
PyTexas 2013 |
College Station, TX, USA |
August 22 August 25 |
GNU Hackers Meeting 2013 |
Paris, France |
August 23 August 24 |
Barcamp GR |
Grand Rapids, MI, USA |
August 24 August 25 |
Free and Open Source Software Conference |
St.Augustin, Germany |
August 30 September 1 |
Pycon India 2013 |
Bangalore, India |
September 3 September 5 |
GanetiCon |
Athens, Greece |
September 6 September 8 |
State Of The Map 2013 |
Birmingham, UK |
September 6 September 8 |
Kiwi PyCon 2013 |
Auckland, New Zealand |
September 10 September 11 |
Malaysia Open Source Conference 2013 |
Kuala Lumpur, Malaysia |
September 12 September 14 |
SmartDevCon |
Katowice, Poland |
| September 13 |
CentOS Dojo and Community Day |
London, UK |
September 16 September 18 |
CloudOpen |
New Orleans, LA, USA |
September 16 September 18 |
LinuxCon North America |
New Orleans, LA, USA |
September 18 September 20 |
Linux Plumbers Conference |
New Orleans, LA, USA |
September 19 September 20 |
UEFI Plugfest |
New Orleans, LA, USA |
September 19 September 20 |
Open Source Software for Business |
Prato, Italy |
September 19 September 20 |
Linux Security Summit |
New Orleans, LA, USA |
September 20 September 22 |
PyCon UK 2013 |
Coventry, UK |
September 23 September 25 |
X Developer's Conference |
Portland, OR, USA |
September 23 September 27 |
Tcl/Tk Conference |
New Orleans, LA, USA |
September 24 September 25 |
Kernel Recipes 2013 |
Paris, France |
September 24 September 26 |
OpenNebula Conf |
Berlin, Germany |
September 25 September 27 |
LibreOffice Conference 2013 |
Milan, Italy |
September 26 September 29 |
EuroBSDcon |
St Julian's area, Malta |
September 27 September 29 |
GNU 30th anniversary |
Cambridge, MA, USA |
| September 30 |
CentOS Dojo and Community Day |
New Orleans, LA, USA |
October 3 October 4 |
PyConZA 2013 |
Cape Town, South Africa |
October 4 October 5 |
Open Source Developers Conference France |
Paris, France |
October 7 October 9 |
Qt Developer Days |
Berlin, Germany |
October 12 October 13 |
PyCon Ireland |
Dublin, Ireland |
If your event does not appear here, please
tell us about it.
Page editor: Rebecca Sobol