LinuxConf.eu wrapup
By Jonathan Corbet
September 12, 2007
The very first LinuxConf Europe
event was held in Cambridge, UK, in the first week of September. This
conference is the result of a cooperation between the UK Unix User Group and the German Unix User Group; it is, in a sense, a
combination of the UKUUG and Linux-Kongress events held in previous years.
Talks by Dirk Hohndel and Michael Kerrisk were published
last week. Here is a summary of some other LCE events.
Power management remains the focus of a great deal of attention. Arjan van
de Ven started off a set of power-related talks with an overview of where
the problems are. His biggest point is that software is a critical part of
the power consumption picture; contemporary hardware provides a number of
power-saving features, but software has a tendency to defeat them. Many of
the ways in which this happens have been covered here before, so there is
no need to repeat them. The core lesson here is that transitions between
power states are expensive, so it is important that hardware components,
once put into a power-saving state, be allowed to stay there for some time.
In the case of the CPU, idle periods of 20ms to 50ms are needed for
effective power savings. Past kernels have rather defeated that goal,
though, by receiving a clock interrupt every 1-10ms. The dynamic tick
patches have finally fixed that problem, making it possible for longer
sleeps to happen. But then user space comes along and ruins things. Since
the advent of PowerTop, though,
improvements have been coming quickly. Many distributions now consume at
least 30% less power in typical laptop use.
Things may be getting better, but Matthew Garrett started the following
session by noting that Linux still sucks - at least, it sucks power. This
is a problem, he says, because getting half the battery lifetime as Windows
on the same hardware is really embarrassing. Systems
are still waking up far too much; the problems exist in both kernel and
user space.
On the kernel side, the usual culprits - device drivers - are a big part of
the problem. There are quite a few drivers which poll their hardware -
sometimes up to 100 times every second. In some cases this cannot be
avoided; the hardware may be broken in a way which requires this kind of
polling. But in other cases the polling can be made smarter - such as
turning it off when the device is not in use. There is still work to be
done in this area.
User-space applications remain a problem. People tracking down wakeups
often blame the X server, but the real trouble is usually the applications
which are causing X to wake up. There is a tool in the works which will
identify the real source of X wakeups; this is a good thing: once problems
are identified they are usually fixed pretty quickly. Polling for vertical
retrace periods (so that the display can be updated without artifacts)
seems to be a particular problem; some API work is being done to make it
easier to avoid this polling. Evidently there are also some applications
which repeatedly ask the server if a particular extension is available;
since the set of extensions does not change while the server is running,
there is little point in doing this.
There are some interesting things which can be done to better use the
power-saving features of the hardware. For example, some framebuffers can
compress the video data into a dedicated memory area, then drive the video
from the compressed data. This technique reduces video memory bandwidth,
saving power (up to half a watt) in the process. An interesting
consequence is that the amount of power saved is dependent on how well the
screen's contents compress - a user's choice of background wallpaper will
affect their power usage.
Finally, there is a lot to be gained if device drivers can communicate more
information to user space, making polling unnecessary. Applications which
poll for changes to the audio volume are an example here; if the sound
system simply told them that the volume had been adjusted, they could
update their displays and go back to sleep.
Jörn Engel gave a talk on the death of hard disks. His core point is
that flash-based storage is faster, requires less power, makes less noise,
and is more robust than rotating storage. It is also more expensive, for
now, but flash is getting cheaper much more quickly. Jörn projects
that flash-based drives will become more economical than hard drives
between 2012 and 2019, depending on which drives one looks at.
Flash makes life easier in a number of ways; the lack of seek delays, for
example, means that much of the trouble the kernel goes to in scheduling of
block I/O operations can be eliminated. On the other hand, flash has
challenges of its own: it is not quite the random-access array of blocks
that one would like. In particular, writing to flash requires dealing with
wear-leveling issues, erase operations, and more.
Manufacturers have done their best to paper over these issues through the
use of translation layers which make a flash array look like a simple disk
drive. These layers make it easier to use flash with existing software,
but there are problems: performance is not always what one would like, and
there can be hidden caches which delay the persistent storage of data. So
Jörn has a request to the flash manufacturers: give us direct access
to the flash array, without translation layers, and let us figure out how
to best support it.
Chris Mason is not waiting for flash to take over; instead, he is working
on the next-generation Linux filesystem for rotating disks. The result, Btrfs, was the subject of
Chris's talk at LCE. LWN covered
Btrfs last June.
Chris's motivation is the fact that disks are, for all practical purposes,
getting slower - the time required to read an entire disk is growing. Most
systems still store large numbers of small files, leading to a lot of
wasted space. Btrfs tries to address these issues and provide a number of
interesting features as well. It is extent-based, resulting in more
efficient storage of larger files. Small files are packed into the
filesystem tree itself, eliminating the internal fragmentation experienced
by a number of other filesystems. It has indexed directories, data and
metadata checksums, efficient snapshots, sequence numbers in objects
(facilitating quick and easy incremental backups), an online filesystem
checker in the works, and more.
The directories are actually indexed twice. One index is there for fast
filename lookup; the other one, instead, lets the readdir() system
call return files in inode-number order, speeding filesystem traversals.
Extended attributes are stored as directory entries. Every file has a
backpointer to its containing directory - and, yes, multiply-linked files
have backpointers to all of the directories in which they are found.
Perhaps the most fun part of the talk was the plots Chris has generated
from various benchmark runs. The limiting factor on filesystem performance
is generally disk seeks; it is important to minimize disk head movement.
In general, ext3 tends to move the disk head all over the platter during
benchmark runs while Btrfs and XFS do better. Chris noted that better
writeback clustering in the virtual memory subsystem would help ext3.
More benchmark plots (some animated) can be found in the Btrfs
benchmark and Seekwatcher pages.
Toward the end, Chris was asked whether performance slows down when the
disk gets full. The answer was "no" because the system crashes instead.
That's a good reminder that Btrfs remains an early-stage development; the
on-disk format has not even been finalized yet. But the production version
of Btrfs is certainly something to look forward to.
Back in 2000, the British Computer Society awarded its Lovelace Medal to
Linus Torvalds. In 2007, the society finally caught up with him to deliver
the medal - though, as speaker Dr. David Hartley noted, they probably were
almost as quick as the post office would have been. As is typically the
case, Linus seemed somewhat embarrassed by the attention.
LinuxConf Europe intends to be a conference on a truly European scale. To
that end, next year's event will likely move to Germany; the details were
not yet finalized to the point that the location could be announced at this
year's conference, though. LCE, helped by the kernel summit, has gotten
this institution off to a good start; your editor is looking forward to
next year's edition.
Comments (15 posted)
Changes ahead for Python
By Jake Edge
September 12, 2007
With its first
alpha just released, Python 3.0 (aka Python 3000 or Py3k) is
making progress, though a final release is still a year off. Py3k overhauls
the language core, removing inconsistencies and other "warts", without
maintaining compatibility with the 2.x version. Various standard Python
idioms go by the wayside and it will take some getting used to.
One of the driving forces for Py3k is to handle unicode strings in a uniform
way. In the 2.x series, unicode handling has bugs, especially when mixing
encoded and unencoded text. The Py3k solution is to separate strings,
which contain decoded text, and byte-strings which are binary data into two
distinct types, str and bytes. Those types cannot be
combined without converting one via the encode() and decode()
methods. The drawback to this change is explained in the
What's New in
Python 3.0 document:
This means that pretty much all code that
uses Unicode, encodings or binary data in any way has to change.
This also leads to a distinction that needs to be made when handling
files. Files are either binary or text files, with text files requiring an
encoding to be specified when they are opened. If the wrong type or
encoding is given, I/O to the file may fail.
One very visible change – perhaps the most controversial –
is eliminating
the print statement, moving it to a function.
The change is being made
mostly for consistency, as there is no other language statement like
print, but it also adds additional features. One can now specify
a separator, line ending, and file directly, there is no need for the
print >>sys.stderr, "error" syntax, instead that becomes
print("error", file=sys.stderr).
As the "What's new" document points out:
Initially, you'll be finding yourself typing the old print x a lot in
interactive mode. Time to retrain your fingers to type print(x) instead!
Another area that has changed significantly is the dict methods.
The keys(), items(), and values() methods no longer
return lists, so code that treats them that way will fail. They now return
something called a "view" that references the dict directly,
producing values as they are needed, much like an iterator. In addition, the
has_key() boolean method has been removed, the in operator
should be used instead.
There are lots of smaller changes that will catch the unwary. Many of the
features removed have been deprecated for some time, but, for programmers who
don't follow Python language development closely, they may surprise. The
raise statement has different syntax, integer division no longer
truncates, instead it returns a float (with // used to get the old
behavior), xrange() has been removed, and so on. It adds up to a
substantial pile of things to deal with when moving existing code to Python 3.
The migration from 2.x is being assisted by the development of Python
2.6, which is slated for release in April 2008. It will provide a Py3k
warnings mode that complains at runtime when a feature is being used in a
way that is incompatible. It will also have many of the new features enabled,
either as __future__ imports or just added into the language if it
doesn't conflict with 2.x syntax. The 2to3 tool is also being
developed to translate 2.6 constructs into their 3.0 equivalents. The
Python Enhancement Proposal (PEP) governing the Py3k plan (PEP 3000) gives an overview of how code
can be maintained to run on both 2.6 and 3.0. It sounds somewhat painful,
but incompatible language changes are never easy.
There is still plenty of work to be done, the final release of 3.0 is
currently scheduled for August 2008. One of the bigger remaining chunks is
a reorganization of the standard library namespace.
PEP 3108 lays out the
changes to be made, including removing older, unsupported, or rarely used
modules, renaming modules to conform to the naming standard, merging the C
and Python implementations of modules (i.e. cPickle goes away and is
replaced with pickle). It cleans up what had become a bit of a mess
over time.
All of these changes have not come about without some objections, both
from those who think another incompatible "upgrade" is not warranted to
those who think Py3k
doesn't go far enough. One area that is not being changed, but is a source of frustration for some,
is the "global interpreter lock" (GIL), which only allows one thread at a
time to operate on any Python objects or call out to C language extensions.
Especially with the advent of multi-core and multi-CPU systems, the lock is
very restrictive, serializing most of the core language processing.
Guido van Rossum, Benevolent Dictator for Life (BDFL) of the Python
language has been very open about addressing these concerns on his All Things
Pythonic weblog. That doesn't mean he plans to change things,
especially with regards to the GIL, but he puts together a well
reasoned defense, mostly concerning the performance of the language
with finer-grained locks. He is clearly not much of a fan of
multi-threaded programming with its attendant race conditions, deadlocks,
and other issues, but he is not opposed to efforts to remove the GIL
either. As he points out, it is not inherent in the Python language, but
is an attribute of the current language implementation, other
implementations (Jython, IronPython) do not have the GIL.
There are fundamental changes in Python 3, it will be interesting to see
how quickly it is adopted after being released. People learning Python
won't need to learn Py3k for another two years or so, according to van
Rossum, and should, instead, concentrate on 2.x (which means 2.5 until April).
The unicode handling rework will probably be enough to get the increasing
number of localized programs updated, but the rest of the changes are not
terribly compelling. It is likely that there will be Python 2.x programs
around for a long time to come.
Comments (11 posted)
Fedora reaching out to new niches
By Jake Edge
September 12, 2007
Purpose-built Fedora distributions, called "spins", are a recent
addition to that community in an attempt to reach additional users. The
idea is to use tools like Revisor to create a custom
collection of software that work well together for a particular set of
tasks. This collection can then be installed or run from a live CD,
providing an easy means to have the right collection of tools immediately,
rather than after a lengthy yum install pass.
The concept itself is not new, there are many distributions targeted at a
particular subset of users. Typically, other popular distributions (Debian
and Ubuntu in particular) have been used as the basis for them. The Fedora
project is embracing the idea, pulling together a list of the spins and
elevating at least two to the status of "official spins". The idea is to
appeal to those who don't want to be bothered with tracking down,
installing, and configuring the tools needed for their task; instead it is
all packaged for them.
Starting with Fedora 7, two official releases of the distribution are
available, one for each of the dominant desktops. For Fedora 8, there will
also be a developer
spin, which has the explicit goal of attracting more Fedora
developers. It will include Eclipse, perhaps other integrated development
environments (IDEs), gcc and friends, emacs, SystemTap,
and other developer tools. Other ideas, such as a working Xen virtual
machine and targeting web developers, have been discussed as well.
The other official spin for Fedora 8 is the Fedora
Electronic Lab (FEL). This project pulls together the tools for
electronic design and configures them to work well together. A wide variety
of software for circuit simulation, hardware development in VHDL and
Verilog, Very Large Scale Integration (VLSI) design, and embedded systems
development are included. Universities are high on the list of target
audiences, with the FEL website claiming 250 universities already using
Fedora; attracting more is one of the goals.
Several other spins are being worked on as well, not "officially", but
there does seem to be some serious work going into them. The Security
LiveCD is a Fedora 7 based spin for security auditing and testing. It
contains all of the tools that an administrator or security researcher
might need to do forensic analysis of a rooted machine, check a network for
vulnerable hosts, or do penetration testing. Since it can be booted
directly from a read-only device, risks of infection from any malware are
eliminated. Any machine can be quickly turned into a security workstation
by using a distribution like this.
Another ambitious project is the Fedora
Art Studio. This spin not only collects the tools into one package, it
also pulls in content likely to be useful to artists, desktop publishers,
animators, and other creative folks. There are collections of clip art,
fonts, textures, brushes, and so on, all with free licenses. There are
also tutorials included to get people up to speed on the various packages.
Plans are to include default Firefox bookmarks for useful sites as well.
Other spins are listed on the site, ranging from the Creative
Commons LiveContent spin (covered by LWN here) to a SystemTap live CD.
The Fedora wiki has various Howtos on remixing
and rebranding
Fedora, as well as using the Live CD
tools. Most people who want to build a custom spin will start by using
the Revisor GUI
tool, which provides options for installation, live or virtualization
(for Xen or KVM virtual machines) media for CDs, DVDs, USB thumb drives and
more. The project has clearly put a lot of time and effort into making it
as easy as possible to create new spins from the large repository of Fedora
software.
It remains to be seen if any of these spins become popular, but it may be a
good way to introduce new users to Fedora. It is unlikely that power users
will find a spin that covers all of what they use, but they just might find
one that serves as a good starting point. They can either customize their
own spin from there or use the usual repository tools to grab whatever
extras they need. For a distribution that, until recently, had a
reputation for not working with the community, this effort may go a long
way towards erasing that history.
Comments (2 posted)
LWN advertising update II
LWN recently tried a new (for us) form of advertising, known as
"in-text" advertising – ads that pop up from highlighted keywords in
an article. When we announced
the change, it was obvious from the comments that it was a tad
unpopular. Truth to tell, they started getting on our nerves more as
time went on; they didn't seem quite so annoying when running it on our
development systems. We have discontinued the ads; they will not be coming back.
A lot of good points were made in the comments, we appreciate the time you
took to make them. Our readers are (obviously) very important to us; your
opinions on what works and what doesn't are always carefully considered.
There were also several interesting suggestions made, we will be pondering
those as we make plans.
We do want to dispel one concern that we heard. We are not under an
imminent threat of going under. We are proceeding with
the plan we laid out in May: working on the revenue side of the business
while producing the same quality of content you have come to expect. There
will be other experiments along the way; some will fail, hopefully some
will succeed as well.
Comments (87 posted)
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: Eavesdropping on Tor traffic; New vulnerabilities in lighttpd, openssh, samba, xorg-server...
- Kernel: The 2007 Kernel Summit; Exported symbols and internal API stability; Who wrote 2.6.23.
- Distributions: RPM Fusion; new releases of Fedora Electronic Lab, Mandriva Linux 2008 RC 1, OpenSolaris 10 update and openSUSE 10.3 Beta3
- Development: Edit image metadata with ExifTool,
KDE 4.0.0 release schedule, nova filters as ladspa plugins,
GCC status reports, GCC as an incremental compile server,
new versions of Samba, NagVis, PIKT, Apache, lighttpd, Zenoss,
Librepos, opentaps, Sailcut CAD, GNOME, GARNOME, KDE 4.0 beta,
X11, gEDA/gaf, GnuPG, Robocode, Widelands, wxWidgets, Chandler Desktop,
Geotag, KnowledgeTree, EJBCA, Free Pascal, Anjuta DevStudio, Pydev.
- Press: Tor Anonymizer used or eavesdropping, Red Hat's latest business deals,
Richard Stallman interview, female-friendly Python, webcams under Linux,
reviews of SLUG, Snort and Trolltech Greenphone SDK, working for standards.
- Announcements: Open Group certifies Apache Directory Server, EFF hires crime attorney,
IBM joins OO.o community, Mandriva Australia launched, Microsoft launches
Silverlight Flash competitor, OpenMoko Neo phones by Christmas,
Sun acquires ClusterFS, VMware Tools open-sourced, Les Trophees du Libre,
LCA miniconf CFPs, Django sprint, KDE-EDU 4.0 polishing event,
Linux Foundation legal summits, Sun Tech Days.
Next page:
Security>>