By Jonathan Corbet
August 31, 2011
Last week's Edition contained
an interview with
Kovid Goyal, the maintainer of the
Calibre electronic book manager, but
it did not look deeply at the application itself. Coincidentally, your
editor has been playing with Calibre with renewed interest recently. This
application has made considerable progress since
your editor's last look at it, so a look at
where it has gone is called for. Calibre is not perfect, but it is a
useful tool for a somewhat unwilling newcomer to electronic books.
Your editor is not one to resist progress; the transition from vinyl to
compact disks was handled with no reservations, despite the fact that there
is still something special about those old analog platters. How can one
resist a medium that is bulky and heavy, limited to 30 minutes of play
time, and which degrades with every playing? When CDs, in turn, started to
go away barely a tear was shed. There has been no pining for 8"
floppies, eight-track tapes - or for tape in general. Technology moves
forward, and things get better.
But books are special. They represent an old technology well optimized for
its intended task and are a thing of beauty. Your editor's love of books
has swallowed up vast amounts of time and crowded the house with dead
trees; Cicero's classic proverb ("a room without books is like a body
without a soul") is well respected here. That occasionally leads to
certain amounts of marital stress, but we digress. The point is that the
movement of the written word to an increasingly digital-only form is
something that has been resisted in these parts for some time.
But the writing is on the wall cloud-based persistent
storage device: books as physical objects are on their way out. Once your
editor got past the denial phase, it became clear that there were even some
advantages to ebooks. They are often cheaper, which is nice. Certain
science fiction authors would appear to be paid by the kilogram; reading
them in electronic form yields the same entertainment value without the
kilograms. Electronic books are especially advantageous when traveling;
the weight involved in carrying sufficient reading material for a family
vacation was, once again, a source of a certain amount of familial
disagreement. Searchability can be a useful feature at times.
There is still nothing like a real book, but the electronic
version is not entirely without its charms.
One does not need to accumulate many ebooks before it becomes clear that
some sort of management scheme is required. Simply keeping books on a
reader device is not really an option; that device may well not be entirely
under the owner's control, its capacity is limited, it can be lost or
damaged, and it eventually will need to be replaced. Just dumping them on a
disk somewhere has problems of its own; some sort of management tool is
needed. For now, in the free software world, Calibre seems to be that tool.
Calibre
As of this writing, the current version of Calibre is 0.8.16. Releases are
frequent - about one week apart - and each release adds bug fixes and new
features. The web site recommends installing binary releases directly from
there because distributors tend to fall behind that schedule; Rawhide did
not let your editor down, though. Interestingly, those looking for the
source on the Calibre web site can search for a long time; there are no
easily-found pointers to the SourceForge
directory where the source can be found. The program is written in
Python.
One thing first-time users won't necessarily notice is that Calibre phones
home when it starts. The ostensible purpose is to check for new releases,
but, in the process, it reports the current running version, the operating
system it is running under (Linux is reported as "oth") and a unique ID
generated when the program is installed - along with the IP address,
naturally. It is not a huge amount of information to report - users of
proprietary reader devices have much larger information disclosure issues
to be concerned about - but it's
still a bit of a privacy violation. Programs that communicate with the
mother ship in this way should really inform their users of the fact and
give them the opportunity to opt out.
The main Calibre window provides a list of books in the library, an
animated "cover browser," a list of metadata types, and a pane for
information about the selected book. By default, somebody just wanting to
look through the books in the library will find less than 1/4 of the
available space dedicated to that task. However, one cannot fault Calibre
for lacking configurability; there are few aspects of the interface that
cannot be tweaked at will. Unwanted stuff is easily gotten rid of.
There is a toolbar across the top with a large number of entries; they do
not all fit unless the window is made quite wide. Some of them can be a
bit confusing; should one import a book with "Add books" or "Get books"?
The icon labeled "N books" (for some value of N) is actually the way to
switch between libraries. "Save to disk" is a bit strange for books in the
library, which are already on disk; it seems to be a way to save a book in
a different format, though how that relates to the "convert books"
operation is not entirely clear. With a bit of time and experimentation,
though, it's not too hard to figure out how things work.
There is a basic reader application built into Calibre; it works well
enough, though, likely as not, few users actually read their books in this
application. Some of its more obnoxious behaviors (the 1/2 second animated
page flip, for example) can be disabled. One thing that cannot be turned
off, though, is the obnoxious "tooltips" that show up on everything. Your
editor has noticed a trend toward these annoyances in a number of
applications; when one can't see the interface through the tips, something
is wrong. As can be seen in the associated screenshot, the "next page"
tooltip obscures the text of the book itself.
Calibre's management of devices seems to work well; when a recognized
device is plugged in, a separate pane showing the contents of that device
is created. Books can be copied between the library and the device at
will; if needed, Calibre will convert the book to a different format on the
way. Your editor's Kindle device Just Works with Calibre; all that was
needed was to plug it in. Android devices also work nicely. The Calibre
site recommends installing WordPlayer on Android, but interoperation with the
open-source FBReader application works well. Aldiko can also be used,
though it is necessary to manually import the book files into the
application after Calibre has placed them on the device.
Naturally, when working with a Kindle, one quickly runs into DRM issues;
Calibre will put up a dialog saying that it cannot work with a locked file
and wish you luck. As it happens, there is a plugin out there that can
decrypt books from a Kindle and store them in a more accessible format.
The Calibre project itself won't go near such plugins, but they are not
hard to find. Whether one sees unlocking an ebook as an exercise of
fair-use rights on a text that one has purchased or as an act of piracy
will depend on one's viewpoint and, perhaps, local law. Your editor can
only say that, if he were able to store his purchased ebooks in a format
that does not require a functioning Kindle or Amazon's continuing
cooperation, he would be much more inclined to buy more such books in the
future.
(The Kindle, incidentally, will eventually be replaced with a more open
device; selecting that device is likely to be the topic of a future
article).
The "Get books" option pops up a dialog that, seemingly, will search every
bookstore on the planet for a given string. Results are listed with their
price, format, and DRM status. The process tends to be slow - not
surprising, given how many
sites must be queried; one will want to trim down the number of sites to
speed things up and eliminate results in undesired languages.
The Calibre developers have clearly been
busy setting up affiliate arrangements with as many of those bookstores as
possible.
The proceeds support ongoing development of the code, which
seems like a good cause, certainly.
Another interesting feature is the ability to download articles from
various news sources, format them appropriately, and send them to the
device. In the case of the Kindle, that sending happens immediately over
the Kindle's cellular connection; there is no need to plug the device into
the computer first. Over 1,000 different news sources are supported at
this point. If Calibre is left running, downloads can be scheduled to run
at regular intervals. The value of this feature arguably drops as
always-connected devices take over, but it's easy to see how it could be
indispensable for those who do a fair amount of offline reading.
Wishlist and conclusion
There is a fairly well developed search feature clearly designed with the
idea that there will be thousands of books in the library. Searches can
key on almost any metadata, but there does not seem to be any way to search
for books based on their contents. If you cannot remember which book
introduced the concept of "thalience," Calibre, it seems, will not be able
to help you find it. Indexing a large library to the point where it can be
efficiently searched is not a small task, of course, but there are times
when it would be nice.
Closer integration between Calibre and the reader devices would be useful.
For example, all readers have a concept of bookmarks, or, at least, the
current position within a given book. Imagine having a copy of one's
current book on a phone handset; it would always be at hand when one finds
oneself with an unexpected half hour to kill somewhere. Later, when
curling up by the fire with the spouse, the dog, a glass of wine, and the
real reader, said reader would already know the new position to start
from. No such luck with the reader, alas; even the spouse and the dog
can't always be counted upon. Calibre can't fix the latter, but it could
convey that kind of information between reader devices.
Even nicer, someday, might be to run an application like Calibre directly
on the reader devices, backed up by a library found in
personally-controlled storage on
the net somewhere. Google's Books offering is aiming at that sort of
functionality, without the "personally-controlled" part, but books are too
important to leave in some company's hands. Until such a time arrives,
we'll be left managing our books on a local system and copying them to
devices as needed. Calibre seems
to be the best option there is for that management; it is a capable tool
that does almost everything a reader could want. It definitely helps to
make the transition away from real books a bit less painful.
Comments (28 posted)
By Jake Edge
August 31, 2011
For handling network management tasks, at least on the desktop, most
distributions (and thus Linux users) rely on NetworkManager. But, there is
an alternative, called ConnMan, that was
originally created as part of Intel's Moblin effort. ConnMan has found its way into
Moblin's successor, MeeGo, which is no surprise, but it is also suited to
smaller embedded Linux systems as well. ConnMan's creator,
Marcel Holtmann, gave a talk at LinuxCon to describe ConnMan, along with
some of the challenges faced in creating a compact network management tool
that is targeted at mobile devices.
Holtmann started out by describing the "wishlist" for mobile devices that
he came up with when he started working on ConnMan three years ago.
Mobile phones were the first use-case he considered because they are
complex devices with limited battery life. Also, "if you solve the
phone case, you solve the tablet and netbook problem as well", he
said. In addition, the needs of televisions are a subset of those needed
for mobile phones.
But, other use-cases are different. Cars have different requirements, as
do robots, sensors, and medical devices. The only use-case that was left
out was the data center because it is "simple and pretty much
static", he said.
So, after considering those use-cases, Holtmann came up with a wishlist
that consisted of a handful of
high-level constraints. It should be a simple and small solution,
"but at the same time, really powerful". It should be
automated so that it didn't have to ask the user what to do "over and
over again if we knew what to do". It should have minimal
dependencies so that it could run on memory-constrained devices. It should
also support customization so that vendors could create their own UI on top
of the core. ConnMan sprang out of that wishlist.
There were also a common set of problems that a network management
application needs to deal with including IP address assignment, which is
"quite complicated actually", especially when considering
IPv6, he said. Dealing with network proxy support is another problem area
because the "settings are really hard to explain", so there is
a need to handle it automatically. DNS handling can be problematic as well.
There are interaction problems too. "Are we on the internet?"
is a surprisingly hard question to answer as the system could be connected
to a hotspot that requires a login for example. There is a need to make
applications aware of internet connectivity—gain or loss—as
well. In addition, time synchronization is important, but even more
important sometimes is determining the correct timezone. All of this stuff
needs to be sorted out before telling applications that the internet is
available, Holtmann said.
Beyond that, there are some additional features required for mobile devices
today, including a flight (airplane) mode, tethering, and statistics
gathering. The latter is particularly important for devices where
different kinds of connections have different costs.
Design principles
Holtmann said that he looked at how other network management applications
are designed and saw a fundamental problem. Instead of keeping the policy
and configuration in the low-level connection manager, those parts live in the UI,
he said. He thinks this is the "wrong approach" and that
policy and configuration should live in the connection manager so that
experts can deal with the hard problems, rather than making users figure
them out. In addition, it allows UI developers to change the interface
easily, because they don't have to change the policy/configuration handling
as part of that.
There are three principles that governed the design of ConnMan from a user
interaction perspective. The first is to ask users "only questions
they can answer". If there are questions that a user will have
trouble answering, the program should try to figure them out for itself.
For example, users don't know or care about the various kinds of wireless
keys required, so don't ask a bunch of technical questions about WEP
vs. WPA vs. pre-shared keys, just ask for the password for the wireless
network. The underlying connection manager should recognize what type is
required by the wireless network automatically.
The second is to only show current and valid information to the user so
that they aren't overwhelmed with useless information. Don't tell them
that the Ethernet cable is not plugged in, he said, "they are sitting
in front of it". Hide WiFi networks that have too weak of a signal
rather than showing a bunch of "<unknown>" access points. The emphasis
should be on previous connections that the user has made, because the
chances that "the user wants to use it again are really high".
Lastly, interact with the user only when it's needed. Part of the solution
is to remember things from previous connections and configurations, but
there is more. If ConnMan doesn't make connections quickly enough, users
will start to think something is wrong and start "messing with
things". Also, use error notifications to tell the user something
useful, not just propagating the error message from the underlying code.
It was difficult to keep to the principles, Holtmann said, but that was the
goal.
Reducing time-to-connect
In keeping with the last principle, ConnMan has done some things
differently to try to reduce the time it takes to establish a
connection.
As Holtmann mentioned earlier, connection establishment for IP is rather
complicated with multiple pieces required to get to a point where
applications can start using the internet. First there is the low-level
IPv4 and IPv6 address and proxy configuration (which includes DHCP, web
proxy auto-discovery (WPAD), and IPv6 auto-configuration). After that,
it may need to do a WISPr (wireless internet service provider roaming)
hotspot login, followed by time synchronization.
That all takes a fair amount of time largely because of various
inefficiencies in the current implementations. For one thing, IPv4 and
IPv6 discovery and configuration should be done in parallel, rather than
serially. Arbitrary timeouts should also be eliminated.
One of the biggest problem areas was DHCP. In current Linux systems, there
are multiple levels of D-Bus messages and callout scripts to handle DHCP.
There are at least three script/program executions and 2 D-Bus
messages, sometimes
with arbitrary waits between them, to handle getting an address via
DHCP. "Time is just wasted",
Holtmann said.
But DHCP only requires 4 UDP messages
of 300-600 bytes each, which "can be done a lot faster than you
think", he said. ConnMan implemented its own DHCP library that
significantly reduced the amount of time it took to get an IP address,
while also reducing memory consumption. The time reduction results in
"approximately 1-2 seconds that can be given back to users",
while the runtime memory savings is very important for some embedded devices.
ConnMan features
The feature list for ConnMan is quite large already, and Holtmann went
through a laundry list of them. Obviously, support for WiFi, Bluetooth,
Ethernet, and WiMAX are "have to have" features, he said, but
ConnMan provides quite a bit more than just the connectivity options.
There are various low-level features that were mentioned earlier like DHCP
(both client and server), WPAD and WISPr, and support for timezone
switching. In addition, support for iptables, 6to4 tunnels, DNS
resolver and proxy/cache, an HTTP client, tethering support, and more are
available. There is also a "personal firewall" feature that is
"under discussion right now", he said.
Beyond that, there are two different APIs available for different kinds of
applications to use. The Service API is for configuration and is used by
the ConnMan UI. It unifies the configuration for all of the different connection
options (WiFi, Bluetooth, etc.) as well as providing a single
signal-strength API.
The Session API is meant for applications to use to
monitor the internet connection. Each application can have one or more
sessions that correspond to different connections they are making. The API
provides IP address change notifications, so that the application can
transparently reconnect. It also allows applications to give priority
information regarding how quickly its data needs to be handled (for "realtime" audio
vs. background network syncing for example). It was designed with
handset and in-vehicle-infotainment (IVI) needs in mind, Holtmann said.
Hotspot login "drove me crazy for a really long time", he
said, but ConnMan now has WISPr 1.0 support that works correctly, unlike
many other implementations (including the iPhone). It doesn't use a browser but does require an
HTTP client. With ConnMan, a device can roam between different
WISPr-supporting hotspots using a password agent to provide the proper
credentials.
ConnMan also supports WISPr 2.0, but none of the hotspot providers do, so
he has been unable to test it. This will support "real" WiFi offloading,
and it doesn't require a username/password because the SIM card credentials
are used to authenticate.
Proxy support is another problem area that has been solved in ConnMan,
Holtmann said. A user's device may need a proxy to reach the internet when
they are at work and either need a different proxy or none at all when at
home. Changing proxies is difficult to do under Linux, he said. Proxy
auto-configuration (PAC) is one solution, but it is JavaScript-based.
Since they don't want a JavaScript interpreter in every system, a separate
solution was needed.
PAC files can be large (he mentioned Intel's being 4M) and must be
downloaded each time a connection is made on networks that use it. To
avoid each application requiring its own PAC support, ConnMan centralizes
that information, but calls out over D-Bus to the pacrunner daemon
to get the required configuration. The implementation is "a little
bit nasty, but it works pretty well" in practice, he said, and it
alleviates users from having to fiddle with proxy configuration.
Full Network Time Protocol (NTP) support is really only needed for
data centers, so ConnMan uses Simple NTP (SNTP) instead. That reduced the
footprint and external dependencies required while still providing
reasonable time synchronization.
ConnMan had full support for tethering via USB, WiFi, and Bluetooth before
either Android or iOS, Holtmann said. It integrates with wpa_supplicant,
BlueZ, and the USB gadget subsystem. In addition,
there is internal iptables handling
as well as DHCP and DNS proxy handling support for ConnMan tethering.
The final feature that Holtmann described is the statistics gathering for
ConnMan. Different connection types have different limits, especially when
roaming, so it is important for users to know what they have used. Also,
some connection types should only be allowed for certain applications, he
said. IVI systems may have a SIM, but the car manufacturer may only want
that used for certain critical functions (navigation, system updates,
etc.). There is Session API support for per-application statistics, but
there is still more work to do on that.
In answer to audience
questions, Holtmann noted that MeeGo is the driving force behind ConnMan,
but that others use it too, including the GENIVI Alliance for IVI
applications as well
as manufacturers of other small embedded Linux devices. ChromeOS uses a
version of
ConnMan that was forked over a year ago—which is a bit surprising:
"do they want the new features or not?". For those who
want to try it on the desktop, he said that Ubuntu is the
only regular distribution that currently has ConnMan packages.
In summary, ConnMan is "fast and simple" and does what users
expect it to do, Holtmann said. The Session API is unique to ConnMan as
far as he knows, and will be very useful to applications. There will be more advanced
features coming in the future, he said. Overall, Holtmann made a pretty
compelling argument for looking at ConnMan as an alternative to
NetworkManager (though he largely avoided talking about the latter),
especially for the mobile device use-cases that it targets.
[ I would like to thank the Linux Foundation for travel assistance to
attend LinuxCon. ]
Comments (17 posted)
By Jake Edge
August 31, 2011
It will come as no surprise to regular LWN readers that the patent situation
for mobile Linux (and mobile devices in general) is an enormous mess. Open
Invention Network CEO Keith Bergelt spoke at LinuxCon to outline how he
sees the current landscape and to impart some thoughts on where he sees
things going from here. In addition, he described several ways that the
community can get involved to help beat back the patent threat, which is
most prominent in the mobile space, but certainly not limited to that
particular sphere.
Android rising
Bergelt said that his talk would center around Android, because it is the
"focus of a lot of ire from Microsoft", but that the same
threats exist against any mobile Linux system that becomes popular. The
"threat landscape" is very dynamic, he said, because it is constantly
changing as various players acquire more patents. Google's move to acquire
Motorola Mobility is a "very significant" move that could
also change things.
Clearly, Linux is on the rise in the mobile space. Right now it is Android
that is leading the way, but he is "hopeful there will be
more", citing webOS, LiMo, and MeeGo as possibilities. It is
"really a two-horse race" at the moment, between iOS and
Android, but others may come along. That would be good because it would
offer more freedom of choice, he said.
The rise of Android has been "unprecedented". If you were
looking forward from 18 months ago, you couldn't imagine that something
would displace iOS on mobile devices, but that's what Android has done.
Android now has an "irreversible position in the mobile
space", he said.
He put up a famous (or infamous) graphic that circulated earlier this year
(at right) which showed all of the different patent lawsuits currently
pending against Android devices. While many may have seen that graphic elsewhere,
Bergelt said, he credits Microsoft for it. We should credit who
created the graphic "rather than who is pushing it", he said.
When something is successful, it attracts attention, and that is what is
happening with Android right now, and graphics like this one are evidence
of that.
Are the current lawsuits a Linux concern or just an Android concern, he
asked. It would be easy to see them as only a problem for Android itself,
because, other than the kernel, Android shares little with a traditional
Linux platform. But you rarely will see an actual Linux lawsuit, Bergelt
said, because it has been developed for 20 years in the open. Instead,
opponents have "patents on adjacent technologies" that are
used to go after Linux-based systems.
Until MeeGo or webOS mature and get significant market share, "mobile
Linux is Android". Whether one thinks that Android is the
"perfect implementation" of Linux or not, the community needs
to "be in support of the mobile Linux that's out there", he
said. When other mobile Linux systems mature, "we should support
them equally as well".
It is important to "ensure that Android is not pulled off the
shelves", Bergelt said. Microsoft and Apple would like to see
Android pulled and are using their patent portfolios to "slow or stall the
commercial success of Linux". Until other mobile platforms emerge,
threats against Android equate to threats against Linux. Android's
viability is needed to prove that there is a market for Linux-based
platforms, he said.
Secondary market for patents
The stakes are so high that the secondary market for patents is
"overheated", Bergelt said. The "per patent price has
risen to astronomical levels", which is well beyond any reasonable
level for acquiring purely defensive patents. It is not about acquiring
patents for licensing revenue either: "You are not going to get your
money back from licensing them; that's ridiculous", he said.
Companies are recognizing that this "land grab for patents"
provides an opportunity to get more for their patents than they would be
able to otherwise, which is putting more patents on the market.
The Nortel patents (which recently sold for $4.5 billion to a consortium
including Apple and Microsoft) are particularly worrisome, Bergelt said,
because they cover mobile communications and network management. The US
Department of Justice (DoJ) is looking into that transaction, and members
of the community can help inform the agency that there are concerns about
those patents being used in anti-competitive ways. A resolution like what
occurred with the Novell patents, where OIN can license them indefinitely,
would be good. That particular outcome deprived Microsoft of the ability
to own the Novell patents because of its history of anti-competitive
behavior, he said.
Bergelt said that he has some empathy for Microsoft, because the company's
history is weighing it down. "If the only thing you've known is
being a monopolist, that's how you are going to work", he said. But
the DoJ needs accurate information about previous and current behaviors
of the purchasers of the Nortel patents. He encouraged those in the
audience who knew of such behaviors to report them to the agency so that it
could have a "balanced view" of the situation. The DoJ
employees are "bright and accomplished", but that patent-based
anti-competitive behavior is not something they normally consider, he said.
Companies that are pursuing the strategy of using patents to slow or stall
competitors aren't trying to educate anyone, they are, instead,
"interested in threatening people". But, "illegal
behavior is illegal behavior, and that's what they're practicing",
he said. Microsoft and Apple would much rather have it be a duopoly,
rather than dealing with the "disruptive situation" that Linux
brings. Neither of those two companies "have the ability to compete
with Linux in the long term", Bergelt said.
The idea is to "tax" Linux heavily with licensing fees. Microsoft has
pursued a "totem-building strategy", where it gets companies
to license its "Linux patents", often by throwing those patent licenses
into other, unrelated deals. This "creates a presumption"
that the licenses have value. There is also a more targeted component
where the company uses the other licensees—who may be price-insensitive and thus
willing to sign suboptimal agreements—as a weapon against
smaller, more price-sensitive companies. Microsoft will also use its
patents on a particular technology as the centerpiece and throw in other
patent licenses as part of any deal. The FAT filesystem patents, which
expire soon, have been used that way. More recently, "active sync" is
being used as a centerpiece, and the company claims ten patents on that
technology.
But Microsoft cannot use the Novell patents in this way, and that's what
Bergelt would like to see happen with the Nortel patents as well. Right now, the
licensing fee that is being charged is $15 per mobile device, but Microsoft
would like to get that up to $30-40 by adding on other patents. Apple's
"touch" patents—which were mostly acquired, not developed by
Apple—are being used in this way as well. This can change the
decisions that vendors and mobile carriers make because at some point it
becomes uneconomical to pay higher per unit royalties, he said.
There is also the problem of "opportunistic patent
aggressors", which are typically "non-practicing entities" (NPEs),
also known as "patent trolls". These organizations are focused on
generating a return. He pointed to Intellectual Ventures (IV) as the
"largest patent troll in the world". IV has used investment
from universities and foundations—fooled by misleading
information into investing in the
company—to amass an enormous patent portfolio of 34,000 worldwide
patents in 9,000 patent families, he said. IV is
"not an innovation company", but is, instead, a "business
designed to use patents to drive return".
The market for patents has led companies like InterDigital to put
themselves on sale, Bergelt said. That company has 2500+ patents that
"almost exclusively relate to mobile communication", and have
generated billions of dollars in traditional licensing revenue. Their
patents still have life left, but the overheated market provides a way to
"cash out" their portfolio.
In addition, financial services firms are pouring "billions"
into patent companies, and they are looking for a return on those
investments, he said.
Fighting the good fight
"Things are going to get worse before they get better",
Bergelt said, which echoes numerous observers of the patent mess. He sees
a need for "more people to work together" to try to,
eventually, fix the problem. There are so many patents that shouldn't have
been issued, "free radicals" he called them, that it will take
a long time to undo that. Part of the problem is that "code is not
searchable in a way that's useful" to determine "prior art", so
patent examiners don't have an easy way to disallow patents based on
earlier implementations of the idea.
There are several defensive patent pools that have spent "billions to
acquire patents". These include RPX, which has 100 members, and AlliedSecurityTrust (AST),
which has 22 members, as well as OIN itself. OIN is a "very peculiar
company" in that has six members but is "tasked with
protecting the community". OIN and its members know that the
community is "where new innovation is coming from", Bergelt
said, and those innovations can be used to build billion dollar companies.
There is work to be done on mobilizing the open source software community
to help fight these patents, he said. There is a "tremendous amount
of prior art" that has not been identified, so OIN and others have
been working on "structures" where developers can document
their ideas in ways that can be used by the patent office. One of those is
the "defensive publication", which is like a "patent without
claims". OIN has spent "tens of thousands of dollars"
to try to educate developers on how to defensively publish their ideas.
In addition, there are opportunities for the community to identify existing
prior art that can limit the claims or possibly invalidate patents that
are in the examination process.
Unlike a technology platform that can be "overtaken by
events", open source is a social phenomenon that is unstoppable,
Bergelt said; we are not going back to the siloed world. Collaboration
"low in the stack", while competing high in the stack, where
companies may have intellectual property interests, is the way new systems
will be developed.
Bergelt also gave an overview of the work that the Linux Defenders project is doing with
help from the community. It
is highlighting patent applications that shouldn't go forward by pointing
out prior art. That means that the community "saves us the problem
of having another free radical". After patents are issued, the
Post-Issue Peer
to Patent initiative allows the community to potentially invalidate or
limit the scope of bad patents. But in order for those projects to work,
more community involvement is needed, he said.
The "stakes have been raised", Bergelt said, and the computing
landscape is being reshaped by smartphones. New technologies are going to
allow these devices to go way beyond where they are today, he said, and
that's why he's excited to see things like MeeGo and webOS. Microsoft is
now recognizing that personal computing is undergoing a major shift, and
it (and others) are fighting the competition in the mobile space with any
weapons they can find.
Community engagement is needed in several areas, but identifying and
codifying prior art is the biggest piece. We will see lots of
bidding for the InterDigital portfolio over the next several months, there
will be more IP speculators and trolls trying to cash in, and
anti-competitive actions from larger companies will take place.
We should support Android and the platforms that come after it and remember
that our opponents
"are going to fight like hell",
Bergelt said.
After Bergelt finished, the Linux Foundation's legal counsel, Karen
Copenhaver, amplified one part of Bergelt's message. The DoJ, she said, is
waiting to let things play out with the Nortel patents to see if there is a
big lawsuit or International Trade Commission (ITC) action using those
patents. But the impact of the patents happens "long before
that" in meetings between the patent consortium and vendors. So it
is imperative that we provide the DoJ information on how these patents
affect Linux well before any litigation occurs, she said. Both Copenhaver and
Bergelt were clearly reaching out to vendors and others who have been
threatened with patent actions by Microsoft, Apple, or other members of
the patent-purchasing consortium.
[ I would like to thank the Linux Foundation for travel assistance to
attend LinuxCon. ]
Comments (9 posted)
The US Labor Day holiday is September 5. In celebration, we will attempt
to labor a bit less than usual, with the result that the Weekly Edition
that would normally come out on September 8 will be published on the
9th instead.
Comments (1 posted)
Page editor: Jonathan Corbet
Security
By Jake Edge
August 31, 2011
A rather potent denial of service (DoS) vulnerability in the Apache HTTP
server has dominated the security news over the last week. It was first
reported by way of a proof-of-concept
posted to the full-disclosure mailing list on August 20. The problem
itself is due to a bug in the way that Apache implemented HTTP Range
headers, but there is also an underlying problem in the way those HTTP
headers are defined.
The proof-of-concept (posted by "Kingcope") is a fairly simple Perl script
that makes multiple connections to a web host, each with a long and,
arguably, incorrectly redundant Range specified in the headers.
Only a small number
of these connections can cause enormous memory and CPU usage on the web
host. This is a classic "amplification" attack, where a small amount of
resources on the attacker side can lead to consuming many more victim-side
resources. In some sense, normal HTTP requests are amplifications, because
a few bytes of request can lead to a multi-megabyte response (a large PDF
for example), but this attack is different. A single request can lead to
multiple partial responses, each with their own header and, importantly,
server overhead. It is the resources required to create the responses that
leads to the DoS.
The Range header is meant to be used to request just a portion of
the resource, but, as we see here, it can be abused. The idea is that a
streaming client or other application can request chunks of the resource as
it plays or displays them. An HTTP request with the following:
Range: bytes=512-1023
would be asking for 512 bytes starting at the 513th (it is zero-based).
But it is not just a single range that can be specified:
Range: bytes=512-1023,1024-2047
would request two ranges, each of which would be sent in either a separate
response (each with a
Content-Range header) or in a multipart
response (i.e. with a
multipart/byteranges Content-Type).
Each of those examples is fairly benign. The problem stems from requests
that look like:
Range: bytes=0-,5-0,5-1,...,5-6,5-7,5-8,5-9,5-10,...
which requests the whole file (
0-) along with several nonsensical
ranges (
5-0,5-1,...) as well as a bunch of overlapping ranges.
The example is taken from the proof-of-concept code (which creates 1300
ranges for each request), and other kinds of
lengthy range requests will also cause the DoS.
When it receives range requests, Apache dutifully creates pieces of a multipart
response for each range specified. That eats up both memory and CPU on the
server, and doing so tens or hundreds of times for multiple attacker
connections is enough to exhaust the server resources and cause the DoS.
The range requests in the attack are obviously not reasonable, but they are
legal according to the HTTP specification. There is discussion
of changing the specification, but that doesn't help right now.
An obvious solution would be to sort the ranges (as they don't have to be
in any specific order) and coalesce those that are adjacent or overlap. If
the range turned out to correspond to the entire file, the server could
just send that instead. Unfortunately, (arguably) badly written applications or
browsers may be expecting to get multipart responses in the same order, and
with the same lengths, as
specified in the request. In addition, the HTTP specification does not allow
that kind of reordering.
So the Apache solution is to look at the
sum of the lengths of the ranges in a request
and if that's greater than the size of the requested file, just send the
whole file. That will defeat the proof-of-concept and drastically reduce
the amplification factor that any particular request can cause. It doesn't
completely solve the problem, but it alleviates the worst part. Attackers
can still craft nonsensical range requests, but the number of responses
they can generate is vastly reduced.
While Apache's implementation of range requests is fairly resource-hungry,
which makes it easier to cause the DoS,
the HTTP protocol bears some of the blame here too. Allowing multiple
overlapping ranges does not really make any sense, and not allowing servers
to reorder and coalesce adjacent ranges seems poorly thought-out as well.
No matter how efficiently implemented, allowing arbitrary ranges of that
sort is going to lead to some sort of amplification effect.
Apache's fix is for the stable 2.2 series, so it is necessarily fairly
conservative. There are ongoing discussions going on in the Apache dev
mailing list that indicate a more robust fix—probably following the
proposed changes to the HTTP specification—is in the works for the 2.3
development branch (which will eventually become the stable 2.4 release).
As of this writing, only Debian has released a fix for the problem (which
it did based on the patch while it was being tested and before Apache announced
its fix). Other distributions are sure to follow. Since it is trivially
easy to make an unpatched server unresponsive, it probably makes sense to
use one of the mitigation techniques
suggested by
Apache until a server update is available.
[ A word of warning to those who may be tempted to try the proof-of-concept
code: while limiting the number-of-forks command-line parameter to 1 may
seem like a good idea for testing purposes, it doesn't actually work in
practice. If that parameter is <= 1, the code sets it to 50, which is
enough to DoS a server—trust me on that last part. ]
Comments (4 posted)
Brief items
On July 19th 2011, DigiNotar detected an intrusion into its Certificate Authority (CA) infrastructure, which resulted in the fraudulent issuance of public key certificate requests for a number of domains, including Google.com.
Once it detected the intrusion, DigiNotar has acted in accordance with all relevant rules and procedures.
At that time, an external security audit concluded that all fraudulently issued certificates were revoked. Recently, it was discovered that at least one fraudulent certificate had not been revoked at the time. After being notified by Dutch government organization Govcert, DigiNotar took immediate action and revoked the fraudulent certificate.
--
DigiNotar 'fesses up
Diginotar indeed was hacked, on the 19th of July, 2011. The attackers were able to generate several fraudulent certificates, including possibly also EVSSL certificates. But while Diginotar revoked the other rogue certificates, they missed the one issued to Google. Didn't Diginotar think it's a tad weird that Google would suddenly renew their SSL certificate, and decide to do it with a mid-sized Dutch CA, of all places? And when Diginotar was auditing their systems after the breach, how on earth did they miss the Iranian defacement discussed above?
--
F-Secure is not so sure we have the full DigiNotar story
None of the recipients were people who would normally be considered high-profile or high-value targets, such as an executive or an IT administrator with special network privileges. But that didn't matter. When one of the four recipients clicked on the attachment, the attachment used a zero-day exploit targeting a vulnerability in Adobe Flash to drop another malicious file — a backdoor — onto the recipient's desktop computer. This gave the attackers a foothold to burrow farther into the network and gain the access they needed.
--
Wired
on an RSA phishing attack that may have led to the SecurID disclosure
I remember back at the government fear mongering after 9/11. How there were hundreds of sleeper cells in the U.S. How terrorism would become the new normal unless we implemented all sorts of Draconian security measures. You'd think that -- if this were even remotely true -- we would have seen more attempted terrorism in the U.S. over the past decade.
--
Bruce
Schneier
Comments (1 posted)
The Mozilla Security Blog carries
an
advisory that DigiNotar has revoked a fake digital certificate it
issued for Google's domain. "
Users on a compromised network could be
directed to sites using a fraudulent certificate and mistake them for the
legitimate sites. This could deceive them into revealing personal
information such as usernames and passwords. It may also deceive users into
downloading malware if they believe it's coming from a trusted site. We
have received reports of these certificates being used in the wild."
Updates to Firefox, Thunderbird, and SeaMonkey are being released in response.
Update: see this
EFF release for a lot more information; it does not look good.
"Certificate authorities have been caught issuing fraudulent
certificates in at least half a dozen high-profile cases in the past two
years and EFF has voiced concerns that the problem may be even more
widespread. But this is the first time that a fake certificate is known to
have been successfully used in the wild. Even worse, the certificate in
this attack was issued on July 10th 2011, almost two months ago, and may
well have been used to spy on an unknown number of Internet users in Iran
from the moment of its issuance until it was revoked earlier today."
Comments (89 posted)
The Apache project has updated its advisory on the recently-disclosed
denial-of-service vulnerability. The news is not good: the scope of the
vulnerability has grown, the workarounds have become more complex, and
there is still no fix available. "
There are two aspects to this vulnerability. One is new, is Apache specific;
and resolved with this server side fix. The other issue is fundamentally a
protocol design issue dating back to 2007."
Full Story (comments: 7)
Apache has released an update to its HTTP server that fixes the denial of service problem that was
reported on August 24 (and
updated on August 26). We should see updates from distributions soon, though it should be noted that Debian put out an
update on August 29. "
Fix handling of byte-range requests to use less memory, to avoid
denial of service. If the sum of all ranges in a request is larger than
the original file, ignore the ranges and send the complete file."
Full Story (comments: 3)
New vulnerabilities
apache2: denial of service
| Package(s): | apache2 |
CVE #(s): | CVE-2011-3192
|
| Created: | August 30, 2011 |
Updated: | October 14, 2011 |
| Description: |
From the Debian advisory:
A vulnerability has been found in the way the multiple overlapping
ranges are handled by the Apache HTTPD server. This vulnerability
allows an attacker to cause Apache HTTPD to use an excessive amount of
memory, causing a denial of service.
|
| Alerts: |
|
Comments (none posted)
apache-commons-daemon: remote access to superuser files/directories
| Package(s): | apache-commons-daemon |
CVE #(s): | CVE-2011-2729
|
| Created: | August 29, 2011 |
Updated: | December 12, 2011 |
| Description: |
From the CVE entry:
native/unix/native/jsvc-unix.c in jsvc in the Daemon component 1.0.3 through 1.0.6 in Apache Commons, as used in Apache Tomcat 5.5.32 through 5.5.33, 6.0.30 through 6.0.32, and 7.0.x before 7.0.20 on Linux, does not drop capabilities, which allows remote attackers to bypass read permissions for files via a request to an application. |
| Alerts: |
|
Comments (none posted)
hplip: remote code execution
| Package(s): | hplip |
CVE #(s): | CVE-2004-0801
|
| Created: | August 25, 2011 |
Updated: | August 31, 2011 |
| Description: |
From the Novell vulnerability entry:
Unknown vulnerability in foomatic-rip in Foomatic before 3.0.2 allows local users or remote attackers with access to CUPS to execute arbitrary commands. |
| Alerts: |
|
Comments (none posted)
pidgin: possible buffer overflows
| Package(s): | pidgin |
CVE #(s): | |
| Created: | August 31, 2011 |
Updated: | August 31, 2011 |
| Description: |
The pidgin 2.10.0 release features the removal of a lot of calls to unsafe string functions, closing a number of potential buffer overflows. See the changelog for details. |
| Alerts: |
|
Comments (none posted)
selinux-policy: policy updates
| Package(s): | selinux-policy |
CVE #(s): | |
| Created: | August 25, 2011 |
Updated: | August 31, 2011 |
| Description: |
From the Scientific Linux advisory:
* Prior to this update, the SELinux policy package did not allow the
RHEV agent to execute. This update adds the policy for RHEV agents, so
that they can be executed as expected.
* Previously, several labels were incorrect and rules for creating new
389-ds instances were missing. As a result, access vector caches (AVC)
appeared when a new 389-ds instance was created through the 389-console.
This update fixes the labels and adds the missing rules. Now, new 389-ds
instances are created without further errors.
* Prior to this update, AVC error messages occurred in the audit.log
file. With this update, the labels causing the error messages have been
fixed, thus preventing this bug.
|
| Alerts: |
|
Comments (none posted)
vpnc: remote command injection
| Package(s): | vpnc |
CVE #(s): | CVE-2011-2660
|
| Created: | August 31, 2011 |
Updated: | August 31, 2011 |
| Description: |
The modify_resolvconf_suse script packaged with vpnc contains a flaw that could enable command injection attacks via specially-crafted DNS entries. |
| Alerts: |
|
Comments (none posted)
xen: denial of service
| Package(s): | xen |
CVE #(s): | CVE-2011-3131
|
| Created: | August 31, 2011 |
Updated: | September 1, 2011 |
| Description: |
A xen virtual machine given control of a PCI device can cause it to issue invalid DMA requests, potentially overwhelming the host with interrupts from the IOMMU. See this advisory for details. |
| Alerts: |
|
Comments (none posted)
Page editor: Jake Edge
Kernel development
Brief items
The current development kernel is 3.1-rc4,
released on August 28. "
Anyway, go
wild and please do test. The appended shortlog gives a reasonable idea of
the changes, but they really aren't that big. So I definitely *am* hoping
that -rc5 will be smaller. But at the same time, I continue to be pretty
happy about the state of 3.1 so far. But maybe it's just that my meds are
finally working." See
the
full changelog for all the details.
Stable updates: the 2.6.32.46, 2.6.33.19, and 3.0.4 updates were released on August 29
with the usual set of important fixes.
Comments (none posted)
POSIX has been wrong before. Sometimes the solution really is to
say "sorry, you wrote that 20 years ago, and things have changed".
--
Linus Torvalds
It only takes a little multicast to mess up your whole day.
--
Dave Täht
For the last ~6 months the Broadcom team has been working on
getting their driver out of staging. I have to believe that they
would have rather been working on updating device support during
that time. I can only presume that they would make that a priority
in the long run.
How many times has b43 been > ~6 months behind on its hardware
support? Despite Rafał's recent heroic efforts at improving that,
I can't help but wonder how long will it be before b43 is again
dreadfully behind?
--
John Linville
Bad English and a github address makes me unhappy.
--
Linus Torvalds
Comments (4 posted)
The
main kernel.org page is currently
carrying a notice that the site has suffered a security breach.
"
Earlier this month, a number of servers in the kernel.org
infrastructure were compromised. We discovered this August 28th. While we
currently believe that the source code repositories were unaffected, we are
in the process of verifying this and taking steps to enhance security
across the kernel.org infrastructure." As the update mentions,
there's little to be gained by tampering with the git repositories there
anyway.
Comments (71 posted)
Kernel development news
By Jonathan Corbet
August 29, 2011
The 32-bit x86 architecture has a number of well-known shortcomings. Many
of these were addressed when this architecture was extended to 64 bits by
AMD, but running in 64-bit mode is not without problems either. For this
reason, a group of GCC, kernel, and library developers has been working on
a new machine model known as the "x32 ABI." This ABI is getting close to
ready, but, as a recent discussion shows, wider exposure of x32 is bringing
some new issues to the surface.
Classic 32-bit x86 has easily-understood problems: it can only address 4GB
of memory and its tiny set of registers slows things considerably. Running
a current processor in the 64-bit mode fixes both of those problems nicely,
but at a cost: expanding variables and pointers to 64 bits leads to
expanded memory use and a larger cache footprint. It's also not uncommon
(still) to find programs that simply do not work properly on a 64-bit
system. Most programs do not
actually need 64-bit variables or the ability to address massive amounts of
memory; for that code, the larger data types are a cost without an
associated benefit. It would be really nice if those programs could take
advantage of the 64-bit architecture's additional registers and instructions
without simultaneously paying the price of increased memory use.
That best-of-both-worlds situation is exactly what the x32 ABI is trying to
provide. A program compiled to this ABI will run in native 64-bit mode,
but with 32-bit pointers and data values. The full register set will be
available, as will other advantages of the 64-bit architecture like the
faster SYSCALL64 instruction. If all goes according to plan, this
ABI should be the fastest mode available on 64-bit machines for a wide
range of programs; it is easy to see x32 widely displacing the 32-bit
compatibility mode.
One should note that the "if" above is still somewhat unproven: actual
benchmarks showing the differences between x32 and the existing pure modes
are hard to come by.
One outstanding question - and the spark for
the current discussion - has
to do with the system call ABI. For the most part, this ABI looks similar
to what is used by the legacy 32-bit mode: the 32-bit-compatible versions
of the system calls and associated data structures are used. But there is
one difference: the x32 developers want to use the SYSCALL64
instruction just like
native 64-bit applications do for the performance benefits. That
complicates things a bit, since, to know what data size to expect, the
kernel needs to be able to distinguish
system calls made by true 64-bit applications from those running in the x32
mode, regardless of the fact that the processor is running in the same mode in
both cases. As an added challenge, this distinction needs to be made
without slowing down native 64-bit applications.
The solution involves using an expanded version of the 64-bit system call
table. Many system calls can be called directly with no compatibility
issues at all - a call to fork() needs little in the translation
of data structures. Others do need the compatibility layer, though. Each
of those system calls (92 of them) is assigned a new number starting at
512. That leaves a gap above the native system calls for additions over
time. Bit 30 in the system call number is also set
whenever an x32 binary calls into the kernel; that enables kernel code that
cares to implement "compatibility mode" behavior.
Linus didn't seem to mind the mechanism used to distinguish x32 system
calls in general, but he hated the use of
compatibility mode for the x32 ABI. He asked:
I think the real question is "why?". I think we're missing a lot of
background for why we'd want yet another set of system calls at
all, and why we'd want another state flag. Why can't the x32 code
just use the native 64-bit system calls entirely?
There are legitimate reasons why some of the system calls cannot be shared
between the x32 and 64-bit modes. Situations where user space passes
structures containing pointers to the kernel (ioctl() and
readv() being simple examples) will require special handling since
those pointers will be 32-bit. Signal handling will always be special.
Many of the other system calls done specially for x32, though, are there to
minimize the differences between x32 and the legacy 32-bit mode. And those
calls are the ones that Linus objects to
most strongly.
It comes down, for the most part, to the format of integer values passed to
the kernel in structures. The legacy 32-bit mode, naturally, uses 32-bit
values in most cases; the x32 mode follows that lead. Linus is saying,
though, that the 64-bit versions of the structures - with 64-bit integer
values - should be used instead. At a minimum, doing things that way would
minimize the differences between the x32 and native 64-bit modes. But
there is also a correctness issue involved.
One place where the 32- and 64-bit modes differ is in their representation
of time values; in the 32-bit world, types like time_t, struct
timespec, and struct timeval are 32-bit quantities. And
32-bit time values will overflow in the year 2038. If the year-2000 issue
showed anything, it's that long-term drop-dead days arrive sooner than one
tends to think. So it's not surprising that Linus is unwilling to add a new ABI that would suffer
from the 2038 issue:
2038 is a long time away for legacy binaries. It's *not* all that
long away if you are introducing a new 32-bit mode for performance.
The width of time_t cannot change for legacy 32-bit binaries. But
x32 is an entirely new ABI with no legacy users at all; it does not have to
retain any sort of past compatibility at this point. Now is the only time
that this kind of issue can be fixed. So it is probably entirely safe to
say that an x32 ABI will not make it into the mainline as long as it has
problems like the year-2038 bug.
At this point, the x32 developers need to review their proposed system call
ABI and find a way to rework it into something closer to Linus's taste;
that process is already underway.
Then developers can get into the serious business of building systems under
that ABI and running benchmarks to see whether it is all worth the effort.
Convincing distributors (other than Gentoo, of course) to support this ABI
will take a fairly convincing story, but, if this mode lives up to its
potential, that story might just be there.
Comments (58 posted)
By Jonathan Corbet
August 31, 2011
"Writeback" is the process of writing dirty pages back to persistent
storage, allowing those pages to be reclaimed for other uses. Making
writeback work properly has been one of the more challenging problems faced
by kernel developers in the last few years; systems can bog down completely
(or even lock up) when writeback gets out of control. Various approaches
to improving the situation have been discussed; one of those is Fengguang
Wu's I/O-less throttling patch set. These changes have been circulating
for some time; they are seen as having potential - if only others could
actually understand them. Your editor doesn't understand them either, but
that has never stopped him before.
One aspect to getting a handle on writeback, clearly, is slowing down
processes that are creating more dirty pages than the system can handle.
In current kernels, that is done through a call to
balance_dirty_pages(), which sets the offending process to work
writing pages back to disk. This "direct reclaim" has the effect of
cleaning some pages; it also keeps the process from dirtying more pages
while the writeback is happening. Unfortunately, direct reclaim also tends
to create terrible I/O patterns, reducing the bandwidth of data going to
disk and making the problem worse than it was before. Getting rid of
direct reclaim has been on the "to do" list for a while, but it needs to be
replaced by another means for throttling producers of dirty pages.
That is where Fengguang's patch set comes
in. He is attempting to create a control loop capable of determining how
many pages each process should be allowed to dirty at any given time.
Processes exceeding their limit are simply put to sleep for a while to
allow the writeback system to catch up with them. The concept is simple
enough, but the implementation is less so. Throttling is easy; performing
throttling in a way that keeps the number of dirty pages within reasonable
bounds and maximizes backing store utilization while not imposing
unreasonable latencies on processes is a bit more difficult.
If all pages in the system are dirty, the
system is probably dead, so that is a good situation to avoid. Zero dirty
pages is almost as bad; performance in that situation will be exceedingly
poor. The virtual memory subsystem thus aims for a spot in the middle
where the ratio of dirty to clean pages is deemed to be optimal; that
"setpoint" varies, but comes down to tunable parameters in the end.
Current code sets a simple threshold, with throttling happening when the
number of dirty pages exceeds that threshold; Fengguang is trying to do
something more subtle.
Since developers have complained that his work is hard to understand,
Fengguang
has filled out the code with lots of documentation and diagrams. This is
how he depicts the goal of the patch set:
^ task rate limit
|
| *
| *
| *
|[free run] * [smooth throttled]
| *
| *
| *
..bdi->dirty_ratelimit..........*
| . *
| . *
| . *
| . *
| . *
+-------------------------------.-----------------------*------------>
setpoint^ limit^ dirty pages
The goal of the system is to keep the number of dirty pages at the
setpoint; if things get out of line, increasing amounts of force will be
applied to bring things back to where they should be. So the first order
of business is to figure out the current status; that is done in two
steps. The first is to look at the global situation: how many dirty pages
are there in the system relative to the setpoint and to the hard limit that
we never want to exceed? Using a cubic polynomial function (see the code
for the grungy details), Fengguang calculates a global "pos_ratio" to
describe how strongly the system needs to adjust the number of dirty
pages.
This ratio cannot really be calculated, though, without taking the backing
device (BDI) into account. A process may be dirtying pages stored on a
given BDI, and the system may have a surfeit of dirty pages at the moment,
but the wisdom of throttling that process depends also on how many dirty
pages exist for that BDI. If a given BDI is swamped with dirty pages, it
may make sense to throttle a dirtying process even if the system as a whole
is doing OK. On the other hand, a BDI with few dirty pages can clear its
backlog quickly, so it can probably afford to have a few more, even if the
system is somewhat more dirty than one might like. So the patch set tweaks
the calculated pos_ratio for a specific BDI using a complicated formula
looking at how far that specific BDI is from its own setpoint and its
observed bandwidth. The end result is a modified pos_ratio describing whether the
system should be dirtying more or fewer pages backed by the given BDI, and
by how much.
In an ideal world, throttling would match the rate at which pages are being
dirtied to the rate that each device can write those pages back; a process
dirtying pages backed by a fast SSD would be able to dirty more pages more
quickly than
a process writing to pages backed by a cheap thumb drive. The idea is simple:
if N processes are dirtying pages on a BDI with a given bandwidth, each
process should be throttled to the extent that it dirties 1/N of that
bandwidth. The problem is that processes do not register with the kernel
and declare that they intend to dirty lots of pages on a given BDI, so the
kernel does not really know the value of N. That is handled by
carrying a running estimate of N. An initial per-task bandwidth limit is
established; after a period of time, the kernel looks at the number of
pages actually dirtied for a given BDI and divides it by that bandwidth limit to
come up with the number of active processes. From that estimate, a new
rate limit can be applied; this calculation is repeated over time.
That rate limit is fine if the system wants to keep the number of dirty
pages on that BDI at its current level. If the number of dirty pages (for
the BDI or for the system as a whole) is out of line, though, the per-BDI
rate limit will be tweaked accordingly. That is done through a simple
multiplication by the pos_ratio calculated above. So if the number of
dirty pages is low, the applied rate limit will be a bit higher than what
the BDI can handle; if there are too many dirty pages, the per-BDI limit
will be lower. There is some additional logic to keep the per-BDI limit
from changing too quickly.
Once all that machinery is in place, fixing up
balance_dirty_pages() is mostly a matter of deleting the old
direct reclaim code. If neither the global nor the per-BDI dirty limits have
been exceeded, there is nothing to be done. Otherwise the code calculates
a pause time based on the current rate limit, the pos_ratio, and number of
pages recently dirtied by the current task and sleeps for that long. The maximum
sleep time is currently set to 200ms. A final tweak tries to account for
"think time" to even out the pauses seen by any given process. The end
result is said to be a system which operates much more smoothly when lots
of pages are being dirtied.
Fengguang has been working on these patches for some time and would
doubtless like to see them merged. That may yet happen, but adding core
memory management code is never an easy thing to do, even when others can
easily understand the work. Introducing regressions in obscure workloads
is just too easy to do. That suggests that, among other things, a lot of
testing will be required before confidence in these changes will be up to
the required level. But, with any luck, this work will eventually result
in better-performing systems for us all.
Comments (9 posted)
By Jonathan Corbet
August 29, 2011
On September 9, 2010, Broadcom
announced
that it was releasing an open source driver for its wireless networking
chipsets. Broadcom had long resisted calls for free drivers for this
hardware, so this announcement was quite well received in the community,
despite the fact that the quality of the code itself was not quite up to
contemporary kernel standards. One year later, this driver is again under
discussion, but the tone of the conversation has changed somewhat.
After a year of work, Broadcom's driver may never see a mainline release.
Broadcom's submission was actually two drivers: brcmsmac for software-MAC
chipsets, and brcmfmac for "FullMAC" chipsets with hardware MAC support.
These drivers were immediately pulled into the staging tree with the understanding that
there were a lot of things needing fixing before they could make the
move to the mainline proper. In the following year, developers dedicated
themselves to the task of cleaning up the drivers; nearly 900 changes have
been merged in this time. The bulk of the changes came from Broadcom
employees, but quite a few others have contributed fixes to the drivers as
well.
This work has produced a driver that is free of checkpatch warnings, works
on both small-endian and big-endian platforms, uses kernel libraries where
appropriate, and generally looks much better than it originally did. On
August 24, Broadcom developer Henry Ptasinski decided that the time had
come: he posted a patch moving the Broadcom
drivers into the mainline. Greg Kroah-Hartman, maintainer of the staging
tree, said that he was fine with the driver
moving out of staging if the networking developers agreed. Some of those
developers came up with some technical issues, but it appeared that these
drivers were getting close to ready to make the big move out of staging.
That was when Rafał Miłecki chimed
in: "Henry: a simple question, please explain it to me, what
brcmsmac does provide that b43 doesn't?" Rafał, naturally, is
the maintainer of the b43 driver; b43, which has been in the mainline for
years, is a driver for Broadcom chipsets developed primarily from
reverse-engineered information. It has reached a point where, Rafał
claims, it supports everything Broadcom's driver does with one exception
(BCM4313 chipsets) that will be fixed
soon. Rafał also claims that the b43 driver performs better, supports hardware that brcmsmac does not, and
is, in general, a better piece of code.
So Rafał was clear on what he thought of the brcmsmac driver (brcmfmac
didn't really enter into the discussion); he was
also quite clear on what he would like to
see happen:
We would like to have b43 supported by Broadcom. It sounds much
better, I've shown you a lot of advantages of such a
choice. Switching to brcmsmac on the other hand needs a lot of work
and improvements.
On one hand, Rafał is in a reasonably strong position. The b43 driver
is in the mainline now; there is, in general, a strong resistance to the
merging of duplicate drivers for the same hardware. Code quality is, to
some extent, in the eye of the beholder, but there have been few beholders
who have gone on record to say that they like Broadcom's driver better.
Looking at the situation with an eye purely on the quality of the kernel's
code base in the long term, it is hard to make an argument that the
brcmsmac driver should move into the mainline.
On the other hand, if one considers the feelings of developers and the
desire to have more hardware companies supporting their products with
drivers in the Linux kernel, one has to ask why Broadcom was allowed to put
this driver into staging and merge hundreds of patches improving it if that
driver was never going to make it into the mainline. Letting Broadcom
invest that much work into its driver, then asking it to drop everything
and support the reverse-engineered driver that it declined to support one
year ago seems harsh. It's not a story that is likely to prove
inspirational for developers in other companies who are considering trying
to work more closely with the kernel community.
What seems to have happened here (according mainly to a history posted by Rafał, who is not a
disinterested observer) is that, one year ago, the brcmsmac driver
supported hardware that had no support in b43. Since then, b43 has gained
support for that additional hardware; nobody objected to the addition of
duplicated driver support at that time (as one would probably expect, given
that the Broadcom driver was in staging). Rafał doesn't say whether
the brcmsmac driver was helpful to him in filling out hardware support in
the b43 driver. In the end, it doesn't matter; it would appear that the
need for brcmsmac has passed.
One of the most important lessons for kernel developers to learn is that
one should focus on the end result rather than on the merging of a specific
piece of code. One can argue that Broadcom has what it wanted now: free
driver support for its hardware in the mainline kernel. One could also
argue that Broadcom should have worked on adding support to b43 from the
beginning rather than posting a competing driver. Or, failing that, one
could say that the Broadcom developers should have noticed the improvements
going into b43 and thought about the implications for their own work.
But none of that is
going to make the Broadcom developers feel any better about how things have
turned out. They might come around to working on b43, but one expects that
it is not a hugely appealing alternative at the moment.
The kernel process can be quite wasteful - in terms of code and developer
time lost - at times. Any kernel developer who has been in the community
for a significant time will have invested significant time into code that
went straight into the bit bucket at some time or other. But that doesn't
mean that this waste is good or always necessary. There would be value in
finding more reliable ways to warn developers when they are working on code
that is unlikely to be merged. Kernel development is distributed, and
there are no managers paying attention to how developers spend their time;
it works well in general, but it can lead to situations where
developers work at cross purposes and somebody, eventually, has to lose
out.
That would appear to have happened here. In the short term, the kernel and
its users have come out ahead: we have a choice of drivers for Broadcom
wireless chipsets and can pick the best one to carry forward. Even
Broadcom can be said to have come out ahead if it gains better support for
its hardware under Linux. Whether we will pay a longer-term cost in
developers who conclude that the kernel community is too hard to work with
is harder to measure. But it remains true that the kernel community can,
at times, be a challenging place to work.
Comments (73 posted)
Patches and updates
Kernel trees
Architecture-specific
Core kernel code
Development tools
Device drivers
Filesystems and block I/O
Memory management
Networking
Security-related
Benchmarks and bugs
Miscellaneous
Page editor: Jonathan Corbet
Distributions
August 31, 2011
This article was contributed by Nathan Willis
In recent weeks I have begun to delve into the exciting world of packaging — mostly to work on font packages, which are among the simplest bits of software that one can install. The dependencies and possible conflicts are few, the payload is small, and getting the package metadata right is arguably the most important part of the process. When I spotted a news item in my feed reader about a "GUI package creator," I was curious enough to take it for a test drive. The experience itself was underwhelming, but after some digging around what I found truly perplexing is that this is an application category that seems to continually spawn new, independent development efforts, all of which result in short-lived projects.
First run
The application that I first took notice of is called Ubucompilator, a GTK+ front end for creating and modifying Debian packages. In spite of the name, it is not Ubuntu-specific, and can help you create packages for any Debian-based system. Ubucompilator has been in development since December 2009, but made its 1.0 release in May 2011, which sparked a flurry of news items in the Ubuntu blog circles.
The bulk of those stories linked to a short YouTube video showing the program in action at the hands of its creator. Unfortunately, however, the video includes no narration, which makes it difficult to follow the on-screen action (shrunk down to YouTube size as it is). There is also no documentation available, either at the main project domain or at the Who's Who of hosting services where Ubucompilator maintains a presence: Google Code, Launchpad, SourceForge, Softpedia, and WordPress.com.
The Launchpad project pages have the most recent builds of the code, the 1.0 source package from May. It requires a runtime for Gambas2, the Visual-Basic-like language, and compiles cleanly with GNU autotools. The interface consists of a filesystem browser on the left, a textarea on the right, and a series of labeled buttons beneath, reading: "Autogen," "./Configure," "Make," "Make install," "Dh_make," "edit control file," and "Make .deb." Just above the bottom row sits a text-entry field labeled "email."
The aesthetics of the interface received more than a fair share of criticism from commenters in the blog stories. Setting that aside, what Ubucompilator appears to actually do is allow you to navigate to a directory containing buildable source code (such as an unpacked upstream package), and punch through the basic build steps with button clicks rather than shell commands. The output of each command is piped into the textarea on the right hand side of the window.
From the perspective of a packaging-newbie, I can see some value to this approach. Obviously, the configuration and make steps are there for convenience's sake only, because if the package fails to compile, you must debug and fix it using other tools. But it ought to be assumed that any user who knows how to use GNU autotools is already familiar with these steps. On the other hand, packaging is another process entirely, so how (and, more importantly, when) to make use of dh_make may not be intuitive.
Unfortunately, this is the stage of the process where Ubucompilator breaks down. Using dh_make to convert the working source code directory into a properly Debian-formatted form requires making some changes and calling some package-dependent options, and Ubucompilator does not expose any of this functionality. In addition, the "edit control file" button simply opens the ./debian/control file (only if it already exists) in an external editor. Building the actual Debian package (for which Ubucompilator calls dpkg-buildpackage) may also require passing arguments, which is not configurable in the interface.
The result is that Ubucompilator is useful for making modifications to a source package that is already in canonical Debian form, but it does not simplify the process of building anything new, nor help with the potentially-tricky steps like determining dependencies and fixing permissions. However, as commenter mikeru observed on one blog post, the bigger problem is that the interface does not reflect the ordered steps that a user must walk through in order to build a valid package. In Ubucompilator's toolbar-like layout, all of the commands are accessible at the same time, although in order to function, they must be executed in a fixed sequence. A "wizard"-like approach would better reflect the workflow and be more effective.
Other players
Interestingly enough, the "guided steps" approach that mikeru advocated is taken by at least two other GUI Debian-package-creator tools: Debreate and Packin. Nevertheless, neither of those two projects are still actively developed. Both are still installable and at least partially usable on a modern desktop, however, so I decided to explore them to see how they compared.
Debreate is a Python application that wraps a GUI around most of the same functionality as Ubucompilator, with optional support for problem-detection using Lintian, and Debian-to-RPM conversion with Alien. The last release was 0.7alpha6 from September 2010.
Debreate splits the package-construction process into discrete steps, with one page devoted to each and forward/back buttons to navigate between them. Its visuals border on the headache-inducing thanks to some odd color choices (alternating red and blue colored text fields and widgets), and not every page fits onto the default window size, but it does at least present all of the necessary options to the user, clearly labeled. It even allows you to preview how settings will be saved in the control file, and provides a decent interface for maintaining a package changelog, adding scripts, and creating .desktop menu entries.
You still need to have a basic understanding of Debian packaging in order to use Debreate, because there is little in the way of built-in help. I admit, it is probably impossible to provide a "help" menu entry that explains how to write pre- and post-install scripts, but some terminology (such as "pre-depends") would benefit from tooltips.
The same goes for the interface itself: it generally makes sense at first glance, but there are scattered checkboxes and radio-buttons whose immediate effects are unclear, particularly on those pages where there is a split-pane. For example, take the editable script window and the separate add/remove script buttons: by experimentation you learn to paste your script into the window and then click the "add" button, but there are plenty of GUI applications where you would be required to add the script before you could enter its contents; both are defensible.
Still, what Debreate does a good job of is visually tracking the settings that you make to a package-in-the-rough. For example, the "Files," "Scripts," and "Changelog" pages all include editors with which you can directly make changes. The other GUI packager-builders I looked at rely on calling external editing applications, and as a result, the changes that you make (including accidents) are not visible within the interface itself.
Packin is a Java application. The last release was in July 2009, and in an excellent display of irony, the Debian package on the download site is not installable, but I had no trouble extracting the .jar file within, and coaxing it to run under the OpenJDK runtime. Like Debreate, it adopts a step-by-step "wizard" approach to package construction, although it is significantly more compact. Not as many control file fields are supported, and you get only a single line to enter dependencies and conflicts (as opposed to Debreate's complete graphical tool with its fancy pull-down menu to select between =, >=, <=, and the other version-number operators).
It also does not have an interface to attach change-logs or scripts. Unlike Debreate, you do not add individual files or folders to the package — instead you point Packin to a properly-formatted directory. To take care of the other steps, you would still need to call on an external tool like debhelper. Between the two, Debreate is definitely the more complete tool, provided that you can read through the oddly-colored UI widgets well enough to use it.
Simplifying what can't be simplified
Debreate and Packin both require you to configure and successfully
compile your source code before you begin building it into a Debian
package. In that sense, the two of them and Ubucompilator stake off
opposite ends of the process, and none of them covers it completely. There are potential uses for both. With Ubucompilator you could quickly patch an existing Debian package and rebuild it. With Debreate you can create a basic skeleton structure for a new package, taking advantage of the built-in guidance provided by the "wizard" approach. But neither one will work in a vacuum.
The naïve solution might be to suggest combining the two, but having spent some time with them, I am not convinced that a start-to-finish Debian package builder would have a market. The root of the problem is that even with an amalgam of Debreate and Ubucompilator's GUI tools, you will always need to drop out of the point-and-click environment to troubleshoot — either to debug the code or to edit scripts, permissions, and other package errors — and troubleshooting is inherently an un-simplifiable task. It is un-simplifiable because it consists primarily of the developer's own mental energy, and not repetitive processes that can be automated, or deterministic problems that can be caught and solved.
This is not news to anyone in the software engineering field, of course, but I was left wondering why so many people over the years have set out to build GUI interfaces for the task of package creation. Since the complete process cannot be reduced to point-and-click simplicity, eventually users become comfortable enough with the CLI and editor-based tools involved in the rest of the activity, and the GUI loses its appeal. It appears that each of the applications has a two-or-three year active lifespan before it slips into obscurity, yet new ones are always in development.
My guess is that packaging is perceived by people just starting out as having a significant barrier-to-entry, thanks to its box of specialized tools and its distinct nomenclature. To a new user, that can seem intimidating, but at the same time it is clearly a task that involves clear logic and understandable rules. Consequently, wrapping a GUI around them probably seems like a building a much-needed access ramp over that barrier — even though everyone who takes the ramp eventually no longer needs it. A similar argument could be made for other recurring "simplify it with a GUI" projects, such as graphical front-ends to version control systems, or graphical SQL explorers.
Ubucompilator, Debreate, and Packin could use some improvement, but
ultimately I do not see anyone who gets into packaging using them for very
long. Thus, rather than "how can we build the perfect GUI packaging tool?"
perhaps better questions to ask are "how can we better integrate
package-building into the development environment" and "how can we best
cultivate new packagers?" The first question is up to the IDE- and
SDK-makers to answer; the second is up to everyone, but the answer is
probably a hard one — time, documentation, and an endless supply of
patience to answer questions.
Comments (18 posted)
Brief items
My interest is less in what the DPL is doing during his DPL term. I am
much more interested in what the DPL, commonly a strong personality
and well connected, is doing afterwards. Luckily, that "afterwards"
time is much longer than his/her active duty. The profile of a DPL
certainly helps a reach out, like "ah, one of us is strong in Debian,
so Debian must be good for us". This will then help our development.
So, I want the DPLs to change reasonably often. Annually sounds
good to me.
--
Steffen Möller
Comments (none posted)
The Mandriva 2011 "Hydrogen" release has been
announced.
There are lots of changes in this release, including the switch to RPM5.
See
the release
notes and
the
Mandriva 2011 Tour page for more information.
Comments (2 posted)
Distribution News
Debian GNU/Linux
The debian-services-admin list has been created. "
The intent of the
list is to be a single point of contact for issues with services deployed
on Debian infrastructure. It is intended as a reasonably low-traffic list
where everyone who's working on any such services should hang out."
Full Story (comments: none)
Red Hat Enterprise Linux
Red Hat has issued an
invitation
to Red Hat Enterprise Linux users to help discuss features for Red Hat
Enterprise Linux 7. "
The Red Hat Enterprise Linux 7 Ideas discussion group on the Red Hat Customer Portal is now open to all Red Hat subscribers - users and partners - to share use cases and discuss features.
The Red Hat Enterprise Linux 7 Ideas discussion group is an extension to the interactive processes already underway with partners, customers and contributors in the open source community. It provides a venue for sharing thoughts and use cases and is invaluable to Red Hat engineering development groups. Access to the Red Hat Customer Portal, which includes a wealth of Red Hat Enterprise Linux information and knowledge, is provided as an important component of every Red Hat subscription."
Comments (none posted)
Ubuntu family
The Ubuntu Developer Membership Board has started a vote to fill a vacant
position. The candidates are Stefano Rivera, Dave Walker, Micah Gersten,
and Charlie Smotherman. Voting closes September 6.
Full Story (comments: none)
Other distributions
Troy Dawson has been very active in the Scientific Linux community. That
will be coming to a close soon as
Troy
has accepted a job at Red Hat.
"
Thank you to everyone who has encouraged, thanked, and helped me
over the past 8 years that I have worked on Scientific Linux. I have said
it before, and I'll say it now, The Scientific Linux community is one of
the best communities there is."
Comments (none posted)
Newsletters and articles of interest
Comments (none posted)
The H has a fairly in-depth
history of Gentoo Linux. It goes back to Gentoo's origins in Stampede Linux and Enoch, notes some GCC woes, looks at the development of Portage, as well as some of the more recent struggles for the distribution. "
In contrast, most of the key Gentoo packages are compiled from source to the specification of the user and the hardware, and every installation is unique. A well-driven installation will result in faster code with less fluff and bloat. The installation may take hours or days but the pay off is that it only happens once. Gentoo is a rolling release and 'version-less', and package updates are 'emerged' from the portage system on the fly."
Comments (33 posted)
Jono Bacon has put up
a tour of the Ubuntu 11.10 desktop for those who would like to see what it is going to look like. "
As you can see, Unity provides a lot of workable space and the shell just wraps around the app in the most minimal way possible to give you as much space as possible for the app. You can also see that when maximized the window buttons and menus are not shown; they only appear if you hover the window title with the mouse. This actually makes the desktop feel much nicer and less cluttered."
Comments (34 posted)
Page editor: Rebecca Sobol
Development
August 31, 2011
This article was contributed by Nathan Willis
Although most software and hardware ebook readers can cope with standard
PDF documents, text-based formats like HTML and EPUB make life easier for
the author and reader due to their flexibility and editability. They can
re-flow text to fit varying screen sizes and orientations, for example, and
better cope with missing fonts. But historically ebook authors and editors
have not had a good open source option, which meant anyone wishing to pitch
in at Project Gutenberg or start an
original book had to stick with a generic text editor or word processor.
The creators of Sigil are
attempting to change that by building a free book editor tailored to the recurring tasks in preparing a text for ebook publication.
Sigil has been in development since late 2009, and the current release is version 0.4.1. Releases are made simultaneously for Linux, Mac OS X, and Windows, with 32-bit and 64-bit versions for Linux and Windows. John Schember assumed maintainership of the project in July 2011 after founder Strahinja Marković decided to step down.
Schember pushed out the 0.4 release that had been in beta-and-RC limbo since Marković finished graduate school and work began consuming his time. There is also a degree of overlap (including Schember himself) between the Sigil development community and the Calibre ebook-manager project — which is good, considering the their shared and complementary concerns. The 0.4.1 release dropped on August 26. As Schember explained it on the development blog, in his numbering scheme an N.0 release is unstable and the N.1 release indicates that the application is now ready for public use.
Sigil uses Qt as its application framework and embeds WebKit to render
HTML book contents. However, the Linux downloads are InstallJammer installers rather
than standard RPM or Debian packages, and they bundle their own copy of the
necessary .so files, placing everything in /opt/sigil/.
Presumably this installation choice is the result of structuring the code
to produce cross-platform packages for those OSes that cannot simply fetch
the necessary dependencies through their package managers. As a result of
that choice, though, the installer weighs in at 20.5MB and you end up with some
duplicate libraries. Hopefully distributions will eventually start
packaging Sigil themselves to avoid that duplication.
The installer does create .desktop launcher files on the
Desktop and in the system's "Applications -> Office" menu, however. On the
down side, it also registers itself as the default file-helper for EPUB
downloads in Firefox, which may not be the desired behavior.
Getting around the ebook
Sigil's interface resembles a cross between a lightweight HTML editor and an ebook reader — the main content is in the center pane of the window, and you can toggle between rendered, source, and split views on the text, while navigation tools sit in side panes to either side.
On the left is a "book browser." The EPUB format consists of a ZIP file with nested folders containing XHTML text, images, CSS stylesheets, embedded fonts, and two metadata files. The .opf file includes publication metadata (author, copyright, etc.) and a manifest of the book's other content files, and the .ncx file contains an XML version of the table of contents (TOC). Sigil can edit any of the text-based files; double-clicking on one in the book browser opens it in a new tab. But you also use the book browser to add, remove, and manipulate extra media, such as images and fonts. On the right is a table-of-contents list, which automatically loads the TOC from an EPUB's .ncx file. You can jump between TOC entries by clicking on them in the list.
Basic text editing is self-explanatory. Sigil provides a word-processor-like toolbar for cutting, pasting, and basic formatting, plus shortcuts to validate the HTML and insert chapter break markers. The edit menus allow you to jump to specific lines by number, and the find-and-replace function is beefy enough to accept standard wildcard characters and Perl-like regular expressions.
When it comes to formatting, however, the need for the source code and split views becomes more clear. Basic HTML tags are good enough for text formatting, but the structure of EPUB files depends on its own set of conventions. For example, I tested several Project Gutenberg EPUBs in Sigil, and noticed that it showed numerous erroneous chapter breaks. Investigating the source, I found that the .ncx file flagged every headline HTML tag as a chapter break.
According to the documentation, Sigil evidently expects
headline tags only to be used to mark structural divisions in the text, and
also interprets them as a nested tree: H2 tags are subdivisions of H1 tags,
and so on. But Project Gutenberg uses headline tags to mark other
elements, such as the authors byline on the title page, and occasionally to
mark up text itself, such as centered text or "inscriptions" that are
separated from the main copy. When brought through to Sigil, these tags are inserted into the TOC. If the text merely uses different levels of headline tag for different display styles, they get nested into a hierarchical TOC anyway.
Editing — and grappling — with EPUB
Correcting this sort of problem requires re-formatting the HTML, perhaps even munging about in the CSS — such as using <DIV> or <SPAN> tags to apply styles within the text. While paragraph indentation and text weight are simple enough for WYSIWYG editing, both of those more complex tasks are better done in a proper code editor. Fortunately Sigil's code view includes syntax highlighting, parenthesis matching, and cleanup courtesy of HTML Tidy.
Changes you make to the CSS files are instantly picked up in the
rendered HTML view, which is essential. What seems to be missing, however,
is an interface to work with the other structural elements, starting with
element ID attributes. Sigil's TOC generator creates these automatically
when creating the TOC and inserts them into the text. Element IDs
are the preferred method for linking to content within a page
(replacing older anchor tags), but Sigil assigns them automatically, using
a mnemonically unfriendly number scheme. It would be helpful to allow some control over this process, particularly for those working on very long texts.
Sigil even actively discourages you from directly editing the .ncx and .opf files in the editor, popping up a scary warning dialog that you must dismiss to continue. Perhaps that is wise, and without it too many users would foul up their TOC. But considering that Sigil can re-generate a new TOC with one button click, it seems like an unnecessary red flag.
It is also possible to break your text up into multiple XHTML files, and use multiple CSS files, but the Sigil editor offers little in the way of managing them. The convention is to use a separate XHTML file for each chapter in a long work, and while you can right-click in the book editor and create a new file, the only management options are "merge with previous" and an "Add Semantics" sub-menu where you can check special pages such as introductions or glossaries.
Beyond the text itself, Sigil allows you to edit an EPUB's metadata (which is stored inside the book's .opf file). You do this with the "Tools -> Meta Editor" window, which provides a structured editor for key:value metadata pairs. The list of supported metadata properties is long; the editor breaks it into "basic" and "advanced" groupings. ISBN, Publisher, Author, Subject and so forth are "basic," while "advanced" contains several hundred possibilities, from "Actor" and "Adapter" all the way to "Woodcutter" and "Writer of accompanying material."
The tool is simple enough to use, particularly since each metadata
property comes with a brief explanation. Otherwise I might not have
guessed my way to the correct definitions of obscure properties like
"Respondent-appellee" or "Electrotyper." Scanning through the list, it is
clear that the EPUB community has far more than generic novels in mind for
its document format: everything from screenplays to academic dissertations
to legal filings are supported. In my experience, though, most ebooks take little advantage of the rich metadata options available — and free ebooks even less so than commercial ones.
You can also add images and fonts to the ebook by right-clicking in the
book browser, although doing so only adds the files to the browser. You
still must manually insert images into the text editor, and reference the
fonts in the CSS in order to take advantage of them. Whenever your editing
session is complete, Sigil's "Validate Epub" tool will look for errors
using a side-project validator called FlightCrew (although which
errors it looks for are not documented), and "File -> Save As" will generate an EPUB output file.
Old books and new books
At the moment EPUB is the only output format supported, although Sigil can also import HTML text and PDFs. The use cases I looked at first were editing existing ebooks — such as fixing formatting problems in Project Gutenberg books. Ideally, Sigil will someday be useful for other tasks, such as converting an HTML or MOBI-formatted work into a proper EPUB, or writing a new ebook from scratch.
Calibre can perform format conversions, although it does so automatically. In order to edit the result, you must convert it first and then open the result in Sigil. That is not too time-consuming, although it would help matters if Calibre's conversion function could be called from within Sigil at import-time — as it stands, Calibre's main focus is indexing one's ebook library, which makes converting a single file tedious.
More importantly, Sigil allows you to create a "new" ebook project, and will automatically generate empty .ncx and .opf skeleton files. But here the book-management shortcomings mentioned earlier come right to the forefront. Working on any kind of multi-file document is awkward, as is changing CSS formatting to perfect the layout and look. Hopefully future revisions of the application will build more tools for authors, taking care of some of the predictable but still manual-only tasks. For example, if you add a font to an EPUB package, it would be nice to be able highlight text in the editor and choose that font from the Format menu, or even to set it as the default for the entire book.
Colophon
Perhaps dedicated EPUB editors are not the wave of the future, and in a few years managing an ebook text will be a simple task in normal word processors, or simply a "print-to-file" style option in every application. For now, however, if you want e-reader output, you need an application that understands the formats explicitly. In that regard, Sigil is vastly superior to using an HTML editor, or generating PDFs and hoping for the best.
Now that a new maintainer is driving development, the pace at which it advances should pick up, including the weak points mentioned above. There are other nitpicks in 0.4.1, such as the number of windows that require re-sizing in order for the user to read their contents, and the fact that opening a file from the File menu closes the current editing session (although it warns about this first with a dialog box). I was greatly relieved to find that Sigil uses standard system icons and widgets throughout its UI, unlike its cousin Calibre, which tends towards the garish.
I spent some time reading about the EPUB format itself, and although it is free, a major new revision is due shortly (for some value of "shortly"). The Sigil team is following the specification process, which is good, although there does not seem to be much interest in supporting other ebook formats, such as FictionBook, Plucker, or the Kindle format.
No doubt EPUB has plenty of life left in it, but as electronic publishing consumes more and more of the publishing industry, other formats are going to become just as important (if not more so). EPUB is not well suited for documents that rely on controlled layout, such as magazines or textbooks, nor does it handle math. Closed and proprietary formats are going to make a play for those documents; with luck Sigil — and other open source tools — will be ready. If you need to edit an EPUB today, Sigil is just what you want. If you are writing from scratch, however, you might find it easier to crank out your tome in another editor, and turn to Sigil for the final formatting.
Comments (5 posted)
Brief items
In hindsight however, I think the complexity of Swig has exceeded
anyone's ability to fully understand it (including my own). For
example, to even make sense of what's happening, you have to have a
pretty solid grasp of the C/C++ type system (easier said than
done). Couple that with all sorts of crazy pattern matching,
low-level code fragments, and a ton of macro definitions, your head
will literally explode if you try to figure out what's happening.
So far as I know, recent versions of Swig have even combined all of
this type-pattern matching with regular expressions. I can't even
fathom it.
--
David Beazley
But if you want to be taken seriously as a researcher, you should
publish your code! Without publication of your *code* research in
your area cannot be reproduced by others, so it is not science.
--
Guido van Rossum
It's downright absurd for there to be a known and understood
crasher bug, affecting all users, in such a critical component for
so long without any acknowledgment or response by upstream or the
Fedora maintainers. This and the Flash audio corruption mess make
it fairly clear that glibc maintenance is not what it should be for
such a crucial package. Given that, the only sensible approach
seems to be to go ahead and Just Fix It.
--
Adam Williamson
Comments (none posted)
Version 2.4.0 of the bzr version control system is out. "
This is a bugfix and polish release over the 2.3 series, with a large number
of bugs fixed (>150 for the 2.4 series alone), and some performance
improvements. Support for python 2.4 and 2.5 has been dropped, many large
working tree operations have been optimized as well as some stacked branches
operations."
Full Story (comments: none)
A new version of GNOME Shell is available. "
While there are many
substantial features in this release, it's particular worth pointing out
the changes contributed by our Summer of Code Students: Nohemi Fernandez's
onscreen keyboard, Morten Mjelva's contact search, and Neha Doijode's work
on getting cover art and other images to display in notifications."
Full Story (comments: 27)
The
sparse
C parser has long been used to run certain types of static analysis checks
on the kernel source. It has been a slow-moving project for some time.
Now, however, Pekka Enberg and Jeff Garzik have announced a project to
couple sparse and the LLVM backend to produce a working C compiler. The
eventual goal is to compile the kernel; for now, they seem to be reasonably
happy with a working "hello world" example.
Full Story (comments: 33)
The
Opa language project has
announced its
existence. "
Opa is a new member in the family of languages
aiming to make web programming transparent by automatically generating
client-side Javascript and handling communication and session control. Opa
is written in OCaml. A hierarchical database and web server are integrated
with the language. The distribution model is based on a notion of a
session, a construct roughly comparable to process definitions in the
join-calculus or to concurrent objects in a number of formalisms."
See
this
introduction for lots of information about Opa.
Comments (7 posted)
Newsletters and articles
Comments (none posted)
Over at opensource.com, Red Hat's Mike McLean
looks at the history of the Koji build system, starting from when it was an internal tool through the freeing of the code (at the Fedora Core 6 to Fedora 7 transition) to the present. "
Of course, this newly unified Fedora would need a build system and it quickly became apparent that Koji was the right tool for the job. While Fedora Extras was already using Plague, it did not satisfy the requirements for building the entire distribution. So, after much discussion, Red Hat released Koji under an open source license. Koji became a key part of Fedora's new end-to-end, free and open infrastructure."
Comments (none posted)
Michael Reed
looks
at GIMP 2.7.3 which now has Single Window Mode. "
GIMP 2.7.3 has added one of the most
requested features in the program's history: a single window mode. Version
2.7 is part of the development branch, so unfortunately, the feature wont
hit most distro repositories for a while. If you want to have a sneak peek
at the new development features, you'll probably have to compile from
scratch."
Comments (21 posted)
NetworkManager hacker Dan Williams has an
overview of the new features in NetworkManager 0.9 on his blog. Among them: "
When connected to a large unified WiFi network, like a workplace, university, or hotel, NetworkManager 0.9 enhances roaming behavior as you move between locations. By using the background scanning and nl80211 features in wpa_supplicant 0.7 and later, you'll notice fewer drops in connectivity and better signal quality in large networks. Most kernel drivers will now provide automatic updates of new access points and enhanced connection quality reporting, allowing wpa_supplicant to quickly roam to the best access point when the current access point's quality degrades and not before."
Comments (26 posted)
Page editor: Jonathan Corbet
Announcements
Brief items
On August 25, 1991, Linus Torvalds made a now-famous
post to the comp.os.minix newsgroup that announced a new free operating system. We have certainly come a long way since then (note the "bang" (!) path in the headers in the posting for instance). "
I'm doing a (free) operating system (just a hobby, won't be big and
professional like gnu) for 386(486) AT clones. This has been brewing
since april, and is starting to get ready. I'd like any feedback on
things people like/dislike in minix, as my OS resembles it somewhat
(same physical layout of the file-system (due to practical reasons)
among other things)."
Comments (28 posted)
Articles of interest
In case anybody was worried about the SCO Group's appeal with regard to
ownership of the Unix copyrights: Groklaw
carries
the news that the US 10th Circuit Court of Appeals has affirmed the
lower court judgment in all respects. "
So, SCO loses again, and
likely this is as far as it will go. Technically, SCO can ask the US
Supreme Court to hear a further appeal, but that is very unlikely to happen
and even less likely to be granted were it to happen." After all
these years, it's hard to say that it is over, but, just maybe, it's over.
Comments (1 posted)
Education and Certification
The Linux Professional Institute (LPI) will be hosting exams at Ohio
LinuxFest on September 11, 2011.
Full Story (comments: none)
Calls for Presentations
The Linux Exposition of Southern California has announced the 10th annual
Southern California Linux Expo (
SCALE 10x) will be held January
20-22, 2012 in Los Angeles, California. November 17 is the deadline for
abstract and proposal submissions.
Full Story (comments: none)
Upcoming Events
O'Reilly's
Android Open
Conference will be held October 9-11, 2011 in San Francisco,
California. Click below for a list of confirmed speakers.
Full Story (comments: none)
The Linux Foundation has announced the keynotes for LinuxCon Europe which
will take place alongside the Embedded Linux Conference Europe in Prague,
Czech Republic October 26-28, 2011.
Full Story (comments: none)
The Linux Users' Group of Davis (LUGOD) will be meeting on September 19
with a presentation on "Full Scale Flight Simulators" [rescheduled from
March 2011].
Full Story (comments: none)
Events: September 8, 2011 to November 7, 2011
The following event listing is taken from the
LWN.net Calendar.
| Date(s) | Event | Location |
September 6 September 8 |
Conference on Domain-Specific Languages |
Bordeaux, France |
September 7 September 9 |
Linux Plumbers' Conference |
Santa Rosa, CA, USA |
| September 8 |
Linux Security Summit 2011 |
Santa Rosa, CA, USA |
September 8 September 9 |
Italian Perl Workshop 2011 |
Turin, Italy |
September 8 September 9 |
Lua Workshop 2011 |
Frick, Switzerland |
September 9 September 11 |
State of the Map 2011 |
Denver, Colorado, USA |
September 9 September 11 |
Ohio LinuxFest 2011 |
Columbus, OH, USA |
September 10 September 11 |
PyTexas 2011 |
College Station, Texas, USA |
September 10 September 11 |
SugarCamp Paris 2011 - "Fix Sugar Documentation!" |
Paris, France |
September 11 September 14 |
openSUSE Conference |
Nuremberg, Germany |
September 12 September 14 |
X.Org Developers' Conference |
Chicago, Illinois, USA |
September 14 September 16 |
Postgres Open |
Chicago, IL, USA |
September 14 September 16 |
GNU Radio Conference 2011 |
Philadelphia, PA, USA |
| September 15 |
Open Hardware Summit |
New York, NY, USA |
| September 16 |
LLVM European User Group Meeting |
London, United Kingdom |
September 16 September 18 |
Creative Commons Global Summit 2011 |
Warsaw, Poland |
September 16 September 18 |
Pycon India 2011 |
Pune, India |
September 18 September 20 |
Strange Loop |
St. Louis, MO, USA |
September 19 September 22 |
BruCON 2011 |
Brussels, Belgium |
September 22 September 25 |
Pycon Poland 2011 |
Kielce, Poland |
September 23 September 24 |
Open Source Developers Conference France 2011 |
Paris, France |
September 23 September 24 |
PyCon Argentina 2011 |
Buenos Aires, Argentina |
September 24 September 25 |
PyCon UK 2011 |
Coventry, UK |
September 27 September 29 |
Nagios World Conference North America 2011 |
Saint Paul, MN, USA |
September 27 September 30 |
PostgreSQL Conference West |
San Jose, CA, USA |
September 29 October 1 |
Python Brasil [7] |
São Paulo, Brazil |
September 30 October 3 |
Fedora Users and Developers Conference: Milan 2011 |
Milan, Italy |
October 1 October 2 |
WineConf 2011 |
Minneapolis, MN, USA |
October 1 October 2 |
Big Android BBQ |
Austin, TX, USA |
October 3 October 5 |
OpenStack "Essex" Design Summit |
Boston, MA, USA |
October 4 October 9 |
PyCon DE |
Leipzig, Germany |
October 6 October 9 |
EuroBSDCon 2011 |
Netherlands |
October 7 October 9 |
Linux Autumn 2011 |
Kielce, Poland |
October 7 October 10 |
Open Source Week 2011 |
Malang, Indonesia |
| October 8 |
PHP North West Conference |
Manchester, UK |
| October 8 |
FLOSSUK / UKUUG's 2011 Unconference |
Manchester, UK |
October 8 October 9 |
PyCon Ireland 2011 |
Dublin, Ireland |
October 8 October 9 |
Pittsburgh Perl Workshop 2011 |
Pittsburgh, PA, USA |
October 8 October 10 |
GNOME "Boston" Fall Summit 2011 |
Montreal, QC, Canada |
October 9 October 11 |
Android Open |
San Francisco, CA, USA |
| October 11 |
PLUG Talk: Rusty Russell |
Perth, Australia |
October 12 October 15 |
LibreOffice Conference |
Paris, France |
| October 14 |
Workshop Packaging BlankOn |
Jakarta , Indonesia |
October 14 October 16 |
MediaWiki Hackathon New Orleans |
New Orleans, Louisiana, USA |
| October 15 |
Packaging Debian Class BlankOn |
Surabaya, Indonesia |
October 17 October 18 |
PyCon Finland 2011 |
Turku, Finland |
October 18 October 21 |
PostgreSQL Conference Europe |
Amsterdam, The Netherlands |
October 19 October 21 |
13th German Perl Workshop |
Frankfurt/Main, Germany |
October 19 October 21 |
Latinoware 2011 |
Foz do Iguaçu, Brazil |
October 20 October 22 |
13th Real-Time Linux Workshop |
Prague, Czech Republic |
| October 21 |
PG-Day Denver 2011 |
Denver, CO, USA |
October 21 October 23 |
PHPCon Poland 2011 |
Kielce, Poland |
October 23 October 25 |
Kernel Summit |
Prague, Czech Republic |
October 24 October 25 |
GitTogether 2011 |
Mountain View, CA, USA |
October 24 October 25 |
GStreamer Conference 2011 |
Prague, Czech Republic |
October 24 October 28 |
18th Annual Tcl/Tk Conference (Tcl'2011) |
Manassas, Virgina, USA |
October 26 October 28 |
Embedded Linux Conference Europe |
Prague, Czech Republic |
October 26 October 28 |
LinuxCon Europe 2011 |
Prague, Czech Republic |
October 28 October 30 |
MiniDebConf Mangalore India |
Mangalore, India |
| October 29 |
buildroot + crosstool-NG Developers' Day |
Prague, Czech Republic |
October 31 November 4 |
Ubuntu Developer Summit |
Orlando, FL, USA |
October 31 November 4 |
Linux on ARM: Linaro Connect Q4.11 |
Orlando, FL, USA |
November 1 November 3 |
oVirt Workshop and Initial Code Release |
San Jose, CA, USA |
November 1 November 8 |
2011 Plone Conference |
San Francisco, CA, USA |
November 4 November 6 |
Fedora Users and Developer's Conference : India 2011 |
Pune, India |
November 4 November 6 |
Mozilla Festival -- Media, Freedom and the Web |
London, United Kingdom |
November 5 November 6 |
Technical Dutch Open Source Event |
Eindhoven, The Netherlands |
November 5 November 6 |
OpenFest 2011 |
Sofia, Bulgaria |
If your event does not appear here, please
tell us about it.
Page editor: Rebecca Sobol