By Jonathan Corbet
November 9, 2010
Back when the 2010 Linux Plumbers Conference was looking for presentations,
the LibreOffice project had not yet announced its existence. So Michael
Meeks put in a vague proposal for a talk having to do with OpenOffice.org
and promised the organizers it would be worth their time. Fortunately,
they believed him; in an energetic closing keynote, Michael talked at
length about what is going on with LibreOffice - and with the free software
development community as a whole. According to Michael, both good and
bad things are afoot. (Michael's
slides
[PDF] are available for those who would like to follow along).
Naturally enough, LibreOffice is one of the good things; it's going to be
"awesome." It seems that there are some widely diverging views on the
awesomeness of OpenOffice.org; those who are based near Hamburg (where
StarDivision was based) think it is a wonderful tool. People in the rest
of the world tend to have a rather less enthusiastic view. The purpose of
the new LibreOffice project is to produce a system that we can all be proud
of.
Michael started by posing a couple of questions and answering them, the
first of which was "why not rewrite into C# or HTML5?" He noted with a
straight face that going to a web-based approach might not succeed in
improving the program's well-known performance problems. He also said that
he has yet to go to a conference where he did not get kicked off the
network at some point. For now, he just doesn't buy the concept of doing
everything on the web.
Why LibreOffice? Ten years ago, Sun promised the community that an
independent foundation would be created for OpenOffice.org. That
foundation still does not exist. So, quite simply, members of the
community got frustrated and created one of their own. The result, he
says, is a great opportunity for the improvement of the system; LibreOffice
is now a vendor-neutral project with no copyright assignment requirements.
The project, he says, has received great support. It is pleasing to
have both the Open Source Initiative and the Free Software Foundation
express their support, but it's even more fun to see Novell and
BoycottNovell on the same page.
Since LibreOffice launched, the project has seen 50 new code contributors
and 27 new translators, all of whom had never contributed to the project
before. These folks are working, for now, on paying down the vast pile of
"technical debt" accumulated by OpenOffice.org over the years. They are
trying to clean up an ancient, gnarled code base which has grown
organically over many years with no review and no refactoring. They are
targeting problems like memory leaks which result, Michael said, from the
"opt-in approach to lifecycle management" used in the past. After ten
years, the code still has over 100,000 lines of German-language comments;
those are now being targeted with the help of a script which repurposes the
built-in language-guessing code which is part of the spelling checker.
OpenOffice.org has a somewhat checkered history when it comes to revision
control. CVS was used for some years, resulting in a fair amount of pain;
simply tagging a release would take about two hours to run. Still, they
lived with CVS for some time until OpenOffice.org launched into a study to
determine which alternative revision control system would be best to move
to. The study came back recommending Git, but that wasn't what the
managers wanted to hear, so they moved to Subversion instead - losing most
of the project's history in the process. Later, a move to Mercurial was done,
losing history again. The result is a code base littered with
commented-out code; nobody ever felt confident actually deleting anything
because they never knew if they would be able to get it back. Many code
changes are essentially changelogged within the code itself as well. Now
LibreOffice is using Git and a determined effort is being made to clean
that stuff up.
LibreOffice is also doing its best to make contribution easy. "Easy hacks"
are documented
online. The project is making a point of saying:
"we want your changes." Unit tests are being developed. The crufty old
virtual object system - deprecated for ten years - is being removed. The
extensive pile of distributor patches is being merged. And they are
starting to see the addition of interesting new features, such as inline
interactive formula editing. There will be a new mechanism whereby
adventurous users will be able to enable experimental features at run time.
What I really came to talk about was...
There is a point in "Alice's Restaurant" where Arlo Guthrie, at the
conclusion of a long-winded tall tale, informs the audience that he was
actually there to talk about something completely different. Michael did
something similar after putting up a plot showing the increase in outside
contributions over time. He wasn't really there to talk about a desktop
productivity application; instead, he wanted to talk about a threat
he sees looming over the free software development community.
That threat, of course, comes from the growing debate about the ownership
structure of free software projects. As a community, Michael said, we are
simply too nice. We have adopted licenses for our code which are entirely
reasonable, and we expect others to be nice in the same way. But any
project which requires copyright assignment (or an equivalent full-license
grant) changes the equation; it is not being nice. There is some
behind-the-scenes activity going on now which may well make things worse.
Copyright assignment does not normally deprive a contributor of the right
to use the contributed software as he or she may wish. But it
reserves to the corporation receiving the assignments the right to make
decisions regarding the complete work. We as a community have
traditionally cared a lot about licenses, but we have been less concerned
about the conditions that others have to accept. Copyright assignment
policies are a barrier to entry to anybody else who would work with the
software in question. These policies also disrupt the balance between
developers and "suit wearers," and it creates FUD around free software
license practices.
Many people draw a distinction between projects owned by for-profit
corporations and those owned by foundations. But even assignment policies
of the variety used by the Free Software Foundation have their problems.
Consider, Michael said, the split between emacs and xemacs; why does xemacs
continue to exist? One reason is that a good chunk of xemacs code is owned
by Sun, and Sun (along with its successor) is unwilling to assign copyright
to the FSF. But there is also a group of developers out there who think
that it's a good thing to have a version of emacs for which copyright
assignment is not required. Michael also said that the FSF policy sets a
bad example, one which companies pushing assignment policies have been
quick to take advantage of.
Michael mentioned a study entitled "The Best
of Strangers" which focused on the willingness to give out personal
information. All participants were given a questionnaire with a long list
of increasingly invasive questions; the researchers cared little about the
answers, but were quite interested in how far participants got before
deciding they were not willing to answer anymore. Some
participants received, at the outset, a strongly-worded policy full of
privacy assurances; they provided very little information. Participants
who did not receive that policy got rather further through the
questionnaire, while those who were pointed to a questionnaire on a web
site filled it in completely. Starting with the legalese ruined the
participants' trust and made them unwilling to talk about themselves.
Michael said that a similar dynamic applies to contributors to a free
software project; if they are confronted with a document full of legalese
on the first day, their trust in the project will suffer and they may just
walk away. He pointed out the recently-created systemd project's policy,
paraphrased as "because we value your contributions, we require no
copyright assignments," as the way to encourage contributors and earn their
trust.
Assignment agreements are harmful to the hacker/suit balance. If you work
for a company, Michael said, your pet project is already probably owned by
the boss. This can be a problem; as managers work their way into the
system, they tend to lose track of the impact of what they do. They also
tend to deal with other companies in unpleasant ways which we do not
normally see at
the development level; the last thing we want to do is to let these
managers import "corporate aggression" into our community. If suits start
making collaboration decisions, the results are not always going to be a
positive thing for our community; they can also introduce a great deal of
delay into the process. Inter-corporation agreements tend to be
confidential and can pop up in strange ways; the freedom to fork a specific
project may well be compromised by an agreement involving the company which
owns the code. When somebody starts pushing inter-corporation agreements
regarding code contributions and ownership, we need to be concerned.
Michael cited the agreements around the open-sourcing of the openSPARC
architecture as one example of how things can go wrong. Another is the
flurry of lawsuits in the mobile area; those are likely to divide companies
into competing camps and destroy the solidarity we have at the development
level.
Given all this, he asked, why would anybody sign such an agreement? The
freedom to change the license is one often-cited reason; Michael says that
using permissive licenses or "plus licenses" (those which allow "any later
version") as a better way of addressing that problem. The ability to offer
indemnification is another reason, but indemnification is entirely
orthogonal to ownership. One still hears the claim full ownership is
required to be able to go after infringers, but that has been decisively
proved to be false at this point. There is also an occasional appeal to
weird local laws; Michael dismissed those as silly and self serving. There
is, he says, something else going on.
What works best, he says, is when the license itself is the contributor
agreement. "Inbound" and "outbound" licensing, where everybody has the
same rights, is best.
But not everybody is convinced of that. Michael warned that there is "a
sustained marketing drive coming" to push the copyright-assignment agenda.
While we were sitting in the audience, he said, somebody was calling our
bosses. They'll be saying that copyright assignment policies are required
for companies to be willing to invest in non-sexy projects. But the fact
of the matter is that almost all of the stack, many parts of which lack
sexiness, is not owned by corporations. "All cleanly-written software,"
Michael says, "is sexy." Our bosses will hear that copyright assignment is
required for companies to get outside investment; it's the only way they
can pursue the famous MySQL model. But we should not let monopolistic
companies claim that their business plans are good for free software;
beyond that, Michael suggested that the MySQL model may not look as good as
it did a year or two ago. Managers will be told that only assignment-based
projects are successful. One only need to look at the list of successful
projects, starting with the Linux kernel, to see the falseness of that
claim.
Instead, Michael says, having a single company doing all of the heavy
lifting is the sign of a project without a real community. It is an
indicator of risk. People are figuring this out; that is why we're seeing
in increasing number of single-company projects being forked and
rewritten. Examples include xpdf and poppler, libart_lgpl and cairo, MySQL
and Maria. There are a number of companies, Novell and Red Hat included,
which are dismantling the copyright-assignment policies they used to
maintain.
At this point, Michael decided that we'd had enough and needed a brief
technical break. So he talked about Git: the LibreOffice project likes to
work with shallow clones because the full history is so huge. But it's not
possible to push patches from a shallow clone, that is a pain. Michael
also noted that git am is obnoxious to use. On the other
hand, he says, the valgrind DHAT
tool is a wonderful way of analyzing heap memory usage patterns and finding
bugs. Valgrind, he says, does not get anywhere near enough attention.
There was also some brief talk of "component-based everything" architecture
and some work the project is doing to facilitate parallel contribution.
The conclusion, though, came back to copyright assignment. We need to
prepare for the marketing push, which could cause well-meaning people to do
dumb things. It's time for developers to talk to their bosses and make it
clear that copyright assignment policies are not the way toward successful
projects. Before we contribute to a project, he said, we need to check
more than the license; we need to look at what others will be able to do
with the code. We should be more ungrateful toward corporations which seek
to dominate development projects and get involved with more open
alternatives.
One of those alternatives, it went without saying, is the LibreOffice
project. LibreOffice is trying to build a vibrant community which
resembles the kernel community. But it will be more fun: the kernel,
Michael said, "is done" while LibreOffice is far from done. There is a lot
of low-hanging fruit and many opportunities for interesting projects. And,
if that's not enough, developers should consider that every bit of memory
saved will be multiplied across millions of LibreOffice users; what better
way can there be to offset one's carbon footprint? So, he said, please
come and help; it's an exciting time to be working with LibreOffice.
Comments (54 posted)
By Jonathan Corbet
November 5, 2010
Keith Packard has probably done more work to put the X Window System onto
our desks than just about anybody else. With some 25 years of history, X
has had a good run, but nothing is forever. Is that run coming to an end,
and what might come after? In his Linux Plumbers Conference talk, Keith
claimed to have no control over how things might go, but he did have some
ideas. Those ideas add up to an interesting vision of our graphical
future.
We have reached a point where we are running graphical applications on a
wide variety of systems. There is the classic desktop environment that X
was born into, but that is just the beginning. Mobile systems have become
increasingly powerful and are displacing desktops in a number of
situations. Media-specific devices have display requirements of their
own. We are seeing graphical applications in vehicles, and in a number of
other embedded situations.
Keith asked: how many of these applications care about network transparency,
which was one of the original headline features of X? How many of them care about
ICCCM compliance? How many
of them care about X at all? The answer to all of those questions, of
course, is "very few." Instead, developers designing these systems are
more likely to resent X for its complexity, for its memory and CPU footprint,
and for its contribution to lengthy boot times. They would happily get rid
of it. Keith says that he means to accommodate them without wrecking things
for the rest of us.
Toward a non-X future
For better or for worse, there is currently a wide variety of rendering
APIs to choose from when writing graphical libraries. According to Keith,
only two of them are interesting. For video rendering, there's the
VDPAU/
VAAPI pair;
for everything else, there's
OpenGL.
Nothing else really matters going forward.
In the era of direct rendering, neither of those APIs really depends on X.
So what is X good for? There is still a lot which is done in the X server,
starting with video mode setting. Much of that work has been moved into
the kernel, at least for graphics chipsets from the "big three," but X
still does it
for the rest. If you still want to do boring 2D graphics, X is there for
you - as Keith put it, we all love ugly lines and lumpy text. Input is
still very much handled in X; the kernel's evdev interface does some of it
but falls far short of doing the whole job. Key mapping is done in X;
again, what's provided by the kernel in this area is "primitive." X
handles clipping when application windows overlap each other; it also takes
care of 3D object management via the GLX extension.
These tasks have a lot to do with why the X server is still in charge of
our screens. Traditionally mode setting has been a big and hairy task,
with the requisite code being buried deep within the X server; that has put
up a big barrier to entry to any competing window systems. The clipping
job had to be done somewhere. The management of video memory was done in
the X server, leading to a situation where only the server gets to take advantage of
any sort of persistent video memory. X is also there to make external
window managers (and, later, compositing managers) work.
But things have changed in the 25 years or so since work began on X. Back
in 1985, Unix systems did not support shared libraries; if the user ran two
applications linked to the same library, there would be two copies of that
library in memory, which was a scarce resource in those days. So it made a
lot of sense to put graphics code into a central server (X), where it could
be shared among applications. We no longer need to do things that way; our
systems have gotten much better at sharing code which appears in different
address spaces.
We also have much more complex applications - back then xterm was just
about all there was. These applications manipulate a lot more graphical
data, and almost every operation involves images. Remote applications are
implemented with protocols like HTTP; there is little need to use the X
protocol for that purpose anymore. We have graphical toolkits which can
implement dynamic themes, so it is no longer necessary to run a separate
window manager to impose a theme on the system. It is a lot easier to make
the system respond "quickly enough"; a lot of hackery in the X server (such
as the "mouse ahead" feature) was designed for a time when systems were
much less responsive. And we have color screens now; they were scarce and
expensive in the early days of X.
Over time, the window system has been split apart into multiple pieces -
the X server, the window manager, the compositing manager, etc. All of
these pieces are linked by complex, asynchronous protocols. Performance
suffers as a result; for example, every keystroke must pass through at least three
processes: the application,
the X server, and the compositing manager. But we don't
need to do things that way any more; we can simplify the architecture and
improve responsiveness. There are some unsolved problems associated with
removing all these processes - it's not clear how all of the fancy 3D bling
provided by window/compositing managers like compiz can be implemented - but
maybe we don't need all of that.
What about remote applications in an X-free world? Keith suggests that
there is little need for X-style network transparency anymore. One of the
early uses for network transparency was applications oriented around forms
and dialog boxes; those are all implemented with web browsers now. For
other applications, tools like VNC and rdesktop work and perform better
than native X. Technologies like WiDi (Intel's Wireless
Display) can also handle remote display needs
in some situations.
Work to do
So maybe we can get rid of X, but, as described above, there are still a
number of important things done by the X server. If X goes, those
functions need to be handled elsewhere. Mode setting is going to into the
kernel, but there are still a lot of devices without kernel mode setting
(KMS) support. Somebody
will have to implement KMS drivers for those devices, or they may
eventually stop working. Input device support is partly handled by evdev.
Graphical memory management is now handled in the kernel by GEM in a number
of cases. In other words, things are moving into the kernel - Keith seemed
pleased at the notion of making all of the functionality be somebody else's
problem.
Some things are missing, though. Proper key mapping is one of them; that
cannot (or should not) all be done in the kernel. Work is afoot to create
a "libxkbcommon" library so that key mapping could be incorporated into
applications directly. Accessibility work - mouse keys and sticky keys,
for example - also needs to be handled in user space somewhere. The input
driver problem is not completely solved; complicated devices (like
touchpads) need user-space support. Some things need to be made cheaper, a
task that can mostly be accomplished by replacing APIs with more efficient
variants. So GLX can be replaced by EGL, in many cases, GLES can can be
used instead of OpenGL, and VDPAU is an improvement over Xv. There is also
the little problem of mixing X and non-X applications while providing a
unified user experience.
Keith reflected on some of the unintended benefits that
have come from the development work done in recent years; many of these
will prove helpful going forward. Compositing, for
example, was added as a way of adding fancy effects to 2D applications.
Once the X developers had compositing, though, they realized that it enabled the
rendering of windows without clipping, simplifying things considerably. It
also separated rendering from changing on-screen content - two tasks which
had been tightly tied before - making rendering more broadly useful. The GEM code had
a number of goals, including making video memory pageable, enabling
zero-copy texture creation from pixmaps, and the management of persistent
3D objects. Along with GEM came lockless direct rendering, improving
performance and making it possible to run multiple window systems with no
performance hit. Kernel mode setting was designed to make graphical setup
more reliable and to enable the display of kernel panic messages, but KMS
also made it easy to implement alternative window systems - or to run
applications with no window system at all. EGL was designed to enable
porting of applications between platforms; it also enabled running those
application on non-X window systems and the dumping of the expensive GLX
buffer sharing scheme.
Keith put up two pictures showing the organization of graphics on Linux.
In the "before" picture, a pile of rendering interfaces can be seen all
talking to the X server, which is at the center of the universe. In the
"after" scene, instead, the Linux kernel sits in the middle, and window
systems like X and Wayland
are off in the corner, little more than special applications. When we get
to "after," we'll have a much-simplified graphics system offering more
flexibility and better performance.
Getting there will require getting a few more things done, naturally.
There is still work to be done to fully integrate GL and VDPAU into the
system. The input driver problem needs to be solved, as does the question
of KMS support for video adaptors from other than the "big three" vendors.
If we get rid of window managers somebody else has to do that work; Windows
and Mac OS push that task into applications, maybe we should too. But,
otherwise, this future is already mostly here. It is possible, for
example, to run X as a client of Wayland - or vice versa. The
post-X era is beginning.
Comments (182 posted)
November 4, 2010
This article was contributed by Neil Brown
In the first article in this
series, we commenced our historical search for design patterns in
Linux and Unix by illuminating the "Full exploitation" pattern which
provides a significant contribution to the strength of Unix. In this
second part we will look at the first of three patterns which characterize
some design decisions that didn't work out so well.
The fact that these design decisions are still with us and worth
talking about shows that their weaknesses were not immediately obvious
and, additionally, that these designs lasted long enough to become
sufficiently entrenched that
simply replacing them would cause more harm than good. With
these types of design issues, early warning is vitally important. The
study of these patterns can only serve if they help us to avoid
similar mistakes early enough. If they only allow us to classify that
which we cannot avoid, there would be little point in studying them at
all.
These three patterns are ordered from the one which seems to give most
predictive power to that which is least valuable as an early warning.
But hopefully the ending note will not be one of complete despair -
any guidance in preparing for the future is surely better than none.
Conflated Designs
This week's pattern is exposed using two design decisions which were
present in early Unix and have been followed by a series of fixes
which have address most of the resulting difficulties. By understanding the
underlying reason that the fixes were needed, we can hope to avoid
future designs which would need such fixing.
The first of these design decisions is taken from the implementation of the
single namespace discussed in part 1.
The mount command
The central tool for implementing a single namespace is the 'mount'
command, which makes the contents of a disk drive available as a
filesystem and attaches that filesystem to the existing
namespace. The flaw in this design which exemplifies this pattern is
the word 'and' in that description. The 'mount' command performs two
separate actions in one command. Firstly it makes the contents of a
storage device appear as a filesystem, and secondly it binds that
filesystem into the namespace. These two steps must always be done
together, and cannot be separated. Similarly the unmount command
performs the two reverse actions of unbinding from the namespace and
deactivating the filesystem. These are, or at least were,
inextricably combined and if one failed for some reason, the other
would not be attempted.
It may seem at first that it is perfectly natural to combine these two
operations and there is no value in separating them. History,
however, suggests otherwise. Considerable effort has gone into separating
these operations from each other.
Since version 2.4.11 (released in 2001), Linux has a 'lazy' version of unmount.
This unbinds a filesystem from the namespace without insisting on
deactivating it at the same time. This goes some way to splitting out
the two functional aspects of the original unmount.
The 'lazy' unmount is particularly useful when a filesystem has
started to fail for some reason, a common example being an NFS filesystem
from a server which is no longer accessible. It may not be
possible to deactivate the filesystem as there could well be
processes with open files on the filesystem. But at least with lazy
unmounted it can be removed from the namespace so new processes wont
be able to try to open files and so get stuck.
As well as 'lazy' unmounts, Linux developers have found it useful to
add 'bind' mounts and 'move' mounts. These allow one part of the name
space to be bound to another part of the namespace (so it appears
twice) or a filesystem to be moved from one location to another —
effectively a 'bind' mount followed by a 'lazy' unmount.
Finally we have a pivot_root() system call which performs a slightly
complicated dance between two filesystem starting out with the first
being the root filesystem and the second being a normal mounted file
system, and ending with the second being the root and the first being
mounted somewhere else in that root.
It might seem that all of the issues with combining the two
functions into a single 'mount' operation have been adequately resolved in the
natural course of development, but it is hard to be convinced of this.
The collection of namespace manipulation functions that we now have
is quite ad hoc and so, while it seems to meet current needs, there
can be no certainty that it is in any sense complete. A hint of this
incompleteness can be seen in the fact that, once you perform a lazy
unmount, the filesystem may well still exist, but it is no longer
possible to manipulate it as it does not have a name in the global
namespace, and all current manipulation operations require such a
name. This makes it difficult to perform a 'forced' unmount after a
'lazy' unmount.
To see what a complete interface would look like we would need to
exploit the design concept discussed last week: "everything can have a
file descriptor". Had that pattern been imposed on the design of the
mount system call we would likely have:
- A mount call that simply returned a file descriptor for the file
system.
- A bind call that connected a file descriptor into the namespace, and
- An unmount call that disconnected a filesystem and returned a file
descriptor.
This simple set would easily provide all the functionality that we
currently have in an arguably more natural way. For example the
functionality currently provided by the special-purpose
pivot_root()
system call could be achieve with the above with at most the addition of
fchroot(), an obvious analogue of
fchdir() and
chroot().
One of the many strengths of Unix - particularly seen in the set of tools
that came with the kernel - is the principle of building and then
combining tools. Each tool should do one thing and do it well. These
tools can then be combined in various ways, often to achieve ends that
the tool developer could not have foreseen. Unfortunately the same
discipline was not maintained with the mount() system call.
So this pattern is to some extent the opposite of the 'tools
approach'. It needs a better name than that, though; a good choice
seems to be to call it a "conflated design". One dictionary
(PJC)
defines "conflate" as "to ignore distinctions between, by treating two or
more distinguishable objects or ideas as one", which seems to sum up
the pattern quite well.
The open() system call.
Our second example of a conflated design is found in the open() system
call. This system call (in Linux) takes 13 distinct flags which
modify its behavior, adding or removing elements of functionality -
multiple concepts are thus combined in the one system call.
Much of this combination does not imply a conflated design. Several
of the flags can be set or cleared independently of the open() using
the F_SETFL option to fcntl(). Thus while they are commonly combined,
they are easily separated and so need not be considered to be conflated.
Three elements of the open() call are worthy of particular attention in
the current context. They are O_TRUNC, O_CLOEXEC and O_NONBLOCK.
In early versions of Unix, up to and including Level 7, opening with
O_TRUNC was the only way to truncate a file and, consequently, it could
only be truncated to become empty. Partial truncation was not
possible.
Having truncation intrinsically tied to open() is exactly the sort of
conflated design that should be avoided and, fortunately, it is easy to
recognize. BSD Unix introduced the ftruncate() system call which
allows a file to be truncated after it has been opened and, additionally, allows the
new size to be any arbitrary value, including values greater than the
current file size. Thus that conflation was easily resolved.
O_CLOEXEC has a more subtle story. The standard behavior of the
exec() system call (which causes a process to stop running one program
and to start running another) is that all file descriptors available
before the exec() are equally available afterward. This behavior can
be changed, quite separately from the open() call which created the
file descriptor, with another fcntl() call. For a long time this
appeared to be a perfectly satisfactory arrangement.
However the advent of threads, where multiple processes could share
their file descriptors (so when one thread or process opens a file, all
threads in the group can see the file descriptor immediately), made
room for a potential race. If one process opens a file with the
intent of setting the close-on-exec flag immediately, and another
process performs an exec() (which causes the file table to not be shared
any more), the new program in the second process will inherit a file descriptor which it should
not.
In response to this problem,
the recently-added O_CLOEXEC flag causes open() to mark the file
descriptor as close-on-exec atomically with the open so there can be
no leakage.
It could be argued that creating a file descriptor and allowing it to be
preserved across an exec() should be two separate operations. That is, the
default should have been to not keep a file descriptor open across exec(),
and a special request would be needed to preserve it. However foreseeing
the problems of threads when first designing open() would be beyond
reasonable expectations, and even to have considered the effects on open()
when adding the ability to share file tables would be a bit much to ask.
The main point of the O_CLOEXEC example then is to acknowledge that
recognizing a conflated design early can be very hard, which hopefully
will be an encouragement to put more effort in reviewing a design for
these sorts of problems.
The third flag of interest is O_NONBLOCK. This flag is itself
conflated, but also shows conflation within open().
In Linux, O_NONBLOCK has two quite separate, though superficially
similar, meanings.
Firstly, O_NONBLOCK affects all read or write operations on the file
descriptor, allowing them to return immediately after processing less
data than requested, or even none at all. This functionality can
separately be enabled or disabled with fcntl() and so is of little
further interest.
The other function of O_NONBLOCK is to cause the open() itself not to
block. This has a variety of different effects depending on the
circumstances. When opening a named pipe for write, the open will
fail rather than block if there are no readers. When opening a
named pipe for read, the open will succeed rather than block, and
reads will then return an error until some process writes something
into the pipe.
On CDROM devices an open for read with O_NONBLOCK will also
succeed but no disk checks will be performed and so no reads will be possible.
Rather the file
descriptor can only be used for ioctl() commands such as to poll for the
presence of media or to open or close the CDROM tray.
The last gives a hint concerning another aspect of open() which is
conflated. Allocating a file descriptor to refer to a file and
preparing that file for I/O are conceptually two separate operations.
They certainly are often combined and including them both in the one
system call can make sense. Requiring them to be combined is where
the problem lies.
If it were possible to get a file descriptor on a given file (or
device) without waiting for or triggering any action within that file,
and, subsequently, to request the file be readied for I/O, then a number
of subtle issues would be resolved. In particular there are various
races possible between checking that a file is of a particular type
and opening that file. If the file was renamed between these two
operations, the program might suffer unexpected consequences of the
open. The O_DIRECTORY flag was created precisely to avoid this sort
of race, but it only serves when the program is expecting to open a
directory. This race could be simply and universally avoided if these
two stages of opening a file were easily separable.
A strong parallel can be seen between this issue and the 'socket' API
for creating network connections. Sockets are created almost
completely uninitialized; thereafter a number of aspects of the socket
can be tuned (with e.g. bind() or setsockopt()) before the socket is
finally connected.
In both the file and socket cases there is sometimes value in being able to set up or
verify some aspects of a connection before the connection is
effected. However with open() it is not really possible in general to
separate the two.
It is worth noting here that opening a file with the 'flags' set to
'3' (which is normally an invalid value) can sometimes have a similar
meaning to O_NONBLOCK in that no particular read or write access is
requested. Clearly developers see a need here but we still don't have
a uniform way to be certain of getting a file descriptor without causing
any access to the device, or a way to upgrade a file descriptor from
having no read/write access to having that access.
As we saw, most of the difficulties caused by conflated design, at
least in these two examples, have been addressed over time. It could
therefore be argued that as there is minimal ongoing pain, the pattern
should not be a serious concern. That argument though would miss two
important points. Firstly they have already caused pain over many
years. This could well have discouraged people from using the whole
system and so reduce the overall involvement in, and growth of, the
Unix ecosystem.
Secondly, though the worst offenses have largely been fixed, the
result is not as neat and orthogonal as it could be. As we saw during
the exploration, there are some elements of functionality that have
not yet been separated out. This is largely because there is no clear
need for them. However we often find that a use for a particular
element of functionality only presents itself once the functionality
is already available. So by not having all the elements cleanly
separated we might be missing out on some particular useful tools
without realizing it.
There are undoubtedly other areas of Unix or Linux design where
multiple concepts have been conflated into a single operation, however
the point here is not to enumerate all of the flaws in Unix. Rather
it is to illustrate the ease with which separate concepts can be
combined without even noticing it, and the difficulty (in some cases)
of separating them after the fact. This hopefully will be an
encouragement to future designers to be aware of the separate steps
involved in a complex operation and to allow - where meaningful -
those steps to be performed separately if desired.
Next week we will continue this exploration and describe a pattern of
misdesign that is significantly harder to detect early, and appears
to be significantly harder to fix late. Meanwhile, following are
some exercises that may be used to explore conflated designed more deeply.
Exercises.
-
Explain why open() with O_CREAT benefits from an O_EXCL flag, but
other system calls which create filesystem entries (mkdir(), mknod(),
link(), etc) do not need such a flag. Determine if there is any
conflation implied by this difference.
-
Explore the possibilities of the hypothetical bind() call that
attaches a file descriptor to a location in the namespace. What
other file descriptor types might this make sense for, and what
might the result mean in each case.
-
Identify one or more design aspects in the IP protocol suite which
show conflated design and explain the negative consequences of this
conflation.
Next article
Ghosts of Unix past, part 3: Unfixable designs
Comments (36 posted)
Page editor: Jonathan Corbet
Next page: Security>>