Leading items

LPC: Michael Meeks on LibreOffice and code ownership

By Jonathan Corbet
November 9, 2010

Back when the 2010 Linux Plumbers Conference was looking for presentations, the LibreOffice project had not yet announced its existence. So Michael Meeks put in a vague proposal for a talk having to do with OpenOffice.org and promised the organizers it would be worth their time. Fortunately, they believed him; in an energetic closing keynote, Michael talked at length about what is going on with LibreOffice - and with the free software development community as a whole. According to Michael, both good and bad things are afoot. (Michael's slides [PDF] are available for those who would like to follow along).

Naturally enough, LibreOffice is one of the good things; it's going to be "awesome." It seems that there are some widely diverging views on the awesomeness of OpenOffice.org; those who are based near Hamburg (where StarDivision was based) think it is a wonderful tool. People in the rest of the world tend to have a rather less enthusiastic view. The purpose of the new LibreOffice project is to produce a system that we can all be proud of.

Michael started by posing a couple of questions and answering them, the first of which was "why not rewrite into C# or HTML5?" He noted with a straight face that going to a web-based approach might not succeed in improving the program's well-known performance problems. He also said that he has yet to go to a conference where he did not get kicked off the network at some point. For now, he just doesn't buy the concept of doing everything on the web.

Why LibreOffice? Ten years ago, Sun promised the community that an independent foundation would be created for OpenOffice.org. That foundation still does not exist. So, quite simply, members of the community got frustrated and created one of their own. The result, he says, is a great opportunity for the improvement of the system; LibreOffice is now a vendor-neutral project with no copyright assignment requirements. The project, he says, has received great support. It is pleasing to have both the Open Source Initiative and the Free Software Foundation express their support, but it's even more fun to see Novell and BoycottNovell on the same page.

Since LibreOffice launched, the project has seen 50 new code contributors and 27 new translators, all of whom had never contributed to the project before. These folks are working, for now, on paying down the vast pile of "technical debt" accumulated by OpenOffice.org over the years. They are trying to clean up an ancient, gnarled code base which has grown organically over many years with no review and no refactoring. They are targeting problems like memory leaks which result, Michael said, from the "opt-in approach to lifecycle management" used in the past. After ten years, the code still has over 100,000 lines of German-language comments; those are now being targeted with the help of a script which repurposes the built-in language-guessing code which is part of the spelling checker.

OpenOffice.org has a somewhat checkered history when it comes to revision control. CVS was used for some years, resulting in a fair amount of pain; simply tagging a release would take about two hours to run. Still, they lived with CVS for some time until OpenOffice.org launched into a study to determine which alternative revision control system would be best to move to. The study came back recommending Git, but that wasn't what the managers wanted to hear, so they moved to Subversion instead - losing most of the project's history in the process. Later, a move to Mercurial was done, losing history again. The result is a code base littered with commented-out code; nobody ever felt confident actually deleting anything because they never knew if they would be able to get it back. Many code changes are essentially changelogged within the code itself as well. Now LibreOffice is using Git and a determined effort is being made to clean that stuff up.

LibreOffice is also doing its best to make contribution easy. "Easy hacks" are documented online. The project is making a point of saying: "we want your changes." Unit tests are being developed. The crufty old virtual object system - deprecated for ten years - is being removed. The extensive pile of distributor patches is being merged. And they are starting to see the addition of interesting new features, such as inline interactive formula editing. There will be a new mechanism whereby adventurous users will be able to enable experimental features at run time.

What I really came to talk about was...

There is a point in "Alice's Restaurant" where Arlo Guthrie, at the conclusion of a long-winded tall tale, informs the audience that he was actually there to talk about something completely different. Michael did something similar after putting up a plot showing the increase in outside contributions over time. He wasn't really there to talk about a desktop productivity application; instead, he wanted to talk about a threat he sees looming over the free software development community.

That threat, of course, comes from the growing debate about the ownership structure of free software projects. As a community, Michael said, we are simply too nice. We have adopted licenses for our code which are entirely reasonable, and we expect others to be nice in the same way. But any project which requires copyright assignment (or an equivalent full-license grant) changes the equation; it is not being nice. There is some behind-the-scenes activity going on now which may well make things worse.

Copyright assignment does not normally deprive a contributor of the right to use the contributed software as he or she may wish. But it reserves to the corporation receiving the assignments the right to make decisions regarding the complete work. We as a community have traditionally cared a lot about licenses, but we have been less concerned about the conditions that others have to accept. Copyright assignment policies are a barrier to entry to anybody else who would work with the software in question. These policies also disrupt the balance between developers and "suit wearers," and it creates FUD around free software license practices.

Many people draw a distinction between projects owned by for-profit corporations and those owned by foundations. But even assignment policies of the variety used by the Free Software Foundation have their problems. Consider, Michael said, the split between emacs and xemacs; why does xemacs continue to exist? One reason is that a good chunk of xemacs code is owned by Sun, and Sun (along with its successor) is unwilling to assign copyright to the FSF. But there is also a group of developers out there who think that it's a good thing to have a version of emacs for which copyright assignment is not required. Michael also said that the FSF policy sets a bad example, one which companies pushing assignment policies have been quick to take advantage of.

Michael mentioned a study entitled "The Best of Strangers" which focused on the willingness to give out personal information. All participants were given a questionnaire with a long list of increasingly invasive questions; the researchers cared little about the answers, but were quite interested in how far participants got before deciding they were not willing to answer anymore. Some participants received, at the outset, a strongly-worded policy full of privacy assurances; they provided very little information. Participants who did not receive that policy got rather further through the questionnaire, while those who were pointed to a questionnaire on a web site filled it in completely. Starting with the legalese ruined the participants' trust and made them unwilling to talk about themselves.

Michael said that a similar dynamic applies to contributors to a free software project; if they are confronted with a document full of legalese on the first day, their trust in the project will suffer and they may just walk away. He pointed out the recently-created systemd project's policy, paraphrased as "because we value your contributions, we require no copyright assignments," as the way to encourage contributors and earn their trust.

Assignment agreements are harmful to the hacker/suit balance. If you work for a company, Michael said, your pet project is already probably owned by the boss. This can be a problem; as managers work their way into the system, they tend to lose track of the impact of what they do. They also tend to deal with other companies in unpleasant ways which we do not normally see at the development level; the last thing we want to do is to let these managers import "corporate aggression" into our community. If suits start making collaboration decisions, the results are not always going to be a positive thing for our community; they can also introduce a great deal of delay into the process. Inter-corporation agreements tend to be confidential and can pop up in strange ways; the freedom to fork a specific project may well be compromised by an agreement involving the company which owns the code. When somebody starts pushing inter-corporation agreements regarding code contributions and ownership, we need to be concerned.

Michael cited the agreements around the open-sourcing of the openSPARC architecture as one example of how things can go wrong. Another is the flurry of lawsuits in the mobile area; those are likely to divide companies into competing camps and destroy the solidarity we have at the development level.

Given all this, he asked, why would anybody sign such an agreement? The freedom to change the license is one often-cited reason; Michael says that using permissive licenses or "plus licenses" (those which allow "any later version") as a better way of addressing that problem. The ability to offer indemnification is another reason, but indemnification is entirely orthogonal to ownership. One still hears the claim full ownership is required to be able to go after infringers, but that has been decisively proved to be false at this point. There is also an occasional appeal to weird local laws; Michael dismissed those as silly and self serving. There is, he says, something else going on.

What works best, he says, is when the license itself is the contributor agreement. "Inbound" and "outbound" licensing, where everybody has the same rights, is best.

But not everybody is convinced of that. Michael warned that there is "a sustained marketing drive coming" to push the copyright-assignment agenda. While we were sitting in the audience, he said, somebody was calling our bosses. They'll be saying that copyright assignment policies are required for companies to be willing to invest in non-sexy projects. But the fact of the matter is that almost all of the stack, many parts of which lack sexiness, is not owned by corporations. "All cleanly-written software," Michael says, "is sexy." Our bosses will hear that copyright assignment is required for companies to get outside investment; it's the only way they can pursue the famous MySQL model. But we should not let monopolistic companies claim that their business plans are good for free software; beyond that, Michael suggested that the MySQL model may not look as good as it did a year or two ago. Managers will be told that only assignment-based projects are successful. One only need to look at the list of successful projects, starting with the Linux kernel, to see the falseness of that claim.

Instead, Michael says, having a single company doing all of the heavy lifting is the sign of a project without a real community. It is an indicator of risk. People are figuring this out; that is why we're seeing in increasing number of single-company projects being forked and rewritten. Examples include xpdf and poppler, libart_lgpl and cairo, MySQL and Maria. There are a number of companies, Novell and Red Hat included, which are dismantling the copyright-assignment policies they used to maintain.

At this point, Michael decided that we'd had enough and needed a brief technical break. So he talked about Git: the LibreOffice project likes to work with shallow clones because the full history is so huge. But it's not possible to push patches from a shallow clone, that is a pain. Michael also noted that git am is obnoxious to use. On the other hand, he says, the valgrind DHAT tool is a wonderful way of analyzing heap memory usage patterns and finding bugs. Valgrind, he says, does not get anywhere near enough attention. There was also some brief talk of "component-based everything" architecture and some work the project is doing to facilitate parallel contribution.

The conclusion, though, came back to copyright assignment. We need to prepare for the marketing push, which could cause well-meaning people to do dumb things. It's time for developers to talk to their bosses and make it clear that copyright assignment policies are not the way toward successful projects. Before we contribute to a project, he said, we need to check more than the license; we need to look at what others will be able to do with the code. We should be more ungrateful toward corporations which seek to dominate development projects and get involved with more open alternatives.

One of those alternatives, it went without saying, is the LibreOffice project. LibreOffice is trying to build a vibrant community which resembles the kernel community. But it will be more fun: the kernel, Michael said, "is done" while LibreOffice is far from done. There is a lot of low-hanging fruit and many opportunities for interesting projects. And, if that's not enough, developers should consider that every bit of memory saved will be multiplied across millions of LibreOffice users; what better way can there be to offset one's carbon footprint? So, he said, please come and help; it's an exciting time to be working with LibreOffice.

Comments (54 posted)

LPC: Life after X

By Jonathan Corbet
November 5, 2010

Keith Packard has probably done more work to put the X Window System onto our desks than just about anybody else. With some 25 years of history, X has had a good run, but nothing is forever. Is that run coming to an end, and what might come after? In his Linux Plumbers Conference talk, Keith claimed to have no control over how things might go, but he did have some ideas. Those ideas add up to an interesting vision of our graphical future.

We have reached a point where we are running graphical applications on a wide variety of systems. There is the classic desktop environment that X was born into, but that is just the beginning. Mobile systems have become increasingly powerful and are displacing desktops in a number of situations. Media-specific devices have display requirements of their own. We are seeing graphical applications in vehicles, and in a number of other embedded situations.

Keith asked: how many of these applications care about network transparency, which was one of the original headline features of X? How many of them care about ICCCM compliance? How many of them care about X at all? The answer to all of those questions, of course, is "very few." Instead, developers designing these systems are more likely to resent X for its complexity, for its memory and CPU footprint, and for its contribution to lengthy boot times. They would happily get rid of it. Keith says that he means to accommodate them without wrecking things for the rest of us.

Toward a non-X future

For better or for worse, there is currently a wide variety of rendering APIs to choose from when writing graphical libraries. According to Keith, only two of them are interesting. For video rendering, there's the VDPAU/VAAPI pair; for everything else, there's OpenGL. Nothing else really matters going forward.

In the era of direct rendering, neither of those APIs really depends on X. So what is X good for? There is still a lot which is done in the X server, starting with video mode setting. Much of that work has been moved into the kernel, at least for graphics chipsets from the "big three," but X still does it for the rest. If you still want to do boring 2D graphics, X is there for you - as Keith put it, we all love ugly lines and lumpy text. Input is still very much handled in X; the kernel's evdev interface does some of it but falls far short of doing the whole job. Key mapping is done in X; again, what's provided by the kernel in this area is "primitive." X handles clipping when application windows overlap each other; it also takes care of 3D object management via the GLX extension.

These tasks have a lot to do with why the X server is still in charge of our screens. Traditionally mode setting has been a big and hairy task, with the requisite code being buried deep within the X server; that has put up a big barrier to entry to any competing window systems. The clipping job had to be done somewhere. The management of video memory was done in the X server, leading to a situation where only the server gets to take advantage of any sort of persistent video memory. X is also there to make external window managers (and, later, compositing managers) work.

But things have changed in the 25 years or so since work began on X. Back in 1985, Unix systems did not support shared libraries; if the user ran two applications linked to the same library, there would be two copies of that library in memory, which was a scarce resource in those days. So it made a lot of sense to put graphics code into a central server (X), where it could be shared among applications. We no longer need to do things that way; our systems have gotten much better at sharing code which appears in different address spaces.

We also have much more complex applications - back then xterm was just about all there was. These applications manipulate a lot more graphical data, and almost every operation involves images. Remote applications are implemented with protocols like HTTP; there is little need to use the X protocol for that purpose anymore. We have graphical toolkits which can implement dynamic themes, so it is no longer necessary to run a separate window manager to impose a theme on the system. It is a lot easier to make the system respond "quickly enough"; a lot of hackery in the X server (such as the "mouse ahead" feature) was designed for a time when systems were much less responsive. And we have color screens now; they were scarce and expensive in the early days of X.

Over time, the window system has been split apart into multiple pieces - the X server, the window manager, the compositing manager, etc. All of these pieces are linked by complex, asynchronous protocols. Performance suffers as a result; for example, every keystroke must pass through at least three processes: the application, the X server, and the compositing manager. But we don't need to do things that way any more; we can simplify the architecture and improve responsiveness. There are some unsolved problems associated with removing all these processes - it's not clear how all of the fancy 3D bling provided by window/compositing managers like compiz can be implemented - but maybe we don't need all of that.

What about remote applications in an X-free world? Keith suggests that there is little need for X-style network transparency anymore. One of the early uses for network transparency was applications oriented around forms and dialog boxes; those are all implemented with web browsers now. For other applications, tools like VNC and rdesktop work and perform better than native X. Technologies like WiDi (Intel's Wireless Display) can also handle remote display needs in some situations.

Work to do

So maybe we can get rid of X, but, as described above, there are still a number of important things done by the X server. If X goes, those functions need to be handled elsewhere. Mode setting is going to into the kernel, but there are still a lot of devices without kernel mode setting (KMS) support. Somebody will have to implement KMS drivers for those devices, or they may eventually stop working. Input device support is partly handled by evdev. Graphical memory management is now handled in the kernel by GEM in a number of cases. In other words, things are moving into the kernel - Keith seemed pleased at the notion of making all of the functionality be somebody else's problem.

Some things are missing, though. Proper key mapping is one of them; that cannot (or should not) all be done in the kernel. Work is afoot to create a "libxkbcommon" library so that key mapping could be incorporated into applications directly. Accessibility work - mouse keys and sticky keys, for example - also needs to be handled in user space somewhere. The input driver problem is not completely solved; complicated devices (like touchpads) need user-space support. Some things need to be made cheaper, a task that can mostly be accomplished by replacing APIs with more efficient variants. So GLX can be replaced by EGL, in many cases, GLES can can be used instead of OpenGL, and VDPAU is an improvement over Xv. There is also the little problem of mixing X and non-X applications while providing a unified user experience.

Keith reflected on some of the unintended benefits that have come from the development work done in recent years; many of these will prove helpful going forward. Compositing, for example, was added as a way of adding fancy effects to 2D applications. Once the X developers had compositing, though, they realized that it enabled the rendering of windows without clipping, simplifying things considerably. It also separated rendering from changing on-screen content - two tasks which had been tightly tied before - making rendering more broadly useful. The GEM code had a number of goals, including making video memory pageable, enabling zero-copy texture creation from pixmaps, and the management of persistent 3D objects. Along with GEM came lockless direct rendering, improving performance and making it possible to run multiple window systems with no performance hit. Kernel mode setting was designed to make graphical setup more reliable and to enable the display of kernel panic messages, but KMS also made it easy to implement alternative window systems - or to run applications with no window system at all. EGL was designed to enable porting of applications between platforms; it also enabled running those application on non-X window systems and the dumping of the expensive GLX buffer sharing scheme.

Keith put up two pictures showing the organization of graphics on Linux. In the "before" picture, a pile of rendering interfaces can be seen all talking to the X server, which is at the center of the universe. In the "after" scene, instead, the Linux kernel sits in the middle, and window systems like X and Wayland are off in the corner, little more than special applications. When we get to "after," we'll have a much-simplified graphics system offering more flexibility and better performance.

Getting there will require getting a few more things done, naturally. There is still work to be done to fully integrate GL and VDPAU into the system. The input driver problem needs to be solved, as does the question of KMS support for video adaptors from other than the "big three" vendors. If we get rid of window managers somebody else has to do that work; Windows and Mac OS push that task into applications, maybe we should too. But, otherwise, this future is already mostly here. It is possible, for example, to run X as a client of Wayland - or vice versa. The post-X era is beginning.

Comments (182 posted)

Ghosts of Unix past, part 2: Conflated designs

November 4, 2010

This article was contributed by Neil Brown

In the first article in this series, we commenced our historical search for design patterns in Linux and Unix by illuminating the "Full exploitation" pattern which provides a significant contribution to the strength of Unix. In this second part we will look at the first of three patterns which characterize some design decisions that didn't work out so well.

The fact that these design decisions are still with us and worth talking about shows that their weaknesses were not immediately obvious and, additionally, that these designs lasted long enough to become sufficiently entrenched that simply replacing them would cause more harm than good. With these types of design issues, early warning is vitally important. The study of these patterns can only serve if they help us to avoid similar mistakes early enough. If they only allow us to classify that which we cannot avoid, there would be little point in studying them at all.

These three patterns are ordered from the one which seems to give most predictive power to that which is least valuable as an early warning. But hopefully the ending note will not be one of complete despair - any guidance in preparing for the future is surely better than none.

Conflated Designs

This week's pattern is exposed using two design decisions which were present in early Unix and have been followed by a series of fixes which have address most of the resulting difficulties. By understanding the underlying reason that the fixes were needed, we can hope to avoid future designs which would need such fixing. The first of these design decisions is taken from the implementation of the single namespace discussed in part 1.

The mount command

The central tool for implementing a single namespace is the 'mount' command, which makes the contents of a disk drive available as a filesystem and attaches that filesystem to the existing namespace. The flaw in this design which exemplifies this pattern is the word 'and' in that description. The 'mount' command performs two separate actions in one command. Firstly it makes the contents of a storage device appear as a filesystem, and secondly it binds that filesystem into the namespace. These two steps must always be done together, and cannot be separated. Similarly the unmount command performs the two reverse actions of unbinding from the namespace and deactivating the filesystem. These are, or at least were, inextricably combined and if one failed for some reason, the other would not be attempted.

It may seem at first that it is perfectly natural to combine these two operations and there is no value in separating them. History, however, suggests otherwise. Considerable effort has gone into separating these operations from each other.

Since version 2.4.11 (released in 2001), Linux has a 'lazy' version of unmount. This unbinds a filesystem from the namespace without insisting on deactivating it at the same time. This goes some way to splitting out the two functional aspects of the original unmount. The 'lazy' unmount is particularly useful when a filesystem has started to fail for some reason, a common example being an NFS filesystem from a server which is no longer accessible. It may not be possible to deactivate the filesystem as there could well be processes with open files on the filesystem. But at least with lazy unmounted it can be removed from the namespace so new processes wont be able to try to open files and so get stuck.

As well as 'lazy' unmounts, Linux developers have found it useful to add 'bind' mounts and 'move' mounts. These allow one part of the name space to be bound to another part of the namespace (so it appears twice) or a filesystem to be moved from one location to another — effectively a 'bind' mount followed by a 'lazy' unmount. Finally we have a pivot_root() system call which performs a slightly complicated dance between two filesystem starting out with the first being the root filesystem and the second being a normal mounted file system, and ending with the second being the root and the first being mounted somewhere else in that root.

It might seem that all of the issues with combining the two functions into a single 'mount' operation have been adequately resolved in the natural course of development, but it is hard to be convinced of this. The collection of namespace manipulation functions that we now have is quite ad hoc and so, while it seems to meet current needs, there can be no certainty that it is in any sense complete. A hint of this incompleteness can be seen in the fact that, once you perform a lazy unmount, the filesystem may well still exist, but it is no longer possible to manipulate it as it does not have a name in the global namespace, and all current manipulation operations require such a name. This makes it difficult to perform a 'forced' unmount after a 'lazy' unmount.

To see what a complete interface would look like we would need to exploit the design concept discussed last week: "everything can have a file descriptor". Had that pattern been imposed on the design of the mount system call we would likely have:

A mount call that simply returned a file descriptor for the file system.
A bind call that connected a file descriptor into the namespace, and
An unmount call that disconnected a filesystem and returned a file descriptor.

This simple set would easily provide all the functionality that we currently have in an arguably more natural way. For example the functionality currently provided by the special-purpose pivot_root() system call could be achieve with the above with at most the addition of fchroot(), an obvious analogue of fchdir() and chroot().

One of the many strengths of Unix - particularly seen in the set of tools that came with the kernel - is the principle of building and then combining tools. Each tool should do one thing and do it well. These tools can then be combined in various ways, often to achieve ends that the tool developer could not have foreseen. Unfortunately the same discipline was not maintained with the mount() system call.

So this pattern is to some extent the opposite of the 'tools approach'. It needs a better name than that, though; a good choice seems to be to call it a "conflated design". One dictionary (PJC) defines "conflate" as "to ignore distinctions between, by treating two or more distinguishable objects or ideas as one", which seems to sum up the pattern quite well.

The open() system call.

Our second example of a conflated design is found in the open() system call. This system call (in Linux) takes 13 distinct flags which modify its behavior, adding or removing elements of functionality - multiple concepts are thus combined in the one system call. Much of this combination does not imply a conflated design. Several of the flags can be set or cleared independently of the open() using the F_SETFL option to fcntl(). Thus while they are commonly combined, they are easily separated and so need not be considered to be conflated.

Three elements of the open() call are worthy of particular attention in the current context. They are O_TRUNC, O_CLOEXEC and O_NONBLOCK.

In early versions of Unix, up to and including Level 7, opening with O_TRUNC was the only way to truncate a file and, consequently, it could only be truncated to become empty. Partial truncation was not possible. Having truncation intrinsically tied to open() is exactly the sort of conflated design that should be avoided and, fortunately, it is easy to recognize. BSD Unix introduced the ftruncate() system call which allows a file to be truncated after it has been opened and, additionally, allows the new size to be any arbitrary value, including values greater than the current file size. Thus that conflation was easily resolved.

O_CLOEXEC has a more subtle story. The standard behavior of the exec() system call (which causes a process to stop running one program and to start running another) is that all file descriptors available before the exec() are equally available afterward. This behavior can be changed, quite separately from the open() call which created the file descriptor, with another fcntl() call. For a long time this appeared to be a perfectly satisfactory arrangement.

However the advent of threads, where multiple processes could share their file descriptors (so when one thread or process opens a file, all threads in the group can see the file descriptor immediately), made room for a potential race. If one process opens a file with the intent of setting the close-on-exec flag immediately, and another process performs an exec() (which causes the file table to not be shared any more), the new program in the second process will inherit a file descriptor which it should not. In response to this problem, the recently-added O_CLOEXEC flag causes open() to mark the file descriptor as close-on-exec atomically with the open so there can be no leakage.

It could be argued that creating a file descriptor and allowing it to be preserved across an exec() should be two separate operations. That is, the default should have been to not keep a file descriptor open across exec(), and a special request would be needed to preserve it. However foreseeing the problems of threads when first designing open() would be beyond reasonable expectations, and even to have considered the effects on open() when adding the ability to share file tables would be a bit much to ask.

The main point of the O_CLOEXEC example then is to acknowledge that recognizing a conflated design early can be very hard, which hopefully will be an encouragement to put more effort in reviewing a design for these sorts of problems.

The third flag of interest is O_NONBLOCK. This flag is itself conflated, but also shows conflation within open(). In Linux, O_NONBLOCK has two quite separate, though superficially similar, meanings.

Firstly, O_NONBLOCK affects all read or write operations on the file descriptor, allowing them to return immediately after processing less data than requested, or even none at all. This functionality can separately be enabled or disabled with fcntl() and so is of little further interest.

The other function of O_NONBLOCK is to cause the open() itself not to block. This has a variety of different effects depending on the circumstances. When opening a named pipe for write, the open will fail rather than block if there are no readers. When opening a named pipe for read, the open will succeed rather than block, and reads will then return an error until some process writes something into the pipe. On CDROM devices an open for read with O_NONBLOCK will also succeed but no disk checks will be performed and so no reads will be possible. Rather the file descriptor can only be used for ioctl() commands such as to poll for the presence of media or to open or close the CDROM tray.

The last gives a hint concerning another aspect of open() which is conflated. Allocating a file descriptor to refer to a file and preparing that file for I/O are conceptually two separate operations. They certainly are often combined and including them both in the one system call can make sense. Requiring them to be combined is where the problem lies.

If it were possible to get a file descriptor on a given file (or device) without waiting for or triggering any action within that file, and, subsequently, to request the file be readied for I/O, then a number of subtle issues would be resolved. In particular there are various races possible between checking that a file is of a particular type and opening that file. If the file was renamed between these two operations, the program might suffer unexpected consequences of the open. The O_DIRECTORY flag was created precisely to avoid this sort of race, but it only serves when the program is expecting to open a directory. This race could be simply and universally avoided if these two stages of opening a file were easily separable.

A strong parallel can be seen between this issue and the 'socket' API for creating network connections. Sockets are created almost completely uninitialized; thereafter a number of aspects of the socket can be tuned (with e.g. bind() or setsockopt()) before the socket is finally connected.

In both the file and socket cases there is sometimes value in being able to set up or verify some aspects of a connection before the connection is effected. However with open() it is not really possible in general to separate the two.

It is worth noting here that opening a file with the 'flags' set to '3' (which is normally an invalid value) can sometimes have a similar meaning to O_NONBLOCK in that no particular read or write access is requested. Clearly developers see a need here but we still don't have a uniform way to be certain of getting a file descriptor without causing any access to the device, or a way to upgrade a file descriptor from having no read/write access to having that access.

As we saw, most of the difficulties caused by conflated design, at least in these two examples, have been addressed over time. It could therefore be argued that as there is minimal ongoing pain, the pattern should not be a serious concern. That argument though would miss two important points. Firstly they have already caused pain over many years. This could well have discouraged people from using the whole system and so reduce the overall involvement in, and growth of, the Unix ecosystem.

Secondly, though the worst offenses have largely been fixed, the result is not as neat and orthogonal as it could be. As we saw during the exploration, there are some elements of functionality that have not yet been separated out. This is largely because there is no clear need for them. However we often find that a use for a particular element of functionality only presents itself once the functionality is already available. So by not having all the elements cleanly separated we might be missing out on some particular useful tools without realizing it.

There are undoubtedly other areas of Unix or Linux design where multiple concepts have been conflated into a single operation, however the point here is not to enumerate all of the flaws in Unix. Rather it is to illustrate the ease with which separate concepts can be combined without even noticing it, and the difficulty (in some cases) of separating them after the fact. This hopefully will be an encouragement to future designers to be aware of the separate steps involved in a complex operation and to allow - where meaningful - those steps to be performed separately if desired.

Next week we will continue this exploration and describe a pattern of misdesign that is significantly harder to detect early, and appears to be significantly harder to fix late. Meanwhile, following are some exercises that may be used to explore conflated designed more deeply.

Exercises.

Explain why open() with O_CREAT benefits from an O_EXCL flag, but other system calls which create filesystem entries (mkdir(), mknod(), link(), etc) do not need such a flag. Determine if there is any conflation implied by this difference.
Explore the possibilities of the hypothetical bind() call that attaches a file descriptor to a location in the namespace. What other file descriptor types might this make sense for, and what might the result mean in each case.
Identify one or more design aspects in the IP protocol suite which show conflated design and explain the negative consequences of this conflation.

Ghosts of Unix past, part 3: Unfixable designs

Comments (36 posted)

Page editor: Jonathan Corbet
Next page: Security>>