Leading items

MeeGo conference: Intel's and Nokia's visions of MeeGo

By Jake Edge
November 17, 2010

We are just at the beginning of a massive change in the way we use computers, and traditional desktops and laptops will be giving way to more and more internet-connected devices—that's the vision presented in two keynotes at the first ever MeeGo conference. But in order for that vision to come about, there needs to be an open environment, where both hardware and software developers can create new devices and applications, without the innovation being controlled—often stifled—by a single vendor's wishes. Doug Fisher, Intel's VP of the Software and Services Group, and Nokia's Alberto Torres, Executive VP for MeeGo Computers, took different approaches to delivering that message, but their talks were promoting the same theme.

The conference was held November 15-17 at Aviva Stadium in Dublin, Ireland and hosted many more developers than conference organizers originally expected. It was very well put on, and at an eye-opening venue, which bodes well for future conferences. One that is more industry-focused is currently planned for May in San Francisco, while another developer-focused event is tentatively scheduled for November 2011 for somewhere other than the US.

"Strategic Freedom with MeeGo"

After an introduction by conference program committee chair Dirk Hohndel, Fisher kicked off his talk with a rueful reminiscence of his talk at the 2005 Ottawa Linux Symposium, where the person running the slide deck exited his presentation at the end, which put up a Windows desktop on the screen. That wasn't particularly popular with the assembled Linux crowd, so he was careful to show that he was presenting his slides using OpenOffice.org Impress on MeeGo this time.

Over the next few years, there will be one billion new internet-connected users and 15 billion connected devices, Fisher said. Intel and MeeGo want to ensure that they meet the needs of that growing market. It is these new devices that will be the main mechanism for connecting with the internet. They will "surpass the traditional way you interact with the internet". And we are "just at the beginning of where this device environment is going to go".

There are two models that are being proposed for this new environment, one that is controlled versus one that is open. The controlled environment is one where a "single vendor provides the whole solution". But lots of people that want to innovate are outside of the box that the vendor has set up. In these closed environments, business models and the implementation of business models are controlled.

But, "the only way you can scale to all of those devices is to have an open environment", Fisher said. In the book Where Good Ideas Come From, author Steven Johnson "debunks the myth that great ideas come from a single person". Instead, it is a "social process as much as a technology process" to come up with these great ideas. Because we don't have any time to waste to build this new device environment, "we have to be able to work together".

"A controlled environment with a box around it will not be able to scale", to the vast array of devices and device types that are coming. But, Fisher cautioned, an open environment should not lead to fragmentation. There is a responsibility to make the platform consistent, so that companies can depend on it and make investments in it.

That is why MeeGo was moved under the Linux Foundation (LF), so that the LF can be "the steward of MeeGo". The governance of MeeGo is modeled after how Linux is governed; there is no membership required and it is architected in an open way. Both Intel and ARM chips are supported, and MeeGo is constructed to "ensure we meet the needs of a broad type of platforms".

Inclusion, meritocracy, transparency, and upstream first

Fisher then turned the stage over to Carsten Munk, who is known for his work on Nokia's Maemo and on the MeeGo N900 port. MeeGo "is trying to do something that has never been done before", Munk said, and there are four key elements to making it work: inclusion, meritocracy, transparency, and upstream first. The inclusive nature of MeeGo was embodied in the fact that he was on-stage with an Intel executive, as an independent developer who works on MeeGo ARM. "The MeeGo way is to include people", he said.

When asked by Fisher if the project had been living up to the four ideals, Munk said that it was "getting better over the last 8-9 months", but that "not everything is perfect". There have been arguments over governance and the like over that time, but the community is still figuring things out. In addition to developing MeeGo as an OS and MeeGo applications, the project is developing "the MeeGo way of working".

The upstream-first policy is "really important to avoid fragmentation", Fisher said after Munk left the stage. Avoiding fragmentation is critical for users and developers. Users want to be able to run their applications consistently on multiple devices, while developers want to be sure they can move to different vendors without rewriting their applications.

MeeGo is an OS that vendors can take and do what they want with it, but in order to call it MeeGo, it must be compliant with the MeeGo requirements. That ensures there is a single environment for developers. They can move their code from vendor to vendor, while avoiding the rework and revalidation that currently is required for embedded and other applications.

Intel wants to deliver the best operating environment for MeeGo, and power the best devices, which is why it has invested in the low-power Atom chip. As an example, he pointed to netbooks that are just getting better, some of which have MeeGo on them. There will be more and more of those in 2011 and 2012, Fisher said. In addition, Intel worked closely with Amino Communications on a MeeGo-based television set top box. What would normally take Amino 18 months to deliver was done in six using MeeGo.

One of the strengths of MeeGo is that in addition to allowing multiple vendors to use it, it also enables multiple device types. Intel was involved in helping with the MeeGo netbooks and set-top box that he mentioned, but he also listed two other vendors using MeeGo, where Intel wasn't involved at all. A German company that made a MeeGo-based tablet as well as a company in China doing in-vehicle-infotainment (IVI) systems in cars that are shipping now are examples of the "power of open source", he said. They took the code and made it work for their devices and customers without having to ask for permission. The MeeGo community is going to be responsible for keeping that kind of innovation happening, he said.

One of the visions for MeeGo devices that was presented in a video at the beginning of the talk was the ability to move audio and video content between these devices. The idea is that someone can be watching a movie or listening to some music and move it to other devices, share it with their friends, and so on. Fisher had someone from Intel demonstrate a prototype of that functionality, where a video was paused on a netbook, restarted on a TV, then moved from there to a tablet.

That is an example of "the kind of innovation we need to drive into MeeGo", Fisher said. It's not just something that is unique and innovative on a single device but, because it is MeeGo, it can move between various devices from multiple vendors. It is a "compelling and challenging opportunity". Though it is an exciting vision for the future, there is still a potentially insurmountable challenge which Fisher left unsaid: finding a way to get the content industries on board with that kind of ubiquitous playback and sharing.

It turned out that the MeeGo tablet used in the demo was a Lenovo IdeaPad—an Atom-powered tablet/netbook. Fisher said that one lucky developer in attendance would be receiving one. When the envelope was opened, though, the name on the inside was "Everyone", so Intel would be giving each conference attendee an IdeaPad. He left it to Hohndel to later deliver the bad news to the roughly 200 Intel and Nokia employees in attendance; there would be no tablets for those folks.

"MeeGo Momentum and the Qt App Advantage"

Torres started his talk by "dispelling rumors" that Nokia might not be committed to MeeGo. He pointed to comments made by new CEO Stephen Elop that reiterated Nokia's commitment. Nokia plans to deliver a "new user experience" using MeeGo, Torres said. Furthermore, he believes that we are "redefining the future of computing" with the advent of widespread internet-connected mobile devices, and MeeGo has all the elements to foster that redefinition.

He looked back at some of the history of computers, noting that in the 1940s IBM's Thomas Watson suggested there was a total worldwide market for five computers. Since that time, the market has grown a bit, but that the command line limited the use of computers to fairly technical users. In the 1970s, when Xerox PARC adopted the mouse and an interface with windows and icons, that really changed things. That interface is a far more human way to interact with a computer, and it is largely the same interface that we have today.

Moving away from the command line meant that you didn't have to be an expert to use a computer and got people "starting to think about every home having a computer". Today, almost every home in the developed world does have a computer. Beyond that, smartphones are computers in our pockets, which allows computers to go places they never went before. But we haven't figured out major new ways to interact with those devices. That is good, because it allows us to define it, he said.

There are advances being made in touch devices using gestures and in motion-sensing gaming interfaces, both of which are more natural to use. He said that his daughter, who is not yet 2 years old, can do things with his smartphone, like use the photo gallery application. Gestures are "bringing computing to a level that is far more intuitive", which is leading to the idea of even more computers in the home. We may not call them computers, he said, but instead they will be called cars or TVs.

All of these different devices need to work together in an integrated way, with interfaces that work in a "human way". One of the strengths of MeeGo is that it was created from the start to go on all of these different kinds of devices. He believes we are going to see a proliferation of devices with MeeGo, and with many different interaction models: driving a car, playing a game or video in the back of the car, at home watching TV, and so on.

Qt for application development

Torres then shifted gears a bit to talk about Qt. It is much more than just a library, he said, it is a development platform incorporating things like database access, network connectivity, inter-object communication, WebKit integration, and more. He said that Qt enables C++ programmers to be four times more productive in developing code, and he expects the addition of Qt Declarative UI to increase that, perhaps as far as a 10x productivity increase.

Qt is also multi-platform and is used "everywhere". It started out as a desktop platform, but is on "all kinds of devices today". As an example of that, he had another Nokia employee demonstrate the same application running on MeeGo, Windows, Symbian, and embedded Linux. The animated photo browsing application was developed using Qt Quick, and could be run, unmodified, on each of the platforms. A Qt Quick application can be placed on a USB stick and moved between the various devices.

Nokia is a company that makes devices, and it "wants to put devices into people's hands that they fall in love with". MeeGo offers them a great opportunity to do that because of its "unique innovation model", which includes both openness and differentiation. Companies like Nokia, mobile phone carriers, TV makers, and so on can add things on top of the MeeGo platform to make themselves stand out. It might be a different user experience or add-on services that are added to differentiate the device, but that can be done on top of a non-fragmented platform with stable APIs. This allows those companies to express their creativity and brand without fragmentation.

The plan for Nokia is to provide "delicious hardware", with great connectivity, and a "fantastic user experience" on top. He again noted Nokia CEO Elop's statement that Nokia would be delivering a new standard for user experience on mobile devices. There are those who think that the user experience for devices has already been decided, but he pointed out that it took decades to decide on the standard interface for driving a car—"and we may not be done", noting that alternatives for car interfaces may be on the horizon.

"Creating a set of devices that are so cool that developers want to develop for them" is the approach Nokia and others are taking with MeeGo, Torres said. Some of those devices will be announced by Nokia in 2011. Given the growth in the MeeGo community, Torres joked that next year's MeeGo developer conference might need to use the outdoor part of the stadium to hold all of the attendees.

While there was much of interest in the visions presented, it is still an open question how many hackable MeeGo devices will become available. There wasn't anything said in the keynotes about devices that can be altered by users with their own ideas of how their MeeGo device should work. Instead, the focus was clearly on the kinds of things that MeeGo enables device manufacturers to do, without any real nod toward user freedoms. With luck, there will be some device makers who recognize the importance of free devices and will deliver some with MeeGo.

Comments (46 posted)

MeeGo beyond the mobile device

November 17, 2010

This article was contributed by Nathan Willis

The majority of the sessions (and indeed, attendees) at the MeeGo Conference in Dublin were focused on the handheld and netbook form factors, because the project emerged from the union of Intel's netbook-oriented Moblin and Nokia's handheld Maemo distributions. As a result it is easy to overlook the fact that the project has added several significantly different target platforms since its inception in February. The "connected TV" and "in-vehicle infotainment" (IVI) platforms share a few common factors with handheld devices, such as near-instant-on boot requirements and remote-management capabilities, but as Monday's talks explained, they also stretch the MeeGo software stack at almost every level, from non-PC hardware support, to different audio and video middleware, to different user interfaces and I/O devices.

Set-top Linux

Dominique Le Foll of Cambridge, UK-based Amino Communication presented two talks about his company's work on the connected TV user experience (UX) for MeeGo. Amino builds MeeGo-based set-top boxes for Europe and North America, generally tailored for television service providers. Le Foll's first talk was one of the Monday-morning keynotes, and focused on Amino's decision to build its products on a "full Linux distribution" rather than a stripped-down embedded Linux platform.

MeeGo's structure as a full distribution lowers the company's development costs, he said, because it permits the team to automatically stay compatible with upstream projects. In contrast, typical embedded distributions tend to use a reduced set of packages and libraries, and usually take the freeze-and-fork approach to what they do include, thus forcing the developers to spend time backporting bug fixes and major updates. In addition, he said, building the company's products — which includes custom applications written for each customer — takes less development time, because they can use the standard desktop Linux development tools, and easily build on top of desktop projects that are rarely included in embedded distributions, such as VoIP, video conferencing, and social networking.

Le Foll's second talk focused more in-depth on the MeeGo software stack and what it needs to become a ready-to-deploy set-top box platform. The five "required" services all set-top devices need to support, he said, are live, broadcast television (based on DVB or ATSC program delivery), Internet video, access to home content (including video, audio, and other media), video-on-demand (VOD) service, and third-party, easy-to-install "apps" of the kind currently popular on consumer smartphones. Each brings its share of challenges to the MeeGo platform.

Broadcast television and VOD services both require some security mechanism with which service providers can implement mandatory access control on specific content streams. This includes DRM and hardware-chain-of-trust as well software modules that can prevent unauthorized applications from accessing protected content or driving special-purpose hardware. Internet video requires, yes, Adobe Flash support — specifically Flash support capable of running on the lower-resource system-on-chip hardware typically used to build set-top boxes.

Access to home content entails seamless playback of a glut of different, often unpredictable video and audio formats, which Le Foll suggested would best be handled by a single unified media-playback application that is decoupled from the content sources. The player, he argued, should not need to know whether the video is coming in live from an antenna, being streamed over IPTV or RTP, or is stored on a network drive. Amino uses GStreamer in its products, and says that it is capable of playing all of the necessary codecs, including broadcast HDTV, but that it lacks a few critical pieces, such as hardware video acceleration and integrated multi-language and subtitle/caption support. Here again, he said, the real need is for a simple playback application that can play back European teletext, US-style closed captioning, and DVD subtitles, without caring which format the underlying source originated in.

Regarding the access control measures and Flash, Le Foll was considering MeeGo set-top boxes as commercial products, of course, to be built by OS-integrators like Amino and sold and deployed by cable companies, satellite TV providers, IPTV distributors, and other content service providers. Do-it-yourself types with an aversion to Flash and no interest in DRM might bristle at the thought of adding them to a Linux distribution, but they would be under no obligation to make use of them, a point which Le Foll clarified in response to an audience question.

On top of the low-level media support, he added, there are several "invisible" things that MeeGo needs to add in order to be a robust connected-TV platform. These include support for remote software updates, automatic backup-and-recovery, and other management tasks that would be infeasible to require non-technical users to perform on their own, and difficult to execute with an infrared remote control. In many countries, he continued, there are legal certification requirements for set-top boxes that entail technical features, such as interfacing with the local emergency broadcast services. Support for infrared remotes is another area in which MeeGo needs significant development, he added, a feature that touches on both hardware drivers and the user interface. Set-top box products demand IR remotes and easy-to-decipher interfaces that can be used from ten feet ("or three meters") away on the couch. Though touch-screen support and gesture interfaces are all the rage in mobile MeeGo device development, he said, they are useless in the set-top environment.

Perhaps the most interesting feature in Le Foll's list of five required services is support for end user "apps." This, he explained, is the most oft-requested feature of the television service providers, who have watched the success of Apple's App Store on the iPhone with envy. In recent years, service providers have tried a number of means to dissuade customers from switching services, including (most recently) "bundling" television service with phone service and Internet access, and all have failed. They are now looking to differentiating their service from the competition with apps on set-top boxes, Le Foll said, which makes MeeGo positioned exceptionally well to meet their needs. For open source developers, this opens up the possibility of developing MeeGo applications for handsets and netbooks that will also run, unaltered, on the next generation of set-top boxes.

Vehicular MeeGo

Another challenging difference in the set-top box environment that Le Foll touched on in his talks is that netbooks and handhelds are essentially single-user devices — while the TV and home theater are shared by the entire household. This distinction has an effect on all sorts of applications, from privacy concerns to customization issues, that developers need to consider when porting their code to the new environment.

The same is true of the IVI platform; not only can one vehicle be driven by many members of a household, but an IVI system often needs to consider many users at once. The driver may be using navigation, passengers in the back seat watching rear-seat-entertainment (RSE) consoles each displaying different content, and yet the IVI system also needs to override all of the separate audio zones to sound an alert if the car's proximity sensor detects it is about to back into the curb.

Rudolf Streif from the Linux Foundation's MeeGo IVI Working Group, presented an overview of the MeeGo IVI platform on Monday afternoon, including the missing pieces needed to build MeeGo into a solid IVI base. In addition to multi-zone audio and video, an IVI system also needs to support split-screen and layered video — for example to permit alerts or hands-free phone call messages to pop-up as higher-priority overlays on top of an existing video layer. But the human-machine-interface (HMI) layer in a vehicle system also has to cope with a different set of user input devices, such as physical buttons and knobs on dash units and steering wheels, and simple integration with consumer electronic devices like MP3 players and phones.

The hardware layer also needs to support a variety of device buses used to connect data sensors (speed, fuel level, etc.). There are several industry standards in wide deployment, Streif said, including Controller Area Network (CAN) and Media Oriented Systems Transport (MOST). Supporting them in open source is challenging, he added, because many car-makers have implemented their own brand-specific variations of the standard, and some (like MOST) are not freely or publicly available. For application developers, of course, MeeGo would also need to provide a bus-neutral common API to access this sensor data and (where applicable) to control vehicle hardware.

There are several areas of the middleware stack where MeeGo — and even Linux and open source in general — currently fall short. One (mentioned in Streif's talk and also raised in the IVI birds-of-a-feather session held later that afternoon) is voice control, specifically speech recognition and speech synthesis. There are few open source projects tackling these tasks, and most of those are academic in nature and not easily-integrated with upstream projects. Because hands-free phone operation is critical (even a legal requirement in many areas), there is a need for good acoustic echo cancellation and noise suppression, neither of which is currently well-supported in an open source project.

IVI devices are even more sensitive to fast boot times and fast application start-up than are entertainment devices, plus they must be prepared to cope with unregulated DC power from batteries and shut down safely and quickly when power is cut off. Like the situation with entertainment devices, most end users are not prepared to or interested in performing system updates, so remote management is a must. But unlike set-top boxes or even phones, car IVI systems are generally designed to have a ten-year lifespan. That poses a challenge not only for hardware makers, but for the MeeGo project itself and its application compliance program.

The IVI Working Group includes a diverse group of collaborators, include silicon vendors like Intel, car makers and Tier 1 automotive suppliers, industry consortia like GENIVI, and automotive software developers like Pelagicore AB. Involvement by the existing MeeGo development community has been slow to build, owing in no small part to the long product development cycle of the auto industry, but Streif and other members of the project were actively seeking input and participation from community members.

Where else can MeeGo go

At first blush, vehicle computing and set-top boxes sound like a radical departure from MeeGo's portable-device beginnings. Listening to the talks, however, it becomes clear that in both cases, there is an industry that up until now had been dominated by traditional embedded systems — and often proprietary operating systems and software stacks — which sees the success of Linux in smartphones and wants to emulate it. Open source software on smartphones took decades to arrive; at the very least the opportunity presented by MeeGo on the set-top box and IVI fronts is one where open source software can make a strong showing from the beginning. Beyond that, it may allow free software advocates to push back on some issues like closed and royalty-bearing standards that currently inhibit development.

The first big bullet point made in all of Monday morning's keynotes was that MeeGo is designed to present a unified Linux-based stack for the embedded market, averting the fragmentation that dogged early Linux smartphone development. That is clearly welcome news to the device makers. But the second big bullet point was that MeeGo presents a unified Linux distribution that is compatible with upstream projects and desktop distributions — which ought to be welcome news to open source developers. Le Foll and Streif both discussed examples of how industry product vendors (television service providers and car-makers, respectively) were eager to get on board with the mobile application craze; having those platforms be compatible with Linux desktops is a clear win. Don't think that it stops there, either — although there were no talks on the program about them, more MeeGo platforms kept cropping up in the middle of people's sessions, including everything from desktop video-phones to digital signage.

Comments (8 posted)

Ghosts of Unix past, part 3: Unfixable designs

November 16, 2010

This article was contributed by Neil Brown

In the second installment of this series, we documented two designs that were found to be imperfect and have largely (though not completely) been fixed through ongoing development. Though there was some evidence that the result was not as elegant as we might have achieved had the original mistakes not been made, it appears that the current design is at least adequate and on a path towards being good.

However, there are some designs mistakes that are not so easily corrected. Sometimes a design is of such a character that fixing it is never going to produce something usable. In such cases it can be argued that the best way forward is to stop using the old design and to create something completely different that meets the same need. In this episode we will explore two designs in Unix which have seen multiple attempts at fixes but for which it isn't clear that the result is even heading towards "good". In one case a significant change in approach has produced a design which is both simpler and more functional than the original. In the other case, we are still waiting for a suitable replacement to emerge. After exploring these two "unfixable designs" we will try to address the question of how to distinguish an unfixable design from a poor design which can, as we saw last time, be fixed.

Unix signals

Our first unfixable design involves the delivery of signals to processes. In particular it is the registration of a function as a "signal handler" which gets called asynchronously when the signal is delivered. That this design was in some way broken is clear from the fact that the developers at UCB (The University of California at Berkeley, home of BSD Unix) found the need to introduce the sigvec() system call, along with a few other calls, to allow individual signals to be temporarily blocked. They also changed the semantics of some system calls so that they would restart rather than abort if a signal arrived while the system call was active.

It seems there were two particular problems that these changes tried to address. Firstly there is the question of when to re-arm a signal handler. In the original Unix design a signal handler was one-shot - it would only respond the first time a signal arrived. If you wanted to catch a subsequent signal you would need to make the signal handler explicitly re-enable itself. This can lead to races, such as, if a signal is delivered before the signal handler is re-enabled it can be lost forever. Closing these races involved creating a facility for keeping the signal handler always available, and blocking new deliveries while the signal was being processed.

The other problem involves exactly what to do if a signal arrives while a system call is active. Options include waiting for the system call to complete, aborting it completely, allowing it to return partial results, or allowing it to restart after the signal has been handled. Each of these can be the right answer in different contexts; sigvec() tried to provide more control so the programmer could choose between them.

Even these changes, however, were not enough to make signals really usable, so the developers of System V (at AT&T) found the need for a sigaction() call which adds some extra flags to control the fine details of signal delivery. This call also allows a signal handler to be passed a "siginfo_t" data structure with information about the cause of the signal, such as the UID of the process which sent the signal.

As these changes, particularly those from UCB, were focused on providing "reliable" signal delivery, one might expect that at least the reliability issues would be resolved. Not so it seems. The select() system call (and related poll()) did not play well with signals so pselect() and ppoll() had to be invented and eventually implemented. The interested reader is encouraged to explore their history. Along with these semantic "enhancements" to signal delivery, both teams of developers chose to define more signals generated by different events. Though signal delivery was already problematic before these were added, it is likely that these new demands stretched the design towards breaking point.

An interesting example is SIGCHLD and SIGCLD, which are sent when a child exits or is otherwise ready for the parent to wait() for it. The difference between these two (apart from the letter "H" and different originating team) is that SIGCHLD is delivered once per event (as is the case with other signals) while SIGCLD would be delivered constantly (unless blocked) while any child is ready to be waited for. In the language of hardware interrupts, SIGCHLD is edge triggered while SIGCLD is level triggered. The choice of a level-triggered signal might have been an alternate attempt to try to improve reliability. Adding SIGCLD was more than just defining a new number and sending the signal at the right time. Two of the new flags added for sigaction() are specifically for tuning the details of handling this signal. This is extra complexity that signals didn't need and which arguably did not belong there.

In more recent years the collection of signal types has been extended to include "realtime" signals. These signals are user-defined signals (like SIGUSR1 and SIGUSR2) which are only delivered if explicitly requested in some way. They have two particular properties. Firstly, realtime signals are queued so the handler in the target process is called exactly as many times as the signal was sent. This contrasts with regular signals which simply set a flag on delivery. If a process has a given (regular) signal blocked and the signal is sent several times, then, when the process unblocks the signal, it will still only see a single delivery event. With realtime signals it will see several. This is a nice idea, but introduced new reliability issues as the depth of the queue was limited, so signals could still be lost. Secondly (and this property requires the first), a realtime signal can carry a small datum, typically a number or a pointer. This can be sent explicitly with sigqueue() or less directly with, e.g., timer_create().

It could be thought that this addition of more signals for more events is a good example of the "full exploitation" pattern that was discussed at the start of this series. However, when adding new signal types require significant changes to the original design, it could equally seem that the original design wasn't really strong enough to be so fully exploited. As can be seen from this retrospective, though the original signal design was quite simple and elegant, it was fatally flawed. The need to re-arm signals made them hard to use reliably, the exact semantics of interrupting a system call was hard to get right, and developers repeatedly needed to significantly extend the design to make it work with new types of signals.

The most recent step in the saga of signals is the signalfd() system call which was introduced to Linux in 2007 for 2.6.22. This system call extends "everything has a file descriptor" to work for signals too. Using this new type of descriptor returned by signalfd(), events that would normally be handled asynchronously via signal handlers can now be handled synchronously just like all I/O events. This approach makes many of the traditional difficulties with signals disappear. Queuing becomes natural so re-arming becomes a non-issue. Interaction with system calls ceases to be interesting and an obvious way is provided for extra data to be carried with a signal. Rather than trying to fix a problematic asynchronous delivery mechanism, signalfd() replaces it with a synchronous mechanism that is much easier to work with and which integrates well into other aspect of the Unix design - particularly the universality of file descriptors.

It is a fun, though probably pointless, exercise to imagine what the result might have been had this approach been taken to signals when problems were first observed. Instead of adding new signal types we might have new file descriptor types, and the set of signals that were actually used could have diminished rather than grown. Realtime signals might instead be a general and useful form of interprocess communication based on file descriptors.

It should be noted that there are some signals which signalfd() cannot be used for. These include SIGSEGV, SIGILL, and other signals that are generated because the process tried to do something impossible. Just queueing these signals to be processed later cannot work, the only alternatives are switching control to a signal handler, or aborting the process. These cases are handled perfectly by the original signal design. They cannot occur while a system call is active (system calls return EFAULT rather than raising a signal) and issues with when to re-arm the signal handler are also less relevant.

So while signal handlers are perfectly workable for some of the early use cases (e.g. SIGSEGV) it seems that they were pushed beyond their competence very early, thus producing a broken design for which there have been repeated attempts at repair. While it may now be possible to write code that handles signal delivery reliably, it is still very easy to get it wrong. The replacement that we find in signalfd() promises to make event handling significantly easier and so more reliable.

The Unix permission model

Our second example of an unfixable design which is best replaced is the owner/permission model for controlling access to files. A well known quote attributed to H. L. Mencken is "there is always a well-known solution to every human problem - neat, plausible, and wrong." This is equally true of computing problems, and the Unix permissions model could be just such a solution. The initial idea is deceptively simple: six bytes per file gives simple and broad access control. When designing an operating system to fit in 32 kilobytes of RAM (or less), such simplicity is very appealing, and thinking about how it might one day be extended is not a high priority, which is understandable though unfortunate.

The main problems with this permission model is that it is both too simple and too broad. The breadth of the model is seen in the fact that every file stores its own owner, group owner, and permission bits. Thus every file can have distinct ownership or access permissions. This is much more flexibility than is needed. In most cases, all the files in a given directory, or even directory tree have the same ownership and much the same permissions. This fact was leveraged by the Andrew filesystem which only stores ownership and permissions on a per-directory basis, with little real loss of functionality.

When this only costs six bytes per file it might seem a small price to pay for the flexibility. However once more than 65,536 different owners are wanted, or more permission bits and more groups are needed, storing this information begins to become a real cost. However the bigger cost is in usability.

While a computer may be able to easily remember six bytes per file, a human cannot easily remember why various different settings might have been assigned and so are very likely to create sets of permission settings which are inconsistent, inappropriate, and hence not particularly secure. Your author has memories from University days of often seeing home directories given "0777" permissions (everyone has any access) simply because a student wanted to share one file with a friend, but didn't understand the security model.

The excessive simplicity of the Unix permission model is seen in the fixed, small number of permission bits, and, particularly, that there is only one "group" that can have privileged access. Another maxim from computer engineering, attributed to Alan Kay, is that "Simple things should be simple, complex things should be possible." The Unix permission model makes most use cases quite simple but once the need exceeds that common set of cases, further refinement becomes impossible. The simple is certainly simple, but the complex is truly impossible.

It is here that we start to see real efforts to try to "fix" the model. The original design gave each process a "user" and a "group" corresponding to the "owner" and "group owner" in each file, and they were used to determine access. The "only one group" limit is limiting on both sides; the Unix developers at UCB saw that, for the process side at least, this limit was easy to extend. They allowed a process to have a list of groups for checking filesystem access against. (Unfortunately this list originally had a firm upper limit of 16, and that limit made its way into the NFS protocol where it was hard to change and is still biting us today.)

Changing the per-file side of this limit is harder as that requires changing the way data is encoded in a filesystem to allow multiple groups per file. As each group would also need its own set of permission bits a file would need a list of groups and permission bits and these became known quite reasonably as "access control lists" or ACLs. The Posix standardization effort made a couple of attempts to create a standard for ACLs, but never got past draft stage. Some Unix implementations have implemented these drafts, but they have not been widely successful.

The NFSv4 working group (under the IETF umbrella) were tasked with creating a network filesystem which, among other goals, would provide interoperability between POSIX and WIN32 systems. As part of this effort they developed yet another standard for ACLs which aimed to support the access model of WIN32 while still being usable on POSIX. Whether this will be more successful remains to be seen, but it seems to have a reasonable amount of momentum with an active project trying to integrate it into Linux (under the banner of "richacls") and various Linux filesystems.

One consequence of using ACLs is that the per-file storage space needed to store the permission information is not only larger than six bytes, it is not of a fixed length. This is, in general, more challenging than any fixed size. Those filesystems which implement these ACLs do so using "extended attributes" and most impose some limit on the size of these - each filesystem choosing a different limit. Hopefully most ACLs that are actually used will fit within all these arbitrary limits.

Some filesystems - ext3 at least - attempt to notice when multiple files have the same extended attributes and just store a single copy of those attributes, rather than one copy for each file. This goes some way to reduce the space cost (and access-time cost) of larger ACLs that can be (but often aren't) unique per file, but does nothing to address the usability concerns mentioned earlier. In that context, it is worth quoting Jeremy Allison, one of the main developers of Samba, and so with quite a bit of experience with ACLs from WIN32 systems and related interoperability issues. He writes: "But Windows ACLs are a nightmare beyond human comprehension :-). In the 'too complex to be usable' camp." It is worth reading the context and follow up to get a proper picture, and remembering that richacls, like NFSv4 ACLs, are largely based on WIN32 ACLs.

Unfortunately it is not possible to present any real example of replacing rather than fixing the Unix permission model. One contender might be that part of "SELinux" that deals with file access. This doesn't really aim to replace regular permissions but rather tries to enhance them with mandatory access controls. SELinux follows much the same model of Unix permissions, associating a security context with every file of interest, and does nothing to improve the usability issues.

There are however two partial approaches that might provide some perspective. One partial approach began to appear in Level 7 Unix with the chroot() system call. It appears that chroot() wasn't originally created for access control but rather to have a separate namespace in which to create a clean filesystem for distribution. However it has since been used to provide some level of access control, particularly for anonymous FTP servers. This is done by simply hiding all the files that the FTP server shouldn't access. Anything that cannot be named cannot be accessed.

This concept has been enhanced in Linux with the possibility for each process not just to have its own filesystem root, but also to have a private set of mount points with which to build a completely customized namespace. Further it is possible for a given filesystem to be mounted read-write in one namespace and read-only in another namespace, and, obviously, not at all in a third. This functionality is suggestive of a very different approach to controlling access permissions. Rather than access control being per-file, it allows it to be per-mount. This leads to the location of a file being a very significant part of determining how it can be accessed. Though this removes some flexibility, it seems to be a concept that human experience better prepares us to understand. If we want to keep a paper document private we might put it in a locked drawer. If we want to make it publicly readable, we distribute copies. If we want it to be writable by anyone in our team, we pin it to the notice board in the tea room.

This approach is clearly less flexible than the Unix model as the control of permissions is less fine grained, but it could well make up for that in being easier to understand. Certainly by itself it would not form a complete replacement, but it does appear to be functionality that is growing - though it is too early yet to tell if it will need to grow beyond its strength. One encouraging observation is that it is based on one of those particular Unix strengths observed in our first pattern, that of "a hierarchical namespace" which would be exploited more fully.

A different partial approach can be seen in the access controls used by the Apache web server. These are encoded in a domain-specific language and stored in centralized files or in ".htaccesss" files near the files that are being controlled. This method of access control has a number of real strengths that would be a challenge to encode into anything based on the Unix permission model:

The permission model is hierarchical, matching the filesystem model. Thus controls can be set at whichever point makes most sense, and can be easily reviewed in their entirety. When the controls set at higher levels are not allowed to be relaxed at lower levels it becomes easy to implement mandatory access controls.
The identity of the actor requesting access can be arbitrary, rather than just from the set of identities that are known to the kernel. Apache allows control based on source IP address or username plus password. Using plug-in modules almost anything else that could be available.
Access can be provided indirectly through a CGI program. Thus, rather than trying to second-guess all possible access restrictions that might be desirable and define permission bits for them in a new ACL, the model can allow any arbitrary action to be controlled by writing a suitable script to mediate that access.

It should be fairly obvious that this model would not be an easy fit with kernel-based access checking and, in any case, would have a higher performance cost than a simpler model. As such it would not be suitable to apply universally. However it could be that such a model would be suitable for that small percentage of needs that do not fit in a simple namespace based approach. There the cost might be a reasonable price for the flexibility.

While an alternate approach such as these might be appealing, it would face a much bigger barrier to introduction than signalfd() did. signalfd() could be added as a simple alternate to signal handlers. Programs could continue to use the old model with no loss, while new programs can make use of the new functionality. With permission models, it is not so easy to have two schemes running in parallel. People who make serious use of ACLs will probably already have a bunch of ACLs carefully tuned to their needs and enabling an alternate parallel access mechanism is very likely to break something. So this is the sort of thing that would best be trialed in a new installation rather than imposed on an existing user-base.

Discerning the pattern

If we are to have a convincing pattern of "unfixable designs" it must be possible to distinguish them from fixable designs such as those that we found last time. In both cases, each individual fix appears to be a good idea addressing a real problem without obviously introducing more problems. In some case this series of small steps leads to a good result, in others these steps only help you get past the small problems enough to be able to see the bigger problem.

We could use mathematical terminology to note that a local maximum can be very different from a global maximum. Or, using mountain-climbing terminology, it is hard to know the true summit from a false summit which just gives you a better view of the mountain. In each case the missing piece is a large scale perspective. If we can see the big picture we can more easily decide if a particular path will lead anywhere useful or if it is best to head back to base and start again.

Trying to move this discussion back to the realm of software engineering, it is clear that we can only head off unfixable designs if we can find a position that can give us a clear and broad perspective. We need to be able to look beyond the immediate problem, to see the big picture and be willing to tackle it. The only known source of perspective we have for engineering is experience, and few of us have enough experience to see clearly into the multiple facets and the multiple levels of abstraction that are needed to make right decisions. Whether we look for such experience by consulting elders, by researching multiple related efforts, or finding documented patterns that encapsulate the experience of others, it is vitally important to leverage any experience that is available rather than run the risk of simply adding bandaids to an unfixable design.

So there is no easy way to distinguish an unfixable design from a fixable one. It requires leveraging the broad perspective that is only available through experience. Having seen the difficulty of identifying unfixable designs early we can look forward to the final part of this series, where we will explore a pernicious pattern in problematic design. While unfixable designs give a hint of deeper problems by appearing to need fixing, these next designs do not even provide that hint. The hints that there is a deeper problem must be found elsewhere.

Exercises

Though we found that signal handlers had been pushed well beyond their competence, we also found at least one area (i.e. SIGSEGV) when they were still the right tool for the job. Determine if there are other use cases that avoid the observed problems, and so provide a balanced assessment of where signal handlers are effective, and where they are unfixable.
Research problems with "/tmp", attempts to fix them, any unresolved issues, and any known attempts to replace rather than fix this design.
Describe an aspect of the IP protocol suite that fits the pattern of an "Unfixable design".
It has been suggested that dnotify, inotify, fanotify are all broken. Research and describe the problems and provide an alternate design that avoids all of those issues.
Explore the possibility of using fanotify to implement an "apache-like" access control scheme with decisions made in user-space. Identify enhancements requires to fanotify for this to be practical.

Ghosts of Unix past, part 4: High-maintenance designs

Comments (110 posted)

Page editor: Jonathan Corbet
Next page: Security>>