|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for February 3, 2011

LCA 2011

By Jonathan Corbet
February 1, 2011
Our community has a number of volunteer-organized events; some of them are rather more organized than others. Anybody who thinks that volunteers cannot produce a professional-quality (or better) event, though, has never been to linux.conf.au. LCA is not just run by volunteers; it is organized by a completely different group of volunteers every year, but it still comes off reliably, every time, as a top-quality conference. Each year's organizers clearly deserve a lot of credit, but there is also a lot of value in the LCA "ghosts" institution, whereby organizers from previous years give advice and keep a watch for red flags as the planning and preparations go forward. Without the ghosts, LCA would not be what it is.

Now imagine that you have been planning an event for over a year. Two weeks before the conference, venues, equipment, accommodations, transportation, social events, and more are all in place. Then the host city is hit by catastrophic floods, the venues for both the conference and the social events are taken out of commission, and the routers for the wireless network are soaking at the wrong end of a flooded warehouse. Even if a new venue can be found, it will no longer be within walking distance of the accommodations, so transportation must be arranged on short notice.

That is the point where the ghosts run out of useful experience to share. It is also the point where an insufficiently determined group would simply give up.

[Against all odds] The organizers of LCA 2011, held in Brisbane, would appear to be a determined bunch indeed. They found a new venue, reprinted the conference maps, found new locations for the social events, swam through the warehouse to recover the routers, arranged new transportation for the attendees, and, beyond any doubt, did a thousand things that nobody else saw. The end result was a conference which, barring knowledge to the contrary, would have seemed like they had planned it that way all along. LCA 2011 didn't just work - it worked just as well as its predecessors. One easily runs out of superlatives when describing the job this group did; your editor only hopes that, after they have slept for a solid week or so, they have arranged a major party to celebrate what they accomplished.

There were a number of interesting sessions at this conference, many of which have been covered in these pages. Here, your editor will summarize some of the talks which, for various reasons (including simple time) were not discussed in a separate article.

Andrew 'Tridge' Tridgell has developed a reputation for energetic LCA talks focused on the simple joy of hacking; his LCA 2011 talk did not [Andrew Tridgell] disappoint. Tridge, it seems, has become somewhat of a coffee snob, so he has taken to roasting his own beans. That turns out to be an attention-intensive process which takes too much time away from the hacking that coffee is meant to support, so he built a Linux-powered coffee roaster out of an old bread maker, a temperature sensor, a heat gun, and a hand-made circuit for power regulation.

While demonstrating the device and hoping the fire alarms did not sound, he went into the specifics of coffee roasting and the details of how one uses LD_PRELOAD to reverse engineer a Windows temperature driver running under Virtualbox on Linux. A good time was clearly had by all. Bdale Garbee's session on the creation of a large, Linux-powered milling machine had a similar feel. Both talks will be well worth watching once the videos become available.

Daniel Bentley and Daniel Nadasi talked about the challenges that go with opening up code at Google. Internal programs tend to be heavily used and have a lot of internal contributors; these people often have a lot of worries when they are approached about releasing their code to the world. They have to be sold on the business case for opening the code, and they have to be talked past worries that their code is too ugly to see the light of day. There are also some real concerns that opening code might reveal internal information and that working with the community might slow the project down. Changing source control and build systems can also be a challenge; apparently few people at Google still remember how to write a classic makefile.

An important question is: where is the home for the code's further development? If it's developed internally, the internal folks are happy because things are working as they were before. Outsiders, who see a series of code dumps, may be less impressed. If development happens publicly then outside developers will be happier, but it can be harder for internal developers. An added factor is that any project, no matter how successfully it is opened, will be dominated by internal developers during the first part of its open existence; that tends to drive the internal development model, but that, in turn, can slow (or prevent) the development of a community around the code.

Daniel and Daniel's response to this problem is a tool called "make open easy," or "moe." With moe, internal developers can mark sections of code which should not be visible to the outside world; markings can take the form of function annotations or preprocessor-like directives. The tool can then extract the code from the internal repository, edit it according to the directives, and load it into a public repository. Importantly, it can also move code in the other direction, merging external changes while retaining the scrubbing directives. Moe makes lives easier on both sides of the wall, and is in active use with a number of projects; it can be obtained from code.google.com.

Carl Worth gave a well-attended session on the notmuch mail system. Notmuch has been reviewed here in the past; your editor was [Carl Worth] mostly interested in the current and future state of this search-oriented mail tool. Recent changes include the ability to search on mail folder names - useful for migrating from a folder-based mail client. There is also synchronization with maildir flags, which is helpful for people using both notmuch and a more traditional client. There are now a few supported output styles for search operations, which should make it easier to create a web-based notmuch front-end, among other things.

In the near future, notmuch users should expect the ability to search on arbitrary mail headers and some relief from the rather inflexible date format which must be used now. Further ahead, there will be more work toward synchronization with remote mail spools; the hard part here is moving tags back and forth. Options for a solution include the addition of a special header to the messages themselves (but that could be problematic if the header leaks in a forwarded message, revealing to all the tags one uses for mail from the special people in one's life), the use of custom maildir flags, or the addition of some sort of journal replay mechanism. There is also talk of storing mail in git packs and using the git protocol to move messages (and tags) around. Even further ahead might be a notmuch backend for mutt.

Meanwhile, the project has a number of interested users but, by Carl's admission, it could benefit from a more present maintainer.

[Kirk McKusick] Kirk McKusick is one of the creators of BSD Unix. His fast-paced session in an overflowing room covered much of the history of the Berkeley Software Distribution, the ups and downs of hacking with Bill Joy, the ATT lawsuit, his refusal to work for just-starting Sun Microsystems (because Apollo had the workstation market completely sewed up), and much more. The talk should eventually appear with the rest of the conference videos; there is also apparently a DVD available on Kirk's web page for those who want more.

There were far more interesting talks than your editor could possibly attend, much less write up. The good news is that the conference organizers are making the videos available quickly; they can be found (in several formats) on this blip.tv page, but this wiki page has them in a much better-organized fashion.

In summary: LCA 2011 was another great success; it would have been judged favorably against its predecessors even in the absence of natural disasters. LCA 2012 will, perhaps surprisingly, be held in Ballarat, a small city outside Melbourne. The Ballarat organizers have a hard act to follow, but history suggests they will be up to the task.

Comments (11 posted)

Debugging conference anti-harassment policies

By Jonathan Corbet
January 31, 2011
Linux.conf.au 2011 distinguished itself in a number of ways, one of which was the uniformly interesting and thought-provoking nature of its keynote talks, two of which have already been covered on LWN. Mark Pesce's keynote was no exception, but this talk also stood out as the only one at the conference to trigger the newly-adopted anti-harassment policy, leading to apologies from the organizers and the speaker. This action was controversial on all fronts; perhaps the only clear conclusions are that we have not yet come to a real consensus on what harassment means or the best way to prevent it.

The source: As of this writing, the talk is not available on the LCA 2011 videos page. Mark has posted the text of the talk and the slides [ODP] for the curious.
The talk itself was about freedom and privacy on the net. There was much discussion of the evils of sites like Facebook and a bit of talk about the Plexus project which is trying to create alternatives which are more free. To your editor, the most chilling point was that the net itself is not free; the crusade against Wikileaks and the Internet shutdown in Egypt were given as examples. The net, he said, functions at the whim of government; we need to build alternative transports - using smoke signals, if necessary - to ensure our right to communicate. We are at war for our freedom, he said, and we need to start approaching the problem that way.

The message clearly resonated with many people in the audience, but the presentation of that message was less than pleasing to many. The speaker aimed for a high level of drama, made heavy use of profanity, and put up some slides that struck some attendees as overtly sexual in nature. In your editor's opinion, the presentation style, which was clearly intended to shock and disturb, detracted from the message which was being delivered. It also ensured that much of the subsequent talk would be about the slides and the language, and not about what was really said. Your editor, who, at the outset, wondered if he could learn something from the speaker to spice up his own talks (which are notably less dramatic), concluded at the end that there was indeed something to learn, but the lessons were all negative.

A number of attendees complained, and the organizers, in response, apologized (to applause) at the closing session. Mark later posted an apology of his own. It seemed like a reasonable handling of the situation, and the discussion could have stopped there - but it didn't.

The lca-chat mailing list, which had mostly occupied itself with (1) making Brisbane's public transportation system seem much more complicated than it really is and (2) discussing the lack of toilet paper in one of the lodging choices, hosted several threads on whether the response to the talk was right. Interested parties are encouraged to read through the threads - which remained civil throughout - for the full discussion. But there are a few things which can be summarized:

  • Some attendees were upset by the talk and fully supported the apology.

  • Some participants, while supporting the posted anti-harassment policy, felt that the talk did not violate that policy.

  • Others went further, saying that the language and imagery used were effective and necessary for the talk to attain its objective of making attendees uncomfortable with the current state of affairs.

  • Others yet objected to the entire conversation, claiming that a discussion of whether the policy applied made them feel unsafe and asking people to stop.

Your editor disagrees with the last group and feels that the discussion is absolutely necessary. We are partway through a process - likely to take years - aimed at making our community and its gatherings more welcoming for all those we would like to have attend. LCA 2011 adopted a new style of policy on harassment which had not been used before, and Mark Pesce's talk was the first time it was invoked. The idea that we have everything right and that no further discussion required is, frankly, laughable. Some debugging will certainly be necessary - once we are sure we have the core design right.

While evaluating the design and pondering debugging, there are a couple of viewpoints from LCA organizers that warrant reading in full. The first is from LCA 2011 organizer Russell Stuart, who opposed the policy from the outset - though, having lost that battle, he argued for apologizing when the policy was violated. He says:

One of the roles of LCA organisers is to bring popular, enlightening and if we get very lucky even inspiring talks. By two measure's Mark Pesce's talk was one of those. It received one of the longest, it not the longest acclamation of any talk at LCA 2011. And if the chatter on our lists is any guide, it caused more people to stop, think and act than any other talk. And yet we have a small minority of people who evidently take offence at images and words that would be perfectly acceptable on Australia broadcast TV, and are now suggesting the vast bulk of the LCA attendees who enjoyed the talk should not have been allowed to see it because they object to it. And they got very close to achieving just that.

Russell fears that the policy heads toward outright censorship and should not be used by other conferences until it has been "substantially reworked." He found agreement from Susanne Ruthven, one of the lead organizers of LCA 2010 and the author of that conference's anti-harassment policy. That policy was aimed at preventing broadly-described "harassment or discrimination" and, seemingly, would not have been invoked for this talk:

As organisers of LCA2010, Andrew and I have discussed this current situation and think some of Mark's slides could be inappropriate and considered bad taste, but they have certainly achieved their purpose of making us all sit up and think, and more importantly, to question. In our view, Mark's talk was not discriminatory or harassment. It obviously offended some people, but then he is entitled to shock, horrify and offend under his right to freedom of expression (as long as his actions aren't breaking any laws, like discrimination laws etc).

Clearly there is a balance to be found here; outright harassment is not a freedom of speech issue, but the desire to create a more welcoming environment in general will almost certainly require curtailing certain types of speech. Those who see speech freedom as fundamental will resist such moves. Those who have suffered assault, or who simply do not want to circulate in a highly sexualized environment, will push in the other direction. Conference organizers - and speakers - may find themselves caught in the middle.

The problems addressed by anti-harassment policies are real. Conference attendees have had to put up with some horrifying experiences which - hopefully! - do not reflect what our community is about. Practices like the employment of booth babes or the use of women as sexually-charged attention magnets on slides do not create an environment which is conducive to the acceptance of women as equal participants. We absolutely need to clean up our act. But doing so will be an iterative process which must also respect other, equally fundamental freedoms. It's a design and debugging problem, and we are far from the final release on this bit of code.

Comments (181 posted)

LCA: Lessons from 30 years of Sendmail

By Jonathan Corbet
February 2, 2011
The Sendmail mail transfer agent tends to be one of those programs that one either loves or hates. Both its supporters and its detractors will agree, though, that Sendmail played a crucial role in the development of electronic mail before, during, and after the explosion of the Internet. Sendmail creator Eric Allman took a trip to Brisbane to talk to the LCA 2011 about the history of this project. Sendmail is, he said, 30 years old now; in those three decades it has thrived without corporate support, changed the world, and thrived in a world which was changing rapidly around it.

The history

Sendmail had its start at the University of California, Berkeley, in 1980; it was initially something Eric did while he was supposed to be working on the Ingres relational database management system. In those days, the Computer Science department had a dozen machines, but the main system was "Ernie CoVAX," which was accessed via ASCII terminals. There was a limited number of ports, so users had to connect via a patch panel in the mail room; contention for available ports was often intense.

Things got more interesting when the Ingres project got an ARPAnet connection; a single PDP11 machine, with two ports, was the only way to access the net at that time. There was no way the entire department was going to share those two ports without somebody getting hurt, so another solution was required. Eric looked at the problem, concluded that what everybody really wanted was the ability to send mail through the gateway machine, and decided that he would make a way to access email from other machines on campus. From this beginning delivermail was born.

There was a set of design principles that Eric adopted at that time. There was only one of him, so programming time was a truly finite resource. Redesigning user agents and mail stores was out of the question. Delivermail had to adapt to the world around it, not the other way around. [Eric Allman] The resulting program worked, but was not without its problems. The compiled-in configuration lacked flexibility, there was no address translation as messages moved between networks, and the parsing was simple and opaque. But it succeeded in moving mail around and giving the entire department access to the net.

Then the department got the BSD contract. Bill Joy needed a mail transfer agent to connect to the network, so he talked Eric into taking on the job. After all, how hard could it be? Among other things, the new MTA needed to support the SMTP mail protocol - which wasn't specified yet. Supporting SMTP also forced the addition of a mail queue, a job which turned out to be much harder than it looked. Eric hacked away, and Sendmail was shipped with 4.1BSD in 1982 with support for SMTP, header rewriting, queueing, and runtime configuration.

After that, Eric left Berkeley for a "lucrative" (heavy on the quotes) career in industry. Sendmail, meanwhile, was picked up by the Unix vendors. The Unix wars were in full force at that time; the inevitable result was a proliferation of different versions of Sendmail. The program became balkanized and incompatible across systems.

Eric returned to Berkeley in 1989 and started hacking on Sendmail again; the immediate need was support for the ".cs" subdomain at the university. That work snowballed into a major rewrite culminating in Sendmail 8; this version integrated a great deal of code from both the industry and the community. It added support for ESMTP, a number of new protocols, delivery status notifications, LDAP integration, eight-bit mail, and a new configuration package. Uptake increased after the Sendmail 8 release as a result of these features, but also as the result of the publication of the O'Reilly "bat" book. Documentation, it turns out, really matters.

Sendmail Inc. was created in 1998 with the fantasy that it would let Eric get back to coding. In reality, starting a company is more about marketing, sales, and money than about technology - a lesson many of us have learned. It was one of the first companies trying to mix open source and proprietary offerings; in those days, the prevailing wisdom is that a company needed proprietary lock-in to have any chance of success. Over time, though, functionality migrated to the free version; thus Sendmail gained support for encryption, authentication, milters (mail filters), virtual hosting, spam filtering, and more. And that's where things stand today.

Lessons learned

As one might expect, 30 years of experience have led to a number of lessons worth passing on. Eric shared a few of them.

One is that requirements change all the time. The original delivermail program had reliability as its primary focus - few things are more hazardous to one's academic career than losing a professor's grant proposal. Over time, the requirements shifted toward functionality and performance; Sendmail had to scale up in speed and features as the Internet took off. Then users were demanding protection from spam and malware; that shifted Sendmail development toward keeping mail out. We have, Eric noted, gone full circle toward unreliable mail service. After that came requirements around legal and regulatory compliance - that is where a great deal of Sendmail Inc.'s business lies. There is currently an increasing focus on controlling costs, mobility, and social network integration. Without the ability to adapt to meet these shifting requirements, Sendmail would not have thrived through all these years.

With regard to Sendmail's design decisions, Eric said that some turned out to be right, some were wrong, and some were right at the time but are wrong now. One criticism that has been made is that Sendmail is an overly general solution; it can route and rewrite messages in ways which are generally unneeded in these days of Internet monoculture. Eric defended that generality by saying that the world was in great flux when Sendmail was designed; there was no way to really know how things were going to turn out. And, he said, he would do it again: "the world is still ugly."

Rewriting rules for addresses are a part of that generality; even at the time, it seemed like overkill, but he couldn't come up with anything better. It was, he said, probably the right thing to do. That said, the decision to use tabs as active characters was the stupidest thing he has ever done. That's how makefiles did it, and it seemed cool at the time. As a whole, he said, the concept was right, but the syntax and flow control could have been a lot better. Even so, he's glad he did matching based on tokens; basing Sendmail configuration around regular expressions would have been far worse.

If he were doing the configuration system now, it would look a lot more like the Apache scheme.

The message munging feature was needed for the rewriting of headers; it facilitated interoperability between different networks. It is still used a lot, he said, though it's arguably not necessary. Sendmail could benefit from a pass-through mode which shorts out the message munging, but that leaves open the question of what should be done with non-compliant messages. Should they be fixed, rejected, or just dropped? There is, he said, no obvious answer.

[Eric Allman] The embedding of SMTP and queueing in the mail daemon was the right thing to do; he does not agree with the Postfix approach of proliferating lots of small daemons. The queue structure itself involves two files for every message: one with the envelope, and one with the body. That forces the system to scan large numbers of small files on a busy system, which is not always optimal. At the time it was the right way to go; now he would probably use some sort of database for the envelopes. The decision to use plain text for all internal files was right, though; it makes debugging much easier.

With regard to the use of the m4 macro preprocessor for configuration, Eric admitted that the syntax is painful. But he needed a macro facility and didn't want to reinvent the wheel. The "damned dnl lines" for comments were a mistake, though, and completely unnecessary. In summary, some sort of tool was needed; m4 might not have been the best choice, but it's not clear what would have been.

With regard to extending or changing features: Sendmail has tended toward extending features and maintaining compatibility, and that has not always been the right thing to do. The hostname masquerading facility was one example; that feature was simply done wrong the first time around. Rather than fixing it, though, Eric papered over the problems with new features. It would have been better to inflict some short-term pain on users, perhaps aided by a migration tool, and be done with it. The unwillingness to replace mistaken features has a lot to do with why Sendmail is difficult to configure.

Sendmail goes out of its way to accept and fix bogus input; that was in compliance with the robustness principle ("be conservative in what you send but liberal in what you accept") that was widely accepted at the time. It increases interoperability, but at the cost of allowing broken software to persist indefinitely, leading to large costs down the road. Nonetheless, it was the right idea at the time for the simple reason that everything was broken then. But he should have tightened things up later on.

What would he have done differently? At the top of the list is trying to fix problems as soon as possible. These include tabs in the configuration file and the V7 mailbox format. He's really tired of seeing ">From" in messages; he said he could have fixed it and expressed his apologies for not having taken the opportunity. He would make more use of modern tools; Sendmail has its own build script, which is not something he would do today. He would use more privilege separation, though he would not go as far as Postfix. He would have made a proper string abstraction; strings are by far the weakest part of the C language.

There are also a number of things he would do the same, starting with the use of C as the implementation language. It is, he said, a dangerous language, but the programmer always knows what is going on. Object-oriented programming, he said, is a mistake; it hides too much. Beyond that, he would continue to do things in small chunks. The creation of syslog (initially as a way of getting debugging information out) was obviously the right thing to do; he was surprised that there was no centralized way of dealing with logging data on Unix systems. He would still implement rewriting rules, albeit with a different syntax. And he would continue not to rely too heavily on outside tools. There is a cost to adding dependencies on tools; sometimes it's better to just build what you need. There are, he said, projects using lex when all they really need is strtok().

There were a number of "takeaways" to summarize the talk:

  • The KISS (keep it simple, stupid) principle works.
  • If you don't know what you are doing, advance designs will not help.
  • The world is messy, just plan on it.
  • Flexibility trumps performance when the world changes every day.
  • Fix things early; your installed base will only get larger if you succeed, and the pain of not fixing things will only get worse.
  • Use plain text for internal files and protocols.
  • Good documentation is the key to broad acceptance; most projects, he said, have not yet figured this out.

The talk was evidently based on a chapter from an upcoming book on the architecture of open-source applications.

One member of the audience asked Eric which MTA he would recommend for new installations today. His possibly surprising answer was Postfix. He talked a lot with Postfix author Wietse Venema during its creation, and was impressed. Postfix is, he said, nice work, even if he doesn't agree with all of the design decisions that were made.

Comments (107 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Security: The end of OpenID?; New vulnerabilities in calibre, kernel, OpenJDK, pango,...
  • Kernel: Undertaker; RAS daemon; Rationalizing the Wacom driver; KernelShark.
  • Distributions: Ubuntu and Qt, MeeGo and GTK+; TurnKey Linux 11, Debian Squeeze "live commenting", Mandriva 2011 delay, ArchBang, ...
  • Development: Supercell; Celery, crosstool-NG, KDevelop, MPlayer, Pyramid, ...
  • Announcements: IPv4 addresses, Re-branding Blender, Bufferbloat.net, FOSDEM interviews, Android and MeeGo training, ...
Next page: Security>>

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds