By Jonathan Corbet
February 1, 2011
Our community has a number of volunteer-organized events; some of them are
rather more organized than others. Anybody who thinks that volunteers
cannot produce a professional-quality (or better) event, though, has never
been to linux.conf.au. LCA is not just run by volunteers; it is organized
by a completely different group of volunteers every year, but it still
comes off reliably, every time, as a top-quality conference. Each year's
organizers clearly deserve a lot of credit, but there is also a lot of
value in the LCA "ghosts" institution, whereby organizers from previous
years give advice and keep a watch for red flags as the planning and
preparations go forward. Without the ghosts, LCA would not be what it is.
Now imagine that you have been planning an event for over a year. Two
weeks before the conference, venues, equipment, accommodations,
transportation, social events, and more are all in place. Then the host
city is hit by catastrophic floods, the venues for both the conference and
the social events are taken out of commission, and the routers for the
wireless network are soaking at the wrong end of a flooded warehouse. Even
if a new venue can be found, it will no longer be within walking distance
of the accommodations, so transportation must be arranged on short notice.
That is the point where the ghosts run out of useful experience to share.
It is also the point where an insufficiently determined group would simply
give up.
The organizers of LCA 2011, held in Brisbane, would appear to be a
determined bunch indeed. They found a new venue, reprinted the conference
maps, found new locations for the social events, swam through the warehouse
to recover the routers, arranged new transportation for the attendees, and,
beyond any doubt, did a thousand things that nobody else saw. The end
result was a conference which, barring knowledge to the contrary, would
have seemed like they had planned it that way all along. LCA 2011 didn't
just work - it worked just as well as its predecessors. One easily runs
out of superlatives when describing the job this group did; your editor
only hopes that, after they have slept for a solid week or so, they have
arranged a major party to celebrate what they accomplished.
There were a number of interesting sessions at this conference, many of
which have been covered in these pages. Here, your editor will summarize
some of the talks which, for various reasons (including simple time) were
not discussed in a separate article.
Andrew 'Tridge' Tridgell has developed a reputation for energetic
LCA talks focused on the simple joy of hacking; his LCA 2011 talk did not
disappoint. Tridge, it seems, has become somewhat of a coffee snob, so he
has taken to roasting his own beans. That turns out to be an
attention-intensive process which takes too much time away from the hacking
that coffee is meant to support, so he built a Linux-powered coffee roaster
out of an old bread maker, a temperature sensor, a heat gun, and a
hand-made circuit for power regulation.
While demonstrating the device and hoping the fire alarms did not sound, he
went into the specifics of coffee roasting and the details of how one uses
LD_PRELOAD to reverse engineer a Windows temperature driver
running under Virtualbox on Linux. A good time was clearly had by all.
Bdale Garbee's session on the creation of a large, Linux-powered
milling machine had a similar feel. Both talks will be well worth watching
once the videos become available.
Daniel Bentley and Daniel Nadasi talked about the challenges that go
with opening up code at Google. Internal programs tend to be heavily used
and have a lot of internal contributors; these people often have a lot of
worries when they are approached about releasing their code to the world.
They have to be sold on the business case for opening the code, and they
have to be talked past worries that their code is too ugly to see the light
of day. There are also some real concerns that opening code might reveal
internal information and that working with the community might slow the
project down. Changing source control and build systems can also be a
challenge; apparently few people at Google still remember how to write a
classic makefile.
An important question is: where is the home for the code's further
development? If it's developed internally, the internal folks are happy
because things are working as they were before. Outsiders, who see a
series of code dumps, may be less impressed. If development happens
publicly then outside developers will be happier, but it can be harder for
internal developers. An added factor is that any project, no matter how
successfully it is opened, will be dominated by internal developers during
the first
part of its open existence; that tends to drive the internal development
model, but that, in turn, can slow (or prevent) the development of a
community around the code.
Daniel and Daniel's response to this problem is a tool called "make open
easy," or "moe." With moe, internal developers can mark sections of code
which should not be visible to the outside world; markings can take the
form of function annotations or preprocessor-like directives. The tool can
then extract the code from the internal repository, edit it according to
the directives, and load it into a public repository. Importantly, it can
also move code in the other direction, merging external changes while
retaining the scrubbing directives. Moe makes lives easier on both sides
of the wall, and is in active use with a number of projects; it can be
obtained from code.google.com.
Carl Worth gave a well-attended session on the notmuch mail system. Notmuch has been reviewed here in the past; your editor was
mostly interested in the current and future state of this search-oriented
mail tool. Recent changes include the ability to search on mail folder
names - useful for migrating from a folder-based mail client. There is
also synchronization with maildir flags, which is helpful for people using
both notmuch and a more traditional client. There are now a few supported
output styles for search operations, which should make it easier to create
a web-based notmuch front-end, among other things.
In the near future, notmuch users should expect the ability to search on
arbitrary mail headers and some relief from the rather inflexible date
format which must be used now. Further ahead, there will be more work
toward synchronization with remote mail spools; the hard part here is
moving tags back and forth. Options for a solution include the addition of
a special header to the messages themselves (but that could be problematic
if the header leaks in a forwarded message, revealing to all the tags one
uses for mail from the special people in one's life), the use of custom
maildir flags, or the addition of some sort of journal replay mechanism.
There is also talk of storing mail in git packs and using the git protocol
to move messages (and tags) around. Even further ahead might be a notmuch
backend for mutt.
Meanwhile, the project has a number of interested users but, by Carl's
admission, it could benefit from a more present maintainer.
Kirk McKusick is one of the creators of BSD Unix. His fast-paced
session in an overflowing room covered much of the history of the Berkeley
Software Distribution, the ups and downs of hacking with Bill Joy, the ATT
lawsuit, his refusal to work for just-starting Sun Microsystems (because
Apollo had the workstation market completely sewed up), and much more. The
talk should eventually appear with the rest of the conference videos; there
is also apparently a DVD available on Kirk's web page for those who want
more.
There were far more interesting talks than your editor could possibly
attend, much less write up. The good news is that the conference
organizers are making the videos available quickly; they can be found (in
several formats) on this blip.tv page, but this wiki page has them in
a much better-organized fashion.
In summary: LCA 2011 was another great success; it would have been judged
favorably against its predecessors even in the absence of natural
disasters. LCA 2012 will,
perhaps surprisingly, be held in Ballarat, a
small city outside Melbourne. The Ballarat organizers have a hard act to
follow, but history suggests they will be up to the task.
Comments (11 posted)
By Jonathan Corbet
January 31, 2011
Linux.conf.au 2011 distinguished itself in a number of ways, one of which
was the uniformly interesting and thought-provoking nature of its keynote
talks, two of which have already been covered on LWN. Mark Pesce's keynote
was no exception, but this talk also stood out
as the only one at the conference to trigger the newly-adopted
anti-harassment policy, leading to apologies from the organizers and the
speaker. This action was controversial on all fronts; perhaps the only
clear conclusions are that we have not yet come to a real consensus on what
harassment means or the best way to prevent it.
The talk itself was about freedom and privacy on the net. There was much
discussion of the evils of sites like Facebook and a bit of talk about the
Plexus project which is
trying to create alternatives which are more free. To your editor, the
most chilling point was that the net itself is not free; the crusade
against Wikileaks and the Internet shutdown in Egypt were given as
examples. The net, he said, functions at the whim of government; we need
to build alternative transports - using smoke signals, if necessary - to
ensure our right to communicate. We are at war for our freedom, he said,
and we need to start approaching the problem that way.
The message clearly resonated with many people in the audience, but the
presentation of that message was less than pleasing to many. The speaker
aimed for a high level of drama, made heavy use of profanity, and put up
some slides that struck some attendees as overtly sexual in nature.
In your editor's opinion, the presentation style, which was clearly intended
to shock and disturb, detracted from the message which was being
delivered. It also ensured that much of the subsequent talk would be about
the slides and the language, and not about what was really said. Your
editor, who, at the outset, wondered if he could learn something from the
speaker to spice up his own talks (which are notably less dramatic),
concluded at the end that there was indeed something to learn, but the
lessons were all negative.
A number of attendees complained, and the organizers, in response,
apologized (to applause) at the closing session. Mark later posted an
apology of his own. It seemed like a reasonable handling of the situation,
and the discussion could have stopped there - but it didn't.
The lca-chat
mailing list, which had mostly occupied itself with (1) making Brisbane's
public transportation system seem much more complicated than it really is
and (2) discussing the lack of toilet paper in one of the lodging
choices, hosted several
threads on whether the response to the talk was right. Interested parties
are encouraged to read through the threads - which remained civil
throughout - for the full discussion. But there are a few things which can
be summarized:
- Some attendees were upset by the talk and fully supported the
apology.
- Some participants, while supporting the posted anti-harassment
policy, felt that the talk did not violate that policy.
- Others went further, saying that the language and imagery used were
effective and necessary for the talk to attain its objective of making
attendees uncomfortable with the current state of affairs.
- Others yet objected to the entire conversation, claiming that a
discussion of whether the policy applied made them feel unsafe and
asking people to stop.
Your editor disagrees with the last group and feels that the discussion is
absolutely necessary. We are partway through a process - likely to take
years - aimed at making our community and its gatherings more welcoming for
all those we would like to have attend. LCA 2011 adopted a new style of
policy on harassment which had not been used before, and Mark Pesce's
talk was the first time it was invoked. The idea that we have everything
right and that no further discussion required is, frankly, laughable. Some
debugging will certainly be necessary - once we are sure we have the core
design right.
While evaluating the design and pondering debugging, there are a couple of
viewpoints from LCA
organizers that warrant reading in full. The
first is from LCA 2011 organizer Russell Stuart, who opposed the policy
from the outset - though, having lost that battle, he argued for
apologizing when the policy was violated. He says:
One of the roles of LCA organisers is to bring popular,
enlightening and if we get very lucky even inspiring talks. By two
measure's Mark Pesce's talk was one of those. It received one of
the longest, it not the longest acclamation of any talk at LCA
2011. And if the chatter on our lists is any guide, it caused more
people to stop, think and act than any other talk. And yet we have
a small minority of people who evidently take offence at images and
words that would be perfectly acceptable on Australia broadcast TV,
and are now suggesting the vast bulk of the LCA attendees who
enjoyed the talk should not have been allowed to see it because
they object to it. And they got very close to achieving just that.
Russell fears that the policy heads toward outright censorship and should
not be used by other conferences until it has been "substantially
reworked." He found agreement from Susanne
Ruthven, one of the lead organizers of LCA 2010 and the author of that
conference's anti-harassment policy. That policy was aimed at
preventing broadly-described "harassment or discrimination" and,
seemingly, would not have been invoked for this talk:
As organisers of LCA2010, Andrew and I have discussed this current
situation and think some of Mark's slides could be inappropriate
and considered bad taste, but they have certainly achieved their
purpose of making us all sit up and think, and more importantly, to
question. In our view, Mark's talk was not discriminatory or
harassment. It obviously offended some people, but then he is
entitled to shock, horrify and offend under his right to freedom of
expression (as long as his actions aren't breaking any laws, like
discrimination laws etc).
Clearly there is a balance to be found here; outright harassment is not a
freedom of speech issue, but the desire to create a more welcoming
environment in general will almost certainly require curtailing certain
types of speech. Those who see speech freedom as fundamental will resist
such moves. Those who have suffered assault, or who simply do not want to
circulate in a highly sexualized environment, will push in the other
direction. Conference organizers - and speakers - may find themselves
caught in the middle.
The problems addressed by anti-harassment policies are real. Conference
attendees have had to put up with some
horrifying experiences which - hopefully! - do not reflect what our
community is about. Practices like the employment of booth babes or the
use of women as sexually-charged attention magnets on slides do not create
an environment which is conducive to the acceptance of women as equal
participants. We absolutely need to clean up our act. But doing so will
be an iterative process which must also respect other, equally fundamental
freedoms. It's a design and debugging problem, and we are far from the
final release on this bit of code.
Comments (181 posted)
By Jonathan Corbet
February 2, 2011
The Sendmail mail transfer agent tends to be one of those programs that one
either loves or hates. Both its supporters and its detractors will agree,
though, that Sendmail played a crucial role in the development of
electronic mail before, during, and after the explosion of the Internet.
Sendmail creator Eric Allman took a trip to Brisbane to talk to the LCA
2011 about the history of this project. Sendmail is, he said, 30 years old
now; in those three decades it has thrived without corporate support,
changed the world, and thrived in a world which was changing rapidly around
it.
The history
Sendmail had its start at the University of California, Berkeley, in 1980;
it was initially something Eric did while he was supposed to be working on
the Ingres relational database management system. In those days, the
Computer Science department had a dozen machines, but the main system was
"Ernie CoVAX," which was accessed via ASCII terminals. There was a limited
number of ports, so users had to connect via a patch panel in the mail
room; contention for available ports was often intense.
Things got more interesting when the Ingres project got an ARPAnet
connection; a single PDP11 machine, with two ports, was the only way to
access the net at that time. There was no way the entire department was
going to share those two ports without somebody getting hurt, so another
solution was required. Eric looked at the problem, concluded that what
everybody really wanted was the ability to send mail through the gateway
machine, and decided that he would make a way to access email from other
machines on campus. From this beginning delivermail was born.
There was a set of design principles that Eric adopted at that time. There
was only one of him, so programming time was a truly finite resource.
Redesigning user agents and mail stores was out of the question.
Delivermail had to adapt to the world around it, not the other way around.
The resulting program worked, but was not without its problems. The
compiled-in configuration lacked flexibility, there was no address
translation as messages moved between networks, and the parsing was simple
and opaque. But it succeeded in moving mail around and giving the entire
department access to the net.
Then the department got the BSD contract. Bill Joy needed a mail transfer
agent to connect to the network, so he talked Eric into taking on the job.
After all, how hard could it be? Among other things, the new MTA needed to
support the SMTP mail protocol - which wasn't specified yet. Supporting
SMTP also forced the addition of a mail queue, a job which turned out to be
much harder than it looked. Eric hacked away, and Sendmail was shipped
with 4.1BSD in 1982 with support for SMTP, header rewriting, queueing, and
runtime configuration.
After that, Eric left Berkeley for a "lucrative" (heavy on the quotes)
career in industry. Sendmail, meanwhile, was picked up by the Unix
vendors. The Unix wars were in full force at that time; the
inevitable result was a proliferation of different versions of Sendmail.
The program became balkanized and incompatible across systems.
Eric returned to Berkeley in 1989 and started hacking on Sendmail again;
the immediate need was support for the ".cs" subdomain at the university.
That work snowballed into a major rewrite culminating in Sendmail 8;
this version integrated a great deal of code from both the industry and the
community. It added support for ESMTP, a number of new protocols, delivery
status notifications, LDAP integration, eight-bit mail, and a new
configuration package. Uptake increased after the Sendmail 8 release
as a result of these features, but also as the result of the publication of
the O'Reilly "bat" book. Documentation, it turns out, really matters.
Sendmail Inc. was created in 1998 with the fantasy that it would let Eric
get back to coding. In reality, starting a company is more about
marketing, sales, and money than about technology - a lesson many of us
have learned. It was one of the first companies trying to mix open source
and proprietary offerings; in those days, the prevailing wisdom is that a
company needed proprietary lock-in to have any chance of success. Over
time, though, functionality migrated to the free version; thus Sendmail
gained support for encryption, authentication, milters (mail filters),
virtual hosting, spam filtering, and more. And that's where things stand
today.
Lessons learned
As one might expect, 30 years of experience have led to a number of lessons
worth passing on. Eric shared a few of them.
One is that requirements change all the time. The original delivermail
program had reliability as its primary focus - few things are more
hazardous to one's academic career than losing a professor's grant
proposal. Over time, the requirements shifted toward functionality and
performance; Sendmail had to scale up in speed and features as the Internet
took off. Then users were demanding protection from spam and malware; that
shifted Sendmail development toward keeping mail out. We have, Eric noted,
gone full circle toward unreliable mail service. After that came
requirements around legal and regulatory compliance - that is where a great
deal of Sendmail Inc.'s business lies. There is currently an increasing
focus on controlling costs, mobility, and social network integration.
Without the ability to adapt to meet these shifting requirements, Sendmail
would not have thrived through all these years.
With regard to Sendmail's design decisions, Eric said that some turned out
to be right, some were wrong, and some were right at the time but are wrong
now. One criticism that has been made is that Sendmail is an overly
general solution; it can route and rewrite messages in ways which are
generally unneeded in these days of Internet monoculture. Eric defended
that generality by saying that the world was in great flux when Sendmail
was designed; there was no way to really know how things were going to turn
out. And, he said, he would do it again: "the world is still ugly."
Rewriting rules for addresses are a part of that generality; even at the
time, it seemed like overkill, but he couldn't come up with anything
better. It was, he said, probably the right thing to do. That said, the
decision to use tabs as active characters was the stupidest thing he has
ever done. That's how makefiles did it, and it seemed cool at the time.
As a whole, he said, the concept was right, but the syntax and flow control
could have been a lot better. Even so, he's glad he did matching based on
tokens; basing Sendmail configuration around regular expressions would have
been far worse.
If he were doing the configuration system now, it would look a lot more
like the Apache scheme.
The message munging feature was needed for the rewriting of headers; it
facilitated interoperability between different networks. It is still used
a lot, he said, though it's arguably not necessary. Sendmail could benefit
from a pass-through mode which shorts out the message munging, but that
leaves open the question of what should be done with non-compliant
messages. Should they be fixed, rejected, or just dropped? There is, he
said, no obvious answer.
The embedding of SMTP and queueing in the mail daemon was the right thing
to do; he does not agree with the Postfix approach of proliferating lots of
small daemons. The queue structure itself involves two files for every
message: one with the envelope, and one with the body. That forces the
system to scan large numbers of small files on a busy system, which is not
always optimal. At the time it was the right way to go; now he would
probably use some sort of database for the envelopes. The decision to use
plain text for all internal files was right, though; it makes debugging
much easier.
With regard to the use of the m4 macro preprocessor for configuration, Eric
admitted that the syntax is painful. But he needed a macro facility and
didn't want to reinvent the wheel. The "damned dnl lines" for comments
were a mistake, though, and completely unnecessary. In summary, some sort
of tool was needed; m4 might not have been the best choice, but it's not
clear what would have been.
With regard to extending or changing features: Sendmail has tended toward
extending features and maintaining compatibility, and that has not always
been the right thing to do. The hostname masquerading facility was one
example; that feature was simply done wrong the first time around. Rather
than fixing it, though, Eric papered over the problems with new
features. It would have been better to inflict some short-term pain on
users, perhaps aided by a migration tool, and be done with it. The
unwillingness to replace
mistaken features has a lot to do with why Sendmail is difficult to
configure.
Sendmail goes out of its way to accept and fix bogus input; that was in
compliance with the robustness principle ("be conservative in what you send
but liberal in what you accept") that was widely accepted at the time. It
increases interoperability, but at the cost of allowing broken software to
persist indefinitely, leading to large costs down the road. Nonetheless,
it was the right idea at the time for the simple reason that
everything was broken then. But he should have tightened things up
later on.
What would he have done differently? At the top of the list is trying to
fix problems as soon as possible. These include tabs in the configuration
file and the V7 mailbox format. He's really tired of seeing
">From" in messages; he said he could have fixed it and
expressed his apologies for not having taken the opportunity. He would
make more use of modern tools; Sendmail has its own build script, which is
not something he would do today. He would use more privilege separation,
though he would not go as far as Postfix. He would have made a proper
string abstraction; strings are by far the weakest part of the C language.
There are also a number of things he would do the same, starting with the
use of C as the implementation language. It is, he said, a dangerous
language, but the programmer always knows what is going on.
Object-oriented programming, he said, is a mistake; it hides too much.
Beyond that, he would continue to do things in small chunks. The
creation of syslog (initially as a way of getting debugging
information out) was obviously the right thing to do; he was surprised that
there was no centralized way of dealing with logging data on Unix systems.
He would still
implement rewriting rules, albeit with a different syntax. And he would
continue not to rely too heavily on outside tools. There is a cost to
adding dependencies on tools; sometimes it's better to just build what you
need. There are, he said, projects using lex when all they really
need is strtok().
There were a number of "takeaways" to summarize the talk:
- The KISS (keep it simple, stupid) principle works.
- If you don't know what you are doing, advance designs will
not help.
- The world is messy, just plan on it.
- Flexibility trumps performance when the world changes every day.
- Fix things early; your installed base will only get larger if
you succeed, and the pain of not fixing things will only get worse.
- Use plain text for internal files and protocols.
- Good documentation is the key to broad acceptance; most projects, he
said, have not yet figured this out.
The talk was evidently based on a chapter from an upcoming book on the
architecture of open-source applications.
One member of the audience asked Eric which MTA he would recommend for new
installations today. His possibly surprising answer was Postfix. He talked
a lot with Postfix author Wietse Venema during its creation, and was
impressed. Postfix is, he said, nice work, even if he doesn't agree with
all of the design decisions that were made.
Comments (107 posted)
Page editor: Jonathan Corbet
Next page: Security>>