LWN.net Logo

An open letter to Evgeniy Polyakov

By Jonathan Corbet
November 25, 2008
[Editor's note: the following article may look like a message to a specific kernel developer, but it is really about the development process in general. Over the years, your editor has seen too many worthy hackers run into development process problems; the end result is often that we lose that person's contributions. We are not so rich that we can afford that sort of loss. The desire to prevent such problems was the motivation behind your editor's recently-written development process document - and this letter.]

Dear Evgeniy,

Your editor has chosen to write to you in a public manner because he hates to see talented developers get frustrated with the kernel process and storm off. We do not have an excess of capable hackers, especially those who can work at your level. Losing one hurts. Your editor hopes that this eventuality can be avoided in this case - for you, and for others who may be encountering the same sort of frustrations you are. Getting code into the kernel can be a pain, sometimes. That said, some 1160 developers have managed it since the opening of the 2.6.28 merge window in October. It is possible to get code merged with sufficient care.

You first posted your distributed storage (DST) patch back in 2007; LWN took a look at it at that time. Since then, this code has come a long way. Beyond the basic task of exporting (and accessing) storage volumes across the net, this code claims "bullet-proof memory allocations," zero-copy transport, failover recovery with full transaction support, support for IPv6 and beyond, and a number of features including encrypted data channels. And, it is said, this code is fast. In general, it looks like good stuff.

You have posted the DST code on the mailing lists a number of times - too many, apparently, for your tastes. Frustration with the process appears to have led to the behavior described in your recent weblog post:

To understand the roots of this issue, I made a simple experiment with the previous DST release. I added following lines into the patch to catch reviewer's eyes:

    ass licker
    static char dst_name[] = "Successful erackliss screwing into";

As you may expect, this does not compile and thus was never read by the people who are subscribed to the appropriate mail lists. I got one private mail about this fact for the whole week. The same DST code (without above lines) was sent public first time more than month ago and was resent 3 times after that.

That's why I do not care about DST inclusion anymore. I do not care about its linux-kernel@ feedback.

So, because the fourth posting of identical code in one month received little attention, DST now risks joining Kevents, network channels, network tree memory management, asynchronous crypto, and more in that place where dusty, out-of-tree stuff lives. This would not be a good outcome. So let us look at what can be done to avoid that - for your sake, for DST users' sake, and for the sake of other developers who may follow.

One way to get more reviews for your code is to pay attention to what those reviewers are saying. Andrew Morton spent some time on DST back in October. He had a number of concrete requests - such as documenting the user-space ABI and the network protocol - which have not been satisfied. He also asked for better code documentation in general:

So please. Go through all the code and make it tell a story. Ask yourself "how would I explain all this to a kernel developer who is sitting next to me". It's important, and it's an important skill.

The November 25, 2008 version of DST still does not tell that story, and that makes it very hard for other developers to understand. Code review, as you know, is in critically short supply in most free software projects. Getting reviews for difficult-to-understand code is hard, especially when it is a large body of complex code which occupies a niche in which relatively few developers have expertise. So it's not surprising that your most recent comment involved white space - anybody can make that kind of review without any need to actually understand what's going on.

Not only does your patch not tell a story, but the individual pieces of it do not even contain changelogs. For a patch set marked "consider for inclusion," that is a fatal error. Playing along with the system on things like that can seem like a waste of time, especially if you hold out no real hope of the patch being merged, but it is a necessary sign of respect for the people you are asking to consider the patch. No maintainer will accept a patch without a changelog.

While we're on the topic of documentation, your kernel configuration help text reads, in its entirety:

This driver allows to create a distributed storage block device.

You owe your users a little bit more than that. Why might they want to use DST? Where can they get the associated tools? This, too, is a fatal error for any substantive kernel change.

And, while we're still somewhat on the subject of reviews: Andrew naturally called out the generic-looking thread pool implementation buried deep within DST; shouldn't it pulled out and made more generic? Your response can be paraphrased as "I can't be bothered to get the API past the review process, which, in any case, is biased toward those who are 'closer to the high end'." But pulling out this code and merging it separately might be the ideal starting point for getting the larger patch set into the kernel. A generic thread pool hiding within a storage device driver, instead, will be an ongoing impediment to inclusion.

Then there is the issue of motivation: why should the kernel developers want to merge this patch? Who are the users of it - do you have users now? How does it compare to other distributed storage technologies already in the kernel? What's the performance like - can you post some benchmark results? As it stands, DST looks like a nice piece of technology, but its benefits are still unclear. Tell that story, and the level of interest may well go up.

Finally, your editor would like to counsel patience. Some patches just take longer than others to find their way in the kernel. That is especially true of complex patches which touch on issues like memory management and which add new user-space ABIs. As a close-to-home example, look at David Howells's FS-Cache code, recently reposted for consideration. The first LWN article on this code was published more than four years ago. David is probably getting a little tired of maintaining this code out-of-tree, but he sticks with it, responds to reviews, and appears to be getting closer to inclusion.

Evgeniy, you appear to be a brilliant and productive hacker. You charge into places that scare off most kernel developers, and you always come back out with something interesting. We need developers like you. But we need developers like you who can work with the process - no matter how frustrating it gets. The kernel process is certainly far from perfect, but it is built around a set of principles which have served us well for many years. You could easily rise up through that process to become one of the "high end" developers who, you say, have an easier time getting code merged. Or you could take your marbles and storm home, making snide comments about reviewers on the way. But that would not be good for anybody involved.

(See also: Evgeniy's response to this article.)


(Log in to post comments)

An open letter to Evgeniy Polyakov

Posted Nov 26, 2008 20:20 UTC (Wed) by tetromino (subscriber, #33846) [Link]

Evgeniy Polyakov responds:
http://www.ioremap.net/node/74

An open letter to Evgeniy Polyakov

Posted Dec 1, 2008 3:21 UTC (Mon) by giraffedata (subscriber, #1954) [Link]

The most important point to take from Evgeniy's response is a contradiction of something the article seems to imply: that Evgeniy got fed up with the kernel development process, performed an experiment to prove the LKML community isn't good enough for him, and decided to quit.

In fact, it appears that Evegeniy wasn't trying to get his code into the standard kernel and isn't mad at anybody. He was trying to get review of his code, did an experiment that showed he wasn't succeeding in getting it, so stopped trying to get it that way.

An open letter to Evgeniy Polyakov

Posted Nov 26, 2008 21:37 UTC (Wed) by leonov (subscriber, #6295) [Link]

Please stick with it Evgeniy! Your work sounds really interesting, and I'm sure there are many of us who would love to see it hit mainline sometime soon... :-)

Changelogs?

Posted Nov 27, 2008 0:22 UTC (Thu) by ikm (subscriber, #493) [Link]

What made me wondering is that 'changelogs' clause by our editor. 'No maintainer will accept a patch without a changelog.' Why is that? What are the reasons that make changelogs that important?

Changelogs?

Posted Nov 27, 2008 0:52 UTC (Thu) by JoeBuck (subscriber, #2330) [Link]

Many may not know that the GPL requires something like a change log if you modify and distribute a GPL program. From the GPL (v2):
"You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change."

The simplest way to meet this requirement is with a standard changelog entry. But even if this often-ignored requirement weren't there, it's good practice to document just what changed.

Changelogs?

Posted Nov 27, 2008 1:04 UTC (Thu) by ikm (subscriber, #493) [Link]

Sure you need to stick in some changelog entries if you're modifying the others' code, but I thought it was a requirement to maintain your own changelogs for every file, including the ones the patch just adds (since DST is mostly a chunk of brand new code, right?) If the original statement was merely about the fact that you need to acknowledge changing the work of others with a brief description of what you've changed there, then that's surely understandable and is of course required.

Changelogs?

Posted Nov 27, 2008 13:30 UTC (Thu) by sbergman27 (subscriber, #10767) [Link]

So... if that requirement is not met, all rights to use and distribute the GPL'd work are automatically revoked under GPLv2, and can be revoked at the option of the copyright holder under GPLv3? And just slapping a changelog on after the fact does not restore the rights?

Changelogs?

Posted Nov 27, 2008 7:53 UTC (Thu) by gouyou (subscriber, #30290) [Link]

Simple, if you've already reviewed the code and you get a new version, you do not want to start again from scratch: you want to know at a glance if you're earlier issues were fixed and what to review in the latest revision.

Re: An open letter to Evgeniy Polyakov

Posted Nov 27, 2008 4:01 UTC (Thu) by lipak (guest, #43911) [Link]

One of the things that encourages people to write code but not worry
about documentation is statements like:

Talk is cheap - show me the code

As a pithy slogan it may be good, but it _could_ give the impression
that writing code alone is enough. This is especially tempting
when writing code comes easier to you than writing documentation.
On top of this uber-hackers tend to dismiss code cleanup,
documentation etc., as low-hanging fruit.

The point that Harold Welte made in his FOSS.in talk this year is
worth recalling here. (It seems to be to also be an undercurrent in
the article being commented on.)

The best FOSS code is written to be read by other humans

Of course it _should_ also run efficiently on a real computer
and so on, but FOSS code that is not a good read is not reviewable
or maintainable.

To make this easier for those for whom English is not a comfortable
language, we should ask the following question:

Will reviewers be willing to accept documentation in languages other
than English?

Kapil.

Re: An open letter to Evgeniy Polyakov

Posted Nov 27, 2008 7:09 UTC (Thu) by i3839 (guest, #31386) [Link]

The question isn't whether reviewers are willing to accept non-Enlish documentation, the question is if people review stuff they don't know what it is nor what it can be used for. You need to spark some interest before you get some attention.

Next step is to make it as easy as possible for people to try the stuff out. High quality documentation that explains briefly how it works and how to get it working is crucial here. the shorter the better, as it in general means it's easier to setup.

In the case of distributes storage it should be made clear what problem is solved and what the advantages are compared to similar solutions, like distributed filesystems.

At this point you'll get people interested enough to help you with the code, either writing parts of it or reviewing it. Or at least a bunch of users.

Re: An open letter to Evgeniy Polyakov

Posted Nov 27, 2008 9:24 UTC (Thu) by rvfh (subscriber, #31018) [Link]

> The best FOSS code is written to be read by other humans

Correct code (including comments) is written to be read by others, because at some point someone will want to add a feature or fix a bug or make it more efficient.

Also because, in two months/weeks/years time (depends on people), the writer will have gone to some other activities and won't be able to understand that ugly clever trick they made to fix that strange case they don't quite remember all the details of.

Also because compilers will have trouble reading too it if it's badly written, and may optimize it in unexpected ways...

Re: An open letter to Evgeniy Polyakov

Posted Nov 27, 2008 11:05 UTC (Thu) by rsidd (subscriber, #2582) [Link]

The best FOSS code is written to be read by other humans

Or, as Abelson and Sussman put it (in the preface to SICP): "Programs must be written for people to read, and only incidentally for machines to execute."

Re: An open letter to Evgeniy Polyakov

Posted Nov 27, 2008 11:13 UTC (Thu) by hmh (subscriber, #3838) [Link]

"Will reviewers be willing to accept documentation in languages other
than English?"

Probably, but what would be the point? The code won't be accepted in that state on mainline anyway... so you're more than likely to just get a "translate to english first, please", or to be completely ignored.

Not that I like the "completely ignored" thing, but...

An open letter to Evgeniy Polyakov

Posted Nov 27, 2008 12:41 UTC (Thu) by alfille (subscriber, #1631) [Link]

I've worked extensively with Evgeniy Polyakov (on an unrelated module -- the w1 system), and found him responsive, accommodating, and very eager to work with developers.

Documentation -- pretty sparse. But he will certainly answer and explain. In the corporate world, you'd assign a documentation specialist to work with him and have a very successful combination. Perhaps we need a role in kernel development for documenters.

An open letter to Evgeniy Polyakov

Posted Nov 27, 2008 16:05 UTC (Thu) by ebirdie (subscriber, #512) [Link]

"Perhaps we need a role in kernel development for documenters."

A person to be taken as example:

Michael Kerrisk
http://lwn.net/Articles/247788/
A search with "michael kerrisk" gives 140 hits in Kernel content on lwn.net.

An open letter to Evgeniy Polyakov

Posted Nov 29, 2008 2:07 UTC (Sat) by docwhat (subscriber, #40373) [Link]

Or perhaps developer facilitators. People who can essentially collect the issues from LKML and other devs, turn it into a bullet list and help the submitting developer meet that list.

Sort of a Dev. Personal Assistant...

Ciao!

An open letter to Evgeniy Polyakov

Posted Dec 1, 2008 3:13 UTC (Mon) by giraffedata (subscriber, #1954) [Link]

Well, the problem is quite simple: there aren't many people who enjoy doing that. In the proprietary development world, you have to pay people to do that. Same with testing, reviewing, planning, and release control.

So the challenge isn't to get people to do the non-coding work; it's to find a way to use the fun coding work that is in great supply without also having the non-coding work.

An open letter to Evgeniy Polyakov

Posted Dec 1, 2008 15:08 UTC (Mon) by cde (guest, #46554) [Link]

I don't mean to be trolling, but who needs distributed storage when we have cheap 3.5" 1TB drives now? Just sayin'

An open letter to Evgeniy Polyakov

Posted Dec 1, 2008 19:10 UTC (Mon) by vmole (guest, #111) [Link]

Consider the problem of installing and maintaining and backing up cheap 3.5" drives on 1,000 corporate desktops vs. doing the same in a bank of servers in the datacenter.

An open letter to Evgeniy Polyakov

Posted Dec 1, 2008 19:43 UTC (Mon) by martinfick (subscriber, #4455) [Link]

The larger the disk, the more I want to ensure that it is replicated on multiple machines and multiple sites. Don't you?

"why"

Posted Dec 4, 2008 9:32 UTC (Thu) by gvy (guest, #11981) [Link]

Now try to get say 1 Gb/s transfer so that it takes somewhat reasonable time to get that terabyte off the plates. Then think of parallel access.

All of this makes spindles, and there are practical limits (including reliability and commodity ones) to how much hard drives can be reasonably stuffed into a box.

This makes boxes... and one suddenly feels the need for distributed filesystem or block device.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds