|
|
Subscribe / Log in / New account

Improving communities through documentation

By Jonathan Corbet
July 19, 2019

OSS Japan
Documentation, said Riona MacNamara at the beginning of her Open Source Summit Japan 2019 talk, is the superpower that we can use to energize users and developers; it is an important part of the creation of a vibrant and inclusive community. While there are a number of roadblocks that can impede participation in a development community, many of those can be addressed with better documentation. The talk was a call for all projects to think about what they are trying to accomplish and to ensure that their documentation is helping to get there.

First, though, MacNamara started with a story. The fifth of Euclid's postulates holds that non-parallel lines must eventually cross. Not only has this postulate never been proved, but there are geometries where it is known to be false; these include hyperbolic spaces. Those spaces, though, are hard to visualize and hard to explain, a fact that inhibited research into them for two centuries.

In 1997, Diana Taimina realized that crochet could be used to create a model of a hyperbolic space; this model is now the standard way of explaining the whole idea. The visualization of hyperbolic spaces is, MacNamara said, a problem that had gone unsolved for centuries for a simple reason: the field of mathematics was closed to women. What else are we losing, she asked, when we exclude the talents held by large parts of our population?

Open-source software has just such a problem, even if one looks only at gender and ignores (for now) many other potential diversity issues. Surveys have shown that only about 3% of the open-source development community is female; the situation is very much like the 18th-century mathematics field. She would like to improve that situation, increase diversity in our communities, help the process of inclusion, and thereby create equity in the field.

Contributor types

Toward that end, Google commissioned a study looking at how people enter open-source communities, with the idea of finding ways to attract more contributions. It asked who the users are, what their journey to becoming a contributor is like, what their information needs are, and how we tend to lose them. The group interviewed 18 subjects, some of whom were contributors and others not, and came up with five archetypes to describe them:

  • Leaders, like "Rory", who are active in their communities; their activity is generally supported at work. Rory's biggest need is usually help in growing the community.
  • "Avery", who is convinced about open source, but whose participation is held back. Avery knows the value of contributing and feels safe in the community, but is unable to fully participate, perhaps because of a lack of support at work.
  • Competent people like "Taylor", who are not yet convinced of the value of participating in the community. Taylor does contribute, generally in the form of bug fixes, but doesn't see the point of doing more. Taylor may be skeptical about the owners of a project and is not sure what benefit would be gained from participating more fully.
  • Curious people like "Parker" who do not yet contribute — we all began as Parker, she said. Parker is curious about how to get involved but faces a number of barriers. They may not be technically advanced or understand how open-source development works; they almost certainly do not have support.
  • Lurkers; MacNamara didn't really talk about this unnamed group at all.

Looking at Parker, MacNamara delved into how such a person would pick a project. There are a few phases involved, the first of which is assessment, where the objective is to eliminate any unsuitable projects from consideration. What language does the project use? What are the requirements for participation? This phase also involves a look at the reliability of the project — is it likely to be around for some time? Parker will probably start by looking at commit activity, but then will move on to the project's documentation. They might not read the documentation at this point, but its presence indicates that the project is healthy.

After that, though, a potential contributor will want to obtain a better understanding of how the project works, and good documentation becomes critical. What is the overall architecture of the project? What are its use cases? They will want to be sure that this information is there and accessible.

The next phase is adoption and, at some point, Parker will certainly run into some sort of problem. It's software, after all. That is the make-or-break point for a future contributor, she said. They may succeed in fixing the problem and will want to contribute the fix back, but that's where the roadblocks show up. Sending a contribution to a project can be scary; one has to make oneself vulnerable to do so. An employer may stop the process, or hostile feedback may drive people away; either way, they don't contribute, and that is bad for everybody involved.

What can help with this process? Documentation (or the lack thereof) is often the biggest barrier that potential contributors encounter.

Documentation for an inclusive project

But is documentation really necessary? Established members of a project will often say that contributors should be able to just read the code to get the answers they need. There are a number of problems with that answer, but MacNamara had one in particular to point out: women have less leisure time available for activities like "just reading the code" than men do. The numbers vary, but women all over the world still do significantly more unpaid work than men. Having the time to contribute to a software project is a privilege, and it's one that women tend not to have the way men do. So they can't just go read the code to get the answers they need.

Another suggested alternative is to simply ask for help. But not everybody experiences the community in the same way. People who have had bad experiences, harassment for example, are going to be reluctant to ask for [Riona MacNamara] help publicly. For them, the lack of documentation is a significant barrier to contribution. Others may not have strong English skills, which makes it harder to ask for help; once again, documentation can help.

MacNamara repeated that contributing to an open-source project requires making oneself vulnerable. In many projects, people are afraid to ask questions — or even to answer them. One of the best things to do to create a more inclusive community is to build psychological safety for participants. Creating clear and effective documentation is a big step in that direction.

In fact, she said, good documentation can help people at all four of the stages listed above. It can help people decide whether a given project is right for them by describing the languages used, what the use cases are, what platforms the project is built on, etc. New contributors can start more quickly with "getting started" guides, tutorials, and documentation with good and well-designed navigation. Documentation can help to convince managers to support the work by covering use cases, contribution requirements, success stories, and so on. Leaders will benefit from the ability to bring contributors in more easily.

The required documentation takes a few different forms. Those considering adopting a project will need "standard engineering documentation" like API references, tutorials, and troubleshooting guides. There should be information about getting help and pointers to resources like a project's social media presence. Good navigation, and especially good search features, are critical.

Contributors need the above documents and more. There should be documentation for moderators and reviewers describing how the project responds to people. There should be welcoming documents for new users and contributors. And, of course, there need to be guidelines for the documents themselves.

To get there, she said, a project should recognize and reward good work on documentation. Non-code contributions are vital to a project's health; they provide a project with a lot of value and stability. When the documentation is good, everybody wins.

To summarize, she said that the lack of good documentation is a barrier to contributions, and to contributions from women in particular. Psychological safety is paramount in an inclusive project, and documentation helps to create that. Documents that clearly describe a project empower contributions. Tribal knowledge concentrates power in the hands of a privileged few, while good documentation gets knowledge out of people's heads and makes it more widely available. Knowledge, she said, is power; when we make knowledge accessible, we build equality.

[Your editor thanks the Linux Foundation for supporting his travel to the event.]

Index entries for this article
ConferenceOpen Source Summit Japan/2019


to post comments

Improving communities through documentation

Posted Jul 20, 2019 11:48 UTC (Sat) by alfille (subscriber, #1631) [Link] (1 responses)

Documentation certainly helps, but it's only part.

An easy way to make code contributions (git, helpful feedback and review), good internal architecture and a modular structure that allows adding components at the edges without having to grok the entire codebase are also important.

Diversity and attracting new populations is also great, but seems orthogonal to the documentation and code structure barriers. Perhaps it should be grouped under "marketing" (to developers).

Improving communities through documentation - and debug solutions

Posted Jul 20, 2019 22:50 UTC (Sat) by marcH (subscriber, #57642) [Link]

While experts may not need one, or even shouldn't use one as I think someone famous said, a good debugger story is by far the most effective way to ramp up new developers. I recently went from regular user but totally ignorant of the code to accepted bug fix on first pass for some relatively complex Python project in _less than a few hours_. How? Entirely thanks to PuDB (any even better alternative please share).

The problem with even very well designed and commented source code: you can still spend hours or even days reading many parts until you understand them well enough to realize they're not actually relevant to your current issue. With a debugger you can often understand very localized issues and fix them while being in complete ignorance of the overall picture. This is not just solving your current issue and contributing back at an extremely low cost, this is also extremely rewarding / addictive which gives a higher chance you will come back and contribute more. Or conversely, this is the fastest way to gauge poor project maintainership and minimize wasted time.

This is IMHO one of the reasons why tools like QEMU are popular: they massively boost debuggability. In a completely different domain look at how popular are Chrome Developer Tools.

Improving communities through documentation

Posted Jul 20, 2019 19:47 UTC (Sat) by mb (subscriber, #50428) [Link] (1 responses)

> women have less leisure time available for activities like "just reading the code" than men do.
> The numbers vary, but women all over the world still do significantly more unpaid work than men.

That doesn't sound reasonable to me.

Yes, having good documentation is always good for dozens of reasons.
But we only have 3% women in Open Source, because they don't have time to read the code?
Having less time would just reduce the "code output" per developer. Not the number of developers.

I also don't think that doing unpaid work is directly linked to having less leisure time. That highly depends on other factors. E.g. the way the family is organized. That can't be fixed by writing documentation.

Improving communities through documentation

Posted Jul 22, 2019 16:47 UTC (Mon) by k8to (guest, #15413) [Link]

What's unreasonable about it? The facts or the implied conclusion? Women doing unpaid work does definitely reduce their leisure time. The expectations from to some extent employers and to a lesser extent families are where this unpaid work comes from. I'm not sure if that implies that we have untapped documentation work hours left on the table due to these factors or not.

It's definitely true that in some projects, it's assumed that documentation should be built from code-reading, and these projects can't benefit from technical writers who aren't code experts. That is the vast majority of technicial writers.

A shift in focus towards a handoff from technical folks to doc writers explicitly can help projects both open and corporate, and I've seen it iin practice. So overall I thnk the ideas being supported here are legitimate.

Improving communities through documentation

Posted Jul 20, 2019 22:28 UTC (Sat) by mfuzzey (subscriber, #57966) [Link]

> women have less leisure time available for activities like "just reading the code" than men do.

Not sure how true that is.
But if it is true it would also be true contributing code too.

Documentation, while good, is not free.
It comes,with a significant maintenance burden of it's own ensuring it keeps up with the code itself.
Often out of date documentation is worse than no documentation.

So while I agree a short architectural overview document and a short "project goals and non goals" document are well worthwhile (and have less maintenance requirements since the goals and architecture change more slowly than the code) I am not convinced more developer documents are the best use of effort in many cases.

Improving communities through documentation

Posted Jul 21, 2019 12:32 UTC (Sun) by ebiederm (subscriber, #35028) [Link] (11 responses)

I feel like I am missing something in the article. How is it established that documentation helps?

My intuition would tend to agree that certain forms are documentation make the learning curve of figuring out a new piece of software much easier. I can see similar things for setting expectations when contributing code. AKA a document that says this is projects process.

Still it says Google had commissioned a study, as such it feels like the bar for recommendations should be set higher than someones intuition.

The practical challenge is that not all of us have sufficient experience for our intuition to be calibrated for what other groups need. Which means sometimes we are very much surprised by the results. Which means we need more than intuition to over come our implicit biases. So is this recommendation for more documentation based on more than intuition?

Improving communities through documentation

Posted Jul 21, 2019 16:48 UTC (Sun) by LtWorf (subscriber, #124958) [Link] (1 responses)

In the study they interviewed 18 people, some of which were not contributors. I don't even know how you can start drawing any meaningful conclusion from 18 people.

I think for those things, it's easier to do a standardised questionnaire to 100-200 people.

Improving communities through documentation

Posted Jul 22, 2019 17:03 UTC (Mon) by k8to (guest, #15413) [Link]

Detailed interviews shed light in a way that a narrow questionnaire to many people doesn't.

However I agree that it's hard to generalize from 18 people.

Improving communities through documentation

Posted Jul 22, 2019 22:56 UTC (Mon) by shemminger (subscriber, #5739) [Link] (5 responses)

I would like to have see some empirical data about project size/lines of code versus documentation size. Then use these metrics to compare against number of contributors and number of new contributors per year etc. This kind of data can be collected against a wide range of projects on some known repos (kernel.org, github, etc).

With this data some of these speculations could be validated.

Improving communities through documentation

Posted Jul 23, 2019 14:55 UTC (Tue) by anselm (subscriber, #2796) [Link] (4 responses)

I would like to have see some empirical data about project size/lines of code versus documentation size.

I don't think documentation size is the metric we're looking for here. A project could have documentation that is very terse but to the point and tells you exactly what you need to know, while another could have reams and reams of unintelligible garbage that isn't helpful at all. This is especially true in a world where project maintainers run a program on their code that automatically extracts the names and types of functions and their parameters, and then have the audacity to call the result “API documentation”.

Improving communities through documentation

Posted Jul 23, 2019 18:53 UTC (Tue) by shemminger (subscriber, #5739) [Link] (3 responses)

I have no complaints about well done docbook style documentation. It can be more helpful than simple comments.
The bigger question is how much documentation is enough? and how do you keep it up to date?
To use an an example the fd.io project has lots of documentation and tutorial videos.

In my experience, videos are better than documentation because a good presenter will often give background and express the intention of the code and project. Documentation seems to be written in an absolute and cold defacto style which doesn't help learning. It is like the difference between classroom and just reading a textbook.

Improving communities through documentation

Posted Jul 23, 2019 22:58 UTC (Tue) by anselm (subscriber, #2796) [Link]

In my experience, videos are better than documentation because a good presenter will often give background and express the intention of the code and project. Documentation seems to be written in an absolute and cold defacto style which doesn't help learning. It is like the difference between classroom and just reading a textbook.

As somebody who used to write Linux training materials for a living (I'm only doing it as a hobby these days), I honestly don't think videos are all they're cracked up to be. Personally I will almost always prefer written documentation to a video, simply because I tend to read a lot faster than most video presenters talk, and it is easier to skip back and forth in a written manual. I've seen videos that were so boring and dry they would make you cough, and well-written documentation that was not only accurate but also totally entertaining.

In general, good videos are a lot more effort to produce and keep current than good written documentation, so what are the chances that a project that can't manage good written documentation will be able to put up good videos? (More power to projects that can do both.)

Improving communities through documentation

Posted Jul 24, 2019 18:03 UTC (Wed) by nilsmeyer (guest, #122604) [Link]

That of course requires a good presenter, personally I prefer text, if there is video I probably won't watch it. But some people learn differently.

Improving communities through documentation

Posted Jul 26, 2019 12:37 UTC (Fri) by NAR (subscriber, #1313) [Link]

I don't know how to search in video. Also writing usually hides accents and dialects (well, mostly), but in videos many times it's really hard to understand what the presenter says.

Improving communities through documentation

Posted Jul 23, 2019 12:48 UTC (Tue) by nilsmeyer (guest, #122604) [Link] (1 responses)

> My intuition would tend to agree that certain forms are documentation make the learning curve of figuring out a new piece of software much easier. I can see similar things for setting expectations when contributing code. AKA a document that says this is projects process.

That makes a lot of sense intuitively, especially there should be clear requirements on what you need to do to get your code reviewed and merged. I see far too many open pull requests on many projects that are just stalling out. There is a lot that can be improved with tooling here which makes the process scalable.

I suppose one might also want to documentation to be a first class citizen in your code base - any feature without sufficient documentation shouldn't be merged until that documentation is produced. This is the approach I take professionally. At a certain point I would like the documentation to also be testable, to see if there is something missing and to see if examples actually work, though I don't really have a very good idea how this would work.

Improving communities through documentation

Posted Jul 23, 2019 18:00 UTC (Tue) by cwitty (guest, #4600) [Link]

> At a certain point I would like the documentation to also be testable, to see if there is something missing and to see if examples actually work, though I don't really have a very good idea how this would work.

Many scripting languages have "doctest" testers; if you put examples in your documentation in a certain stylized format, then the doctest framework will run the examples and make sure the current output still matches the output in the documentation. I've used this a lot for Python; it works very well, as long as you're testing APIs on objects that have reasonable textual I/O. (And you can also combine this with coverage tests, if you like, to track how much of your code is tested by doctests.)

Improving communities through documentation

Posted Jul 23, 2019 15:18 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

My process is usually to set other developers (usually my intended audience for documentation I write) down the path of using what is meant to be documented. I then solicit feedback on what was confusing (or make such a list based on questions I receive) for improvement. The reason is that as the author, what is that I'm a pretty bad judge of what might be confusing to someone not familiar with the code should do to use it. (I do document the interfaces and tricky internal bits, but usage documentation is harder without feedback.)

Anecdotally, I have found that better documentation leads to fewer private emails asking how something does/should work :) .

Improving communities through documentation

Posted Jul 23, 2019 12:34 UTC (Tue) by nilsmeyer (guest, #122604) [Link]

> women have less leisure time available for activities like "just reading the code" than men do. The numbers vary, but women all over the world still do significantly more unpaid work than men.

That only really tracks when you don't consider working on open source software as "unpaid work", which I think diminishes the voluntary contributions many men make. If you classify either as unpaid work then it suddenly becomes a matter of preference, much like it is in the workplace.


Copyright © 2019, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds