Improving communities through documentation
First, though, MacNamara started with a story. The fifth of Euclid's postulates holds that non-parallel lines must eventually cross. Not only has this postulate never been proved, but there are geometries where it is known to be false; these include hyperbolic spaces. Those spaces, though, are hard to visualize and hard to explain, a fact that inhibited research into them for two centuries.
In 1997, Diana Taimina realized that crochet could be used to create a model of a hyperbolic space; this model is now the standard way of explaining the whole idea. The visualization of hyperbolic spaces is, MacNamara said, a problem that had gone unsolved for centuries for a simple reason: the field of mathematics was closed to women. What else are we losing, she asked, when we exclude the talents held by large parts of our population?
Open-source software has just such a problem, even if one looks only at gender and ignores (for now) many other potential diversity issues. Surveys have shown that only about 3% of the open-source development community is female; the situation is very much like the 18th-century mathematics field. She would like to improve that situation, increase diversity in our communities, help the process of inclusion, and thereby create equity in the field.
Contributor types
Toward that end, Google commissioned a study looking at how people enter open-source communities, with the idea of finding ways to attract more contributions. It asked who the users are, what their journey to becoming a contributor is like, what their information needs are, and how we tend to lose them. The group interviewed 18 subjects, some of whom were contributors and others not, and came up with five archetypes to describe them:
- Leaders, like "Rory", who are active in their communities; their activity is generally supported at work. Rory's biggest need is usually help in growing the community.
- "Avery", who is convinced about open source, but whose participation is held back. Avery knows the value of contributing and feels safe in the community, but is unable to fully participate, perhaps because of a lack of support at work.
- Competent people like "Taylor", who are not yet convinced of the value of participating in the community. Taylor does contribute, generally in the form of bug fixes, but doesn't see the point of doing more. Taylor may be skeptical about the owners of a project and is not sure what benefit would be gained from participating more fully.
- Curious people like "Parker" who do not yet contribute — we all began as Parker, she said. Parker is curious about how to get involved but faces a number of barriers. They may not be technically advanced or understand how open-source development works; they almost certainly do not have support.
- Lurkers; MacNamara didn't really talk about this unnamed group at all.
Looking at Parker, MacNamara delved into how such a person would pick a project. There are a few phases involved, the first of which is assessment, where the objective is to eliminate any unsuitable projects from consideration. What language does the project use? What are the requirements for participation? This phase also involves a look at the reliability of the project — is it likely to be around for some time? Parker will probably start by looking at commit activity, but then will move on to the project's documentation. They might not read the documentation at this point, but its presence indicates that the project is healthy.
After that, though, a potential contributor will want to obtain a better understanding of how the project works, and good documentation becomes critical. What is the overall architecture of the project? What are its use cases? They will want to be sure that this information is there and accessible.
The next phase is adoption and, at some point, Parker will certainly run into some sort of problem. It's software, after all. That is the make-or-break point for a future contributor, she said. They may succeed in fixing the problem and will want to contribute the fix back, but that's where the roadblocks show up. Sending a contribution to a project can be scary; one has to make oneself vulnerable to do so. An employer may stop the process, or hostile feedback may drive people away; either way, they don't contribute, and that is bad for everybody involved.
What can help with this process? Documentation (or the lack thereof) is often the biggest barrier that potential contributors encounter.
Documentation for an inclusive project
But is documentation really necessary? Established members of a project will often say that contributors should be able to just read the code to get the answers they need. There are a number of problems with that answer, but MacNamara had one in particular to point out: women have less leisure time available for activities like "just reading the code" than men do. The numbers vary, but women all over the world still do significantly more unpaid work than men. Having the time to contribute to a software project is a privilege, and it's one that women tend not to have the way men do. So they can't just go read the code to get the answers they need.
Another suggested alternative is to simply ask for help. But not everybody
experiences the community in the same way. People who have had bad
experiences, harassment for example, are going to be reluctant to ask for
help publicly. For them, the lack of documentation is a significant
barrier to contribution. Others may not have strong English skills, which
makes it harder to ask for help; once again, documentation can help.
MacNamara repeated that contributing to an open-source project requires making oneself vulnerable. In many projects, people are afraid to ask questions — or even to answer them. One of the best things to do to create a more inclusive community is to build psychological safety for participants. Creating clear and effective documentation is a big step in that direction.
In fact, she said, good documentation can help people at all four of the stages listed above. It can help people decide whether a given project is right for them by describing the languages used, what the use cases are, what platforms the project is built on, etc. New contributors can start more quickly with "getting started" guides, tutorials, and documentation with good and well-designed navigation. Documentation can help to convince managers to support the work by covering use cases, contribution requirements, success stories, and so on. Leaders will benefit from the ability to bring contributors in more easily.
The required documentation takes a few different forms. Those considering adopting a project will need "standard engineering documentation" like API references, tutorials, and troubleshooting guides. There should be information about getting help and pointers to resources like a project's social media presence. Good navigation, and especially good search features, are critical.
Contributors need the above documents and more. There should be documentation for moderators and reviewers describing how the project responds to people. There should be welcoming documents for new users and contributors. And, of course, there need to be guidelines for the documents themselves.
To get there, she said, a project should recognize and reward good work on documentation. Non-code contributions are vital to a project's health; they provide a project with a lot of value and stability. When the documentation is good, everybody wins.
To summarize, she said that the lack of good documentation is a barrier to contributions, and to contributions from women in particular. Psychological safety is paramount in an inclusive project, and documentation helps to create that. Documents that clearly describe a project empower contributions. Tribal knowledge concentrates power in the hands of a privileged few, while good documentation gets knowledge out of people's heads and makes it more widely available. Knowledge, she said, is power; when we make knowledge accessible, we build equality.
[Your editor thanks the Linux Foundation for supporting his travel to the
event.]
| Index entries for this article | |
|---|---|
| Conference | Open Source Summit Japan/2019 |
Posted Jul 20, 2019 11:48 UTC (Sat)
by alfille (subscriber, #1631)
[Link] (1 responses)
An easy way to make code contributions (git, helpful feedback and review), good internal architecture and a modular structure that allows adding components at the edges without having to grok the entire codebase are also important.
Diversity and attracting new populations is also great, but seems orthogonal to the documentation and code structure barriers. Perhaps it should be grouped under "marketing" (to developers).
Posted Jul 20, 2019 22:50 UTC (Sat)
by marcH (subscriber, #57642)
[Link]
The problem with even very well designed and commented source code: you can still spend hours or even days reading many parts until you understand them well enough to realize they're not actually relevant to your current issue. With a debugger you can often understand very localized issues and fix them while being in complete ignorance of the overall picture. This is not just solving your current issue and contributing back at an extremely low cost, this is also extremely rewarding / addictive which gives a higher chance you will come back and contribute more. Or conversely, this is the fastest way to gauge poor project maintainership and minimize wasted time.
This is IMHO one of the reasons why tools like QEMU are popular: they massively boost debuggability. In a completely different domain look at how popular are Chrome Developer Tools.
Posted Jul 20, 2019 19:47 UTC (Sat)
by mb (subscriber, #50428)
[Link] (1 responses)
That doesn't sound reasonable to me.
Yes, having good documentation is always good for dozens of reasons.
I also don't think that doing unpaid work is directly linked to having less leisure time. That highly depends on other factors. E.g. the way the family is organized. That can't be fixed by writing documentation.
Posted Jul 22, 2019 16:47 UTC (Mon)
by k8to (guest, #15413)
[Link]
It's definitely true that in some projects, it's assumed that documentation should be built from code-reading, and these projects can't benefit from technical writers who aren't code experts. That is the vast majority of technicial writers.
A shift in focus towards a handoff from technical folks to doc writers explicitly can help projects both open and corporate, and I've seen it iin practice. So overall I thnk the ideas being supported here are legitimate.
Posted Jul 20, 2019 22:28 UTC (Sat)
by mfuzzey (subscriber, #57966)
[Link]
Not sure how true that is.
Documentation, while good, is not free.
So while I agree a short architectural overview document and a short "project goals and non goals" document are well worthwhile (and have less maintenance requirements since the goals and architecture change more slowly than the code) I am not convinced more developer documents are the best use of effort in many cases.
Posted Jul 21, 2019 12:32 UTC (Sun)
by ebiederm (subscriber, #35028)
[Link] (11 responses)
My intuition would tend to agree that certain forms are documentation make the learning curve of figuring out a new piece of software much easier. I can see similar things for setting expectations when contributing code. AKA a document that says this is projects process.
Still it says Google had commissioned a study, as such it feels like the bar for recommendations should be set higher than someones intuition.
The practical challenge is that not all of us have sufficient experience for our intuition to be calibrated for what other groups need. Which means sometimes we are very much surprised by the results. Which means we need more than intuition to over come our implicit biases. So is this recommendation for more documentation based on more than intuition?
Posted Jul 21, 2019 16:48 UTC (Sun)
by LtWorf (subscriber, #124958)
[Link] (1 responses)
I think for those things, it's easier to do a standardised questionnaire to 100-200 people.
Posted Jul 22, 2019 17:03 UTC (Mon)
by k8to (guest, #15413)
[Link]
However I agree that it's hard to generalize from 18 people.
Posted Jul 22, 2019 22:56 UTC (Mon)
by shemminger (subscriber, #5739)
[Link] (5 responses)
With this data some of these speculations could be validated.
Posted Jul 23, 2019 14:55 UTC (Tue)
by anselm (subscriber, #2796)
[Link] (4 responses)
I don't think documentation size is the metric we're looking for here. A project could have documentation that is very terse but to the point and tells you exactly what you need to know, while another could have reams and reams of unintelligible garbage that isn't helpful at all. This is especially true in a world where project maintainers run a program on their code that automatically extracts the names and types of functions and their parameters, and then have the audacity to call the result “API documentation”.
Posted Jul 23, 2019 18:53 UTC (Tue)
by shemminger (subscriber, #5739)
[Link] (3 responses)
In my experience, videos are better than documentation because a good presenter will often give background and express the intention of the code and project. Documentation seems to be written in an absolute and cold defacto style which doesn't help learning. It is like the difference between classroom and just reading a textbook.
Posted Jul 23, 2019 22:58 UTC (Tue)
by anselm (subscriber, #2796)
[Link]
As somebody who used to write Linux training materials for a living (I'm only doing it as a hobby these days), I honestly don't think videos are all they're cracked up to be. Personally I will almost always prefer written documentation to a video, simply because I tend to read a lot faster than most video presenters talk, and it is easier to skip back and forth in a written manual. I've seen videos that were so boring and dry they would make you cough, and well-written documentation that was not only accurate but also totally entertaining.
In general, good videos are a lot more effort to produce and keep current than good written documentation, so what are the chances that a project that can't manage good written documentation will be able to put up good videos? (More power to projects that can do both.)
Posted Jul 24, 2019 18:03 UTC (Wed)
by nilsmeyer (guest, #122604)
[Link]
Posted Jul 26, 2019 12:37 UTC (Fri)
by NAR (subscriber, #1313)
[Link]
Posted Jul 23, 2019 12:48 UTC (Tue)
by nilsmeyer (guest, #122604)
[Link] (1 responses)
That makes a lot of sense intuitively, especially there should be clear requirements on what you need to do to get your code reviewed and merged. I see far too many open pull requests on many projects that are just stalling out. There is a lot that can be improved with tooling here which makes the process scalable.
I suppose one might also want to documentation to be a first class citizen in your code base - any feature without sufficient documentation shouldn't be merged until that documentation is produced. This is the approach I take professionally. At a certain point I would like the documentation to also be testable, to see if there is something missing and to see if examples actually work, though I don't really have a very good idea how this would work.
Posted Jul 23, 2019 18:00 UTC (Tue)
by cwitty (guest, #4600)
[Link]
Many scripting languages have "doctest" testers; if you put examples in your documentation in a certain stylized format, then the doctest framework will run the examples and make sure the current output still matches the output in the documentation. I've used this a lot for Python; it works very well, as long as you're testing APIs on objects that have reasonable textual I/O. (And you can also combine this with coverage tests, if you like, to track how much of your code is tested by doctests.)
Posted Jul 23, 2019 15:18 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
Anecdotally, I have found that better documentation leads to fewer private emails asking how something does/should work :) .
Posted Jul 23, 2019 12:34 UTC (Tue)
by nilsmeyer (guest, #122604)
[Link]
That only really tracks when you don't consider working on open source software as "unpaid work", which I think diminishes the voluntary contributions many men make. If you classify either as unpaid work then it suddenly becomes a matter of preference, much like it is in the workplace.
Improving communities through documentation
Improving communities through documentation - and debug solutions
Improving communities through documentation
> The numbers vary, but women all over the world still do significantly more unpaid work than men.
But we only have 3% women in Open Source, because they don't have time to read the code?
Having less time would just reduce the "code output" per developer. Not the number of developers.
Improving communities through documentation
Improving communities through documentation
But if it is true it would also be true contributing code too.
It comes,with a significant maintenance burden of it's own ensuring it keeps up with the code itself.
Often out of date documentation is worse than no documentation.
Improving communities through documentation
Improving communities through documentation
Improving communities through documentation
Improving communities through documentation
Improving communities through documentation
I would like to have see some empirical data about project size/lines of code versus documentation size.
Improving communities through documentation
The bigger question is how much documentation is enough? and how do you keep it up to date?
To use an an example the fd.io project has lots of documentation and tutorial videos.
Improving communities through documentation
In my experience, videos are better than documentation because a good presenter will often give background and express the intention of the code and project. Documentation seems to be written in an absolute and cold defacto style which doesn't help learning. It is like the difference between classroom and just reading a textbook.
Improving communities through documentation
Improving communities through documentation
Improving communities through documentation
Improving communities through documentation
Improving communities through documentation
Improving communities through documentation
