|
|
Log in / Subscribe / Register

WOS4: Quality management in free content

One problem which must be faced by any cooperative project is that of quality management. If anybody can contribute to a work, how can a project ensure that its output is up to the standards it has set for itself? A Wizards of OS 4 panel session on this topic highlighted three very different approaches to this issue.

Ullrich Pöschl, a researcher at the Max Planck Institute for Chemistry, is trying to address a number of problems with the scientific publishing world. Publication is crucial to scientists - it is, in the end, the one concrete result from their work which matters. But the process to publication is long and frustrating, and can often be hampered by personal agendas and scientific conservatism. Your editor who, in a previous life, actually published a paper in a refereed journal can attest to what a painful process it can be. There are also problems with scientific fraud and (much more often) plain old carelessness. Scientists, in their rush to get their work out, will often not take the time to produce work of the needed quality. Quite a few papers are published which contribute little and actually dilute the pool of scientific knowledge.

On the other side, scientific journals are tremendously expensive, and they publish last year's work. There are a lot of pressures for faster - and more open - access to scientific results. It seems that a more open approach would benefit everybody, but only if the quality level can be maintained.

Ullrich is a founder of a relatively new journal (Atmospheric Chemistry and Physics) which has set out to demonstrate a new approach to scientific publication. This journal has retained much of the classic scientific publication process - every paper is still reviewed by anonymous referees whose questions must be answered to the editor's satisfaction. Where things differ is in the openness of the process.

When a paper is submitted, as long as it's not complete junk, it will be immediately published as a "discussion paper" on the journal's web site. It is clearly marked as an unreviewed paper, not to be taken as definitive results at that time. While the referees are reviewing the paper, others can post comments and questions as well. These others are limited to "registered scientists," since the desire is to keep the conversation at a high level. The comments become part of the permanent record stored with the paper, and they can, at times, be cited by others in their own right. The editor will consider outside comments when deciding whether the paper is to be accepted and what revisions are to be required.

After using this process for five years, Atmospheric Chemistry and Physics has the highest level of citations in the field. Citations are important in the scientific world: they are an indication that a given set of research results has helped and inspired discoveries elsewhere. The high level of citations here indicates that this publication process is succeeding in attracting high-level papers and filtering out the less useful submissions.

Things are at an early stage - out of approximately 7,000 scientific journals, about five are currently publishing with this sort of technique. Others are interested, however, and that number can be expected to grow in the future.

Martin Haase then took the podium to talk about quality management in Wikipedia. While Wikipedia is a useful resource, there have been a number of well-reported problems. Some articles can be flat-out wrong, or, sometimes, distorted to meet somebody's political goals. Maintaining and improving Wikipedia's reputation will require getting a handle on these problems.

Some measures being taken by Wikipedia are:

  • Putting restrictions on anonymous access. In particular, anonymous editors cannot create new articles.

  • Getting a better handle on attribution of work. Wikipedia maintains an article editing history now, and has lists of contributors. Some people, it seems, have been surprised to learn this, and have changed the style of their contributions afterward.

  • A two-level reviewing process. Articles which have been heavily reviewed and deemed to be correct can be designated as "featured" articles. This process, however, turns out to be slow, so a new, less rigorous "good article" designation has been created as well.

  • Specific metadata about validation is being added to articles.

  • There is a mechanism for creating permanent links to specific versions of articles. These links can be used by outside sites to link to a "known good" version of an article with no need to worry about what subsequent changes could bring.

While agreeing that improving the quality of Wikipedia articles will be a never-ending process, Martin seems to think that the measures being taken will move things in the right direction. He warned explicitly about "expertism" - requiring that articles be written by experts in the field. It can be hard for experts to write articles for people who are unfamiliar with the field - their work tends to be jargon-heavy and written at the wrong level. They also tend to run in schools, and expert-written articles tend to reflect the views of one school only. Limiting contributions to experts would, in Mr. Haase's view, rob Wikipedia of much of its usefulness.

The third panelist, Larry Sanger, disagrees. Larry was a part of the creation of Wikipedia, but has since fallen out with that project. So, while claiming to be a "big fan of Wikipedia," he spent much time criticizing it. Wikipedia, he says, was meant to be the wild side of Nupedia, it was never supposed to be the whole thing. With only half of the original design, he says, it is not surprising that things have gone wrong.

So what has gone wrong? According to Larry, the Wikipedia rules are not enforced uniformly, leading to lots of abuses. Anonymous editing attracts trolls and other people whose main purpose is not the creation of a top-quality encyclopedia. The Wikipedia community is insular and hard to join. And there is no place for academics, people who are experts in their field. Wikipedia people may fear expertism, but Larry, instead, is on a campaign against amateurism. This amateurism, he says, is behind many of the problems with Wikipedia, but the community will not recognize these problems, and, thus, he says, will never fix them.

So Larry is going to fork Wikipedia. His project, called The Citizendium, will, he says, be very different. It will start out very much the same, however: the same software, and copies of all the Wikipedia articles. Those articles will track changes to their Wikipedia equivalents until they are changed locally, at which point they will become a hard fork. There are no plans to fork the software. In essence, the Citizendium intends to make full use of Wikipedia's free licensing (as is its right) to bootstrap the new site, and only move away from Wikipedia content when and where it feels it has something better to offer.

There will be some distinct roles for members of the Citizendium project. People who are deemed to be sufficiently expert in a given field will be called "editors"; regular contributors will be expected to defer to the editors in their field of expertise. These editors will be self-selecting, but they must publicly state their credentials. Editors can mark an article as being "approved," indicating that, in their opinion, it has reached a certain level of quality.

There will be no anonymous editing allowed in the Citizendium, and no pseudonyms either. All contributors must work under their own names. There will be a number of rules on how contributors and editors are supposed to work, with quick expulsion from the project for those who do not follow them. To that end, there will also be "constables," whose job is to enforce these rules.

There are vague plans for a meeting to draft and approve the charter under which the project operates. For now, however, the Citizendium is very much Larry Sanger's project, with goals and processes set by him. Whether it will be able to build a community and maintain it while keeping quality high remains to be seen.


to post comments

The Citizendium

Posted Sep 19, 2006 17:14 UTC (Tue) by cventers (guest, #31465) [Link] (4 responses)

I definitely agree that there is a right to fork Wikipedia and use its
content; that much is encompassed in the spirit of the GNU Free
Documentation License that covers the work.

In the past, I've grown tired of severe misinterpretation and I've
personally blacklisted two publications for publishing blatant and
offensive attacks on Wikipedia. What worries about The Citizendium is
that this opens up further opportunity for the press to misunderstand
Wikipedia.

In essence, they are starting from the already excellent Wikipedia
content. I believe that the argument made by those against expertism is
that some of this content would not exist in the quality it expertism
took hold. So The Citizendium, then, would use Wikipedia content created
without expertism to uphold the idea of expertism, and then one-up
Wikipedia by using the good side of expertism to correct Wikipedia
blemishes. This may mislead people into believing that expertism is the
only operable strategy; when the bulk of the content appearing under the
headline that flies the expertism flag was created through a process of
evolution in a society where expertism is rejected.

I hope that doesn't happen, but I fear that it will.

The Citizendium

Posted Sep 19, 2006 17:42 UTC (Tue) by trutkin (guest, #3919) [Link] (1 responses)

Eh, I wouldn't worry about it. Most efforts like these fail.

The Citizendium

Posted Sep 20, 2006 1:50 UTC (Wed) by xoddam (subscriber, #2322) [Link]

Indeed. The problems with Wikipedia already largely derive from the fact
that its editors are all self-selected. This new project doesn't seem to
correct that in any way.

Forking Wikipedia content and tracking changes to those articles which
have *not* been edited by members of the forking project looks to me like
a recipe for mediocrity. Articles which are cleaned up by Citizendium
editors may look better for a short while, but it seems inevitable that

(a) interested Wikipedians (perhaps assisted by a daemon, or even the
same people as produce the Citizendium content) will promptly merge any
decent changes back into the parent project, so they may as well be made
there in the first place.

(b) the articles which suffer from a lack of informed attention on
Wikipedia will suffer even more so in an elitist environment. The worst
irrelevancies may be deleted, but then the articles will sit and rot.

Unless the editors of Citizendium want to review *every* change made to
Wikipedia for quality, they're bound to duplicate mediocre changes and to
overlook excellent ones.

The Citizendium

Posted Sep 19, 2006 20:21 UTC (Tue) by jstAusr (guest, #27224) [Link]

I agree and think it would be useful if Citizendium would separate their expert viewpoints from the original, not necessarily separate places but maybe shaded/colored differently, with information on why they think the changes are needed.

The Citizendium

Posted Sep 21, 2006 4:04 UTC (Thu) by moxfyre (guest, #13847) [Link]

I personally think that the Citizendium idea sounds really neat, and can't wait to see how well it works in practice.

However, I believe there are major downsides to the Citizendium model: there are lots of Wikipedia articles that would never get created by "experts". For example, wikipedia has many articles analyzing "Simpsons" episodes in exhaustive detail. Who would be considered an "expert" for this type of article???

Wikipedia is filled with wonderful articles created by enthusiasts of things that have no recognized experts! Furthermore, in many technological fields the experts have no incentive to create and maintain free content. For example, I recently created the article "Lugged steel" on a method of bicycle frame construction. This article includes many links to resources for amateur bicycle frame builders. If an expert had written this article, I doubt those links would be there. Nearly all the true experts on steel frame building do it for a living, and many jealously guard their methods and tools!

It would be like asking Microsoft to write an article on C compilers... then wondering why there's no mention of GCC. Or asking Larry Ellison to write an article on database software, then wondering why there's no mention of MySQL. How will Citizendium ensure that it finds experts who don't have a vested interest in presenting the topic one way or another?

Larry Sanger has good points, but some of his articles were a bit impenetrable

Posted Sep 19, 2006 18:10 UTC (Tue) by emk (subscriber, #1128) [Link] (1 responses)

I've read a number of Larry Sanger's old Wikipedia articles on philosophy. While his articles contained a lot of good, expert level material, they also tended to read more like lecture notes from a graduate-level class than like encyclopedia articles. His writing style was informal, didactic, opinionated, and occasionally inaccessible to a lay reader.

In the right place, there is nothing wrong being informal, opinionated, or highly technical. (After, I'm frequently accused of these traits myself.) But Sanger's style perhaps wasn't the best fit for Wikipedia.

In my experience, Wikipedia isn't hostile to experts, per se. Like any open project, it favors certain personality traits, and requires some patience with trolls and the terminally confused.

Flawed Wikipedia articles generally have several causes:

a) Nobody has ever been interested enough to write a good article. This is an inevitable fact of life, especially for obscure topics.

b) The article has an infestation of "difficult" editors--this happens particularly on cultural "hot button" topics. Some difficult people aren't worth messing with unless you're prepared to go all the way to the Arbitration Committee, and that's a big pain.

c) The article has bit-rotted into oblivion, because nobody can be bothered to keep it fixed.

There's a few other pathologies which can produce well-written, competent articles with a lot of subtle errors. These generally occur when domain experts are scarce, or accurate sources are hard to find.

Larry Sanger has good points, but some of his articles were a bit impenetrable

Posted Sep 21, 2006 14:10 UTC (Thu) by mchristensen (guest, #4955) [Link]

In Larry's defense many of these articles where ported wholesale to Wikipeia from old lecture notes. The intention was always to fix those articles, but intention isn't action, so that never got done.

Larry is a good guy, with interesting ideas, who could have been a great leader for Wikipedia, but turned out to be something else. Not everybody can be great at every job, so there's no shame in that.

With that said, I'm not particularly hopeful that he will turn out to be a great leader for this next project either.

I think it would be great to have a more reliable version of wikipedia, and I hope Larry will be successful. And I think it would be fantastic if credible educational institutions could get involved in improving the reliability of key areas of the Wikipedia.

I just don't think this project is designed correctly, or has the right leadership to make either of those things happen.

WOS4: Quality management in free content

Posted Sep 20, 2006 8:06 UTC (Wed) by ortalo (guest, #4654) [Link] (3 responses)

NB: A high level of citations may not necessarily mean a high level of scientific quality (at least for the initial years or for a wide public); it may also simply mean a higher level of diffusion of the paper with respect to other publications. Especially when comparing a web-based scientific journal with a conventional one, I am not sure citations are decisive.

More generally, it is always very difficult to assess the scientific "level" of a publication or a person. Everyone tries to use quantitative indicators to do this but, in fact, there are only two means: factual evidence (like measuring star light distortion near Mercury to prove some theory - initially highly controversial) or scientists qualitative opinions (the latter technique being frequently a source of infinite recusion and the associated bugs).

OT: GR

Posted Sep 21, 2006 5:35 UTC (Thu) by roelofs (guest, #2599) [Link] (2 responses)

... factual evidence (like measuring star light distortion near Mercury to prove some theory - initially highly controversial)

Minor quibble: the theory to which you refer, if I'm not mistaken, is General Relativity, in which case I believe you've conflated two separate tests/predictions: the bending of starlight near the sun (first measured during the total solar eclipse of 1919, IIRC), and the anomalous precession of Mercury's orbit (by an extra 43 arc-seconds per century, again assuming my memory isn't playing tricks on me). Mercury itself has far too little mass to cause lensing measurable even with today's instruments, I believe.

Greg

OT: GR

Posted Sep 21, 2006 9:19 UTC (Thu) by ortalo (guest, #4654) [Link] (1 responses)

Thanks for correcting (and enlightening) me.
Furthermore, it sounds like a very practical illustration on why, nearly all the time, such writings needs quality review(ers).

OT: GR

Posted Sep 22, 2006 19:53 UTC (Fri) by smurf (subscriber, #17840) [Link]

Good point, but don't forget that "quality" != "expert".

Case in point: I could have written the same correction, but my knowledge of relativity stops roughly at the point where things get interesting (i.e. mathematical).

The "expert bias" problem shouldn't be neglected either. As a somewhat extreme example, if you only allow astrologers to edit the article on astrology, then Wikipedia's much-vaunted neutral PoV is lost -- none of these people will admit that the stuff doesn't work in the first place.

WOS4: Quality management in free content

Posted Sep 20, 2006 17:52 UTC (Wed) by zooko (guest, #2589) [Link] (2 responses)

Sounds like a job for a revision control tool. The Citizendium can put their copy of wikipedia under control of such a tool, then use that tool to merge their edits with wikipedia edits, as well as to include/exclude wikipedia edits on a "cherry picking" basis.

darcs would be perfect the job, were it not for its fatal performance flaws (i.e. it always locks up when you try certain kinds of merges).

Free content and version control

Posted Sep 23, 2006 10:05 UTC (Sat) by ddaa (guest, #5338) [Link] (1 responses)

> Sounds like a job for a revision control tool.

I had just the same impression. The "hard-fork on first edit" approach just sounds like "we do not have the right tool to track upstream, so we fork".

> darcs would be perfect the job, were it not for its fatal performance flaws

Hu... and CVS would be perfect for the job were it not for its being rubbish?

I take you actually meant: there is not at this time a practical free software tool to deal with cherrypicking inclusion and exclusion, although the free distributed version control folks have been thinking a lot about it, it does not seem that anybody was able to come up with something usable so far.

Free content and version control

Posted Sep 27, 2006 5:29 UTC (Wed) by njs (subscriber, #40338) [Link]

Cherrypicking per se -- i.e., taking some changes from another version and applying them to your version -- is pretty easy. (I mean, these are just text files, and diff/patch can handle those well enough for most purposes, though you can also do somewhat better with real merge tech.) Where darcs runs into trouble, and where there are unsolved problems, is when you want later merges to be smart about respecting those past cherrypicks. But since they don't want to re-merge the two forks anyway, that doesn't matter. I don't see why you couldn't perfectly well import future edits from wikipedia into a "possible changes to review and click on the ones that you want to apply" interface.

(Some of us have made a similar argument about cherrypicking for software too, since the most important use cases are things like cherrypicking to stable branches that are never going to be merged back either.)

evolution of editing rule sets and multiple article class inheritance

Posted Sep 28, 2006 10:38 UTC (Thu) by copsewood (subscriber, #199) [Link]

Having a number of Wikipedia trees or forks might turn out to allow evolution of a more fit set of rules for compiling the content. The forks that choose the most effective rules for editing will attract the best content, and others will merge good articles increasingly from the site that has better editing. It seems probable that different classes of article will benefit from different rule sets, and the fork that produces the best in class will be able to prove they have the best rule set for that classification from evidence that others merge from their articles in this area more than the other way round. Clearly the original Wikipedia rule set has proved effective at creating a good encyclopedia from scratch. As is mentioned elsewhere there are problem areas slowing progress from good to excellent, particularly with topics attracting vandalism, political agendas and self advertising.

One way of classifying an article will be based on its newness, other ways will include frequency of edits and another will be its subject area. I think we have a multiple class inheritance system here.

Might it lead to better results for those working on different knowledge classifications (e.g. physics, economics, sci-fi, fantasy literature, maths etc.) to be enabled to evolve their own editing rule sets, inheriting from other classes based on newness and edit frequency ? A possible rule might include that an article which has stabilised (based on quantitative content change over a number of edits will reject edits of more than a particular size from a single contributor until reviewed by some previous editors of the same article.


Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds