|
|
Subscribe / Log in / New account

Wiki markup isn't too bad

Wiki markup isn't too bad

Posted Aug 13, 2025 22:58 UTC (Wed) by KJ7RRV (subscriber, #153595)
Parent article: Arch shares its wiki strategy with Debian

It's different from Markdown, but the description of wiki markup seems a bit overly negative. I've never tried writing a parser for it, but I would not describe it as "very weird and hard to understand ... for humans." Like any markup language, it must be learned, but it doesn't seem to be any harder than Markdown for comparable features.

The fact that it's less familiar than Markdown is certainly a valid consideration, as it's quite reasonable to prefer a system that will be more familiar to its users, but the quotes suggest that it's inherently hard to use, which doesn't seem to be the case.

Are there any specific examples of "changing a single token ... completely break[ing] a page"? I've never seen that happen on either of the MediaWiki sites I've been an active user of (one as an administrator), so I'm curious to see a case where it did happen.


to post comments

Wiki markup isn't too bad

Posted Aug 13, 2025 23:43 UTC (Wed) by tux3 (subscriber, #101245) [Link] (1 responses)

I like wikitext, but on enwiki it's fairly common to see new users accidentally destroying the infobox, or adorning the page with a snazzy bright red error message after mangling its references.

Then there's the very elaborate template system. At this point the comparison to Markdown stops – it's easy to have lighter syntax if you simply don't have the feature, but you can't really avoid running into MediaWiki templates even as a casual user. Templates are used in almost all MediaWiki installs I've seen, they're really great to automate repetitive elements in a page, but on some Wikis they're sprinkled in practically every paragraph of text.
And a small mistake with these can have a very large blast radius. It's surprisingly easy to accidentally break tens of thousand of pages at once and/or to bring the servers to their knees with reasonable-looking changes.

Wikimedia spent a lot of effort on the visual editor, and I think casual users really benefit from it. It's significantly harder to edit the average enwiki article in source form than a Markdown page.
Take today's enwiki featured article for example, the page starts with a screenfull of templates for the infobox, it is sprinkled with inline <ref>{{cite ... }}</ref> calls that interrupt the flow of text with anywhere between a single line and half a screen of inline cite data, and even simple formatting uses '''''some unusual''''' syntax, like these 5 single quotes that control bold and italics.

Wiki markup isn't too bad

Posted Aug 14, 2025 12:04 UTC (Thu) by AdamW (subscriber, #48457) [Link]

It's fine as markup. I have a love/hate relationship with using the really exotic bits. You can do some crazy stuff with it (like, ahem, https://fedoraproject.org/wiki/Wikitcms ). But it sure is fun getting the kinks out, and remembering how it all works when you come back to it a year later.

Significant whitespace! Oy.

Wiki markup isn't too bad

Posted Aug 14, 2025 12:23 UTC (Thu) by t-8ch (subscriber, #90907) [Link] (3 responses)

> It's different from Markdown, but the description of wiki markup seems a bit overly negative.

It's very complex and internally inconsistent, having grown through ad-hoc changes to the parsing logic over the years.
When I researched wikitext parsers a few years ago, the only comprehensive and robust one was Parsoid.
It has been developed by the Wikimedia foundation since 2012 and ships with newer MediaWiki installations out of the box. It can convert between wikitext and annotated HTML which is then very easy to handle with any HTML library.
As an example for the complexity of wikitext, Parsoid needs multiple parsing phases. Around five if I recall correctly.

https://www.mediawiki.org/wiki/Parsoid

Wiki markup isn't too bad

Posted Aug 18, 2025 14:27 UTC (Mon) by paravoid (subscriber, #32869) [Link] (2 responses)

Parsoid is indeed the next-generation parser, is being actively developed by a fully staffed team, and has been for ~14 years by now, if I'm not mistaken. It's *still* not the default for read views which speaks to the complexity of the problem (among other reasons). The Wikimedia Parser/Parsoid team will be the first to tell you how difficult wikitext can be in terms of semantics, so I don't think the criticism here is undeserved.

Parsoid is supposed to become the default for read view on Wikipedias by July 2026, and the default for MediaWiki 1.47 LTS (Nov 2026), with the legacy parser to be ultimately deprecated in 2028. These timelines have slipped multiple times before, and the language the WMF folks use to announce them is... careful ("tentantively scheduled", "we hope", etc.), so don't hold your breath. Fortunately one can already benefit from it by using VisualEditor, including on the new Debian wiki.

https://wikimedia.eventyay.com/talk/wikimania2025/talk/UU... and, linked from there, https://docs.google.com/presentation/d/198_UG5VmHYMoO_38s... is probably the most recent update from the project.

Wiki markup isn't too bad

Posted Aug 18, 2025 21:04 UTC (Mon) by smurf (subscriber, #17840) [Link] (1 responses)

Counterpoint: if you need 15 years to write a correct parser for a markup language, that markup language *is* bad, almost by definition.

Wiki markup isn't too bad

Posted Aug 22, 2025 18:50 UTC (Fri) by cscott (guest, #178938) [Link]

It's also taken nine years *just to write a spec* for markdown (commonmark started 2014, latest version just published 2024-01-28). Things which look simple on the surface can be surprisingly hard to nail *all the way down*.

And that's the case for wikitext, for sure. Parsoid has been in production use since 2012, and has powered all the mobile apps for almost as long. Many many other WMF projects have been using Parsoid HTML for a decade. Before we completely ditch the old legacy parser, however, we need to make sure that 99.<some number of nines>% of Wikipedia's >100M pages are bug-for-bug compatible, because we take seriously our duty as custodians of the knowledge base which is Wikipedia and the WMF projects. At this point you might consider our work more 'archivist' then 'engineer', in that our main effort isn't the parser, per se, but preserving the rendering of existing articles.

The goal of Parsoid is to render pages into well-specified semantic HTML, which preserves all the meaningful information (template boundaries, template arguments, invisible constructs, etc) of the original wikitext. This isn't *just* to allow us to use an HTML editor and round-trip back to the original wikitext, it also paves the way for other editors and markup languages in the future: as long as it can round-trip to and from "MediaWiki DOM Spec" HTML, you can use it to edit wikipedia.

More information: https://en.wikipedia.org/wiki/User:Cscott/Ideas/A_Dozen_V...

Wiki markup isn't too bad

Posted Sep 21, 2025 5:53 UTC (Sun) by sl2c (subscriber, #179455) [Link]

Honestly, from my reading Wikipedia internals lately, it being different from Markdown is actually an advantage, because it makes it that much more obvious that someone copy-pasted from an LLM when their text is covered in double-asterisked words.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds