Semantic MediaWiki: Toward smarter wikis
Interactive Knowledge Stack (IKS) is an open source project focused on building an open and flexible technology platform for semantically enhanced Content Management Systems (CMS). It is a collaboration between academia, industry, and open source developers, co-funded with €6.58 million by the European Union. The goal is to enrich content management systems with semantic content in order to let the users benefit from more intelligent extraction and linking of their content. This could solve part of the chicken-and-egg problem for the semantic web that arises because end users don't have easy-to-use semantic web tools.
At the recent IKS workshop in Paris, one of the keynote speakers was Mark Greaves, who spoke about the possibilities of the semantic web in the wiki setting. His speech looked at the limits of traditional wikis, the promise of semantic wikis, and the birth of Semantic MediaWiki (SMW), an extension to MediaWiki, the wiki software that powers Wikipedia.
Wikis have become a powerful instrument for crowdsourcing, but they're not the only types of content management systems that tap into the potential of the crowd. Greaves, who is working as Director of Knowledge Systems at Paul Allen's asset management company Vulcan, emphasized that bulletin boards, forums, and newsgroups are the antecedents of wikis and even the beginnings of social networks. Now we have many websites that crowdsource their content from their users:
A critical property of wikis is consensus, which comes thanks to collaboration and custom policies. For instance, one of the core content policies of the Wikipedia encyclopedia is that each article should be written from a neutral point of view (NPOV). This forces authors to not write from their own point of view and that helps lead them to consensus with authors that have another point of view about the topic. The MediaWiki software also has software support to facilitate reaching consensus, such as the talk pages and change tracking.
But traditional wikis have their limits, as most knowledge is locked inside text and cannot be queried in a smart way. Wikipedia has an answer for this with thousands of lists, for instance lists of countries (which is itself a list of lists). But these are all manually maintained, each of them ordered by another property, like birth rate, literacy rate, population, income equality, and so on. So Greaves asked the logical question: "Why don't we give Wikipedia authors a way to add structure to their content?
"
Semantic MediaWiki
That's where semantic wikis come in, and according to Greaves they hold a lot of promise:
One project working to add semantics to wiki systems is Semantic MediaWiki (SMW), a GPL licensed extension to MediaWiki that allows annotating semantic data within wiki pages. This means that a MediaWiki wiki that incorporates the extension is turned into a semantic wiki: content that has been enriched with semantic information can be used in specialized searches, used for aggregation of pages, displayed in alternate formats like maps, calendars, or graphs, and exported to formats like RDF (Resource Description Framework) and CSV (Comma-Separated Values).
How does this work?
Some examples will make it clear what SMW adds. For instance, on the normal Wikipedia page of France, there's a link to its capital city, Paris:
... the capital city is [[Paris]] ...
The [[Paris]] code is a link to a wiki page about Paris, but there's no information encoded about the specific relationship between France and Paris.
In contrast to this classical approach, the semantic web is all about interlinking data in a machine-readable way. The core technology under the hood is RDF, which is used to describe entities and their relationships. Each RDF statement comes in the form of a triple: subject - predicate - object. Each subject and predicate is identified by a URI, while an object can be represented by a URI or be a literal value such as a string or a number. So, a Semantic MediaWiki version of the sentence about Paris could be:
... the capital city is [[Has capital::Paris]] ...
The [[Has capital::Paris]] code not only adds a link to a wiki page about Paris, it also specifies the nature of the relationship between France and Paris: France has Paris as its capital. Or to translate it into an RDF triple: "France" (which is implicit as it is the topic of the current page) is the subject, "has capital" is the predicate, and "Paris" is the object.
This is an example where the object can be represented by a URI, but there are also other examples where the object is represented by a literal value such as a number:
... its population is [[Has population::65,821,885]] ...
When this code is on the page about France, it represents an RDF triple with "France" as its subject, "has population" as its predicate, and "65,821,885" as its object. These typed links (with the predicate as the type) give SMW an out-of-the-box mechanism to automatically generate lists. With SMW's inline queries feature, it's easy to re-use this structured information to generate lists and tables which are automatically updated and cached. For instance, users can easily generate a page with a list of all countries ordered by their population, or a list of all countries with a population greater than 20 million, or a table of all countries with their capitals, and so on.
Automatically-generated lists are not the only possibility when you start adding semantic links. You can also display the information in various formats, you can have different language versions of a wiki using the same data, you can integrate and mash-up your wiki's data and export it for external re-use, and more.
Ecosystem
The development of Semantic MediaWiki was initially funded by the EU project SEKT (Semantically Enabled Knowledge Technologies), and after this supported in part by the University of Karlsruhe in Germany. The first release was version 0.1 in 2005. In 2007, Vulcan started sponsoring the German company Ontoprise to develop a commercial version of the extension, Semantic MediaWiki+ (SMW+).
According to Greaves, there are 50 open source MediaWiki extensions that use the semantic information provided by SMW. For example, there's Halo, funded by Vulcan, that facilitates creation, retrieval, navigation and organization of semantic data with some intuitive graphical user interfaces, Semantic Drilldown that provides a faceted browser interface for viewing semantic data by filtering, and Semantic Result Formats that provides a large number of display formats, including maps, calendars, graphs, and charts.
If you want to install SMW on your own wiki, there's an extensive administrator manual with installation instructions and a list of the configuration options. For users who will be entering the semantic markup, the project also has a user manual
Some semantic wikis
Semantic MediaWiki is already used by over 300 public active wikis around
the world. Greaves called these semantic wiki applications "the
icing on the cake
", as they really show the flexibility of adding
semantics to a wiki. Some notable examples are Open Energy Information, a
crowdsourced wiki with information about energy resources, including
real-time data and visualizations, SKYbrary, a wiki
created by several European aviation organizations to create a comprehensive source of aviation safety information, Familypedia, a wiki on family history and genealogy, SNPedia, a wiki investigating human genetics, Oh Internet, a wiki to track internet memes, and Ultrapedia, a search engine for OCR'd books.
Many organizations also use SMW internally, including Pfizer, Johnson & Johnson Pharmaceutical Research and Development, and the U.S. Department of Defense. Greaves added that Vulcan is eating its own dog food:
Towards a semantic Wikipedia
Some academics have already proposed using SMW on Wikipedia to tackle the problem of the many lists that have to be created manually, but according to Wikimedia Foundation Deputy Director Erik Möller it's still unclear whether SMW is up to the task of supporting a web site on the scale of Wikipedia. So while Semantic MediaWiki already powers a lot of web sites and is quite user-friendly, it remains to be seen whether it will eventually bring semantics to the ultimate wiki, Wikipedia.
The SMW project has a fairly detailed roadmap. Some of the interesting tasks are an improvement of the usability of the semantic search features (part of Google Summer of Code 2011), a light version of SMW without query capabilities, improvements for the Semantic Drilldown extension, and so on. It's already quite usable, as many of the active SMW wikis show, but to really reach the vision of the semantic web and be able to link various semantic wikis and other content management systems, Semantic MediaWiki needs to become as easy to use as Wikipedia.
| Index entries for this article | |
|---|---|
| GuestArticles | Vervloesem, Koen |
