Semantic Wikipedia: some issues

Just went to a talk by Denny Vrandecic, one of the people who developed the Semantic MediaWiki.
A little description:

Within only a few years, the free encyclopedia Wikipedia has become one of the most important online knowledge sources. The project “Semantic MediaWiki” engages in the conception and development of semantic extensions of MediaWiki – the software underlying Wikipedia. The goal is to enable simple, machine-based processing of Wiki-content by allowing users to provide suitable semantic annotations. However, the special Wiki environment and the multitude of envisaged applications impose a number of additional requirements.

The overall objective of the project is to develop a single solution for semantic annotation that fits the needs of most Wikimedia projects and still meets the Wiki-specific requirements of usability and performance. It is understood that ad hoc implementations (i.e. “hacks”) may sometimes solve single problems, but agreeing on common editing syntax, underlying technology, exchange formats, etc. bears huge advantages for all participants.

The importance and greatness of the wikipedia is not questionable (12000 hits per second, a milion and a half articles only in english… more statistics here). Making it “queriable” through a classification schema, i.e. an ontology (or more than one) sounds pretty useful, but I’d just like to lay down a couple of thoughts to inspire and make their life harder :-)

  • what’s the issue with the metadata consistency?? We can either choose a “lighweight” and pretty simple ontology, so to reach an easy agreement between the parts involved (who are they, by the way? the whole lot of wikipedia users?), but of course you’d like to get more from any knowledge modeling enterprise. So I guess there are serious consistency issues, “internal” (since it’s needed a powerful model which inglobates various subtle perspectives, in the form of classes and relations..), and “external” (I guess people won’t agree easily on metadata, will they? – so how to support of solve this problem?)
  • the classic Knowledge Acquisition problem: who and why will “tag” the wiki articles? Is automatic KA an answer maybe? Can an average wikipedia user be bothered about levels of abstractions, and the manual hassle of adding parenthesis and categories? Maybe not, but I guess quite a lot of hard-core wikipedias would..
  • Reasoning: what are the added values then, beyond a simple string-search, or an inconsistency check? This is the interesting stuff i believe. The whole wikipedia-knowledge being reorganized depending on perspective..
  • Argumentation: I believe one of the strenghts (if not the main one) of wikipedia, is the collaborative work behind it. And the collaboration is guaranteed by a solid (and simple) infrastructure which supports debating, arguing, in general reaching consensus through interaction. Is this now totally forgotten? I think there’s loads of metadata to be extracted there, and one fundamental research question still unanswered: how do discourse semantics interact and relate to content semantics? KMi’s work on discourse representation, mainly around the ScholOnto project, could be of great help here…..
Share/Bookmark






2 Responses to “Semantic Wikipedia: some issues”

Ciao Mikele,
thanks for your SecondThoughts — being the chatterbox I am I’d like to adress them. I’d come over to your desk, but you left already — come on, it’s not even 7pm yet! ;)

Metadata consistency: a major problem here is how to make a wikilike User Interface that allows to add constraints and stuff. Wiki users don’t like to get told “No, you can’t do that” — you would have to at least offer an explanation of why you can’t do that. So I guess this is a tough problem. Our solution? None — we just ignore anything that has more semantics than the subsumption relation, really.

KA bottleneck: we hope that people will enter it by hand. This is the advantage of the *many* people Wikipedia has. But you are right: we need to provide strong incentives. And we hope that the autogenerated lists really will provide such incentives. Would you rather annotate your country once and then create the country lists automatically, or update all the different country lists?

Reasoning: we don’t need reasoning for many of the advantages we are aiming for. Having a structured knowledge base can offer you a lot already. But the idea of reorganizing is — wow — really cool! Never thought about that. In that case, I guess, the reasoning happens offline, on your machine? So you tell your personal Wikipediabrowser “Hey, listen, this is my point of view on the world (aka ontology), now organize your stuff according to that”. Sounds very cool.

Argumentation: You are right :) And I need to talk to you about this one. Especially since the whole Semantic MediaWiki actually was inspired on argumentation-related work done in the SEKT project.

Hope to catch you tomorrow!
denny

denny added these pithy words on Oct 23 06 at 5:58 pm

Thanks Danny, I’m actually playing around with the SemWiki quite a lot in the last two days, and I must say that as a Knowledge Acquisition tool, it’s has a lot of value and potential. Keep going!

Mikele added these pithy words on Oct 26 06 at 7:07 pm