Sunday, April 10, 2005

SW Enabled Website Builder

Definition snippet:

"...Semantic web will be formed by web pages comprehensible for both humans and machines...

You can be curious why semantic annotation of web pages is so important. There is number of reasons - mainly powerful and precise searches, advanced (predicate-based) hyper navigation or ultimate integration. Simply it is about using information in a way that is incomparably more efficient to how we use it today.

Consider this:

  • There are bilions of pages that are not semanticaly annotated in the Internet today.

  • There are several proposals describing howto annotate web pages with semantic information.

  • Regretably (almost) nobody builds SW enabled web sites - mainly because there are no widely used tools enabling semantic annotation...

  • ... but the time is now! Start building of the SW today!

Currently we are in "transitional phase". Research teams all around the world are trying to improve existing/invent new methods used to get semantic information out of the common web pages (documents). There are various data mining algorithms and applications on top of them like Grokker. Their effort is meritious since they enable processing of bilions web pages that are already in place - unfortunately only into some extend. Simply since there is no explicit semantic hint, they may just guess what is going on.

In this post I would like to describe another aspect of this - what should contain semanticaly enriched pages and how should look like the tools suitable for such purpose. Every day there are created thousands of pages that could be semanticaly encriched - if there would be convenient tools in place. Unfortunately there is nothing major...

For example imagine SW enabled DreamWeaver that would work like this:

  • On creation of the new page, both its embedded and external Dublin Core annotation would be created.

  • On creation of the link, it would be annotated in two ways. As usually you would specify URL, but also URN of the resource (object) whose representation is referenced by target URL and also predicate which such link represents.

  • Since DreamWeaver is WYSIWYG, you would be allowed to use (e.g Eclipse like) Ctrl-Enter to use auto-complete. Auto-complete would be able to suggest appropriate resource using your personal as well as common sematics ontologies. Resources' labels (RDFS label) and comments (RDFS comment) would be used to build WYSIWYG's search index.

    As you would write the name of the resource, it would be compared with known labels/comments of known resources so in the end there would be only a few resources to be selected.

    Thus everybody would be able to build web pages with semantically enriched links.

  • The same applies for abbreviations, images, emphasized text snippets, etc.

  • Every embedded page component (like image, applet or video) should also have Dublin Core annotation.

  • ... and more ;-)

  • Now I'm working on the XHTML & RDF Symbiosis post where will be described technical details of this feature and a few examples. You may await it soon.

An important part of the Vodyanoi's analysis describes creation of such semantically enriched pages from the MindRaider's notebooks. If you are curious how it will work, then check this:

  • In MR you are building your mind map, which is in fact a bunch of URI identified Concepts.

  • Each concept is identifed by (globally) unique URI.

  • Every URI has associated both Label (rdfs:label) and (rdfs:comment).

Vodyanoi serve will then serve XHTML pages enriched with RDF in it, so every link will be annotated with URI of the resource (e.g. Concept it refers) and corresponding predicate.

MindRaider Concept's annotation editor functionality will be improved. While you will be writing Concept annotation and there will occur match of the text with label of some Concept, matching concept will be linked directly to the annotation. Also you will be allowed to reference URI identified entities explicitely e.g. I plan to integrate there FOAF, that will server as social map/identity resource.

Lot of work ahead, but that is what e-mentality project is about - original ideas validated by prototypes that are able to work even in the environment of existing Internet, SW applications and associated technologies.