Questioning the Semantic Web's history

January 22nd, 2009, By talk

In the beginning Tim created the Web, and the Web was Semantic, and the Web was good…

I’ll go out on a limb and say that the Semantic Web precedes the World Wide Web.Perhaps it’s more accurate to say that the Semantic Web is the original vision of the web, as can be seen in this diagram from the original proposal of the WWW created in 1989 by Sir Tim Berners-Lee, the British scientist credited with inventing it:

Tim Berners-Lee originally envisioned a semantic web

In hindsight we all know the early web didn’t quite evolve the way Berners-Lee envisioned it. Never one to dispair Sir Tim never abandoned his vision and continued to publish materials and make statements regrading the evolution of Semantic Web. In 1998 he started defining a road map for the semantic web and in 1999 he is quoted as saying:

“I have a dream for the Web in which computers become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.”

His position as director of the World Wide Web Consortium, which oversees the Web’s development (a position he continues to hold to this day alongside many others no less prestigious) meant that Berners-Lee was uniquely positioned to help transform his vision into a reality.

The Semantic Web evolves

The early years of the millennium saw increased activity by W3C to promote and advance Semantic Web. After securing generous funding from the EU and other sources the W3C launched numerous workshops, events, and projects and provided backing for research in to the Semantic Web. The results of these efforts was a veritable smorgasbord of specifications and guidelines which were meant to be developed into the principal technologies of the Semantic Web. The current components are:

  1. The Resource Description Framework (RDF) Core Model.
  2. The RDF Schema language.
  3. The Web Ontology language (OWL).
  4. SPARQL – The standardized query language for RDF, that enables joining decentralized collections of RDF data.
  5. The GRDDL Recommendation meant to create bridges between the RDF model and various XML formats, like XHTML.
  6. POWDER, which is not a specification but rather a working group that develops technologies to find resource descriptions for specific resources on the Web that can be joined to other RDF data.
  7. The SKOS model – an RDF vocabulary for expressing the basic structure and content of concept schemes (thesauri, classification schemes, subject heading lists, taxonomies, ‘folksonomies’, etc.).

The activities of all these groups is documented and as of 2006 can be viewed on the W3C’s aptly named “Semantic Web Activity News“.

The Semantic Web - the Layer cake as of 2007

The Semantic Web - the Layer cake as of 2007

The Semantic Web Paradigm

When faced with this impressive body of work, that spans nearly a decade, it is difficult to avoid asking one simple question:

How come after nearly a decade of work by some of brightest minds on the planet we, the countless masses who browse the web daily, are still largely unaware of the Semantic Web and have yet to experience the promise it holds?

The paradigm explained and resolved

The way I see it the evolution of a Semantic Web, even as defined by the work done under the auspices of the W3C,  is largely dependent on two factors:

Condition 1 – The existence of LARGE amounts of data online -For computers to ‘do the work for us’ and provide us with the boons that Berners-Lee envisioned for the semantic web, they must have the data required available to them. During most of the time that has passed since Berners-Lee published his vision, not only was the required data missing, it was also unclear who would collect, structure, validate and publish it.

Resolution – The explosion of the ‘Web 2.0′ phenomenon, starting with Wikis in 2000 and later evolving into the huge variety of social networking sites we enjoy today, resolved this issue. It was us, all of us, who through our massive engagement with countless dedicated social networks provided, and continue to provide, the Semantic Web with the data it requires to function.

Condition 2 – The structuring of data in formats that are understandable by machines - As the listing and diagram above clearly show much of the work the W3C has done is related to resolving this issue, however the simple fact is that is has largely been ignored by developers and commercial enterprises, and not without cause:

  1. Many of the formats developed by the W3C are inconvenient to work with and implementing them is time consuming.
  2. The commercial enterprises, web developers and the countless individuals involved in the day-to-day work of building and expanding the web, tend to resist formats dictated from above. The Web, by the very nature of the structure of the Internet as a net of connected yet independent computers, is an anarchic medium.
  3. Despite all the time that has passed the W3C still has a great deal of work to do before the semantic web formats it advocates are structured and defined enough to be ready for wide scale commercial use.

Resolution – Nature and business both abhor a vacuum. While the W3C sit and deliberate, enterprise has de-facto provided the means for resolving the 2nd condition required for the Semantic Web via the APIs published by a rapidly growing number of web services. An API, by it’s very defintion, is a method for one web-service/computer to speak to another using predefined structured calls.

The logic driving this mushrooming of “data-givaways” is threefold:

  1. APIs serve as splendid echo systems presenting an opportunity for an industry runner up to harrass the leader of the pack.
  2. APIs allow web services that sell products to automate and thus greatly enhance their affiliate business.
  3. There is a growing demand from developers for APIs and a demand is always met. Eventually…
  4. Paid APIs present an opportunity for additional direct revenue.

What’s coming next?

By now (January 2009) product and download based web-services like Amazon and Itunes have made it clearly obvious that there is money to be made on the semantic web, and even though less product oriented services are still struggling to monetize the data they’ve aggregated, few would argue this value is worthless. It remains to be seen how user trends coupled with the laws of economics continue to shape the evolution of the web into the Semantic Web it was originally meant to be.