Visualize locations mined from the latest news

February 3rd, 2011, By eitanb

Check out this quick way to extract places from the news and put them on a map.

Locations mentioned in the news regarding Natalie Portman:

Locations mentioned in the news regarding Global Warming:

Locations mentioned in the news regarding Weapons of Mass Destruction:

This can be done in an extremely easy way, on ANY collection of words – using this query:
http://api.headup.com/v1?raw=true&q=Wikileaks/yahooboss:allabstractstext/x:entity/aspage(“map”)

If we break the query apart, here’s what we get:

  • ANY_COLLECTION_OF_WORDS_YOU_LIKE/yahooboss:allabstractstext – brings text of abstracts from Yahoo! BOSS for the collection of words.
  • /x:entity – Extracts the entities from this text, using SemantiNet’s entity extraction engine.
  • /aspage(‘map’) – renders entities on a map. This is done by trying to reason what is the location of each entity – and put them as markers on a simple Google Map.

We’ve quickly devised another template, to see the text right by the map, for easy playing around:

http://api.headup.com/v1?raw=true&q=summer olympics/aspage(“example/text_and_entities_on_map_1.html”)

As can be seen – the extracted entities from the news are sometimes only remotely related to the topic itself. Still, it’s quite fun for playing around :)

Bookmark and Share

Keith Richards’ guitar gallery in 4 lines of code

January 24th, 2011, By eitanb

The first line of “code” would be this query, that returns a list of Keith Richards’s guitars (click to check it out):
http://api.headup.com/v1?raw=true&q=Keith Richards/popularmeaning/`instrument`/render(“videolist.html”)
Let’s break down the query to its parts:

  • Keith Richards/popularmeaning – gives us the URI (a unique ID) for Keith Richards, dbpedia:Keith_Richards.
  • `instrument` – that’s a fuzzy matching of the free-form text “instrument” with a predicate of dbpedia:Keith_Richards. We get a list of the instruments that Keith played.
  • Then we render this list of instruments as using a template (that we’ve prepared in advance) called videolist.html.
This is actually a more “fuzzy” way of querying the graph. The strict way of querying it would have been:
http://api.headup.com/v1?raw=true&q=dbpedia:Keith_Richards/dbpedia-owl:instrument/render(“videolist.html”)

The rest of the code resides inside the videolist.html template. Let’s have a look inside:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title>SemantiNet's Video Portal</title>
  <link rel="stylesheet" href="http://www.blueprintcss.org/blueprint/screen.css">
  <style type="text/css" media="screen">
  body {background-color: #FAFAFF}
  </style>
</head>
<body>
 <div class="container">
 <div class="span-24">
  <table style="font-family: Consolas; font-size: x-small; background-color: white" class="span-24">
   <%foreach ./take(20)%>
  <tr width="100%">
   <td>
    <%select keytermsforquery/youtube:getplayers/first%>
   <iframe title="YouTube video player" class="youtube-player" type="text/html" width="300" height="230" src="http://www.youtube.com/embed/<%= f:v1/split('/')/*/at(4)%/>" frameborder="0">
   </iframe>
    </%select%>
   </td>
   <td>
   <h2><%= label%/></h2>
    <%= abstract/str:unescapeunicode%/>
   </td>
  </tr>
   </%foreach%>
  </table>
 </div>
 </div>
</body>
</html>

Most of the template is made of simple HTML markup. The query lines are:

  • keytermsforquery/youtube:getplayers/first
    • keytermsforquery - query refinements, for making calls to search APIs (YouTube in this case) –  adds semantic “cues” when querying APIs to get more accurate results.
    • youtube:getplayers – gets YouTube players for the entities.
    • Take only the first video, we want 1 for each instrument.
  • label – brings a nice readable name.
  • abstract/str:unescapeunicode – brings the Wikipedia abstract – and removes Unicode escaping.
In the same way, we can display a nice YouTube gallery of any list we’d like:
Bookmark and Share

Queries of the day – some simple ones to start with

January 19th, 2011, By eitanb

Here are some quick ways to query “World Knowledge” information using SemantiNet’s API, in a very basic way.

Let’s break it down a bit:

  1. dbpedia:New_Zealand – brings the node in the graph that represents New Zealand. How did we know to put an “_”? DBPedia matches Wikipedia’s URLs, and New Zealand’s Wikipedia entry resides under http://en.wikipedia.org/wiki/New_Zealand. We used Google to find how New Zealand appears in Wikipedia.
  2. We added a “label” to it, in order to get a nice representation of it.
  3. dbpedia-owl:capital – gets New_Zealan’s capital.
  4. latlong – get the coordinates of the capital.

Of course that we can run similar ones:

Notice that Linked-Data (including Wikipedia) contain over 1 Billion(!) facts, so these simple queries actually give access to quite a bunch of data.

Want to play around? Give it a spin: http://www.headup.com/playground.php

Or, check out the documentation in the wiki: http://wiki.headup.com/index.php?title=Knowledge_Graph_API

Bookmark and Share

Introducing SemantiNet’s API

January 16th, 2011, By eitanb

We’re excited to release an alpha version of our new API. This is the first post in a series of blog posts, describing ways you can rapidly build impressive data intensive applications.
SemantiNet’s new API is designed for easy querying of a selected collection of useful Web Services, Wikipedia, Linked-Data, and the unstructured web. The API also provides a flexible templating language, for easy creation of semantic mashups and data intensive applications, directly from your browser.

To see what we mean – let’s start with a very simple querying of DBPedia (click the link to see this query in the playground):
The API returns a node that represents Bar Refaeli in DBPedia, with some of the data that is connected to this node. Quite simple so far. So, let’s devise a template that takes this node, and present information from this node, as an HTML:
<html>
  <body>
    <!-- label provides a nice representation of the node's name -->
    <h1><%= label%/></h1>
    <!-- personage calculates the age, based on information we have from the birth-date and the current time -->
    Age: <%= personage/round(1)%/><br/>
    <!-- We want a nice representation of the birthPlace, so we take dbpedia-owl:birthPlace/label -->
    Born in: <%= dbpedia-owl:birthPlace/label%/><br/>
    <!-- Take the first image returned from YahooBoss's images search -->
    <img src="<%= yahooboss:images/first%/>" width="100px">
  </body>
</html>
Click here to see it live in the playground


What’s returned from a couple of examples:

Nice. We’ve seen how to query both LinkedData (from DBPedia in this case) and the web (through Yahoo Boss) – to get the picture.
Just to get a feeling of what’s possible, let’s play with these models’ place of birth.
This query, will return the 3 birth places:
/multy('dbpedia:Carolyn_Murphy','dbpedia:Marisa_Miller','dbpedia:Bar_Refaeli')/dbpedia-owl:birthPlace
Using this template – we will take the places, and put them on a map:
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <script src="http://maps.google.com/maps?file=api&amp;v=2&amp;key=ABQIAAAAkMzYJpqzT4X0Hj0W-xMFIhTkBdPb1_Y7shJWGA4g7zFU4DbUwRSRxPPUjb7uuS8U3pAZlGMUGn5Vww" type="text/javascript"></script>
  </head>
  <body>
    <div id="mapdivid" style="float: left; height: 100%; width: 100%"></div>
    <script type="text/javascript">
    if (GBrowserIsCompatible()) {
     var map = new GMap2(document.getElementById("mapdivid"));
     map.setCenter(new GLatLng(0, 0), 1);
     <%foreach .[location]%>
       map.addOverlay(new GMarker(new GLatLng(<%= /location/geo:lat%/>, <%= /location/geo:long%/>),{title:"<%= /label%/>"}));
     </%foreach%>
    }
    </script>
  </body>
</html>
Check it out “live” here. In this example, we iterate over the birthplaces, using a ‘foreach’ directive, and for each place – we take the location’s latitude, longtitude and label.

The following query, will return a list of female models, include only those that we have data about their height, and order them by their height:
category:Female_models/deepinstances(3)[dbpedia-owl:height]/orderdesc(dbpedia-owl:height)
To show a table of the tallest 10 models, we’ll use this template:
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <table style="float: left; clear: both">
      <%foreach ./take(10)%>
        <tr>
          <td><img width='100px' src='<%= /yahooboss:images/first/%>'
          onerror='this.onerror = null; this.src="http://static1.headup.com/images/placeholder.jpg"' />
          </td>
          <td><%= /label%/><br/>
          <%= /dbpedia-owl:height%/>
          </td>
        </tr>
      </%foreach%>
    </table>
  </body>
</html>
Check it out in the playground here, or – for a more generic version of this table, check this out:

Scraping pages from accross the web – is quite sweet as well. Let’s do a call to this page:
http://sportsillustrated.cnn.com/2010_swimsuit/models/
And now, let’s scrape the images from this page, using SemantiNet’s API:
/fetch('http://sportsillustrated.cnn.com/2010_swimsuit/models/')/htmlxpath('//div[@class="cnnIndexList"]//img/@src')/*
This gets us a list of photos. And if we put them in this template:
<%foreach .%>
  <img src='<%= .%/>'/>
</%foreach%>
What this query results is a scraping of the site’s images:


In the following blogposts we’ll write about more characteristics of the SemantiNet’s API and language:

  • Extensible – anyone can easily extend this language (we internally call it CSlang), building on top of existing predicates
  • A rich NLP library for Entity extraction, contextual disambiguation, access to WordNet
  • Powerful first-class-citizens of the language – graph querying primitives, lambda calculus
  • Very short development feedback loop
  • Piping
  • Fuzzy ontology support – freeform queries allow quick trials and discovery
  • Inference rules one-liners
  • Enrich the ontholgy based on web-links
  • Data mining primitives – map-reduce, grouping, histograms, order-by, max on lists

Want to get started? Take a spin in the playground, and check out our wiki: http://wiki.headup.com/index.php?title=Knowledge_Graph_API

Bookmark and Share
Older Posts »