Disambiguation

May 30th, 2010, By talk

Which is Which?

As you probably know by now, Headup specializes in understanding the meaning of text in web pages, extracting the important objects, and bringing complementary content about them from around the web.  One of the most important requirements in carrying out this process is correctly identifying the meaning of words on the page, especially those that have more than one meaning.  If you fail to do this, you can fetch a lot of high-quality, real-time and personalized information – about the wrong topic…

In technical jargon, identifying the correct meaning of words is called “Word Sense Disambiguation”.  Or in the words of Led Zeppelin (from “Stairway to Heaven”):

There’s a sign on the wall

But she wants to be sure

‘Cause you know sometimes words

Have two meanings

Some words have different meanings depending on their context.  For example, the word “Apple” can mean the fruit, the technology company, the record label or the band.  “John Mack” can refer to the Chairman of Morgan Stanley, the musician, or the psychiatrist who specialized in alien abduction experiences. The term “Enterprise” can mean a   company, a city, a ship, a starship, a space shuttle, and much more.

On the other hand, the same term can appear in the text in many different formats.  For example, if you write a blog post that mentions Barack Obama, you might refer to him as Barack Obama, President Obama, the President, the U.S. President, President of the U.S.A., Barack, Obama, Mr. Obama, etc.  All of these terms refer to the same person, so an automated system seeking to understand the text should resolve all of them to the same entity.

There are various approaches to word sense disambiguation.  Some rely on the statistics of surrounding words; others require a training stage utilizing large pieces of text in which the meaning of words has been marked manually.  Headup’s approach to disambiguation is based on its knowledge graph, the ever-expanding collection of topics, attributes, and semantic relationships between them.  Combining information derived from the knowledge graph with analysis of syntax (they way word are combined into sentences), enables Headup to reach a very high rate of precision (the percentage of terms that are correctly identified) in its disambiguation process.

The examples below show how Headup can correctly identify terms that appear in plain text, even when the term has more than one meaning, or appears only partially in the text.  All of the examples are based on actual posts in blogs that are using Headup.

First, let’s take a look at an example from the film blog “HeyUGuys”.  Here you can see how Headup correctly identifies the word “Abrams” as referring to the writer and producer J. J. Abrams.

Disambiguation Image

And here’s another example: It is typical in gossip media to refer to celebrities using their first name only, to induce a sense of familiarity.  In the post below, taken from the blog “HitDanBack”, singer Mariah Carey is referred to only as “Mariah”.  This doesn’t stop Headup from correctly identifying her, based on understanding the topic of the blog post and the context in which the name appears.

Disambiguation Image

And in the final example below, you can see how Headup interprets the term “European Championship” in an article from the blog Jewlicious.  This term can mean a championship in any sport, but based on the context of the article and related terms that are identified in the text, Headup correctly interprets “European Championship” as referring to the European Figure Skating Championship.

Disambiguation Image

If you want to see more examples of Headup in action, visit www.headup.com and explore the various blogs that are already using Headup.  You can also test drive the engine for yourself in our Entity Extraction Playground.  Enjoy!

Smart Search Finding Things in Groups

May 16th, 2010, By talk

Searching for stuff is sometimes tough.  If you know what you’re looking for, and you phrased your search term just right, then you usually get good results.  But if not, you’re in big trouble, doomed to endless sifting through the results, page by page until you find the thing that you were really looking for.

Search engines are good at finding terms, expressions, and pieces of text.  But that’s where their world ends: They don’t understand the meaning of the text they are searching for, and they know nothing about objects, entities or relationships.  In addition, they are not designed to find stuff in groups, but search for a single object each time.

For example, let’s say you are interested in seeing video clips of songs from the Dire Straits album “Brothers in Arms”.  If you search for “Dire Straits Brothers in Arms Album” on YouTube, you will get many links to video clips of the song “Brothers in Arms”, and some links to other songs in that album (if the album name appears in the clip description).  If you are lucky, you’ll get a link to a playlist called “Dire Straits Brother in Arms Album” prepared by some user who manually searched for these tracks by name.

YouTube search results for "Dire Straits Brothers in Arms Album"

But now look what happens if you execute the same query through Headup: Headup automatically digs into its database to find the tracks in the album, and searches for specific video clips of these tracks.  Then, it returns a nice “video wall” where each thumbnail links to a different track in the “Brothers in Arms” album.  The key here is that Headup “knows” what an album is, associates it with its tracks, and is smart enough to understand that YouTube hosts mainly videos of tracks, not full albums.  This type of reasoning and “smart search” implementation is way beyond the power of other “topic search” engines that do nothing more than search forwarding.

Headup video results for "Dire Straits Brothers in Arms Album"

Let’s take another example.  What if you are searching for a certain type of product by a certain brand – such as Samsung LED-backlit LCD TVs, or Sony Flash-based HD camcorders.  If you try these search terms in a regular search engines, you will get scattered results of news announcements, product reviews, and maybe a link to a specific product page.  But you’ll never get a list of actual TVs or camcorders that match these criteria, since the search engines can only search for the text you supplied, but don’t understand it.

When such a search is conducted through Headup, it queries its knowledge graph for items that match the requested criteria.  Since in Headup objects have meaning, properties and relations to other objects, it is quite easy to go through all the “Products” by the “Company” Sony, find the “Camcorder” type products, and filter only those items that have “Memory Type” equal Flash, and “Resolution” equal “HD”.  So executing such a query through Headup may result, for example, in a neat list of links to specific product pages, which may include media reviews, user reviews and price comparison with purchasing links.

Note that even though Headup currently does not support direct search, the “smart search” method is already implemented in the current pop-up widget and topic pages.  When you look at images, news or videos of a certain object or topic, Headup’s “smart search” works behind the scenes to bring you the most relevant content for that object, by understanding and utilizing its relationship to other objects.

Headup v1.10 – The Beta: New widgets, Topic Pages & Usage Analytics

March 24th, 2010, By talk

During my time here at Headup I’ve come to realize one of the things that characterizes our progress as a startup company, is that it isn’t linear.

In other words, similar spans of time don’t necessarily manifest comparable progress.

Whereas for certain periods the work we invest is manifested in minor evolutions, other times our toil manifests itself as a major revolution.

Today marks the fruition of such a period, & what a revolution it is…

Headup – The News

I’m happy to announce the release of Headup v 1.10 which marks the launch of our official Beta & introduces the following improvements & features:

  1. Headup Snippet – The new slim Headup widget.
  2. Headup Topic Pages – Topic Pages you can customize to display your content & match your design.
  3. Analytics –  Now you can follow exactly how your widget is performing & what are the most popular topics on your site.

Please note this is a Beta, which means we’re still tweaking and testing these features. We’d love for you to try them out, and would appreciate any, and every feedback you have for us. If you run across something you feel is broken or silly, please let us know, so we can fix it.

Thanks!

Headup Snippet – A New Widget is Born

The new Headup “Snippet” is a lightweight widget that displays a short summary & related articles from your site for the topics it identifies in your content. It’s the default widget on this blog & it looks like this:

Headup Snippets - Introducing the Diet Widget

Headup Snippets - Introducing the Diet Widget

To activate Snippet widgets instead of your default Tabbed widget enter your dashboard & select “Snippet” from “Widget Mode” options in the new “General Settings” box:

Want Snippets? Choose your widget type

Want Snippets? Set your Widget Mode to "Snippet"

Snippet widgets are linked to Headup Topic Pages via the “View Topic” button

Headup Topic Pages – All the Freshest Content about your Topics in 1 place

Headup Topics Pages show in-depth dynamic coverage for your topics & offer related activities & topic-to-topic browsing. They can be accessed via Snippet widgets or directly by setting your “Widget Mode” to “Link”.

All the Freshest Content about your Topics in 1 place

All the Freshest Content about your Topics in 1 place

Topic & Usage Analytics

By popular demand we’ve added a brand new “Analytics” tab to the publisher dashboard. Use it to learn how many times your visitors use Headup, for how long, & which topics are the most popular on your site.

Topic & Usage Analytics

Topic & Usage Analytics

A Word of Thanks

As you can see over these past few months we’ve been working intensively on improving Headup & developing exciting new features for your enjoyment, however hadn’t it not been for a wonderful & select group of supportive bloggers, some of which you’ve met here in our weekly blogger interviews, we wouldn’t have had the opportunity to check these new features &test their value.

Before I sign off I’d like to take the opportunity to say a heartfelt “Thank You” to Dave, David & Ruhani for all their support. Thanks guys! We couldn’t have pulled this one off & launched our Beta without you…

Hey You Guys! – A chat with David Sztypuljak about film blogging

January 22nd, 2010, By talk

This week’s guest blogger is a child of the 80′s, self proclaimed geek, and avid Goonies fan. It gives me great pleasure to introduce Dave Sztypuljak – founder and editor extraordinaire of HeyUGuys.co.uk – England’s most popular independent resource for all things film

Mike:

Hi Dave – it’s marvelous speaking to you.

Dave:

Hi Mike!

Mike:

Dave who are the guys at HeyUGuys?

Dave:

The blog was founded by Jon Lyus and myself in November 2008 and is now entering its 2nd year. The name is a reference to “The Goonies” because we both love the film and originally thought we’d be posting only about 80′s movies. About two days after we started I think we realized the scope was too narrow and we wanted to write about film in general. BTW we nearly called the blog 88MPH – probably a less esoteric reference…

Mike:

I’ll admit the “Hey you guys!” eluded me but I’d have caught on to the 88MPH. How did Jon and you connect?

Dave:

We worked at the same investment house. He was in HR and I was in IT.
Jon would send out film quizzes routinely to everyone. I always aced them…
We soon became close friends and a while later, when I decided I’d like to start a film blog, I emailed him and asked him if he wanted to team up. The rest is history…

Mike:

You know Dave, in preparation for this interview I spent a fair amount of time researching film blogs.
It seems to me as if there are very few serious film blogs outside the US. Am I wrong?

Dave:

Actually, Mike you’re absolutely right. We’re pretty unique insofar as we’re a non-American, English language, film blog. Honestly I think it’s given us an advantage in terms of exclusive content and audience.

Mike:

Do you feel cutoff?

Dave:

On the contrary! I feel unique…
Seriously though, being situated in England as we are, has provided us with some amazing opportunities that were very important for building HeyUGuys as a brand.

Mike:

Do tell…

Dave:

About seven months ago we realized Ridley Scott was filming the new Robin Hood movie, with Russell Crowe and Cate Blanchett, in a wood near Farnham in Surrey. Farnham is only about 20 minutes drive from where I live, so one day we headed down there to visit and film the set. We didn’t quite realize it at the time but we’d created an amazing asset for ourselves – original and exclusive content that none of the “big” channels had. The post achieved popularity quickly and was linked to from everywhere.

The Tron poster we posted before anyone else is another example of a scoop that got us a fair deal of attention:

Dave Sztypuljak posted this Tron poster originally on December 9, 2009

Dave Sztypuljak posted this Tron poster originally on December 9, 2009

Mike:

How has the success of HeyUguys affected your lives?

Dave:

Its funny really. The first time we received validation we were being picked up by the mainstream media it was such a surprise I thought it was a hoax!

We got invited to the Star Trek premier by Sky News. One day I opened my inbox and the invitation was just waiting there for me. We got full VIP treatment. Throughout the premier I was waiting for someone to grab me, tell me the whole thing was a mistake, and kick me out the door.

Mike:

From my experience as long as you keep a smooth face and act as if you belong these things go smoothly. Seems you can even gate crash the Whitehouse nowadays as long as you keep your cool.

Dave:

Believe me, when the second premier invitation came in both Jon and I were already far more suave about the whole thing, although we still get very excited!

Mike:

So now that you’re all cool with being a film blogger what are the plans for the future?

Dave:

I’ve recently quit my “day job” to work for HeyUguys full time. The plan for 2010 is to make the blog profitable enough to sustain itself and us. I’d be happy to live my life watching films, mingling with celebrities and writing about it.

Mike:

Sounds like a plan to me. Good luck Dave and thanks for talking to me and for supporting Headup!

Dave:

It’s my pleasure. I actually like what Headup is doing quite a bit, so much so that I’ve been chatting to some blogger friends getting them to try it out.

Older Posts »