The challenge of ambiguity

Fellow Nestorese,

as is our habit we often like to give you a peek behind the curtain of a vertical search engine and show you some of the challenges we face. Today I thought I’d explain the difficulty of ambiguity.

We humans have fallen into the lazy habit of using the same word to mean many things, including place names. A shockingly high number of places in the UK share the same name. For locals this is typically no problem, because it’s clear which place is being referred to, but it can make the lives of a property search engine developer struggling night and day to help you find homes to buy and rent as easily as possible very difficult. For example, if you just search for a properties for sale in Rushton how do we know which of the three Rushtons in the UK you mean?

Rushton

But of course there are some placenames, that, though they exist in multiple locations around the country, have a clear winner. For example, properties for sale in Waterloo. In a flurry of post Napoleonic war celebration, many areas were named after the historic battle ground. There are five Waterloos around the UK, but when most people say they want to rent a flat near Waterloo, they mean near the south London train station. In those cases we take you, our dear flat searching friend, directly to what we believe is the dominant result, but also give you the option to change your search to the more obscure locations:

Waterloo

Differentiating between the locations which are truly ‘ambiguous’ and those like Waterloo where there is a clear winner is the challenge. These are the subtle tweaks that lead to a product that ‘just works’. Please let us know if you’ve found any locations you think we’re not quite getting right.

One final note, this problem is in no way isolated to the English language. In fact the UK isn’t bad – I’ll leave it to our spanish blog to rant about how many San Sebastians there are in Spain.

  • Housereview
    Hi Ed,

    Thanks for your thoughts and input. Couldn't agree with you more, a single query would not be good enough.

    To that end we set up a table within the wiki to record the results of the Geography Name Clash Index should folks be interested to contribute (as many times as they wish).

    http://www.housereview.com/wiki/Geography_Name_...

    Always happy for thought leaders in this space, such as yourself, to contribute more suggestions or comment on the underlying issues.

    Screenshots are all well and good for impact but no way to evaluate, compare, discuss and improve. A single result as you note is very easy to manipulate, that said, looking at even a small sample set will hopefully encourage some folks to manipulate . Possibly better for the industry as a whole to raise and address (no pun intended) some of these issues.

    Looking forward to your presentation at AGI2007.

    David
  • Ed
    Hi David,

    great to see that you understand the difficulty, and Kingston is indeed a good example of an ambiguous location name.

    Nevertheless, I'm always hesitant to endorse any comparison of quality that focuses on a single query. It's very easy to manipulate such a test via the choice of the query to produce a desired outcome. The only real way to test is to evaluate a large number of queries (and to consider how popular those queries are).
  • Housereview
    Ed

    Great post.

    Know you and the gang enjoy a competition so put together a Geography Name Clash Index based "kingston".

    http://www.housereview.com/forum/viewtopic.php?...

    If there are better names searchs let me know.

    David
blog comments powered by Disqus