10 Most Important SEO Patents: Part 8 – Assigning Geographic Relevance to Web Pages

How much might one page on a website influence the rankings of other pages? When I joined an agency in 2005, our focus was on rankings for individual pages – optimizing their content for specific terms and phrases, and making sure that they had links from other pages, both onsite and off. I found myself unable to color just within those lines. It was impossible to ignore the impact of global issues on a website when trying optimize individual pages for terms. Every page on a site has the ability to impact how each page might be crawled and indexed and displayed by search engines.

For example, if the home page of a site was accessible at multiple URLs, there was the very real risk that PageRank for that page could be split multiple ways, such as amongst:

  • http://www.example.com/
  • http://example.com/
  • http://www.example.com/index.htm
  • http://example.com/index.htm
  • http://www.example.com/Index.htm
  • http://example.com/Index.htm

If the pages of a website are accessible both with and without a “www,” then it’s possible that all pages of the site could be indexed by a search engine with each version getting some share of PageRank. For years, you could visit the homepage of the New York times, and see a PageRank of 9 in a Google Toolbar for the version at “http://www.nytimes.com,” and a PageRank of 7 for the version at “http://nytimes.com.”

Those are some examples of harm that might be caused by ignoring global issues on a site, but there are also some benefits to paying attention other pages on a site and how it might impact an individual page. Choosing meaningful anchor text to link to one page from another page is a good example. Another came out in a patent application originally published in 2005 that could have an impact on how well a page might rank for geographically related terms, even if those terms didn’t appear upon a page, as long as those terms might appear on another page on the same site.

Assigning Geographic Identifiers to Web Pages

When brothers Lars and Jens Rasmussen had their Australian based company, Where 2 Technologies, acquired by Google in October of 2004, they became involved in the team that created Google Maps. They also worked on a process involving how Google might may handle geographic location on web pages. They put together the following patent application describing a process to assign geographical information to web pages.

Assigning geographic location identifiers to web pages
Inventors: Lars Eilstrup Rasmussen, and Jens Eilstrup Rasmussen
US Patent Application 20050182770
Published August 18, 2005
Filed November 26, 2004

Abstract

A system and method for assigning geographic location identifiers to web documents may include identifying a set of web documents. A geographic location identifier included within a first web document in the set of web documents may be identified. The identified geographic location identifier may be assigned to a second web document in the set of web documents based on a relevancy of the first web document to the second web document.

The patent filing describes how a search engine might look for location information on web pages, assign locations to pages which do include geographic information, and then assign locations to pages “relevant” to those pages.

The problem that this patent application was intended to address was that keyword-based search engines (as Google was at the time, and often still is) would fail to geographically define web pages when trying to use:

  • Manual assignments by a search engine of locations to pages
  • Manual assignment by a site owner of locations to pages
  • Use of geographic meta tags
  • Search engine assignment of location when looking at postal addresses appearing on the same pages as the keywords.

Assignment of geographic location identifiers

Under the process in the patent filing, geographic location identifiers on pages can be assigned to other pages which might or might not include geographic identifiers, after relevancy factors are considered, allowing pages without location information to be included in a geography based search. The relevancy factors might include:

  • Relative distance between documents
  • Terminology used, and
  • Whether or not the page is on the same site

The patent application points out a number of geographic location identifiers, such as:

  1. A partial or complete postal address
  2. Telephone number
  3. Area code
  4. Airport codes
  5. Landmark identifiers
  6. Other values tied to physical locations, such as longitude and latitude
  7. Or based upon hyperlinks between pages without geo information that seem related to these pages which do have location information

We’re also told that other documents, such as directories might be useful in associating location identifiers. A search engine might also use a pattern matching approach that examines text on pages to find standard formats for addresses and other information that tends to describe locations.

One assumption that this process follows is that if a page has some kind of location information on it, it is associated with that location.

Following that assumption, if other pages are within a certain number of clicks from that page, they might also be assumed to be associated with the location if they are “relevant” to that first page.

Geographically Relevant Pages

A geographically relevant document might be defined as relevant where

  1. The pages are on the same web site, and
  2. The anchor text on the page with location information leading to the other page contains one or more terms from a small rule-based set of terms.

The relevant terms might include terms such as:

  • Locations
  • directions
  • Find
  • Finder
  • Locate
  • Locater
  • Store(s)
  • Branch(es)
  • About
  • Company
  • Contact
  • Information
  • etc.

Another approach would be to consider a page relevant to a location if the anchor text to it includes a complete or partial postal address.

For images or videos or other non-text anchors, linked pages might be considered relevant if the URL in the link includes either a complete or partial postal address or one of the above relevant terms.

Pages could also be considered relevant by examining the contents of the page directly.

A link that doesn’t include relevant terms like that might still be considered relevant if the HTML <title> of the linked to document includes any of those relevant terms, or a complete or partial postal address.

Geographically Important Click Distance

Once a candidate geographically relevant page is identified, pages a the number of links away from the page with the location upon it are looked at. The patent suggests that this click distance might be a range of 2 – 5 links. If the distance is further, that page might not be considered geographically relevant.

In addition to looking at pages that are linked from the page with a location on it, this process will also look at pages that link to that page. So, if a page on the site that doesn’t have a location on it links to a page that does, with a geo relevant term like “directions” as anchor text, then the linking page (without a location) can be geographically assigned the location found on the linked to page.

It’s possible that there might be an assessment of geographical relevance that differs from one page to another based upon looking at all of the pages together. Relevant links and link distances may be calculated for pages which don’t contain the geographical location information. Each of those pages may collect a measure of relevance based upon those distances, and that measure can be added together for all of the neighboring pages that may contain geographical information. So, if a page is linked from or to by a number of pages that use relevant anchor text or URLs, it may be determined to be more relevant for that geographical information than other pages.

To compound things, more than one location might be associated with a single page.

Takeaways

Even though it was filed back in 2004, this patent application is pending at the USPTO, and it looks like it might not be granted.

But many of the concepts contained in the patent filing are reflected in another (now granted) patent that I wrote about when it was still pending, called Propagating useful information among related web pages, such as web pages of a website. While that patent covers this concept of assigning geographic relevancy, it also broadens things to assign relevancy for other terms and concepts from one page to another. I wrote about the patent in the post, Google Determining Search Authority Pages and Propagating Authority to Related Pages.

A snippet from that post that shows how the “relevancy” of one page might be assigned to another outside the context of geographic locations:

Location information may not be the only information on this site that may be propagated from one page to another. Imagine that a page of the site includes menu items for the restaurant, including “pho,” which is a beef noodle soup.

Assume that “pho” is considered to be a highly descriptive term because it isn’t frequently used in a wide collection of Web pages.

This term “pho” may be identified and propagated up to the home page of the site as well, and be treated as if the word appears on the home page, even though it actually doesn’t.

Now assume that someone from or near Anytown, CA searches for “pho restaurants” the home page may show up as a relevant match, even though neither the term “pho” nor the location actually appears on the home page of the site.

How much might the content of one page influence the rankings of others?

It some cases, it could hold a great amount of influence.

Note: where location of your business is very important because you have a shop or office that you might want people to visit in person, or because you provide services to people in a particular region, I would recommend including your address on every page of your site that you want indexed by the search engines. While Google might be assigning geographic relevancy in a manner like this, some smart self help can make it much easier for search engines to get things right.

All parts of the 10 Most Important SEO Patents series:

Part 1 – The Original PageRank Patent Application
Part 2 – The Original Historical Data Patent Filing and its Children
Part 3 – Classifying Web Blocks with Linguistic Features
Part 4 – PageRank Meets the Reasonable Surfer
Part 5 – Phrase Based Indexing
Part 6 – Named Entity Detection in Queries
Part 7 – Sets, Semantic Closeness, Segmentation, and Webtables
Part 8 – Assigning Geographic Relevance to Web Pages
Part 9 – From Ten Blue Links to Blended and Universal Search
Part 10 – Just the Beginning

Share

29 thoughts on “10 Most Important SEO Patents: Part 8 – Assigning Geographic Relevance to Web Pages”

  1. Thanks Bill! Your articles are a fountain of obvious and – maybe even more important – not so obvious wisdom for us that take our daily bread from working with search engines to pull rankings.

  2. the example of the restaurant website and the spread of the terms in reversed direction, from deep pages to the home page, is a concept that I had not actually considered and can be applied to many other things.

  3. Hi Bill,

    Its truely is a pleasure to read a seoblog, thats beyond the usual standard. I got you recommended by a danish seo-expert.

    How do you see the problems that Google Places currently have with geo-localization ? Its getting bettet though, but still many results not responsive to my geo-location

  4. As you’ve noted in your caveat about “smart self help,” this sort of propagation of relevance from one page to another seems to be mostly to make up for shortcomings on those pages. If you really want your home page returned for a search on [pho] then it may help to have the word on a menu page on the same site, but you’re probably better off getting a paragraph onto the home page that discusses and lists the restaurant’s Vietnamese specialties.

  5. I can agree with that, geographic relevance is getting more and more important for my SEO work and for my clients. Another great post!

  6. I have experience damage of loosing my ranking on Google with location specific to US and just adding my business in Google place with correct address solved the problem of getting rank back. Now days Google extremely focusing on displaying result Geo wise. The last change I have observed showing its official blog like webmaster and analytic with country specific domain name rather than top level. it might be experiencing impact and planning for more changes to identify and display result specific to location.

  7. Hi Bill,

    Would you happen to stumble upon a patent/patents that would explain how Google “decides” whether to display “pure” maps in the SERPS or blended maps/website box?

  8. I believe we’re kindred spirits in our frustrations with the agency model of SEO. I found my experience extremely prohibitive of optimal performance & do far better working as a hired gun… Considering a websites simply a collection of (for example) 300 indexed pages is problematic in the manner you mention – every page is a part of a whole & a cog in the larger wheel – sitewide footer links for example should be utilized as a barometer of contextual relevance. I feel as the algorithm becomes more sophisticated so “direct hits” on keyword strings become less important than over-riding editorial & “semantic sense” if you will…

  9. Hi Rosentand,

    Thank you. I love when patent filings like this give you some possible insights into things the search engines are doing that might not be so obvious. That’s one of the values in spending time with them.

  10. Hi Nicolas,

    I’ve had people complain to me that their home page is showing up well for some queries that they wanted a deeper page to show up for, and they didn’t want that happening. If Google is indeed assigning relevance to their home page based upon what is contained in a deeper page, that isn’t necessarily a bad thing.

    Sometimes people want that deeper page to appear, and not the home page at all. The ideal solution is to try to make that deeper page more relevant than the home page, so that both pages might show up in results rather than trying to somehow make the home page less relevant. Sometimes that point isn’t easy to get across though.

  11. Hi Jimi,

    Thank you.

    Not quite sure that I understand the question you have with Google Local and geo localizaton. The patent filings I wrote about in this post focus upon Web search results, and how they might be determined to be relevant for a specific location, though.

    The people who worked on these patent filings also have been involved in Google Maps/Local, and it’s worth digging into some of the things that they have written. It’s hard to provide you with a detailed answer since I don’t necessarily have a sense of what specific kinds of problems you’re experiencing.

  12. Hi Bob,

    Making some intelligent choices about what you place on which pages is an ideal approach, and if you think that listing the specialities you offer on the home page of your site makes sense, then it’s definitely something that you should do. But if you’re optimizing the home page for a specific term or two, and you want to focus upon that term, you don’t necessarily want to try to include everything that’s on other pages of the site on your home page.

    If Google uses an approach like this, then it could potentially impute relevance for terms and phrases on a number of other pages of a site to a home page. For instance, if a site has 20 other pages that contain meaningful content that it thinks it should make a home page relevant for based upon the content of those other pages and how they are linked to the home page, then that’s not necessarily a bad thing. You wouldn’t want to try to include the content of those 20 other pages on the home page.

  13. Hi Lukas,

    Thank you. For some sites, geographic relevance is definitely important, especially if the businesses behind those sites have a location that people can visit in person, or an area that they provide services within. For others, it might not be so important. But if it is, it’s definitely worth paying attention to.

  14. Hi Alex,

    Not sure if you’re referring to Web search results, or local search results when they might be displayed within the web search results as Place pages. I’m not sure if there is a connection between verifying your business in Google Maps, and how well your page or pages might be returned just as an organic Web result. But I do think that verification can help with local results, and local results that are blended into web results.

  15. Hi Matt,

    There are definitely challenges and benefits in working within an agency as opposed to working on your own. In the particular instance I described, I was able to get the people I was working with to see the benefits of looking past individual pages to global issues as well. It took a little work, but the value of doing so definitely paid off, and they were able to see that. On the plus side of working within an agency, you can often always find others to talk about the issues that you come across and collaborate with you on solutions.

  16. We’re currently trying to target European countries and originally create a URL structure where we would put each language version on a subdomain e.g. en.example.com. After noticing that our overall rankings had dropped across almost all keywords we decided to move everything back to the root domain and have each language site within a subfolder e.g. example.com/en/index/html – Within 2 weeks we had nailed most of our target markets after doing nothing else but restructuring the URL. Thanks Bill, great article.

  17. Bill: When you wrote about this patent in summer of 2005 after it was published, I thought it was magnificent and explained an incredibly mysterious change that occurred in late Feb 2005 as a part of one of the infamous google dances.

    Following the google dance…in an incredible change websites representing local businesses could be discovered for logical search phrases:

    For instance if you were using google search to look for a:

    Miami plumber
    Denver florist
    Electrician in Rochester, NY

    or any of the probable 15-20 % of total searches that probably reflect local businesses…Eureka A website for a local business was turning up at the top of the rankings.

    That had not been the case prior to that change. Typically a search for any of those terms or a million other variations for a local/business/service/organization was not resulting in high rankings for a local entity.

    Instead virtually all of those searches gave results with either Ebay, Amazon, a mega site with thousands of pages or mega directories. The actual web sites representing local businesses would trail these sites. Spock from Star Trek would have despised it. Google search was not logical when it came to fulfilling local searches.

    I was befuddled by the change in early 2005 when a new better logic for Google search arose. Nobody who wrote on SEO described the change. I had been working in local seo for several years. I was fully aware of the horrible logic.

    I actually had local sites that ranked above the mega sites and mega directories for appropriate search phrases. Oh my oh my. I can’t tell you how many links I had to those sites..and with anchor text ;) Ha ha.

    The article you wrote described the changes. To me the critical background change was how google emphasized location/place/directions etc (via the list you pulled from the patent)…and more critically established a parameter with regard to the # of pages from the “location oriented page on the website”. That is the part that references distances in documents. It further suggests that if there are 2-5 links between a page on a site and the page that emphasizes location…well then it seems to eliminate the locational nature of the website.

    It was a tremendous article and a tremendous find.

    Hardly anyone was paying attention to local seo at the time. I repeat nobody wrote about the change.

    You caught it.

    It opened the door for the ability of small businesses and services to build websites that would show to local customers.

    It was a huge advancement in google search logic. It showed how google could build filters within their algo’s to minimize or adjust the impact of the PageRank algo with its emphasis on total “link juice”

    It really was a great article about a great topic that most people missed.

    I happened to catch it at the time and became a devoted “fanboy” to your writings and the importance of patents as they might effect google’s algos. :D

    Thanks for including this in your list of top ten!!!! Keep up the great work!!!!

  18. Hi Derek,

    Thanks for sharing your experience, and your success in overcoming that subdomain/subdirectory problem by changing your URLs.

    I suspect that part of the problem might also have been Google treating the subdomains as if they were possibly separate websites, and not sharing the PageRanks for those pages and the other pages on whole domain.

  19. Hi David,

    Thank you very much for your kind words. This was one of the first Google patents that I decided to analyze and write about in detail, and the results and ideas that sprang out of it inspired me to go through a lot of other patents since.

    This change did seem to make an incredible amount of difference in how sites ended up being ranked for geographically related queries in Web search. There really wasn’t anyone writing about local search back then. I’m really happy to see a lot more people paying attention to it now. :)

  20. might be off topic slighty or covered somewhere else on your site as only just found this site, but, does google not penalise you for duplicate content if your site is available as both http://www.example.com and example.com as i have read conflicting info about this and still not sure of the answer.

  21. Definitely very forward thinking to look into a patent filing to gain SEO insight. This, as well as your other posts show that it’s becoming increasingly more important to use all of your resources and do your due diligence to stay ahead of the curve. With the internet age really in its infancy, it will likely only get more and more difficult to make it onto page one of Google so you’ve got to use every possible advantage to stay ahead of the competition. Looking forward to reading some more of your writing.

  22. Hi Steve,

    When you provide content that is available at both a www and a non-www version of the pages of your site, you stand a good chance of Google and other search engines not recognizing that and splitting the link equity, or PageRank of your pages. That’s not a penalty, in that Google isn’t intentionally penalizing you, but it can cause your pages not to rank as highly as they could.

  23. Hi Samantha,

    Thank you. There’s a lot of great information on the Web, and there’s a lot of misinformation on the Web, and that’s true of SEO as well as many other things. I like patents from the search engines because they are from the search engines. Google or Yahoo or Microsoft might file a patent that they may end up not using, or they might end up following a different approach, but there’s still some value to be gained from going through the patents that they do file, even if it’s just getting their perspective on a particular aspect of how they view the Web, or searchers, or search.

  24. Bill, I stumbled across your blog yesterday when doing some research into local SEO. Brilliant stuff. The level of detail you are going into, and researching straight from the source is awesome. Very different to the conventional blogging in this area. Thanks!

  25. Hi jlawrence,

    Thank you. Part of the reason why I blog is to research things I’m interested in, so that I learn from what I blog about. It makes the time I spend researching and blogging worth the time for me.

  26. Your articles are a fountain of obvious and – maybe even more important – not so obvious wisdom for us that take our daily bread from working with search engines to pull rankings.

Comments are closed.