Personalizing Search Results at Google

Sharing is caring!

document sets at Google

One thing most SEOs are aware of is that search results at Google are sometimes personalized for searchers; but it’s not something that I’ve seen too much written about. So when I came across a patent that is about personalizing search results, I wanted to dig in, and see if it could give us more insights.

The patent was an updated continuation patent, and I love to look at those, because it is possible to compare changes to claims from an older version, to see if they can provide some details of how processes described in those patents have changed. Sometimes changes are spelled out in great detail, and sometimes they focus upon different concepts that might be in the original version of the patent, but weren’t necessarily focused upon so much.

One of the last continuation patents I looked at was one from Navneet Panda, in the post, Click a Panda: High Quality Search Results based on Repeat Clicks and Visit Duration In that one, we saw a shift in focus to involve more user behavior data such as repeat clicks by the same user on a site, and the duration of a visit to a site.

Personalizing search results
Inventors: Paul Tucker
Assignee: GOOGLE INC.
US Patent: 9,734,211
Granted: August 15, 2017
Filed: February 27, 2015

Abstract

A system receives a search query from a user and performs a search of a corpus of documents, based on the search query, to form a ranked set of search results. The system re-ranks the set of search results based on preferences of the user, or a group of users, and provides the re-ranked search results to the user.

The older version of the patent is Personalizing search results, which was filed on September 16, 2013, and was granted on March 10, 2015.

A continuation patent has claims rewritten on it, that reflect changes in how a process that has been patented might have changed, using the filing date of the original version of the patent.

I like comparing the claims, since that is what usually changes in continuation patents. I noticed some significant changes from the older version to this newer version.

There is a lot more emphasis on “high quality” sites and “distrusted sites” in the new version of the patent, which can be seen in the first claim of the patent. It’s worth putting the old and the new first claim one after the other, and comparing the two.

The Old First Claim

1. A method comprising: identifying, by at least one of one or more server devices, a first set of documents associated with a user, documents, in the first set of documents, being assigned weights that reflect a relative quantification of an interest of the user in the documents in the first set of documents; receiving, by at least one of the one or more server devices, a search query from a client device associated with the user; identifying, by at least one of the one or more server devices and based on the search query, a second set of documents, each document from the second set of documents having a respective score; determining, by at least one of the one or more server devices, that a particular document, from the second set of documents, matches or links to one of the documents in the first set of documents; adjusting, by at least one of the one or more server devices, the respective score of the particular document, to form an adjusted score, based on the weight assigned to the one of the documents in the first set of documents; forming, by at least one of the one or more server devices, a list of documents in which documents from the second set of documents are ranked based on the respective scores, the particular document being ranked in the list based on the adjusted score; and providing, by at least one of the one or more server devices, the list of documents to the client device.

The New First Claim

This is newly granted this week:

1. A method, comprising: determining, by at least one of one or more server devices, preferences of a user or a group of users, wherein the preferences indicate a document bias set and weights assigned to the documents, wherein the weights include distrusted document weights; determining, by the at least one of the one or more server devices, a high quality document set obtained from a document ranking algorithm; creating, by at least one of the one or more server devices, an intersection set of documents which includes documents in both the document bias set and the high quality document set; receiving, by at least one of the one or more server devices, a search query from the user; performing, by at least one of the one or more server devices, a search of a corpus of documents, based on the search query, to form a ranked set of search result documents; determining, by at least one of the one or more server devices, at least one link from the intersection set of documents to at least one document in the ranked set of search result documents, the at least one document not in the intersection set of documents; re-ranking, by at least one of the one or more server devices, the set of search result documents based on the preferences of the user or the group of users, wherein re-ranking the set of search results comprises: identifying a link of the set of links from the intersection set of documents to the document of the set of search result documents, and based on identifying the link, adjusting a rank of the search result document based on the weight assigned to the document in the document bias set from where the identified link originated from; and providing, by at least one of the one or more server devices, the re-ranked search results to the user.

The changes I am seeing in these two different first claims involve what are being called “distrusted document weights” from a “document bias set”, and showing pages from “a high quality document set.” The newer claim makes it more clear that personalized results come from these two different sets of results. It’s possible that it doesn’t change how personalization actually works, but the increased clarity is good to see.

The Purpose of these Personalizing Search Results Patents

We are told that some sites are favored more than others, and some are disliked more than others, and those are are created from a query or browser history, to generate a document bias set:

FIG. 1 illustrates an overview of the re-ranking of search results based on a user’s or group’s document or site preferences. In accordance with this aspect of the invention, a document bias set F 105 may be generated that indicates the user’s or group’s preferred and/or disfavored documents. Bias set F 105 may be automatically collected from a query or browser history of a user. Bias set F 105 may also be generated by human compilation, or editing of an automatically generated set. Bias set F 105 may include a set of documents shared, or developed, by a group that may further include a community of users of common interest. Document bias set F 105 may include one or more designated documents (e.g., documents a, b, x, y and z) with associated weights (e.g. w.sup.a.sub.F, w.sup.b.sub.F, w.sup.x.sub.F, w.sup.y.sub.F and w.sup.z.sub.F). The weights may be assigned to each document (e.g., documents a, b, x, y and z) based on a user’s, or group’s, relative preferences among documents of bias set F 105. For example, bias set F 105 may include a user’s personal most-respected, or most-distrusted, document list, with the weights being assigned to each document in bias set F 105 based on a relative quantification of the user’s preference among each of the documents of the set.

This document bias set mention appears in both the older, and the newer version of the patent.

The patents also both refer to a high quality document set, and that is described in a way that seems to place a lot of attention on PageRank or a Hubs and Authority approach to ranking:

A high quality document set L 110 may be obtained from any existing document ranking algorithm. Such document ranking algorithms may include a link-based ranking algorithm, such as, for example, Google’s PageRank algorithm, or Kleinberg’s Hubs and Authorities ranking algorithm. The document ranking algorithm may provide a global ranking of document quality that may be used for ranking the results of searches performed by search engines. High quality document set L 110 may be derived from the highest-ranking documents in the web as ranked by an existing document ranking algorithm. In one implementation, for example, set L 110 may include the top percentage of the documents globally ranked by an existing document ranking algorithm (e.g., the highest ranked 20% of documents). In an implementation using PageRank, set L 110 may include documents having PageRank scores higher than a threshold value (e.g., documents with PageRank scores higher than 10,000,000). Set L 110 may include multiple documents (e.g., documents m, n, o, p, x, y and z) with associated weights (e.g., weights W.sup.m.sub.L, W.sup.n.sub.L, W.sup.o.sub.L, W.sup.p.sub.L, W.sup.x.sub.L, W.sup.y.sub.L and W.sup.Z.sub.L). The weights may be assigned to each document (e.g., documents m, n, o, p, x, y and z) based on a relative ranking of “quality” between the different documents of set L 110 produced by the document ranking algorithm.

Personalized results served to a searcher are results that come from both the document bias set, and the high quality document set (as the patent says, from an “intersection” between the two sets).

If you are interested in how personalized search may work at Google, spending some time with this new patent may provide some insights. Knowing about how two different sets of documents are involved in returning results is a good starting point.

Sharing is caring!

30 thoughts on “Personalizing Search Results at Google”

  1. Great article on personalizing search results. Search results is a very crucial factor in Seo. Very useful information for me.

  2. Hi Bill,
    I have been reading your articles recently and they are very informative and well written. I love your writing style. Thanks for your amazing posts.

    Regards.

  3. Hi Bill,

    Thank you for writing on this topic. To be honest i was not aware something like this exist. Now i know about Personalizing Search Results at Google to an extent.

  4. Just finished listening to your two podcast session with edge of the web my 3rd time which brought me here. Great material – thanks for sharing it Bill.

  5. Hi Thomas,

    Good hearing that you enjoyed those podcasts. I had fun working on them. Happy that you visited here after listening. 🙂

  6. Hi Bill, I enjoyed the read. You took a different angle on search results that I hadn’t seen to date. I didn’t think to look at the patent and compare it to previous versions. Thanks for that.

  7. Hi Luke,

    Many of the patents I write about are new ones, and not continuation patents, so I can’t really compare them with an older version. But, when we do have a situation like this were it is an instance were the claims of a patent have been re-written, it feels like an instance where the process described in the first version of the patent may be something that may have been followed by a search engine, so comparing the claims from the two may give us an updated look at how the process written about is possibly being used. I like being able to do that a lot, and I’m thinking of a couple of other continuation patents that I didn’t write about but should. 🙂

  8. Hey, Bill.
    Excellent topic. Thanks for sharing such an informative post. I have learned something new from your article.

  9. Hi Munna,

    It was interesting seeing personalized results coming from a biased document set. I hadn’t seen it quite worded like that from Google before. It’s nice being able to see how all of the parts fit together.

  10. Wow, You’ve got some amazing knowledge regarding this topic. Hope, I am surely going to bookmark your page and please add me to your mailing list. I need more suggestions and tips on SEO. Thanks in advance.

  11. Bill, love your articles. Don’t understand the difference between “distrusted document weights” and “document bias set” but do see that Google is giving more personalized results.

  12. Hey Bill,
    I have learned new things today from your post. It’s really a complicated thing, but you have written down this topic like a master of Google High school. Heads off to you man. Keep going on and let us know more complicated topics like this one. Thanks Bill for your awesome post.

  13. Hi Bill,

    Very cool article.It’s always fascinating to see how Google personalizes search results.The Fig.1 flow chart is especially interesting with the document sets.Always enjoy your insights!

  14. it feels like a situation where the system depicted in the essential variation of the patent may be something that may have been trailed by a web crawler, so differentiating the cases from the two may give us an invigorated look at how the technique elucidated is possibly being used. I like having the ability to do that a ton, and I’m considering a couple of other continuation licenses .

  15. Great piece of information related to personalizing search results. Obviously, In the changing landscape of marketing and promotion, SEO has become a core element of any serious business whether big, small or medium. Therefore, searching is crucial for SEO.
    Thanks a lot for sharing it.

  16. Great piece of writing. I still not reading it and till now I miss a great post on search result. If is not easy to understand me but I know this is more precious post. So read it more and more. Thanks for sharing, keep it up.

Comments are closed.