Context Clusters in Search Query Suggestions

Movies Context Clusters

Sharing is caring!

unsplash-logoSaketh Garuda

Context Clusters and Query Suggestions at Google

A new patent application from Google tells us about how the search engine may use context to find query suggestions before a searcher has completed typing in a full query. Think of Google as a Decision Engine, focused upon bringing searchers more information about interests they may have. After seeing this patent, I’ve been thinking about previous patents I’ve seen from Google that have similarities.

It’s not the first time I’ve written about a Google Patent involving query suggestions. I’ve written about a couple of other patents that were very informative, in the past:

In both of those, the inclusion of entities in a query impacted the suggestions that were returned. This patent takes a slightly different approach, by also looking at context.

Context Clusters in Query Suggestions

We’ve been seeing the word Context spring up in Google patents recently. Context terms from knowledge bases appearing on pages that focus on the same query term with different meanings, and we have also seen pages that are about specific people using a disambiguation approach. While these were recent, I did blog about a paper in 2007, which talks about query context with an author from Yahoo. The paper was Using Query Contexts in Information Retrieval. The abstract from the paper provides a good glimpse into what it covers:

User query is an element that specifies an information need, but it is not the only one. Studies in literature have found many contextual factors that strongly influence the interpretation of a query. Recent studies have tried to consider the user’s interests by creating a user profile. However, a single profile for a user may not be sufficient for a variety of queries of the user. In this study, we propose to use query-specific contexts instead of user-centric ones, including context around query and context within query. The former specifies the environment of a query such as the domain of interest, while the latter refers to context words within the query, which is particularly useful for the selection of relevant term relations. In this paper, both types of context are integrated in an IR model based on language modeling. Our experiments on several TREC collections show that each of the context factors brings significant improvements in retrieval effectiveness.

The Google patent doesn’t take a user-based approach ether, but does look at some user contexts and interests. It sounds like searchers might be offered a chance to select a context cluster before showing query suggestions:

In some implementations, a set of queries (e.g., movie times, movie trailers) related to a particular topic (e.g., movies) may be grouped into context clusters. Given a context of a user device for a user, one or more context clusters may be presented to the user when the user is initiating a search operation, but prior to the user inputting one or more characters of the search query. For example, based on a user’s context (e.g., location, date and time, indicated user preferences and interests), when a user event occurs indicating the user is initiating a process of providing a search query (e.g., opening a web page associated with a search engine), one or more context clusters (e.g., movies) may be presented to the user for selection input prior to the user entering any query input. The user may select one of the context clusters that are presented and then a list of queries grouped into the context cluster may be presented as options for a query input selection.

I often look up the inventors of patents to get a sense of what else they may have written, and worked upon. I looked up Jakob D. Uszkoreit in LinkedIn, and his profile doesn’t surprise me. He tells us there of his experience at Google:

Previously I started and led a research team in Google Machine Intelligence, working on large-scale deep learning for natural language understanding, with applications in the Google Assistant and other products.

This passage reminded me of the search results being shown to me by the Google Assistant, which are based upon interests that I have shared with Google over time, and that Google allows me to update from time to time. If the inventor of this patent worked on Google Assistant, that doesn’t surprise me. I haven’t been offered context clusters yet (and wouldn’t know what those might look like if Google did offer them. I suspect if Google does start offering them, I will realize that I have found them at the time they are offered to me.)

Like many patents do, this one tells us what is “innovative” about it. It looks at:

…query data indicating query inputs received from user devices of a plurality of users, the query data also indicating an input context that describes, for each query input, an input context of the query input that is different from content described by the query input; grouping, by the data processing apparatus, the query inputs into context clusters based, in part, on the input context for each of the query inputs and the content described by each query input; determining, by the data processing apparatus, for each of the context clusters, a context cluster probability based on respective probabilities of entry of the query inputs that belong to the context cluster, the context cluster probability being indicative of a probability that at least one query input that belongs to the context cluster and provided for an input context of the context cluster will be selected by the user; and storing, in a data storage system accessible by the data processing apparatus, data describing the context clusters and the context cluster probabilities.

It also tells us that it will calculate probabilities that certain context clusters might be requested by a searcher. So how does Google know what to suggest as context clusters?

Each context cluster includes a group of one or more queries, the grouping being based on the input context (e.g., location, date and time, indicated user preferences and interests) for each of the query inputs, when the query input was provided, and the content described by each query input. One or more context clusters may be presented to the user for input selection based on a context cluster probability, which is based on the context of the user device and respective probabilities of entry of the query inputs that belong to the context cluster. The context cluster probability is indicative of a probability that at least one query input that belongs to the context cluster will be selected by the user. Upon selection of one of the context clusters that is presented to the user, a list of queries grouped into the context cluster may be presented as options for a query input selection. This advantageously results in individual query suggestions for query inputs that belong to the context cluster but that alone would not otherwise be provided due to their respectively low individual selection probabilities. Accordingly, users’ informational needs are more likely to be satisfied.

The Patent in this patent application is:

(US20190050450) Query Composition System
Publication Number: 20190050450
Publication Date: February 14, 2019
Applicants: Google LLC
Inventors: Jakob D. Uszkoreit
Abstract:

Methods, systems, and apparatus for generating data describing context clusters and context cluster probabilities, wherein each context cluster includes query inputs based on the input context for each of the query inputs and the content described by each query input, and each context cluster probability indicates a probability that at a query input that belongs to the context cluster will be selected by the user, receiving, from a user device, an indication of a user event that includes data indicating a context of the user device, selecting as a selected context cluster, based on the context cluster probabilities for each of the context clusters and the context of the user device, a context cluster for selection input by the user device, and providing, to the user device, data that causes the user device to display a context cluster selection input that indicates the selected context cluster for user selection.

What are Context Clusters as Query Suggestions?

The patent tells us that context clusters might be triggered when someone is starting a query on a web browser. I tried it out, starting a search for “movies” and got a number of suggestions that were combinations of queries, or what seem to be context clusters:

The patent says that context clusters would appear before someone began typing, based upon topics and user information such as location. So, if I were at a shopping mall that had a movie theatre, I might see Search suggestions for movies like the ones shown here:

Context Clusters

One of those clusters involved “Movies about Business”, which I selected, and it showed me a carousel, and buttons with subcategories to also choose from. This seems to be a context cluster:

Movies about Business

This seems to be a pretty new idea, and may be something that Google would announce as an availble option when it becomes available, if it does become available, much like they did with the Google Assistant. I usually check through the news from my Google Assistant at least once a day. If it starts offering search suggestions based upon things like my location, it could potentially be very interesting.

User Query Histories

The patent tells us that context clusters selected to be shown to a searcher might be based upon previous queries from a searcher, and provides the following example:

Further, a user query history may be provided by the user device (or stored in the log data) that includes queries and contexts previously provided by the user, and this information may also factor into the probability that a user may provide a particular query or a query within a particular context cluster. For example, if the user that initiates the user event provides a query for “movie show times” many Friday afternoons between 4 PM-6 PM, then when the user initiates the user event on a Friday afternoon in the future between these times, the probability associated with the user inputting “movie show times” may be boosted for that user. Consequentially, based on this example, the corresponding context cluster probability of the context cluster to which the query belongs may likewise be boosted with respect to that user.

It’s not easy to tell whether the examples I provided about movies above are related to this patent or if it is tied more closely to the search results that appear in Google Assistant results. It’s worth reading through and thinking about potential experimental searches to see if they might influence the results that you may see. It is interesting that Google may attempt to anticipate what is suggests to show to us as query suggestions, after showing us search results based upon what it believes are our interests based upon searches that we have performed or interests that we have identified for Google Assistant.

The contex cluster may be related to the location and time that someone accesses the search engine. The patent provides an example of what might be seen by the searcher like this:

In the current example, the user may be in the location of MegaPlex, which includes a department store, restaurants, and a movie theater. Additionally, the user context may indicate that the user event was initiated on a Friday evening at 6 PM. Upon the user initiating the user event, the search system and/or context cluster system may access the content cluster data 214 to determine whether one or more context clusters is to be provided to the user device as an input selection based at least in part on the context of the user. Based on the context of the user, the context cluster system and/or search system may determine, for each query in each context cluster, a probability that the user will provide that query and aggregate the probability for the context cluster to obtain a context cluster probability.

In the current example, there may be four queries grouped into the “Movies” cluster, four queries grouped into the “Restaurants” cluster, and three queries grouped into the “Dept. Store” cluster. Based on the analysis of the content cluster data, the context cluster system may determine that the aggregate probability of the queries in each of the “Movies” cluster, “Restaurant” cluster, and “Dept. Store” cluster have a high enough likelihood (e.g., meet a threshold probability) to be input by the user, based on the user context, that the context clusters are to be presented to the user for selection input in the search engine web site.

I could see running such a search at a shopping mall, to learn more about the location I was at, and what I could find there, from dining places to movies being shown. That sounds like it could be the start of an interesting adventure.

Sharing is caring!

34 thoughts on “Context Clusters in Search Query Suggestions”

  1. Fascinating article and insights. I’ve been running some tests myself as context clusters are underused for client seo. Producing content that are supporting topics to your chosen core topic is so vital today and that is apparent with the patent you mentioned.

  2. Amazing Article, Bill! I was curious about Context Clusters, but i couldn’t an explanation like you did anywhere else. Thank you! This will certainly help me with seo knowledge in search queries. Keep up the good work!

  3. Bill, what is your suggestion to small business owners who are trying to navigate the big world of Google? I read through your article and wanted to forward it to my web guy to ask if this is something we need to be concerned with. How does one grow their knowledge of the search engine and help position their company at the top. Thank you again for this article.

  4. Hi Barry,

    Business owners who decide to conduct business online can find it a competitive advantage to learn as much as they can about Google, Bing, and other potential sources of traffic to their sites. This means spending time on the blogs that each publishes, and the many support pages that each offers, including webmaster guidelines, and developer’s pages. Google has webmaster evangelists such as John Mueller and Gary Illyes, who both answer questions at places like Twitter, and on the Google Webmaster Help forums. Paying attention to sources such as these can bring access to the latest news, and helpful advice from Google. The Google Search Console provides many tools filled with data about your site (same with Google Analytics) and that is worth spending time learning about, and using both to learn about traffic to your sites. Bing has Bing Webmaster Tools, which also provides information about your site from Bing’s perspective.

    In addition to learning as much as possible about sources of traffic to your site from search engines, there are places like Pinterest and Twitter and Facebook that may be worth looking into as well. Social networks can be another way your customers can reach out to you, and use to learn more about what goods and/or services that you offer. Using such sources to build positive relationships with you audience can be helpful as well.

    Also, when it comes to your customers, try to find the places that they might frequent on the Web and talk about things like the goods and services that you offer. What you might learn from them in those places may change how you offer what you do. Being alert and aware can provide divendends.

    With this post, it is about how Google may be working on making how it works with knowledge panels and structured data better – Google is always working on improving how it does what it does. If you don’t have a knowledge panel for your business, haven’t been using Google My Business to verify your business, and to use Google posts in a knowledge panel about your business, those are a couple of steps along Google’s efforts along the knowledge graph. It may give your web guy some ideas about how to start preparing for those days. The Google Developer’s Pages have a lot of informmation about using structured data that your web guy might find a good starting point – those may be worth looking into, because Google is going through changes, and is finding ways to evolve what they offer. Knowing where they might go, and knowing as much about your audience as possible are both things that can be very helpful to small busisness owners.

  5. We’ve been doing a lot of work around clustering for content based on topics and intent, so it was interesting to read about how Google might cluster search query suggestions based on context. We’ve never dived too deep into the technical details of how Google and its algorithm links words, topics and searches together, just understood that they do and followed a common sense approach to how these relate to better SEO. What I found most interesting from your article though was the idea that context clusters would appear before someone started typing, based on things like the user history and location – is that right? Could Google really build up enough individual user and location-specific data that it would effectively become a personalised search engine for each user that can pre-empt what they need/are going to ask for? Not sure whether to be fascinated or slightly unnerved.

  6. Hi Matt,

    The inventor of this patent was working on the Google Assitant, which provides a news feed that focuses upon your stated interests, and your search history, so the idea that they might provide context clusters before you start typing shouldn’t be surprising. I go though the Google Now news feed at least a couple of times a day, and it often provides opportunities to update my interests. Usually, they do a pretty good job of providing me with information that I am interested in. I haven’t had to set up alerts, and they are showing me thing I am interested in seeing. I think that is kind of nice.

  7. Hey Bill,
    Thanks for wonderful informative content,
    for me, it really adds huge value in terms of my knowledge towards google search query.
    looking forward to expanding my knowledge with your content.

    Thanks a lot,

  8. Bill, this is super interesting stuff. Trying to understand the searcher’s intent behind a keyword is a massive difficulty when you try to make keyword’s research at scale. This hints that the Google Auto-Complete really is closer and closer to expose the searcher’s intent. However, the autocomplete API is long gone 🙁

  9. Hi Bill,

    I am new here and recently read your articles. You always try to share the informative article, anyway I didn’t know about context clusters after reading this I got few ideas.

  10. Always waited for this moment to arrive. So this blog was quite helpful and informative thanks to the author keep posting such content.

    These posts are very useful to go through and I am eager to read some more content like this. Images are very eye catchy and love the display of content being done by the author would love to get some more of like this.

  11. Wow, a topic I personally find to be supporting the very heart of Google’s search results. My opinion anyway. INTENT is a consideration that the SEO community must expertly study and understand on behalf of our trusted clients. Again, my opinion. I want to thank you Bill Slawski, for making that a bit easier with the content you share here… Actually, including consideration of “Query Suggestions ” in our SEO began from information shared by you. Thanks, DUDE!

  12. This article is very good for me. Thanks for writting a such good blog

  13. The structural data does not appear or appear according to the keyword. When you observe it for a long time, it can be felt as a complex algorithm.

  14. Amazed!
    The quality of information you have delivered in this blog is remarkable. I enjoyed reading it and will definitely suggest my friends to go through this blog for once.
    Thanks for sharing.

  15. I love the way you get that little extra bit of juice from an article by doing some additional digging into the social profiles of the creators of patents.

    I think you can learn a lot about a wide variety of things by identifying the progenitor of something and then picking their brain by inspecting trails on the internet. You make SEO an interesting read.

    Thanks for the hard work!

  16. Great article about Google algorithm and cluster search, I’m always interested in reading this type of materials and trying to understand how to improve my ranking. I’m also wondering if you can write an article about Voice Search and other ranking factors used by Google.

  17. Bill, I strongly agree with you when you say that Google is doing everything they can to be innovative leaders. I think it’s pretty amazing how far-stretched their innovation really is. Even though they weren’t the first to come up with the new technology such as augmented reality glasses, but I think that’s a good example of their innovation: basically do as much cool stuff as possible. Oh, and not be evil while doing it.

  18. I am reading a blog on this website for the first time and I would like to tell you that the quality of the article is up to the mark it is very well written. Thank you so much for writing this article and I will surely read all the blogs from now on. Thank you so much for caring about your content and your readers.

  19. Bill Slawski, I must confess to you, your methodology is highly interesting. The information provided in this write up are really useful and helpful to me

  20. Hi Bill

    Awesome blog for Context Clusters in Search Query Suggestions! its amazing articles though. Thanks for sharing informative article!

    Keep it up!

  21. This Blog should be read by every Digital Marketing Enthusiast.
    Such a great content about the core patent behind search suggestion.
    Extremely Great niche Website for Digital Marketers. Loved it!

  22. I seen so many blogs daily and frequently saying.. I like your blogs the most.. because we can see the research and work done on it.. Really liked it

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.