In general, the subject matter of this specification relates to identifying or generating augmentation queries, storing the augmentation queries, and identifying stored augmentation queries for use in augmenting user searches. An augmentation query can be a query that performs well in locating desirable documents identified in the search results. The performance of an augmentation query can be determined by user interactions. For example, if many users that enter the same query often select one or more of the search results relevant to the query, that query may be designated an augmentation query.
In addition to actual queries submitted by users, augmentation queries can also include synthetic queries that are machine generated. For example, an augmentation query can be identified by mining a corpus of documents and identifying search terms for which popular documents are relevant. These popular documents can, for example, include documents that are often selected when presented as search results. Yet another way of identifying an augmentation query is mining structured data, e.g., business telephone listings, and identifying queries that include terms of the structured data, e.g., business names.
These augmentation queries can be stored in an augmentation query data store. When a user submits a search query to a search engine, the terms of the submitted query can be evaluated and matched to terms of the stored augmentation queries to select one or more similar augmentation queries. The selected augmentation queries, in turn, can be used by the search engine to augment the search operation, thereby obtaining better search results. For example, search results obtained by a similar augmentation query can be presented to the user along with the search results obtained by the user query.
Continue reading “Quality Scores for Queries: Structured Data, Synthetic Queries and Augmentation Queries”
My last Post was Five Years of Google Ranking Signals, and I start that post by saying that there are other posts about ranking signals that have some issues. But, there are other pages that you may want to look at while you are learning to rank webpages, and I didn’t want to turn people away from looking at one recent post that did contain a lot of useful information.
Cyrus Shepard recently published a post about Google Sucess Factors on Zyppy.com which I would recommend that you also check out.
Cyrus did a video with Ross Hudgins on Seige Media where he talked about those Ranking signals with Cyrus, called Google Ranking Factors with Cyrus Shepard. I’m keeping this post short on purpose, to make the discussion about ranking the focus of this post, and the star. There is some really good information in the Video and in the post from Cyrus. Cyrus takes a different approach on writing about ranking signals from what I wrote, but it’s worth the time visiting and listening and watching.
Continue reading “Learning to Rank”
Search Using Structured Data
Structured Data is information that is set out in a way which makes it easy for a search engine to read easily. Some examples include XML markup in XML sitemaps and schema vocabulary found in JSON-LD scripts.
A search engine that answers questions based upon crawling and indexing facts found within structured data on a site works differently than a search engine which looks at the words used in a query, and tries to return documents that contain the same words as the ones in the query; hoping that such a matching of strings might contain an actual answer to the informational need that inspired the query in the first place. Search using Structured Data works a little differently, as seen in this flowchart from a 2017 Google patent:
In Schema, Structured Data, and Scattered Databases such as the World Wide Web, I talked about the Dipre Algorithm in a patent from Sergey Brin, as I described in the post, Google’s First Semantic Search Invention was Patented in 1999. That patent and algorithm described how the web might be crawled to collect pattern and relations information about specific facts. In that case, about books. In the Google patent on structured data, we see how Google might look for factual information set out in semi-structured data such as JSON-LD, to be able to answer queries about facts, such as, “What is a book, by Ernest Hemingway, published in 1948-1952.
Continue reading “Google Patent on Structured Data Focuses upon JSON-LD”
Visiting Seattle to Speak about Structured Data
I spoke at SMX Advanced this week on Schema markup and Structured Data, as part of an introduction to its use at Google.
I had the chance to visit Seattle, and tour some of it. I took some photos, but would like to go back sometimes and take a few more, and see more of the City.
One of the places that I did want to see was Pike Place market. It was a couple of blocks away from the Hotel I stayed at (the Marriott Waterfront.)
It is a combination fish and produce market, and is home to one of the earliest Starbucks.
I could see living near the market and shopping there regularly. It has a comfortable feel to it.
Continue reading “Schema, Structured Data, and Scattered Databases such as the World Wide Web”
Google Introduces Combined Content Results
This new patent is about “Combined content. What does that mean exactly? When Google patents talk about paid search, they refer to those paid results as “content” rather than as advertisements. This patent is about how Google might combine paid search results with organic results in certain instances.
The recent patent from Google (Combining Content with Search Results) tells us about how Google might identify when organic search results might be about specific entities, such as brands. It may also recognize when paid results are about the same brands, whether they might be products from those brands.
In the event that a set of search results contains high ranking organic results from a specific brand, and a paid search result from that same brand, the process described in the patent might allow for the creation of a combined content result of the organic result with the paid result.
Continue reading “Google to Offer Combined Content (Paid and Organic) Search Results”