Databases Answering Queries
Imagine that some sites might be ranked by Google based upon how their databases answer queries. A patent from Google refers to this approach as one that looks at database service requirements to rank large sites such as sites that cover products, jobs, travel, recipes and movies. Such sites might include some static pages that provide examples of the capabilities of their databases, such as being able to provide answers to queries such as: “Brand X Cameras for less than $300.00”.
The patent provides some examples of the types of sites that are covered by this patent:
Many websites for which data available in resources store the data in large databases of structured information. For example, job search websites may have respective job databases, and respective resources (web pages) that include forms to search the databases. Likewise, recipe websites have respective databases for recipes, and movie websites have respective databases for movies. Requesting information for a certain recipe or movie causes the website to query its respective database and generate a webpage that presents the information in a structured format.
The patent tells us that most search engines do not account for the abilities of databases of such sites to respond to particular queries, which may make Google different from those search engines. The patent says:
However, many scoring algorithms do not score the search capabilities of a database when determining the relevance of a resource generated from data stored in the database. As a result, the search engine may not identify data that are particularly relevant to a query, and/or identify particular search capabilities that are available to the user that issued the query and that may help the user satisfy his or her informational need.
Imagine that Google may rank sites based upon “a service requirement score for the database,” or how well their databases answer queries.
That service requirement score would be a measure of an ability of the database to fulfill the service requirement, which would enable it to respond to a query. I just wrote a post that described how Google is working to create a database of questions and answers to those, to show as “people also ask” questions, in the post: Google’s Related Questions Patent or ‘People Also Ask’ Questions. It’s not too much of a stretch to imagine that Google is saving up questions that might be asked about Jobs, Travel, commerce, movies, and recipes, and trying to determine which sites might be best at answering those questions, and ranking those sites on their ability to answer questions with different parameters, such as “a recipe for X that is under 1,000 calories.”
Advantages Under this Patent
The patent provides a couple of examples of advantages of using the processes described in this patent that are practical and helpful:
Websites need not generate multiple “optimized webpages” that are optimized for particular instances of queries to ensure that the website is identified in a search result. Instead, the underlying capabilities of the website database and the authority of the website are used as metrics to surface websites and databases that are of high quality with respect to a particular query. This reduces the overall cost of website management, and provides users with data that are more likely to satisfy the user’s informational need than the optimized webpages.
The systems and methods can utilize the conceptual schemas of the databases to provide additional information for queries that may not otherwise be derived from the queries. For example, a user that types in the search query [Brand X cameras under 300] may be searching for Brand X cameras that cost less than $300. The user, however, may not know that the “Q” models of Brand X cameras are prosumer models that each retail in excess of $300. Thus, by use of a product database, the search engine may determine that “Q” model are each in excess of $300. Thus, the search engine may modify the query with an operator that excludes the “Q” models, e.g., [Brand X cameras under 300 OP:NOT(Q)], or, alternatively, modify the query to emphasize resources that include reference to Brand X models that are priced under $300. The search engine thus surfaces fewer resources that include extraneous information, thereby satisfying the user’s informational need more quickly than if the extraneous information were provided.
The patent is:
Resource identification from organic and structured content
Inventors: Trystan G. Upstill and Jack W. Menzel
Assignee: Google Inc.
US Patent 9,589,028
Granted: March 7, 2017
Filed: March 16, 2016
Methods, systems, and apparatus, including computer program products for structured content ranking. In an aspect, a method determines a service requirement from terms of a query, the service requirement being one of a plurality of service requirements fulfilled by databases; determines, for each of the databases, a service requirement score for the database, the service requirement score being a measure of an ability of the database to fulfill the service requirement; selects databases based on the service requirement scores; generates data responsive to the service requirement based on the terms of the query and one or more of the selected databases; and generates, from the data identifying resources that are determined to be responsive to the query and from the data responsive to the service requirement, search results that include first search results that each identify a corresponding resource that was determined to be responsive to the query.
SEO Based Upon Database Capabilities
This patent describes how sites might be ranked based upon their ability to answer questions from searchers instead of just how well optimized those sites might be based upon information retrieval relevance scores and link-based importance scores. We don’t know how much weight Google might give to a database service requirement ranking, but chances are, considering Google would be trying to find the most helpful sites, that may be considered an important metric. The detailed description for the patent starts off telling us this:
In some implementations, the search engine ranks results using a first ranking algorithm and based on non-semantic search terms, e.g., [nursing jobs]. The search system then accesses database information that describes the content and capabilities of website databases to determine which of the databases can fulfill a database service requirement. For example, if the query is [nursing jobs in Palo Alto over 100,000], the search system will identify jobs databases that have geographic and salary parameters that includes the values of “Palo Alto” and “100,000” or more. Using this information, the search engine may promote (or demote) search results referencing resources of a website that includes a database, and/or revise the query to include a constraint to filter out (or emphasize) resources that include certain terms.
The patent provides a definition of a service requirement:
a service that is requested, either implicitly or explicitly, by a query. For example, for the query [nursing jobs in Palo Alto over 100,000], the service requirement is a job search. Likewise, for the query [LAX to SFO] (or [Flights LAX to SFO]), the service requirement is a flight search.
Search Results Using Website’s Databases
The website databases may have different parameter types depending upon what kind of content is contained in the site. For instance:
…a flight search database may be configured to receive parameter values for the following parameter types: origin location, destination location, times and dates. Likewise, a job search database may be configured to receive parameter values for the following parameter types: location, job category, and salary.
These database parameters may be responded to by different parameter values, so
a particular job database may be tailored to only nursing jobs in New York and thus, the parameter value from the parameter type “Nursing Category” may be limited to specific nursing categories, e.g., Cardiology, Cardiothoracic, Hemodialysis, etc.
If Google is aware of the different parameters and values that respond to queries submitted to a site’s database, it can show those database results right in search results, as shown in this screenshot from the patent:
I haven’t seen any search results quite like this yet, but it seems to be something to keep an eye open for.
Google may exclude some results from a query if they don’t fit what a person might have searched for as shown in this screenshot from the patent:
This makes search engines more like something that is searching a web sized database. This may be part of the future of SEO.