Google Image Classification and Landmarks

Image Classification in the past

Back in 2008, I was writing about how a search engine might learn from photo databases like Flickr, and how people label images there in a post I wrote called, Community Tagging and Ranking in Images of Landmarks

In another post that covers the Flickr image classification Landmark work, Faces and Landmarks: Two Steps Towards Smarter Image Searches, I mentioned part of what the Yahoo study uncovered:

Using automatically generated location data, and software that can cluster together similar images to learn about images again goes beyond just looking at the words associated with pictures to learn what they are about.

That is using metadata from images in an image collection, which is very different from what Google is doing in this post about identifying landmarks in the post, How Google May Interpret Queries Based on Locations and Entities (Tested), where it might identify landmarks based upon a knowledge of their actual location.

Continue reading “Google Image Classification and Landmarks”

Context Clusters in Search Query Suggestions

unsplash-logoSaketh Garuda

Context Clusters and Query Suggestions at Google

A new patent application from Google tells us about how the search engine may use context to find query suggestions before a searcher has completed typing in a full query. Think of Google as a Decision Engine, focused upon bringing searchers more information about interests they may have. After seeing this patent, I’ve been thinking about previous patents I’ve seen from Google that have similarities.

Continue reading “Context Clusters in Search Query Suggestions”

Universal Search Updated at Google

unsplash-logoTristan Colangelo

Sura gave up on her debugging for the moment. ‘The word for all this is ‘mature programming environment.’ Basically, when hardware performance has been pushed to its final limit, and programmers have had several centuries to code, you reach a point where there is far more signicant code than can be rationalized. The best you can do is understand the overall layering, and know how to search for the oddball tool that may come in handy ‘take the situation I have here’ She waved at the dependency chart she had been working on ‘We are low on working fluid for the coffins. Like a million other things, there was none for sale on dear old Canberra. Well, the obvious thing is to move the coffins near the aft hull, and cool by direct radiation. We don’t have the proper equipment to support this so lately, I’ve been doing my share of archeology. It seems that five hundred years ago, a similar thing happened after an in-system war at Torma. They hacked together a temperature maintenance package that is precisely what we need.

‘Almost precisely’

Continue reading “Universal Search Updated at Google”

How Google’s Knowledge Graph Updates Itself by Answering Questions

unsplash-logoElijah Hail

To those of us who are used to doing Search Engine Optimization (SEO), we’ve been looking at URLs filled with content, and links between that content, and how algorithms such as PageRank (based upon links pointed between pages) and information retrieval scores based upon the relevance of that content have been determining how well pages rank in search results in response to queries entered into search boxes by searchers. Web pages connected by links have been seen as information points connected by nodes. This was the first generation of SEO.

Chances are good that many of the methods that we have been using to do SEO will remain the same as new features appear in a knowledge Graph based search, such as knowledge panels, rich results, featured snippets, structured snippets, search by photography, and expanded schema covering many more industries and features then it does at present.

Continue reading “How Google’s Knowledge Graph Updates Itself by Answering Questions”

How Google Identifies Primary Versions of Duplicate Pages

Identifying Primary Versions of Duplicate Pages

We know that Google doesn’t penalize duplicate pages on the Web, but it may try to identify which version it prefers to other versions of the same page.

I came across this statement from Dejan SEO on the Web about duplicate pages earlier this week, and wondered about it, and decided to investigate more:

If there are multiple instances of the same document on the web, the highest authority URL becomes the canonical version. The rest are considered duplicates.

The above quote is from the post at Link inversion, the least known major ranking factor. (it is not something I am saying with my post. I wanted to see if there might be something similar in a patent. I found something closer, but it deoes say the same thing that Dejan predicts
.

Man in a cave
unsplash-logoLuke Leung

I read that article from Dejan SEO about duplicate pages, and thought it was worth exploring more. As I was looking around at Google patents that included the word “Authority” in them, I found this patent which doesn’t quite say the same thing that Dejan does, but is interesting in that it finds ways to distinguish between duplicate pages on different domains based upon priority rules, which is interesting in determining which duplicate pages might be the highest authority URL for a document.

Continue reading “How Google Identifies Primary Versions of Duplicate Pages”

Quality Scores for Queries: Structured Data, Synthetic Queries and Augmentation Queries

Quality Scores and Augmentation Queries

In general, the subject matter of this specification relates to identifying or generating augmentation queries, storing the augmentation queries, and identifying stored augmentation queries for use in augmenting user searches. An augmentation query can be a query that performs well in locating desirable documents identified in the search results. The performance of an augmentation query can be determined by user interactions. For example, if many users that enter the same query often select one or more of the search results relevant to the query, that query may be designated an augmentation query.

In addition to actual queries submitted by users, augmentation queries can also include synthetic queries that are machine generated. For example, an augmentation query can be identified by mining a corpus of documents and identifying search terms for which popular documents are relevant. These popular documents can, for example, include documents that are often selected when presented as search results. Yet another way of identifying an augmentation query is mining structured data, e.g., business telephone listings, and identifying queries that include terms of the structured data, e.g., business names.

These augmentation queries can be stored in an augmentation query data store. When a user submits a search query to a search engine, the terms of the submitted query can be evaluated and matched to terms of the stored augmentation queries to select one or more similar augmentation queries. The selected augmentation queries, in turn, can be used by the search engine to augment the search operation, thereby obtaining better search results. For example, search results obtained by a similar augmentation query can be presented to the user along with the search results obtained by the user query.

Continue reading “Quality Scores for Queries: Structured Data, Synthetic Queries and Augmentation Queries”