My last post, Not All Anchor Text is Equal and other Co-Citation Observations, was a response to a White Board Friday video posted a couple of weeks ago at the SEOmoz Blog, Prediction: Anchor Text is Dying…And Will Be Replaced by Co-citation. I didn’t expect my next post (this one) to revisit that post and its observation that the way certain words might co-occur on pages might be a possible ranking signal that Google may be using.
Rand noted that first page rankings for three different pages, which didn’t seem very much optimized for the queries they were returned for, might be ranked based upon a ranking signal that looks at how words tend to co-occur on pages related to those queries. My post in response explored some reranking approaches by Google that also might account for those rankings, including Phrase Based Indexing, Google’s Reasonable Surfer Model, Named Entity Associations, Category associations involving categories assigned to queries and categories assigned to webpages, and Google’s use of synonyms in place of terms within queries.
Google’s Phrase-Based Indexing approach pays a lot of attention to words (phrases, actually) that appear together, or co-occur, in the top (10/100/1,000) search results for a query and may boost pages in rankings based upon that co-occurrence, and seemed like a possible reason why those pages might be appearing on the first page of results. The other reranking approaches that I included also seemed like they might be in part or in full responsible for the rankings as well. Then I found a patent granted to Google this week that seems like an even better fit.
Last Friday, in a well received and thoughtful White Board Friday at SEOmoz titled
Prediction: Anchor Text is Dying…And Will Be Replaced by Co-citation (title changed at SEOmoz) Prediction: Anchor Text is Weakening…And May Be Replaced by Co-Occurrence, Rand Fishkin described how some unusual Search Results caused him to question how Google was ranking some results.
I’m a big fan of looking at and trying to analyze and understand search results for specific queries, especially when they include results that appear somewhat puzzling, and I think those provide some great fodder for discussions about how Google might be ranking some search results. Thanks, Rand.
If I were to tell you that the major search engines have a bigger and richer database full of information than their index of the World Wide Web, would you believe me? Chances are that you’re one of the persons who helped build it. The information that Google and Bing and Yahoo collect about the searches and query sessions and clicks that searchers perform on the Web covers an incredible number of searches a day. When Google introduced their Knowledge Graph this past May, they gave us a hint of the scope and usage of this database:
For example, the information we show for Tom Cruise answers 37 percent of next queries that people ask about him. In fact, some of the most serendipitous discoveries I’ve made using the Knowledge Graph are through the magical “People also search for” feature.
When someone performs a search for a query that doesn’t produce much results at Google or Bing, the search engines might remove some of the query terms to provide more results, or they might look for synonyms that might help fill the same or a similar informational need. But chances are that such approaches still might not produce the kinds of results that searchers want to see.
Can the quality of links that your pages or videos or other documents link to influence the ranking of your pages, based upon a reachability score? A newly granted patent from Google describes how the search engine might look at linked documents and other resources reachable from a page or video or image to determine such a reachability score.
Search rankings might be promoted (boosted) or demoted in search results for a query based upon that reachability score calculated based upon a number of different factors.
Someone clicks on a search result, and while there they find links to other resources that they might click upon. Different user behaviors recorded by a search engine might be monitored to determine how people interact with the first, or primary resource visited, and similar user behavior signals may also be looked at for pages or videos or other resources linked to from that resource. Reachability scores might also be calculated for those secondary resources linked to from the first resource, looking at the third or tertiary pages and other resources linked to from the secondary resources.
Calculating reachability scores may follow a process like the following:
Imagine that a search engine might insert place markers into a web page, perhaps with the use of something like the new Google Tag Manager? These markers could enable a search engine to calculate how long it might take someone to read that page. A newly granted patent from Google describes why they might insert such markers (without really telling how how it might insert those), to determine the reading speed of a page.
The process described by the patent might try to understand how different features associated with a page might cause it to take less time or more time for a visitor to read a page. It would then use that understanding to predict how such features might influence the reading of other pages that don’t have markers inserted into them. These types of features could include language, layout, topic, and the length of text of those documents. These are all things that could affect traffic across the web or at specific websites.
I’m on the second day of a trip to New York City, giving presentations at SMX East on both the potential impact of mobile devices to the future of search, and on how reputation and authority signals might impact the rankings and visibility of authors and publishers and commentors on the Web.
My first presentation was in the “local and mobile” mobile track of the conference as part of a session titled “Meet Siri: Apple’s Google Killer?” where I joined Bryson Meunier, Will Scott, Andrew Shotland, and moderator Greg Sterling in discussing the potential impact of Apple’s Siri and voice search on SEO and search.
When I read the title for this proposed session a couple of months back, I couldn’t help but start to draft a pitch to join in on the conversation. I’ve been carefully watching patents and papers from Google and Apple and others about inventions and interfaces that might transform the way we search in the future, and the way that people might share information and market businesses online.
Google is experimenting with including emails in your search results. Of course, the emails you see will be personal to you, and won’t be shared with others. The emails will only be the ones that you received via Gmail, and the service is opt-in only. The announcement was made on August 8th, in the Google Official Blog post, Building the search engine of the future, one baby step at a time
Chances are that the rankings used to decide which emails to show, and the order of those emails is probably very similar to the importance rankings used to display different colored markers on your emails in Gmail. One of the good things about those importance ranking markers is that if you want, you can search and filter your Gmail emails by them if you want, as well as using other advanced search filters. But we don’t know exactly if the search from Gmail provides the same kind of ranking and results as the search results you might see when GMails are integrated into Google Web search.
For many search queries, very recent search results (such as from the last 6-12 hours) are preferred over older and more stale results that might rank well based upon popularity signals, including significant past user traffic that might cause them to have been assigned a high ranking. That may work fine if you think of search engines as a repository of pages that might be relevant as references, like a library.
But with the Web becoming a place where people frequently tweet social networking updates, with news sources striving to be the first to publish about breaking topics, bloggers publishing on new topics, merchants offering new products and discounting old ones, and other content online appearing with an emphasis on freshness, search engines are becoming increasingly a near real-time monitor of the World around us.