Last October, I made a list of 20 Ways Search Engines May Rerank Search Results, which was well received (thank you!), and it was suggested recently that I come up with an updated list.
When someone searches at a search engine, the conventional approach a search engine might take is to try to find pages that contain the keywords searched for, and rank and serve them in an order which combines a relevancy score for each of the pages with some kind of importance metric, such as a PageRank Score.
This list contains links to a number of patent applications and a few papers involving ways to rerank search results. Most of these were published after the creation of my previous reranking list.
The approaches in some may overlap a little with some on the previous list in terms of topics covered, but these are new documents discussing how search results might be reranked.
Some of the methods described here may not presently be in use, but it might not hurt to think about them.
1. Desktop Search Influenced by the Contents of an Active Window
Granted on Tuesday, this Google patent describes how a search of the Web may be altered based upon an active document, such as a text document or email or IM message, in an open window on a person’s computer at the time that they are searching while using a desktop search application.
2. Expanded and Adjacent Queries from User Logs
Another recent patent filing, from Microsoft, takes a look at user query sessions log files. It looks at the terms used in a query, and takes aggregated queries of others who have used the same search terms along with other words in their queries to create expanded queries.
It also looks at related queries during sessions (adjacent queries) from other searchers who have searched for the same query. Results from these expanded queries and adjacent queries may be used to rerank the results that show up in response to the original query.
3. Social Network Endorsements
If you are a member of a social network, and the network allows you to rate and endorse web pages, and to let your friends do the same, the endorsements made by you and from your friends may cause the results that you see from a search be reranked, according to a Google patent application.
4. Personalized Anchor Text Relevance
Links from pages that contain some anchor text that seems related to information found in an explicit or implicit profile of your interests may weigh more heavily in the rankings of pages under the following recently granted Google patent.
Explicit information is information related to something that you have expressly stated that you have some interest in. Implicit information involves information that may be inferred that you are interested in based upon such things as pages that you have bookmarked, or visited in the past.
5. Recognizing Semantically Meaningful Compounds
A search for more than one term may result in a search engine searching for sets of pages that use all of those terms. Treating some of the words within a query as semantically meaningful compounds, and reranking based upon pages which contain such a compound may mean that more relevant documents are returned to a searcher.
6. Use of Trends and Bursty Topics
Fresh and highly topical and popular content related to a query may make its way into search results, and push down other relevant results, from an Ask.com patent filing.
Correlated top gainer events can be used to improve the ranking of search engines and predicting search trends. This is used for adding freshness to the Web index. Those Web pages that contain fresh topics–identified over the stream of news–are boosted in ranking for the period of observation. After a certain amount of time (e.g., a week, a month, etc.), if the topic is no longer fresh the boosting effect is subject to a decay rule.
7. User Distributed Search Results
A Google patent application describes a method of letting people insert search results into their blogs, emails, and instant messages. Reputation scores may be created for people who do this, and the higher the reputation score, the higher that result might rank for a relevant query searched for by someone else.
8. Advanced Search Users
A Microsoft paper looks at how advanced users of search engines searching and browser results, to get a sense of how results might be improved for all searchers.
9. Dual Trustrank
Using community endorsements and ratings of endorsers, along with link-based trustrank, in a dual trustrank process from Yahoo, to rerank pages. The idea is that there are members of your social network whom you trust, and if they endorse a page, than it is likely to be more trustworthy.
Couple that with a link analysis approach to finding webspam, and this “dual” method of identifying trust can be used to show a searcher more trustworthy pages.
10. Web Traffic
By looking at real time, or near real time web traffic and activity, including search results selections at other search engines, results can be reranked under the methods described in this Ask.com patent application.
11. Different Queries, Similar Results and Selections
A Yahoo patent application looks at query histories for different queries that provide similar results and similar selections amongst the searchers who enter those queries. This may allow a search engine to broaden result sets to include results from those different queries.
- Using matrix representations of search engine operations to make inferences about documents in a search engine corpus
12. Understanding Timely Topics through Alerts
The frequency and timeliness of alert sign ups for different topics could affect rankings under this Google patent application.
13, 14, 15. Similar Users with Similar Interest and Their Selections
Three methods of clustering users with similar interests, to rerank results based upon what those other users have selected. There’s some similarity under these approaches, and some significant differences. But is seemed reasonable at the time I wrote this to cluster them together.
- Scalable user clustering based on set similarity (Google)
- Augmenting user, query, and document triplets using singular value decomposition (Microsoft)
- Methods and systems for providing a response to a query (IAC Search & Media, Inc.)
16. Paid and Organic Results on the Same Page
The appearance of results in both paid search and organic might cause the organic results to be removed, as described in this Microsoft patent application. Not a major “reranking,” but an interesting one.
17. High Confidence Spelling Corrections
Spelling corrections where the search engine believes with a high degree of confidence that the query included a misspelling may result in pages being included in results which use the correct spelling. The results for what the search engine believes is the misspelling may then be pushed back, under this Google patent application.
18. Language Match Between Query and Pages Returned
If the language used in the query doesn’t match the language used on the page being returned (except for English language pages), the page may be moved down in search results:
19. Labels of Custom Search Results
Reranking based upon the creation and use of custom search engines on different topics, with labels relevant to the queries used. Rather than pointing at the patent filing for this one, this paper on indexing data structures was pretty interesting:
In the case of Google Co-op, customized search engines can specify query patterns that trigger specific facets as well as provide hints for re-ranking search results. The annotations that any customized search engine specifies are visible only within the context of that search engine. However, as we start seeing more custom search engines, it would be desirable to point users to different engines that might be relevant.
- Structured Data Meets the Web: A Few Observations (no longer available)
20. Agent Rank
Rankings based upon the reputation of an author, under a system that ranks different parts of pages based upon verifiable authorship of those sections.
I wrote about this one some more at Search Engine Land – Google’s Agent Rank Patent Application
Other Factors, like Universal Search
There are probably a number of other factors that may influence and cause search results to be reranked. I didn’t even include Universal and Blended search results in this list, though I probably could and should have.
Again, it’s possible that some of these reranking methods are presently being used, some may be used in the future, and some may not be used at all. It’s even more likely that as we move forward, that two people performing the same search in different locations, or at different times, or both, will see different results from the search engines in response to the same queries.