Ordering Images (and other Multimedia) in Search Results By Predicting Clickthroughs

Search for [ships] in Yahoo Image Search, and you’ll see a good number of images of ships. How do they get placed in the order that they appear within?

yahoo image search results showing images of ships.

A search engine tends to rely upon text associated with those images to rank them in image search. That could be alt text associated with the image, or a caption, or other text that shows up on a page near the picture. Some other information may also be used to rank those images, such as how relevant that page an image appears upon might be for the query term searched for and the quantity and quality of links pointing to the page.

Another signal that a search engine might consider may be the number of times a searcher may select an image when they see that image in search results. A potential problem with using selections or clickthroughs of images as a signal is that many searchers often expect that images (or web pages or news results or videos) near the top of search results tend to be the most relevant for search results and may be more likely to click through images or other kinds of search results at the top of listings that search engines show them.

Imagine a search engine coming up with a prediction model for different queries, where the search engine might predict how often searchers might click on a result at different positions in search results. For example, the top result might be clicked upon 12 percent of the time, and the second result 9 percent of the time, and the third result 6 percent of the time, and so on.

And then the search engine tracked the actual percentage of clicks each result received to see if some results received more clicks than predicted (therefore overperforming) and other results received less clicks than predicted (therefore underperforming). The search engine could move some results around and see if certain images tended to overperform while others underperformed based upon the positions they were placed at in the search results.

Images that tended to overperform could be moved up in search results, and images that underperformed could be moved down.

A recently published Yahoo patent application explores this process of predicting clickthroughs on search results to reorder images, videos, and other multimedia results based upon how well or poorly they perform at different locations in those search results.

The predictions could be made in part by looking at historical query log files for queries involving multimedia results.

A couple of different approaches could be followed in using this kind of prediction.

One would be for the search engine to aim at finding the best ordering of results to show when displaying the top results, to get the most clickthroughs for all of those results.

In one embodiment, ranking is performed based on permutations representing orderings of objects. A permutation is an ordered list. For a set of objects, there exists a finite number of ways to order the objects. For each permutation, a determination is made of the number of selections that are expected to result if a results list is provided in response to queries for objects that match a search query.

For example, if objects are listed in order (1,2,3,4,5), then it might be determined, based on the predictor results for each object at each position, that 10 selections are expected to occur over a period of time. If a different ordering, (2,3,1,5,4) for example, is expected to bring a larger number of selections, then the second ordering is more desirable. In this embodiment, all permutations are compared against one another to determine the permutation that is expected to receive the largest number of selections.

Another approach would be to focus upon individual results to see which position would earn the most clickthroughs for each individual image, or object, shown in search results.

In one embodiment, ranking is performed based on a contention game. A predictor determines a number of selections that each object is expected to receive at each position on a results list. The object expected to receive the maximum number of selections at a particular position is placed at the particular position. This determination may be made for each position, with each object competing for each position on the list. This ensures that each position contains the object expected to earn the maximum number of selections for that position.

While this patent filing uses images as an example of this prediction approach, it also tells us that it could be used with videos and word processing documents and computer code and plain text.

The patent application is:

Predicting and ranking search query results
Invented by Bipin Suresh and Nikhil Garg
US Patent Application 20090171942
Assigned to Yahoo
Published July 2, 2009
Filed December 31, 2007

Abstract

Techniques are described herein for providing search results that are ranked based on a predictor that predicts, for each of a number of objects, likelihoods that each particular object will be selected at different positions on a results list.

Share

16 thoughts on “Ordering Images (and other Multimedia) in Search Results By Predicting Clickthroughs”

  1. This is a good idea. I wonder if something similar will be implemented into web search results in the future.

    I have used something similar for an eCommerce site I worked on. At current it is only used for category pages for arranging the sub-categories. Basically, every time someone clicks on a product or category a view value is incremented in the database. The aim is obviously to see what is most popular in general and provide that first. However, there is the issue of deciding how to change the value of the votes for products that are further down the pile.

  2. Interesting read. I wonder how much “prediction” is given weight. If a website has given a highly clicked on
    image and then adds a new image….. link juice for images?

  3. Hi David,

    I wonder if we would even notice if a search engine started doing something like this (if they haven’t already). It is an interesting idea – optimizing search results based upon predicted clickthroughs. The idea of predicting based upon historical data, and resorting based upon whether an item underperforms or overperforms seems to make sense as well.

    Deciding how to sort results in ecommerce pages based upon different criteria can be difficult. In addition to sorting items by how many times certain ones may have been clicked upon or purchased, there may be a benefit in sorting and listing by price, by quality, by brand, by ratings, and in other ways. There’s potentially a great deal of value in testing and trying out different approaches.

  4. Hi Andy,

    Thanks. There’s does seem to be some value in using images on your pages that people find engaging enough to click through when they are presented out of context in image search results. The predictions themselves appear to be limited to a determination of which order to present images (or videos or other search results) by the search engine on its own pages. But when an image starts ranking higher because it over performs, does that have any effect on the page that it comes from, other than possibly indirect traffic through image search?

  5. Seems to me like this has some similarities with paid search – trying to return results most likely to get clicked on – which makes the engines more money, or in this case would theoretically provide the most likely clicked on content.

    Interesting concept, for certain.

  6. Certainly makes sense for the search engines to implement something like this. After all it helps them improve their results automatically, shifting the balance from constant algo tweaking to user determined rankings.

  7. Hi Ismael,

    I’ll have to confess that I’m a little surprised to have a comment left on my site by the Director of the Michigan Department of Human Services, and that you sound somewhat knowledgeable about paid search.

    The idea of paying attention to what visitors are doing on your web site, and making changes to make your site more effective makes sense. Seeing how a search engine might do that in the context of which images or videos or other media that visitors might choose from search results makes a lot of sense as well.

  8. Hi JR,

    Good points. Associating an image with a specific query based upon the text that accompanies that image, or the popularity of the page that it originally appears upon may mean images (or videos or other results) may not be all that relevant for those queries. Paying attention to what searchers select in an intelligent fashion does sound like an approach worth exploring.

    An interesting paper from one of the inventors listed on this patent filing from Stanford focuses upon some other ways to determine how relevant pages might be that show up in search results:

    Ascertaining the Relevance Model of a Web search engine

  9. Good Morning All – I have not yet read the patent and will do so soon. AT first glance, I am not overly enthused by a prediction method of relevance that associates images that the spider cannot see with text that it can index. Paid search makes these comparisons, between the search term, the ad text and the landing page text, to determine a quality score that drives placement. Customers can override the search engine’s decision by paying more for the lack of machine determined relevance to the searcher’s query. Click-through has its own influence after a time. Here, the search engine is relying solely on the image annotation for relevance determination and the variable factor of prediction. Hmmmm….

  10. Hi Marianne,

    Search engines rank images based upon the meta data associated with those images – alt text, some amount of text surrounding those images, and other text that appears upon the page that the images appears upon, possibly along with considering the rankings of pages that those images appear upon without seeing the images that it is associating with that text and those rankings. Is considering the selection of images from clickthroughs, and a prediction model which is aimed at eliminating bias based upon position something that helps rankings of images, or hurts it?

  11. Good points,in this case would theoretically provide the most likely clicked on content.

  12. Hi shadu,

    There has been some development in facial recognition and object recognition software that the search engines are using, and that may play a role in the way that images are ranked. What I liked about this patent filing was that it provided a glimpse at one method that might be used in combination with others.

  13. This is going to make an interesting change in thinking about the way they are ranking as described in the patent. If they truly move the over performing item in the rankings that sounds somewhat counter productive and is definitely going against the norm. If the model works though they must be expecting it to prove itself and equalize. This will prove to be an interesting thing to both follow and test to find out how it will deliver on it’s strategy as well it’s effectiveness.

  14. Hi James,

    This is one of those patent filings that I’d love to see a whitepaper that goes along with it, that tells about an experiment on the processes described in the patent, and some results. I went looking, but couldn’t find one. I did find an interesting paper from one of the inventors from a computer science class that he took which looked at ways of measuring the relevance of search results. It was interesting enough to link to in one of my comments above (Ascertaining the Relevance Model of a Web search engine). Such testing is something I’m hoping to see more of.

Comments are closed.