Microsoft Playing with Blocks to Understand How Images Might be Related

What is the most important part of a page? If a page has images on it, what images are the most important ones?

If a search engine were to try to understand whether or not any images on the pages of a site were related to each other, how would it go about figuring that out?

The first two questions are easy to answer – the most important part of a page is the part that visitors focus upon when they look at it. The most important images are the ones that people look at and pay attention to when they are on that page.

A newly granted patent from Microsoft tries to solve all three questions in an automated manner that can break a page down into blocks, and decide a level of importance amongst those blocks when comparing them to each other – what is the probability that a user will focus upon each of those blocks (or upon images within those blocks) when looking at the page.

It might consider the importance of one block to another on the same page and on other pages within the same site by looking at links between the blocks on those pages. It might view whether images are within the same blocks or related blocks, and also look for links to images from different blocks to see if and how images might be related.

So, imagine a news site with a front page that is broken into excerpts for articles, with links to fuller stories inside. A number of them have one or two images. Each excerpt might be considered a block, and the images within that block might be considered related because the are shared within a block. If there are additional images in the full story linked to by that block, the images within that story might also be considered to be related

The patent is Method and system for identifying image relatedness using link and page layout analysis

From the patent on blocks:

A block of a web page represents an area of the web page that appears to relate to a similar topic. For example, a news article relating to an international political event may represent one block, and a news article relating to a national sporting event may represent another block.

So, why break a page down into blocks, instead of just trying to index everything on a page? Probably because many web pages cover more than one topic, especially like in the instance above where that page has multiple sections on different topics.

Why decide within those blocks which ones are the most important? Again, on that news page, the articles or excerpts are probably a lot more important than the copyright notice, or a set of navigational links, or some boilerplate that appears on every page of the site.

Plus, figuring out the importance of blocks can help with understanding the relatedness of blocks within the pages of a site – understanding the layout of a page and how parts of a pages might link to each other can help with indexing:

After calculating a numeric indicator of these importances for pairs of pages and blocks and pairs of images and blocks, the link analysis system generates an indicator of the relatedness of each image to each other image by combining the calculated importance of a block to a page, the calculated importance of a page to a block, and the calculated importance of an image to a block.

Because the relatedness of an image to another image is based on block-level importance rather than on page-level importance, this relatedness is a more accurate representation of relatedness than conventional link-based search techniques.

We’re also told that rankings of images may benefit from this method of looking at blocks upon pages:

The link analysis system may also use the relatedness of images to generate a ranking of the images. The ranking may be based on a probability that a user who starts viewing an arbitrary image will transition to another image after an arbitrarily large number of transitions between images.

The link analysis system may also generate a vector representation of the images based on their relatedness and apply a clustering algorithm to the vector representations to identify clusters of related images.

The patent was originally filed for in 2004, and this idea of looking at blocks isn’t surprising or new. For a good overview of this blocking process, see the Microsoft paper Block-level Link Analysis. For newer work on a very similar process, see: Search Objective Gets a Refined Approach

Share

6 thoughts on “Microsoft Playing with Blocks to Understand How Images Might be Related”

  1. Hey Bill,

    Very interesting read. Thanks. I went to take a peek at the Block Level link Analysis MS Research…

    We are told constantly to optimize all aspects and elements of a web page for the user – for obvious reasons. This, in the majority, appears to support that guideline.

    One thing I would query is that the VIPS algo and the block level web graph sections do appear to complement each other, but they do not necessarily complement traditional eye-tracking and F-shaped heatmap studies. Nor do they necessarily speak to actual user behavior, especially regarding certain types of searchers/users.

    e.g. >

    However, further reading indicates that they are attempting to discount navigational areas and other non-relevant onpage blocks, and that the pre-learning data can attempt to address these issues. Colour schemes are also being taken into consideration. To what extent I am uncertain, and how that is actually interpreted I am unsure of.

    I think that any kind of block level analysis would benefit enormously from actual human user studies per industry and per market across a range of site layouts. If the eye-tracking data and heatmap zones from such a study were to be effectively interpreted and applied to a vertical VIPS algorithm grouping it would offset the nearly pure mathematical nature of such an algo with real human usage behavioral data.

    I realize that would possibly never happen due to numerous feasibility considerations, but in an ideal world, that would be fantastic… and of course I would want to see all that lovely data for myself. :)

  2. rats – the e.g. didn’t show up..

    this is the e.g.

    ‘Intuitively, some blocks with big size and centered position are probably more important than those with smaller block size and margin position.’

  3. Hi F-lops-y

    Those are all good points, and it’s hard not to think of eye tracking studies when you see someone claim that they are trying to figure out what parts of pages people fixiate upon.

    The patent does seem to imply that they may be weighing different parts of a page that may be fairly valuable content, and comparing those against each other. To some degree it is, and that might be a problem. I think the approach has evolved a little.

    Block level analysis attempts to break pages into different parts that may not be all that related — But the ultimate goal isn’t so much that people might find some segments more important than others when viewing the pages, but as you note, is discounting parts of the pages that may be boilerplate or navigation or footers or sidebars, and distinquishing between segments that focus upon different topics.

    The object rank approach still attempts to break down pages into parts, but it also seems to incorporate more information retrieval elements into what it does.

    There’s more on it in a link from my last link in my post at:

    ObjectLevel Ranking: Bringing Order to Web Objects

    Colour schemes are also being taken into consideration. To what extent I am uncertain, and how that is actually interpreted I am unsure of.

    In some ways, it could be pretty simple. For example, envision a news page where stories alternatively have slightly different background colors. Those different colors would be one of possibly a few good indications that each segment is a different story. Some white space (or some other colored space) between each could help. A headline above each might also help. A horizonal rule (hr) between each could be a nice demarcation point showing that each segment is not related.

    I’m not completely sold on eyetracking on its own as a measure of the importance of different parts of pages, but you might find these eyetracking studies interesting:

    We don’t see that “F” shape everywhere, according to those studies.

    It would be interesting to see how user data could affect that, but I would guess that the way we read (top to bottom and left to right for a good pecentage of people who read certain languages) and decoration and coding issues (good and bad) could end up playing more of a role than they should if the idea is to relate information on blocks from one page to information from blocks on other pages.

  4. Pingback: Bill Slawski answers: Does Google really use VIPS to sort the signal from the noise?

Comments are closed.