Image Search Without the Text

The indexing of images at search engines relies on very little information taken from the images themselves, but instead relies upon text that appears upon a page with the images.

It appears that Google is getting smarter with image search, and a new feature (undocumented from Google) shows the ability to search for faces and for news images.

This paper from Yushi Jing of Georgia Institute of Technology, and Shumeet Baluja and Henry Rowley of Google looks at an approach to indexing images that relies much more upon characteristics of the images themselves:

Canonical Image Selection from the Web

From the abstract to the paper:

This paper explores the use of local features in the concrete task of finding the single canonical images for a collection of commonly searched-for products. Through large-scale user testing, the canonical images found by using only local image features significantly outperformed the top results from Yahoo, Microsoft and Google, highlighting the importance of having these image features as an integral part of future image search engines.

Here’s the Local Coherence-based Image Selection Algorithm described in the paper:

1. Given a text query, retrieve the top 1000 images from Google image search and generate SIFT features for these images.
2. Identify matching features with Spill Tree algorithm.
3. Identify common regions shared between images by clustering the matched feature points.
4. Construct a similarity graph. If there is more than one cluster, select the best cluster based on its size and average similarity among the images.
5. From the chosen cluster, select the image with the most and highly connected edges.