How Google Rejects Annoying Advertisements and Pages

How might a search engine approve or reject ads automatically, without human review, on the basis that the ads are annoying or displeasing in some way?

Without considering the very large volume of ads that get presented to Google everyday, you might think that they would manually review every ad that advertisers present for publication, which would take a lot of people. While ads should attract some attention, they shouldn’t be annoying or offensive. There are a number of standards set from Google for image ads, video ads, and for text ads.

A patent application from Google goes into a good amount of depth on how it might take a programmatic approach to identifying ads, and Web pages that are “annoying.”

The patent filing describes some of the methods used when reviewing images and text and audio, with tools like Optical Character Recognition and pattern matching against large databases of images and sounds. It also details how Flash and animated images might be reviewed, but is silent on what it is looking at when it refers to things like a “Trust Score.”

Continue reading “How Google Rejects Annoying Advertisements and Pages”

Google’s User Distributed Search Results in Emails, IMs, Blogger

Imagine writing an email or blog post or forum post or IM message, and having a button that you can press that will perform queries based upon what you’ve written and find relevant lists of search results based upon your content, for you to include with what you have written.

Those can be maps and local search result lists for addresses or businesses that you’ve included, or search results for a name or product in your message or post, or other information. A couple of illustrations below (click on the images to see larger versions) show how these might look in email, or might be offered to someone writing a post in Blogger.

Local search results inserted into an email:

Google User Distributed Search - Local

Continue reading “Google’s User Distributed Search Results in Emails, IMs, Blogger”

New Google Ranking Patents

These two patents granted today to Google look like they hold some interesting approaches to using large amounts of data about pages and queries and user interactions to rank pages in search results.

Ranking documents based on large data sets
Invented by Jeremy Bem, Georges R. Harik, Joshua L. Levenberg, Noam Shazeer, and Simon Tong
Assigned to Google
US Patent 7,231,399
Granted June 12, 2007
Filed: November 14, 2003


A system ranks documents based, at least in part, on a ranking model. The ranking model may be generated to predict the likelihood that a document will be selected. The system may receive a search query and identify documents relating to the search query. The system may then rank the documents based, at least in part, on the ranking model and form search results for the search query from the ranked documents.

Continue reading “New Google Ranking Patents”

Context Sensitive Stemming for Web Search

It is questionable how much most commercial search engines use stemming as part of the process involved in returning search results, because it could have the effect of reducing the relevance of search results, and because it can be a computationally expensive process.

Researchers at Yahoo! take a second look at stemming, and how it can be adapted to Web search in Context Sensitive Stemming for Web Search (if that link doesn’t work, try this).

The paper explores using statistical language modeling to perform a context sensitive analysis, and predict which variants of words in a query will be useful when expanding a search for the query term. This can result in a lot less bad expansions, involving less computation and improving the precision of results.

Context sensitive document matching is also performed for the expanded variants.

Continue reading “Context Sensitive Stemming for Web Search”

Search Engines Learning from Advanced Searchers?

There’s been a fair amount of research centered around looking at data involving user interactions with commercial search engines that may be helpful to those search engines in ranking and recommending pages.

But some searchers may be better at finding relevant results than others. Understanding differences between different searchers, and their searching strategies may be helpful to those search engines.

A study from Microsoft Research, Investigating the Querying and Browsing Behavior of Advanced Search Engine Users (pdf), attempts to better understand how people with greater experience in conducting searches interact with a search engine to see if that knowledge can help others with less experience.

Their research involved looking at the interaction logs of advanced search engine users, and the logs of those with less experience. They tell us that there are

Continue reading “Search Engines Learning from Advanced Searchers?”