Google’s Paid Search Human Evaluators

A newly granted patent from Google provides details on how advertising from Google may be evaluated by human evaluators…

Last September, Scott Huffman, leader of Google’s Search Evaluation Team, told us about some of the efforts behind the scenes to measure and improve the quality of Google’s search results in a post at the Official Google Blog titled Search evaluation at Google. As one part of the review process that they perform, the search engine may use human reviewers:

Human evaluators. Google makes use of evaluators in many countries and languages. These evaluators are carefully trained and are asked to evaluate the quality of search results in several different ways. We sometimes show evaluators whole result sets by themselves or “side by side” with alternatives; in other cases, we show evaluators a single result at a time for a query and ask them to rate its quality along various dimensions.

Google also uses human evaluators to look at the quality of paid advertising shown through Google’s advertising programs. Here’s a snippet from a classified that Google is presently running for a temporary Ads Quality Rater:

As an Ads Quality Rater, you will be responsible for reporting and tracking the visual quality and content accuracy of Google advertisements. Ads Quality Raters use an online tool to examine advertising-related data of different kinds and provide feedback and analysis to Google. Projects worked on may involve examining and analyzing text, web pages, images, and other kinds of information.

You will need an in depth and up-to-date familiarity with English- speaking web culture and media. Additionally, you will apply this knowledge to a broad range of interests and topics. Ads Quality Raters possess excellent written communication skills and web analytic capabilities. You will be required to work 10-20 hours a week on a self-directed schedule.

Google was granted a patent today that provides details on the kinds of things that the search engine may have been looking at in their evalutation of sponsored search results. The patent was originally filed in 2004, and it’s possible that it only provides a summary of an actual process that was taking place back then. It’s also quite likely that the evaluation process from Google has evolved significantly since this patent was filed. It is interesting historically, for the types of questions that it asks, and the concerns that it addresses in advertisements. Here’s a screenshot of a partially filled out evaluation form from the patent:

a partially filled out evaluation questionnaire for Google Advertisements.

The patent is:

System and method for rating electronic documents
Invented by Sumit Agarwal, Gokul Rajaram, and Leora Ruth Wiseman
Assigned to Google
United States Patent 7,533,090
Granted May 12, 2009
Filed March 30, 2004

Abstract

A system and method for rating an electronic document such as an advertisement. Rating information is received from one or more evaluators. A signal relevant to a criteria is received and a determination is made whether to deliver the document in response to the signal based on the criteria and the rating information from the one or more evaluators.

The patent goes into a great amount of detail on how advertisements might be reviewed prior to being considered for use with advertisements, by a large number of evaluators, who create a trust score for ads as well as ranking and classifying the ads on a number of criteria. Classification may take place by looking at information such as:

  • Subject matter,
  • Content rating,
  • Aggregate content rating,
  • Sensitivity score,
  • Content type,
  • Language,
  • Geographic origin (e.g., country or city of origin),
  • Geographic area of target audience,
  • Document source,
  • Owner of content,
  • Creator of content,
  • Target demographic, or;
  • Other criteria.

Ranking aspects of ads to determine whether they should be run at all, or to match them to content for appropriate audiences may also look at such things as:

  • Offensiveness content,
  • Pornographic or other prurient content,
  • Adult content,
  • Violence content,
  • Children’s content,
  • Target age,
  • Gender,
  • Race,
  • National origin,
  • Religion, or;
  • Other factors

The patent provides more details on how this evaluation process may work, as well as offering a number of examples of the processes involved. If you use paid search, you may find some value in spending time reading through the patent more deeply.

Share

38 thoughts on “Google’s Paid Search Human Evaluators”

  1. The questions in the list look like soome sort of parental control guidelines. Its funny how relevance to website content of ad does not seem to be include in the excerpt shown

  2. I am not sure which would be better but whatevre ever they do will probably be pretty good and logically applied. I think humans have to be part of the process somehow.

  3. It seems a wise decision to obtain an understanding of cultural bias, but I wonder how subjective it is since the evaluators are carefully trained. Also, I would be curious if this technique would be used to screen out poor results from a SERP. For example, Merchant Circle does a great job at targeting keywords to rank higher in a page. They offer no content; they do sell advertising space. I see more firms adopting this method to increase their revenues. Will these middle men be cut out when a human evaluator enters the picture?

  4. It’s good that Google evaluates it’s own Algorithm by humans, at least for the Search Results, but why for Ads? Do they really care what an Ad looks like?

  5. Hi Niche,

    Some of the questions did remind me of parental control guidelines, too. There’s some mention in the patent filing about the possibility of only showing some ads during certain hours. I’m guessing that the thinking might imitate television programming, where certain shows are only broadcast during hours when children or more likely sleeping and not accessing TV (or the Web). Is that a good model for web advertising to follow?

    I believe that relevance to website content is a separate concern. This is an evaluation of the content of an ad to determine if it should be shown at all, or shown only in certain contexts.

  6. Hi zzllrr,

    This kind of evaluation is a review of sponsored listings. I’m not sure that it will have an impact upon what shows up in the organic search results, but the purpose seems to be to provide higher quality ads to show with those results. Interesting to see how much human review is relied upon in that area, but I’m not sure tht it’s surprising to see.

  7. Hi People Finder,

    It’s not surprising to see that Google manually reviews ads, but seeing some of the process behind that review was interesting, and it’s good to get a sense of how they do perform that review. :)

  8. Hi Mo,

    We do know that Google does use some human reviewers to look at the quality of their search results, too. It makes sense for them to do so.

  9. Hi Frank,

    Google also offers the opportunity for people to offer their thoughts on translations, and if something in those translations might be culturally insensitive. It makes a lot of sense to be sensitive to potential problems.

    I like it when a patent filing tries to provide some background information about the problems and issues that they are intended to address. Here’s how this patent filing begins:

    With the advent of the Internet, a seemingly limitless variety of content may be provided to people of varying ages, preferences, and sensibilities. Some content may be more appropriate for some individuals and groups than for others. For instance, violent or pornographic content is typically deemed inappropriate for children. Providing audience-appropriate content is desirable because it generally maximizes benefits and minimizes burdens for both the content provider and the audience. For instance, an audience-appropriate and relevant ad is more likely to generate a sale than an offensive and irrelevant one.

    Accordingly, ads and other content are often targeted to specific audiences that may have an interest in the content. For instance, ads directed to males may be displayed during nationally televised football events because they draw large numbers of male viewers. Similarly, an ad for an airline may be displayed at an Internet search engine site when a user submits a query including the word “plane.”

    However, providing ads and other documents based on user-related content does not ensure the propriety of that content for a particular audience. For instance, a beer advertisement may not be appropriate on a website for recovering alcoholics, even though the ad and the content of the website are related by subject matter.

    I’ve seen some really poorly matched ads alongside search results over the past decade. Hopefully human evaluators help make a difference.

  10. Hi Paulus,

    I do think that it is very important for Google to evaluate the quality and content of ads that they dispaly. Considering that the advertisements shown with search results are one of Google’s primary sources of revenue, Google has a very strong interest in presenting ads that are appropriate, relevant, and high quality. If the ads shown with search results are offensive and annoying, that may negatively impact the numbers of searchers who use Google.

  11. Its funny how you can have a temp job with Google but I think that they’re secretly having them click sponsored ads in the search engine while performing the “full analysis”.

  12. Hi Scotty,

    It would be a very questionable practice for Google to have their quality reviewers to click upon paid ads while analyzing pages. It would make sense for Google to have controls set into place to monitor for that kind of practice, and I would guess that they probably do. There isn’t much value in Google having their analyzers click through ads without any intent to buy, especially since doing so would mean more clicks that need to be paid for with less actual conversions. The impact of increasing clicks without conversions might earn a few pennies while simultaneously possibly losing advertising accounts – a pennywise but pound foolish approach. If the amount of clicks go up, but the ration of clicks to conversions go down, Google may lose advertisers. That’s likely a result that they don’t want.

  13. Hi Paulus,

    Yes. :) If Google were to start showing ads that didn’t inspire much trust in consumers, that could harm their revenue considerably. Quality management is important, and the reputation of Google and their advertising program could suffer as a result.

  14. Hi Big Picture,

    I remember a presentation by one of the User Interface designers at Google describing some simple tests that they did with Google Maps. One test involved changing the font size shown to visitors, and the other involved showing the “map” on the right side of the page instead of the left site.

    They made those changes and went live with them to collect data. It didn’t take them too long to collect data from a few hundred thousand visitors to Google Maps.

    I wonder how many people apply for evaluator positions like the one I linked to above.

  15. This reminded me of something I read weeks ago:

    http://econsultancy.com/us/blog/3569-google-does-no-evil-except-when-there-s-lots-of-money-involved

    As much as the technical stuff is involved…don’t forget Google is a business too. On this article I’ve read about some ads that Google ran that were very unethical but Google still runs it even today…the reasons for this are obviously money.

    Citing:
    “As Wall points out, Google is probably making over $10,000/day on these ads on Google alone and over $10,000/day on these ads through AdSense content network.”

    Would love to hear your opinion on this William.

  16. Im personally quite surprised google would admit to something like this, I dont think that there is anything wrong with it and its clearly the most effective way of policing their system but it does take away from the “mechanical” feel google has. Admitting that people play a part within their results in any capacity seems to take away from the “feel” of their powerful algorithm

  17. Hi Jimmy,

    I know what you mean. When everything is said and done, humans often don’t do a bad job at being able to look at a web page and decide whether or not it’s relevant for a specific search query. Many of the mechanical algorithms that “learn” to make judgments about the relevance of web pages (and how well or poorly the algorithms are doing) do so based upon a sample, or seed set, of pages that have been evaluated by human beings. And those algorithms are based upon human judgments and assumptions of what might return relevant pages as well. No matter how mechanical and objective we might like search results to be, there’s just no escaping some human judgment being involved in what they do.

  18. Bill Slawski,

    For almost a year now I noticed that had visitors from google California Mountain View in my web site. I suspected that human reviewers from google look at my web pages and study whether or not it’s relevant for a specific search keyword phrase.

    Google firstly diminished meta tags except title tag, and later gave importance backlinks from web directory submissions after that blog commenting, forum posting, social bookmarking etc…

    Obviously google diminishes importance own search engine algorithm and tries to involve human evaluation. It’s interesting that with less competitive keyword phrases one web page can easily be at #1 at google SERP-s only with backlinks pointing at it and appropriate anchor text within link, but theme or topic of that web page can be totally different and hasn’t to contain link keyword phrase within itself at all.

    So IMHO google tries to involve human evaluators to assess quality and relevancy of web pages.

  19. Hi Web Dizajn,

    Thanks for sharing your experiences. I believe that Google human reviewers often telecommute to Google from locations around the globe, so it’s possible that reviewers may come from places other than Google’s Mountain View Network too.

    I don’t believe that Google has ever used meta keywords and meta descriptions for ranking purposes, though they will use meta descriptions as snippets for pages in search results. The idea of using anchor text and calculating PageRank for pages is described in some of the earliest papers about the search engine, such as The Anatomy of a Large-Scale Hypertextual Web Search Engine:

    The text of links is treated in a special way in our search engine. Most search engines associate the text of a link with the page that the link is on. In addition, we associate it with the page the link points to. This has several advantages. First, anchors often provide more accurate descriptions of web pages than the pages themselves. Second, anchors may exist for documents which cannot be indexed by a text-based search engine, such as images, programs, and databases. This makes it possible to return web pages which have not actually been crawled. Note that pages that have not been crawled can cause problems, since they are never checked for validity before being returned to the user. In this case, the search engine can even return a page that never actually existed, but had hyperlinks pointing to it. However, it is possible to sort the results, so that this particular problem rarely happens.

    This idea of propagating anchor text to the page it refers to was implemented in the World Wide Web Worm [McBryan 94] especially because it helps search non-text information, and expands the search coverage with fewer downloaded documents. We use anchor propagation mostly because anchor text can help provide better quality results. Using anchor text efficiently is technically difficult because of the large amounts of data which must be processed. In our current crawl of 24 million pages, we had over 259 million anchors which we indexed.

    I don’t believe that human evaluation is intended to diminish the importance of Google’s algorithms, but rather to allow the search engine to evaluate what they are doing, and improve that algorithm. As you say, to “assess quality and relevancy of web pages.”

  20. Hi,

    I think Google needs human evaluation for algorithm building and as data input for some algorithms to work. As far as I understood the papers on combating link spam, algorithms can determine pretty well if a site on the web is spammy or not after a subset of the web has been marked manually as spam/not-spam. So I imagine the data could be used for statistical purposes and to in order calibrate their evaluation mechanisms.
    Would make sense to take a similar approach on Ads, which again have to be validated before published. In the case of text ads it is an automated process, for images it is still manually done. But again, an automated system has to be calibrated somehow.

  21. Hi Stef,

    Good points. Many of the processes that the search engines describe as involving “machine learning” often start with a set of human judgments regarding a seed or test set of links or sites or images, and it makes sense that they be manually reviewed by humans from time to time.

  22. Human evaluation is a must in all areas involving search engines. I believe it is google’s human evaluators (or the extent to which they are used) that set them apart from other search engines. I am involved in this process myself, but due to the confidentiality agreement I am limited in regards to what I can disclose. Anyway, evaluators are use to check and fix the results of automated processes and to help improve future automated processes.

  23. Hi Alan,

    Thanks for sharing what you can. Determining the actual relevance of search results is a hard problem. I think it’s essential for human evaluators to check on how well or poorly automated systems are working as well.

  24. The Google human evaluators sound like employment with leapforce, lionbridge and butler hill. Wanted to know if anyone has information on how to apply directly to google for the search evaluation position??? I am experienced with this kind of work and have already applied for the ADS Quality Rater but for English (England). Seems they have not hired for the Ads Quality Rater in the US for a while. Any advice would be appreciated.
    Thanks

  25. Hi Goeb,

    I believe I’ve seen ads from Leapforce and Lionbridge that were for Search Engine evaluators, most likely for Google. Can’t remember exactly when the last one I saw was (on Craig’s list), but it was only sometime within the past few months.

  26. Hi Scott,

    I believe that Google still does use some human evalutators to provide some human and manual feedback the relevance of both paid and organic search results. Google has come up with a way of looking at the quality of advertisements and landing pages in an automated manner, but chances are that they still see some value in getting some feedback from real people.

Comments are closed.