The Most Relevant Reviews or the Highest Quality Reviews?
If you want to write a review that more people might see, and which a search engine might use as a “representative review” to display for a business or product or service, there are some things that you might want to keep in mind while writing. At least according to a patent filing from a couple of Google employees. It isn’t officially assigned to Google at this point, but it lists a Google patent application I wrote about last November on Reputations for Reviewers and Raters as a related filing.
The patent describes “quality” signals in the ideal review, and has some good advice about what Google might look as the most relevant reviews.
Besides looking for the most relevant reviews, Google seems to like well-formatted reviews.
Run your review through a spell checker and grammar checker – chances are that Google will.
A few short sentences isn’t enough, and a few long paragraphs are too much.
Avoid sentences that are either too long or too short and make sure that those sentences have beginnings, middles, and ends – sentence fragments aren’t favored at all.
ALL CAPs are considered RUDE by Google (and by lots of other people online).
It’s also not unwise to avoid profanity and sexually explicit content in most reviews – we’re told that type of language “often contribute[s] little or nothing to an understanding of the subject and can make the user who is reading the reviews uncomfortable.”
In addition to the most relevant reviews, Google seems to like reviews with appropriate words.
A review should contain “high value” words rather than being made up mostly of “low value” words. A high-value word might be one that is appropriate for a particular kind of review, identified from a dictionary of words that might be associated with reviews. For example, if the review is about a digital camera, it might ideally contain words like aperture, image stabilization, DSLR, sensor type, lens system, still image formats.
Or, the search engine might look at the frequencies of words that appear in the review and see if there is a high frequency of less common words included. Here’s how the patent filing describes that approach:
With regard to values associated with words in the review, reviews with high-value words are favored over reviews with low-value words.
In some embodiments, the word values are based on the inverse document frequency (IDF) values associated with the words. Words with high IDF values are generally considered to be more “valuable.” The IDF of a word is based on the number of texts in a set of texts, divided by the number of texts in the set that includes at least one occurrence of the word. The reviews engine may determine the IDF values across reviews in the reviews repository and store the values in one or more tables. In some embodiments, tables of IDF values are generated for reviews of each type.
For example, a table of IDF values is generated for all product reviews; a table is generated for all product provider reviews, and so forth. That is, the set of texts used for determining the table of IDF values for product reviews are all product reviews in the reviews repository; the set of texts used for determining the table of IDF values for product provider reviews are all product provider reviews in the reviews repository, and so forth.
Each subject type has its own IDF values table because words that are valuable in reviews for one subject type may not be as valuable in reviews for another subject type.
Another factor that Google might make in deciding which reviews to show for products, or for merchants on pages like the Google Place page for a merchant, is whether or not the review is representative of other reviews.
To determine how “representative” a review might be, the search engine might cluster different reviews together based upon shared characteristics of those reviews. Those shared characteristics might cover a variety of aspects involving the reviews. For instance, reviews of books ordered from an online book store might focus upon the storyline in the book, or how quickly the book was shipped, or upon the author, or similar books.
If the reviews include ratings, those might be used to cluster the reviews. So, something reviewed might have 8 ratings from 5 stars to 3.6 stars, indicating a positive review. It might also have 3 ratings between 1 star and 2.3, indicating a negative review. And it might have 5 ratings between 3.5 stars and 2.4 stars, indicating a neutral review. Since there are more reviews in the positive range, a review or 2 might be selected from amongst those positive reviews to display, using the kinds of “quality” criteria above.
The patent filing is:
Selecting High Quality Reviews for Display
Invented by Kushal B. Dave and Jeremy A. Hylton
US Patent Application 20110125736
Published May 26, 2011
Filed: January 26, 2011
A method and system of selecting reviews for display are described. Reviews for a subject are identified. A subset of the identified reviews is selected based on predefined quality criteria. The selection may also be based on zero or more other predefined criteria. A response that includes content from the selected reviews is generated. The content may include the full content or snippets of at least some of the selected reviews.
Earlier this week I wrote about a Google study that explored how reviews from different sources might be aggregated together, and how they might be compared to one another.
The study and Google’s display of starred ratings and listings of numbers of reviews in both organic search results and in Google Place page listings give us a sense of the importance Google places in reviews. Google isn’t just looking for the most relevant reviews, but they are also looking for different signals of quality to provide quality reviews.
This patent filing gives us a sense of how Google might choose reviews to display, based upon the quality of the reviews and how those might be clustered together so that representative reviews might be displayed to people who want to be able to access reviews quickly.
It’s also interesting to see how Google might define reviews in terms of “quality” considering how much that term seems to have filled the search landscape lately.
Google’s quality scores for Adwords advertisements and landing pages have been used to help set a price for sponsored ads within Google’s search results for a while now. I recently wrote about a Google patent filing for web publishers pages that would help determine how much those site owners might earn for displaying Google Adsense ads based upon the quality of their pages. Google’s Panda updates for search rankings and the blog posts that Google has published about it have provided us with guidelines for building quality sites.
Quality can seem like an abstraction – an aspect of something that’s both hard to define and to measure, and that might be subjective enough to mean quite different things to different people. But, we’ve been seeing over and over in different contexts that Google is more than willing to define “quality” features or attributes of advertisements, landing pages, advertising publishers pages, web pages and now in reviews.