How credible is your website? How likely are people to believe what they find on your pages, or contact you to learn more about what you offer, or conduct a transaction on your site? Would you consider your site to contain high-quality content? How do you measure the quality of the content on your pages?
Search engines seem to be placing more emphasis on web site quality, such as with the recent Panda updates at Google, as described in a couple of blog posts on the Official Google Blog:
- Finding more high-quality sites in search
- High-quality sites algorithm goes global, incorporates user feedback
If Google is now looking at the quality of content on pages as part of what they consider when showing pages in search results, just how do they calculate the quality of pages?
Measuring Quality a Work in Progress
From 2005 through 2009, academic and industry researchers held a yearly workshop focusing upon webspam or Adversarial Information Retrieval on the Web. The workshop series was known as AIRWeb.
For some reason or another, a workshop wasn’t held in 2010, but the workshop reemerged in 2011 at the 20th International World Wide Web Conference in Hyderabad, India in a joint presentation with WICOW, (Workshop on Information Credibility on the Web).
Held on March 28, 2011, the Joint WICOW/AIRWeb Workshop on Web Quality (WebQuality 2011) was aimed at a large group of topics, which are listed on the workshop’s website. The main themes and topics from this web site quality conference included:
Assessing the credibility of content and people on the web and social media
- Measuring quality of web content
- Uncovering distorted and biased content
- Modeling author identity, trust, and reputation
- Role of groups and communities
- Multimedia content credibility
Fighting spam, abuse, and plagiarism on the Web and social media
- Reducing web spam
- Reducing abuses of electronic messaging systems
- Detecting abuses in internet advertising
- Uncovering plagiarism and multiple-identity issues
- Promoting cooperative behavior in social networks
- Security issues with online communication
There are many papers listed and linked to on the workshop webpage, but they don’t cover the whole range of topics listed on the agenda for the workshop. The topics and subtopics listed are ones worth considering and raising questions about if you own a website and participate in social networks and other places on the Web.
Google’s Web Site Quality Guidelines
One of the resources that I like to point people towards when talking about the credibility of Websites is the Stanford Persuasive Technologies Lab, which published the Stanford Credibility Guidelines in 2002. The guidelines are based upon a joint study conducted with Consumer Reports Webwatch, titled How Do People Evaluate a Web Site’s Credibility?
While I think there’s considerable value to both those guidelines and the study behind them, they are almost a decade old, and they focus primarily upon credibility rather than quality. Credibility is an important aspect of the quality of a web site, but there’s more about quality than how credible people find a set of web pages.
Google has provided information on its pages about what they look at when they consider the quality of what they see online. That includes what they look for in advertisements and landing pages, and their Landing page and site quality guidelines are worth spending some time on, even if you don’t advertise with Google. There are three main aspects to those guidelines:
- Relevant and Original Content
Google’s Webmaster Guidelines also provide a set of things to consider when putting together a site, and tell us that, “Following these guidelines will help Google find, index, and rank your site.” Many of the problems that I see on websites can be resolved by paying attention to the guidelines that Google lists upon this page.
A few years back, I published a post titled How Google Rejects Annoying Advertisements and Pages, which described a Google patent that provided a way for the search engine to programmatically assess the quality of advertisements and landing pages. In that post, about halfway down, I bolded one thought I had after spending some time with the patent:
This system could be used to assess other pages on the Web other than just advertisements, such as web page content.
With the Panda update, it appears that Google has come up with a way to automate assessment on the quality of web pages like they did for advertisements.
I wrote some more thoughts about the Panda update in Searching Google for Big Panda and Finding Decision Trees. One of the questions that I raised in the conclusion to that post was “How does Google define ‘quality?'”
We get some substantial hints from the Webmaster and landing page guidelines from Google. We could also look at sites online that we consider being quality sites, and see what they do to create that impression.
Other Resouces on Credibility and Quality
But, I think it’s worth going beyond those guidelines, and beyond emulating other sites that might be seen as being quality sites. So, I’ve also been looking for some other resources and information on the Web about credibility and quality, and thought that the following were interesting:
- Augmenting Web Pages and Search Results to Help People Find Trustworthy Information Online (pdf)
- Crowdsourcing credibility: The impact of audience feedback on Web page credibility (pdf)
- Customer loyalty in e-commerce: an exploration of its antecedents and consequences (pdf)
- Questioners’ credibility judgments of answers in a social question and answer site
- TIME: A Method of Detecting the Dynamic Variances of Trust (pdf)
- Explaining and Predicting the Impact of Branding Alliances and Web Site Quality on Initial Consumer Trust of E-Commerce Web Sites (pdf)
- Trust Online: Young Adultsâ€™ Evaluation of Web Content (pdf)
Web Site Quality Conclusion
Webspam isn’t a solved problem, and it likely won’t be in the foreseeable future, but the search engines (especially Google) have been receiving a considerable amount of criticism lately for the quality of content that appears in their top results for many queries.
While combatting spam still seems like an important aspect of what they do, Google seems to have broadened how they rank pages to include consideration of quality signals.
Much of what they consider may help answer the question that I raised at the start of this post, “How likely are people to believe what they find on your pages, or contact you to learn more about what you offer, or conduct a transaction on your site?”