On Relevance and Search Engines

Relevance matters to each of us on a daily basis. It enables us to focus upon the things that are important in our lives. It’s something that each of us learns about everyday, and have been since around the time that we first learned to crawl, but not necessarily consciously.

Relevance and Evidence

I first began purposefully studying relevance a number of years ago, but not to help websites show up in search engines. My introduction to relevance as something I needed to learn, and needed to learn well, came in law school, in classes like Evidence and Criminal and Civil Procedure. In Evidence, we spend the class learning about the rules of evidence. The test for relevance under the Federal Rules of Evidence is:

(a) it has any tendency to make a fact more or less probable than it would be without the evidence; and

(b) the fact is of consequence in determining the action.

There are actually a number of rules involving whether or not evidence can be admitted in a courtroom in a criminal or civil proceeding.

One of those, for instance, is the hearsay rule. While you want proof of something to be admitted, you want it to be reliable evidence. Because of that, there’s a strong preference towards witnesses sharing their actual experiences when they testify to the truth of something. If I was to be a witness in a case, I would testify about my own experiences to prove something. I wouldn’t testify to someone else’s experience. I couldn’t take the witness stand, be sworn in, and then tell the judge and jury the words of someone else as proof of what they saw. If I were to testify that Joe said that Sam stabbed Edward, the Court would want to know why Joe wasn’t on the stand instead of me. The opposing lawyer should be objecting to my testimony under the hearsay rule.

There are a large number of exceptions to the rule against hearsay as evidence. For example, one of those covers a dying declaration from someone. If they passed away, they aren’t available to testify in person. If they knew they were dying, they were also assumed to understand the seriousness of their statements.

We had a special guest speaker in my evidence class who was an advocate in a number of high profile legal battles, and had also worked as a prosecutor in a good number of cases. Before the class began, he passed out some copies of his law school grades. To say that he wasn’t a very good student would be an understatement. But he told us why he was extremely effective as a lawyer. He knew the rules of evidence inside out and backwards. He not only carried a copy of the Rules of Evidence around with him all the time, but he also kept extra copies of it in his office, in his car, in his kitchen, and even in his bathroom. He practiced his closing statements, and his objections, in his underwear in front of a mirror at night before he went to sleep.

Search engines are one of the primary tools that most of us use to learn about the world around us. When we search for something, we expect the results that we see to be both relevant and important for the terms that we entered into a search box. The relevance of the answers to our queries are as important as the relevance of evidence in a legal case in that those answers can shape what we think and influence our actions.

Relevance and Information Retrieval

Just like with Evidence, there are a number of rules that search engines follow when it comes to determining whether or not something is relevant.

When someone who does SEO for a living is asked about how search engines rank pages in search results, their first answer might be that a search engine will return pages that are relevant to a query, and will rank those pages based upon how relevant and how important they might be. That determination of relevance by a search engine follows some practices identified in information retrieval, and a relevance score is often referred to in patents and papers from the search engines as an information retrieval score.

Relevance has long been studied in information retrieval. One of the people who have been specifically studying and writing about relevance is Dr. Tefko Saracevic, a professor at the School of Communication, Information and Library Science at Rutgers University. If you do SEO, or if you’re very interested in how relevance might be defined, it’s highly recommended that you read at least one of his papers on the topic, possibly starting with Relevance: A Review of the Literature and a Framework for Thinking on the Notion in Information Science. Part II (pdf) Note that part I was written around 30 years ago. Dr. Saracevic has been studying relevance for a long time.

The following three quotes are used to start the paper:

“Relevant: having significant and demonstrable bearing on the matter at hand.”

“Relevance: the ability (as of an information retrieval system) to retrieve material that satisfies the needs of the user.”

Merriam Webster (2005)

“All is flux.”

Plato on Knowledge in the Theaetetus (about 369 BC)

Even better, if you get the chance, I would recommend watching a presentation that Dr. Saracevic gave on Relevance at the University of Tennesee in 2007, at:

Relevance in Information Science. (The video isn’t coming up through the link, but here’s a link to the Abstract. Maybe it will start working again.) It is a little over an hour long, but your understanding of how relevance is used in information science and how it has evolved over time will be greatly expanded.

Among some of the things Dr. Saracevic points out is that relevance is dynamic. What’s relevant to us can change based upon how much we know about a topic. It can alter depending upon whether we are first exploring a topic, comparing different pages on that subject, or even trying to buy something. As our informational and situational needs change, so does what might be relevant for us.

Relevance and Search Engines

Search Google for [Baltimore ravens] and your intent may have been to find a latest score in a football game, the origin of the team name, a roster of the players on the team. You could be searching for tickets, or the location of their stadium. If it’s the morning of a home game, and you’re searching from the area around Baltimore, Google may focus primarily upon where you could get tickets. If you search when the game is late in the fourth quarter, Google might focus upon the score. If you search in the middle of the summer, before the season starts, Google might offer Ravens’ news focusing upon signing free agents or extending player contracts.

Search engines have been evolving and becoming more sophisticated in how they treat “relevance” in determining which search results to show. Early search engines worked towards finding web pages that contained the keywords you used in your query. This type of relevance was a substitute of the concept of recall, or a showing of all the documents that included those words. Google started out by returning those keyword matches, but also attempted to rank those pages based upon how important they might be, based upon whether those pages were linked to by other pages, with pages linked to by more important pages ranked higher.

Google used a different definition of relevance when they started showing advertisements on third party web pages in their adsense program, which involved placing pages in different classifications based upon the topics or categories of those pages. I wrote about it in the post, Google’s Second Most Important Algorithm? Before Google’s Panda, there was Phil.

That category approach required that the search engine look at the words and phrases that appear upon pages, and find documents where the same words and phrases tended to co-occur. Documents with co-occurring words would be clustered together, and serve as “categories” for those pages.

We’ve seen Google determining that some queries evidence a desire to see maps and businesses for a certain location, even when the location itself isn’t included within the query. This type of situational need presents “relevant” results such as map results for nearby pizza parlors when we type the word [pizza] into a Google search box.

Google’s new Knowledge base search results show us information about entities that appear in our queries, and some additional information as well. When you’re signed into Google, a search for [espn radio] shows three listings in the knowledge base results for “People related to espn radio.” It’s highly likely that Google examined its query log files to see what other types of things that people search for when they search for [espn radio] to try to anticipate our next query.

In that case, Google is trying to return relevant results by learning from previous searchers and how it may have helped meet their situational and informational needs.

Search engines do try to return relevant results in response to a query, but that definition of relevance is a dynamic and shifting one that doesn’t always depend upon whether or not the keywords from a query match keywords used on a page in the page title, heading, content, and in anchor text pointed to that page.

Share

23 thoughts on “On Relevance and Search Engines”

  1. Pingback: On Relevance and Search Engines - Inbound.org
  2. tonight: a thousand SEOs, standing in front of the mirror in their underwear, chanting tomorrow’s most relevant searches…
    I guess if you’re working on becoming an authority on a topic, you should already be on your way to having seasonal, situational and informational needs covered…
    Thanks for the reminder on query flux and the reinforcement of the need to constantly revise what the relevant targets are..

  3. So it sounds like part of the Google “relevance” algorithm is its attempt to guide us down the same path that others have walked before us…with respect to searching.

    …and then split testing our reactions to what we see so as to determine what are ultimately the most relevant results based around what is more or less a social media structure.

    At least that is what I am getting from this.

    It makes sense though…tracking people’s reactions to various results by dropping unpopular results in the rankings and boosting popular ones.

    Makes sense…

    Mark

  4. Hi Mark,

    You’re welcome. I’m guilty myself of undervaluing relevance, and how it might be an ever shifting target that a search engine might choose to respond to in ways that we might not always anticipate.

    Sometimes it’s surprising (in a good way) when Google decides that it should insert localized results into a set of search results, and your client might show up fairly highly for a very competitive and somewhat generic term in a localized region where their clients actually live in.

    Having the possibility of ranking highly via an image or video or maps result when your page might not have ranked organically otherwise is another opportunity that can be a blessing for some sites, if Google thinks that one of those non web pages i relevant for a query.

    It does show how important it is not only to assess how competitive a set of search results might be, but also understand what types of relevance Google might be tuning that set of results for, and to check on those on a regular basis.

  5. Hi Mark,

    That kind of split testing is definitely a possibility. It might not be something that happens in a manual manner, but rather instead is guided by algorithms that analyze query and click log files and other data to decide whether a query is served well by things like OneBox results, Maps, images and videos, pages that are commercial and transactional in nature or informational, and so on.

    Emphasizing shopping sites in relation to Christmas related queries makes sense in early December, and showing more informational type sites for the same queries also makes sense in February. Rankings might drop for some sites and rise for others not based upon whether they’ve done something wrong or right, but rather based upon what a search engine might believe searchers want to see.

  6. I hope the primary focus of Google stands with Relevancy over measuring the readability or quality of a resource derived from context. Even so, i feel like PPC results with richer features are quietly becoming the desired listings on the SERPs.

    Bill, do you think Google would prefer to judge content on relevancy with more weight over the individual URL’s content quality? I would reckon getting away from overall domain authority would finally allow small publishers and businesses compete with bigger brands.

    Relevancy and google.. to me still has a long way to go. Pounding the preverbal “pavement” on the web, I have seen rankings and “SEO quality” articles ranking all over because of keyword saturation. It’s frustrating coupled with exact match top-level domains, still a surefire way to rank for at least one keyword.

  7. I anticipate relevance gaining substantially more weight in the future. For those who are/have been striving to be relevant to searchers, those efforts should be increasingly rewarded, moving forward.

    I’d certainly LOVE to see more relevance given to commercial/shopping searches; far too often Google returns the major eCommerce players (esp. Amazon), even when their offerings and user experience can’t stack up against smaller, more focused niche stores.

  8. The link to the presentation that Dr. Saracevic gave on Relevance at the University of Tennesee in 2007 is not working, not sure if it on your end or theirs. That being said thank you for this great article.

  9. Great write up. I think after penguin a lot of SEOs have been scrambling to understand the major changes that happened at least from a link building perspective. This post was a refreshing reminder that at the end of the day search results are all about relevance. I see a lot of people going on about crazy new link building tactics that may or may not work but at the end of the day if you always remember to ask yourself “does this make sense”” “is what I’m doing relevant to my website?” you’ll be on the right path.

  10. Can relevancy be determined by Google algorithms or irrationality of human choices. Can you base search results on either being relevant? Tracking the trail of a searcher until they reach their destination may highlight relevance for that searcher but not for another. Search A.I. may not be able to ever produce the perfect search. It will come somewhat close enough for the user. Even if the search engine could read minds, search results would be skewed because the human is still irrational.

  11. I love the law analogy. My father was a lawyer for a very long time specializing in the 4th Amendment. I remember being in the 5th or the 6th grade and being made to argue the legality of a hypothetical police search at the dinner table. Not to mention having to read stuff from Plato as required summer reading. Anyways, I will definitely check out Dr. Saracevic’s Relevance In Information Science. Good stuff!

    Could’ve done without the Ravens reference though…

  12. It’s really crazy how far Google has come in the past years. Doing SEO for a living it’s beyond frustrating when Google changes it’s algorithm. It’s really a tough feat when you think about relevancy when it comes to local businesses and services. The results that come up are going to be sites that have paid a pretty penny to be in the first few results. In a perfect world it would be ranked my the “best” company or service but there is no way for Google to know this. For most search queries (Baltimore ravens) I think Google does a great job in terms of relevancy.

  13. A great read Bill. Relevance has become incredibly important to SEO. I like your take on explaining it.

    Thanks

  14. Bill,

    Thanks for taking the time to share great valuable information about SEO on your site that does not focus on keywords, link building, etc. Especially this post because as Google evolves so should we as marketers and since google now returns results based on relevancy and their split testing, we should adapt to them and create sites and pages that can take advantage of these new possibilities to rank higher.

  15. Thanks for the article on relevance. I’ve never heard of Dr. Saracevic but I’m going to carve out an our, find that video, and lap it up.

    Cheers

  16. I love this. I’ve been contemplating a post that explores similarities between The Federal Rules of Evidence and the ways that search engines use certain signals as evidence of relevance and popularity. Your post couldn’t have come at a better time for me, so thanks.

  17. Hi Bill

    Thanks. I guess it goes back to the push to build relevant web pages focussed primarily around user experience, relevance and quality of content instead of “over optimizing irrelevant low quality content”

  18. Wow, that’s a great way to explain the importance of relevance, it makes so much sense now. I’ve got to admit that when I started working on SEO I used to take this for granted, but Google’s constants updates force us to pay more attention to those things that we believe don’t matter, such as relevance. Great article!

  19. We have done testing on your theory of relevance in terms of inlinks and it works within certain boundaries. For example link spammers that create 1,000’s of inlinks every week do not need to worry about relevance because the sheer number of inlinks they are creating each week is enough to keep them ranking highly on a SERP. As opposed to us white hat SEO’ers who create very few but high quality inlink (and then yes, it is crucial that the link is from a relevant source).

  20. Hi Paul,

    Relevance is relevance, and there are many different definitions of relevance. Not sure what you thought you were testing when you “tested” my “theory” of relevance. A definition is a definition, and not a theory. Who says that a search engine isn’t ignoring a lot of the links that spammers make all the time? Why does the link have to be from a relevant source either. :)

  21. Pingback: How to Slowly Lose Trust, Influence, and Ultimately Your Audience

Comments are closed.