I originally wrote the following article a couple of years ago for publication at Website Magazine. It presents one way of thinking about the evolution of search and search engines, and I thought it might be a good idea to share it here as well. I’ve added a few very minor updates to the article.
Search engines have come a long way since their modest beginnings — although you may not have noticed. The major engines such as Google, Yahoo, and Bing guard their search secrets closely, so one can never be certain how they are operating. But they are evolving, and personalization seems to be the wave of the future.
Search engines have already developed through two major stages and now may be on the verge of a third generation. The first stage was based simply on matching keywords in documents — where the same results were shown to all searchers, regardless of who they were or their original search intentions. The second stage, where we maybe now, examines how searchers interact with the search engine to predict their intent. Finally, the third stage will attempt to consider the actual interests of searchers then recommend pages accordingly.
Stage One- Keyword Matching
Before the Web and today’s search engines, searching through a database filled with textual documents meant matching the terms in your query to the exact appearance of the terms in those documents. Sorting documents by relevance or importance would have been a monumental task, if possible at all. Some database searches let you only locate documents where certain words appeared within a defined distance from other specified words from the same document. For example, a search for “California beaches” would be effective only if both terms in a document were located within one word of another.
Then the Web introduced us to an interconnected network of pages that could link to each other using hyperlinks. Search engines evolved to understand differences in the importance of words when they were located in different parts of a page. For example, if you searched for a certain phrase, pages containing those words in their titles and headlines might be considered more relevant than other pages where those words also appeared, but not in those “important†parts of pages.
Relevance was also found by indexing words that link to other pages. If a link pointing to a page used the phrase “deep sea fishing” as anchor text, the page being pointed to would be considered relevant to deep sea fishing. The existence of links to pages also has been used to help define the perceived importance of a page. Information about the quality and quantity of links to a page can be used by search engines to get a sense of implied importance of the page being linked.
However, there is a limit to the effectiveness of this type of keyword matching. When two people perform a search at one of the major search engines, there’s a chance that even if they use the same search terms, they might be looking for something completely different. For example, as someone who enjoys a cup of coffee, when I search for java I might be looking for something completely different than a programming friend looking for some technical information on the popular programming language. The term java can mean coffee, a programming language, an island of Indonesia, or even something else.
Stage Two- Learning Search Behavior
As search engines progressed and users were given more options (Web pages) to find information, the engines needed to respond with a refined approach to search. The second development of search engines started asking the question: How do we go about learning intent when someone types a certain phrase into a search box?
Several options surfaced. You could create some type of profile for a searcher to collect information about their interests – either by having them complete a form, recording their activities and search history, looking through the contents of their desktops and emails or from both their explicit and implied interests. Unfortunately, people often hesitate to share detailed personal information about their interests with search engines. Plus, an individual’s past searching history may not help predict their future intentions.
You could aggregate information collected from a large number of interactions between users and search engines. What pages do people click when faced with a list of search results? If the vast majority of those searching for java choose pages about programming, it would make sense to show more programming pages in search results and fewer pages about coffee.
You could study user actions – how they move their mouse pointer across a search results page, how long they stay at a selected page, how far down that page they scroll, and many other possibilities.
Studying a series of searches from the same user may offer a glimpse into modified search behavior. How does an individual change their queries after receiving unsatisfactory results? Are search terms shortened, lengthened, or combined with new terms? Comparing selected results (Web pages) of one user to that of another using the same query could be very telling. Although the search engines will not share their strategies, it is clear that this type of analysis is being used elsewhere on the Web. Consider the item-to-item recommendations that Amazon.com offers when people perform searches at that store (people who purchased this book were also interested in). Now, imagine a search engine recommending pages selected by other users who searched using the same terms.
Add to that some other information that a search engine might collect about a user when a search is performed – location, language preferences indicated in their browser or the type of device they are using (mobile phone, handheld or desktop).
Search engines could learn a lot about Web searchers by examining the services that we select. Some of the information engines may be looking for from users:
- Search results clicked upon.
- Choices of interest in email alerts.
- Personalized search histories.
- Ads clicked upon.
- Bookmarked pages (Delicious, Yahoo, Myweb 2.0).
- Picture tags (Flickr).
- Annotations (Google Sidewiki, Twitter, Friendfeed, etc.).
- Web pages chosen for customized search engines (Google custom search).
- Queries used and pages selected in vertical searches (Google Maps, Yahoo local search, Google Product Search, etc.).
- Personal profiles (Orkut, MySpace, etc.).
- Query revisions and many others.
Stage Three – Learning From the People
At some point, the search engines may go beyond personalization based on interactions with search and other services, to analyzing footprints people leave on the Web itself. User profiles in places like MySpace or Facebook, “digging” at Digg.com, claimed blogs at Technorati and other emerging spots on the Web have given users the ability to not only put their stamp on countless pages but endless opportunities for users to leave their tracks all over the Web — and for someone or something else to study those tracks. Ask yourselves, what does this imply when it comes to privacy?
Digital signatures associated with identities from initiatives like OpenID or Typepad authentication may provide even more insights about a person and their interests.
Personalization and SEO
Personalization should, and likely will have a big impact on the way people search, what site owners learn about their intended audiences, and measuring the effectiveness of SEO campaigns – especially to SEO firms using ranking reports as one way of measuring the efficacy of their efforts.
Can we learn from the evolving stages of search engines? In attempting to provide personalized search results, the focus of search engines’ efforts has shifted from matching keywords to know more about the true interests of searchers. Keyword matching still plays a role in what search engines do when returning results, but information gathered from those searchers is playing an increasing role in the results they see.
I was on the phone with a colleague a few months ago when he identified his highest-ranking competitor for his choice of keywords. I searched using his same terms and could not find the site that he claimed was at the top of the rankings. I asked him to scroll to the top of the Google search page he was viewing, and whether he saw a link labeled sign out at the far right — he did, meaning that he was signed in with Google and his query was being treated a personalized search using his past search history. In a non-personalized search at Google, he was outranking his competitor’s site, yet it appears that, while signed in to Google, he was visiting his competitors’ pages so often that they were ranking higher for those keywords than his site. Personalization presents us all with some new challenges.
While we are left to speculate about search engine behavior and observe the changing landscape, there are some steps that an SEO professional or any website owner can take while anticipating the effects of personalization:
- Learn about Social Networking Theory and Online Social Networks.
- Recognize and share with clients the diminishing value of ranking reports.
- Aim towards measuring results and conversions in a meaningful manner from log file analysis and Web analytics tools.
- Find ways to learn more about your intended audiences and existing customers.
Nice, I don’t know how you keep the motivation to churn out so many posts though I cant even get close to 1 a week!
Different data centres also effect the search results greatly, I have had someone sitting opposite me at work do a search and I have seen different SERP results because they go out though a proxy and I access direct.
While I can see that ranking reports are not the golden talisman that they once were in terms of measuring a successful search campaign I think that they are still a critical metric. I have read several other suggestions for how a SEO campaign should be measured and given it some thought. I think that the best option would be to analyse the total revenue from natural search over x period of time excluding brand terms (possibly) but I would be interested to hear other peoples thoughts here. I think that a lot depends on the objectives of the campaign but to dampen the credibility of what has become the standard method of measuring a campaign without really having a “replacement” in existence is a little damaging and leaves people (me!) without much guidance.
This is something that people who are not familiar with the industry will never ever get.
Hi Jimmy,
Good point on the impact of different data centers. There are a lot of filters and reranking methods that can impact search results, so that what you see may be different from what I see. I remember working about a 30 minute drive away from where I lived, and seeing different rankings based upon where I was browsing from.
There are a number of different ways that you can define metrics for the success of a web site, and revenue is certainly one of them. There are others that might not be as easy to measure.
For instance, when you provide support information on your web site, you may cut down on the number of help calls to your offices, and increase the level of customer satisfaction with the goods or services that you provide. Making that support easy to find can be one of your objectives in optimizing a site.
Another objective behind your site might be to educate or inform people and share information. Some sites rely upon advertising, so increasing pageviews and advertising revenue might be another metric to chart success.
There are other intangibles that might be a little harder to measure, such as reputation building or brand awareness. Or generating business to offline business locations. There are sites that let you find the nearest location of their store, and even whether or not they may have certain products in inventory at those stores, but it can be difficult to correlate the online search with the offline purchase.
A successful SEO campaign can also focus upon increasing traffic for longtail terms as well as main or head terms, and focusing only upon the main terms may not be as good an indicator of the success of that campaign as looking at overall traffic increases.
Learning to use analytics programs in a way that can not only help you understand how successful your site is, but also in a way that can suggest positive changes can be a lot more powerful than ranking reports. I don’t think it hurts to look at rankings, but they are only a small part of the picture, and with personalization influencing where different people might see pages in search results, they may not give a very clear view.
Hi Bill, hope you had a wonderful holiday season. I have visited this post a few times and I think its because its lacking mention of location classifiers. Have you, or do you intend on having more discussion about
Classification of ambiguous geographic references
I think it needs further evaluation and discussion and you’re the man to get the ball rolling. 🙂
Hi Charles,
Thank you – I hope that your holidays were wonderful as well.
The idea of location classifiers for web pages is a pretty good topic, as it relates to personalization. My post does discuss how a search engine might create “profiles” about searchers, but what I didn’t get into in this post were topics such as how a search engine might also create separate profiles based around User Groups and web sites.
I did write a post about the patent application you link to, Classification of ambiguous geographic references, a few weeks after it was published – Which Newark is the Dominant Newark? Classification of Ambiguous Geographical References in Local Search
In one of my examples from that post, I mention a web page that tells us about a business on “Castro Street” in the “Bay area, but doesn’t provide an actual address. The search engine knows that “Bay area” is a phrase that people often use to refer to the San Francisco bay, but can also refer to places such as Green Bay and Tampa Bay. It also notes that there are a good number of “Castro Streets.” But, the combination of “Bay Area” and “Castro Street” together makes it likely that the business being described is likely in Mountain View, California.
That web page may then be given a location classification based upon those ambiguous geographic references – which could become part of a profile about the page.
Keep in mind, that it’s also likely that search engines will try to understand other geographical information about web pages and web sites.
For instance, there’s “provider” location of that site – its business address, where it is hosted, the location of any physical storefronts.
There’s a “content” location, which is the geographic area that the content of the site is about.
There’s also a geographical serving area based upon the audience for the site, which could be local, regional, national, or even global. One way that a search engine could glean this information is by traffic to the site.
There are also likely profiles associated with queries as well.
For instance, the word [java] could mean a cup of coffee, a programming language, an Island of the same name, or something else completely. A search engine could note that 70% of the top 100 (or thousand, or some other number) of searches for [java] in the United States tend to be programming related, 20% involve a search for coffee, and the remaining 10% for the Island. The search engine might try to provide a diverse set of results amongst the top 10, showing 7 results for programming, 2 for coffee, and 1 for the island.
Or, the search engine might classify a query as navigational, transactional, or informational. For a navigational search such as ESPN, the search engine might try to show the ESPN home page as the first result. For an informational search, such as [how to make icecream], the search engine might return more informational type sites. For a transactional query, such as [hotels in New Orleans], the search engine might return more commercial pages.
Profile information about searchers (individual and/or as part of a group), about queries, and about web sites may all play a role together in the search results that someone ends up seeing. In some ways, these types of results could be seen as much recommendations as search results, along the line of Amazon telling you that “people (with browsing and purchasing histories like yours) who looked at these books went on to look at/buy the following books.”
Mr. Bill, thanks for taking the time to post a response and on a Sunday no less, seams like I can always count on you for some closure. Now that you pointed out the Newark post I do remember reading it and I read it again. What’s interesting about that post is that at the time no one felt that Local was much of a player, but look how that landscape has evolved since then.
With all the data available to search engines its safe to say that page and user profiling, as well as personalization, its most likely here to stay. And as a B&M and Web store owner I actualy welcome it. Though in my opinion they will never truly be able to “predict†what someone’s true thoughts and intentions are with very much accuracy because the human mind is just too complex (as well as scattered). And then we have the search engines inability to identify that a searcher may have completed their online “detective work†and that the personalization of that subject matter is no longer welcome. More so on a shared machine or with ad personalization.
Your trade will certainly have its hands full with personalization, though I think that Google’s location customization will prove to be a bigger nut to crack as its way to restricting. Myself, I will most likely remain focused on ensuring that my customers needs are met, if not exceeded.
Gosh, I am looking where and what I am posting about and thinking I am just a simple apparel manufacturer. Yet I strongly feel these are some of the things a mom and pop store needs to be aware of, especially if they want to be competitive and experience growth online and offline.
Mahalo
Hi Bill,
Once again, you zoomed in and marked on the key points, business owners and marketers should be looking into and not all just about running ranking reports which doesn’t correlate to achieving the overall business goals which encompass:
1. Revenue (targeting and measuring the right segment)
2. Cost reduction (increasing targeted business conversions)
3. Business goals conversion (matching the business goals with the customers intent and motivation – targeted approach) i.e. site engagement, increasing qualified business leads, etc
And to quote your right on target earlier post reply “the focus is upon ensuring that your customers’ needs are addressed in a way that continues to have them return” to any marketers / business owners who sees this.
Thanks for sharing as always Bill 🙂
Cheers,
Deric
I would agree that there are lots of metrics that you can use to measure the effectiveness of a site outside of search rankings and they are specific to different business areas like, as you say, cutting back on calls or building the brand.
Increasing the number of long tail searches coming to the site and increasing revenue that comes in though way of natural search might be interesting and meaningful metrics for evaluating a sites SEO campaign but they dont have the finality of the old “rankings reports” which are still really popular when you get to a board level within a company (and probably anywhere else other than online marketing specialists). An increase in hits/revenue though natural search year on year could easily be acquitted to the general increased numbers going online and the increased spend online globally.
Another metric that might be worth a mention is the percentage of orders placed online in relation to offline, this is something that you should really expect any good SEO campaign to drive, however, the above issue could also be applied here.
The problem that I still see even when combining all of the above is that its so difficult to accurately measure the effect of the money you are putting into SEO, especially if you are paying an agency.
Originally the main benefits of DM campaigns was that you would be able to track everything that happened and the response rates. The web took this a step further allowing even more detailed tracking but we are coming to a stage where ROI is increasingly hard to measure and while we can get a general feel and see if something looks to be effective quantifying it is not easy to do.
I think I might be fishing for a magic bullet here but I think that there should be a definitive final metric that can be used to measure the success of every effort put into your SEO. This metric might actually exist and it could be that we are discussing it at such a general level that its not possibly to nail it down.
Dont get me started on investing in social media!
Hi Charles,
You’re welcome. I agree that it’s important for business owners to try and keep an eye upon and ask questions about things that could substantially impact their businesses, like personalization in search and the growing impact of local search.
There’s a potential problem with trying to use past searching and browsing behavior to try to understand present intentions behind a search – often people search based upon a need for information in a particular situation that may have nothing at all to do with what they might have searched for yesterday or the day before, or what they might have indicated as interests on their MySpace or Stumbleupon page. What we probably will end up seeing in the future is a “limited” amount of personalization, where searchers can find answers to their questions sometimes inspite of customizations based upon needs that a search engine assumes those searchers intend. Sometimes Java programmers do want to find information about the island of Java, and all the programming search results they are shown get in the way.
It’s probably in searchers’ and the search engines’ best interests for the search engines to focus as many resources as they can on improving local search, and making local search business listings and maps as accurate as possible. It’s an area that has evolved, as you note, and will continue to become more and more important. I think it’s more important for them to improve local search at this point than it is for them to focus upon personalization of search results.
Ultimately, the changes that we see at search engines are like the changes that you make to your own site – the focus is upon ensuring that your customers’ needs are addressed in a way that continues to have them return. 🙂
Hi Jimmy,
Unfortunately, a lot of corporate management practices involving the effectiveness of web sites comes from the use of analytics to create status reports that end up sitting on executive’s desks after a quick look through, rather than being used as working tools to find areas to improve and innovate. Ranking reports are simple, easy to understand, and not always very helpful when it comes to generating ideas for future actions.
I think you’re right that it isn’t easy to accurately measure the effect of the money that you are putting into SEO. I think that’s why it can be helpful to look at a wise range of metrics to get a sense of the value of SEO efforts, starting with simpler things such as increases in pageviews, traffic, and conversions, and then expanding to include increases in newsletter subscriptions, downloads of whitepapers, increases in phone calls and the generation of leads, and defining other metrics that might be relevant to the objectives behind the site.
I don’t know if there is a definitive final metric that can be used to gauge the success of an SEO campaign. Even an increase in traffic can be misleading if that traffic isn’t bringing visitors who might be interested in what your site offers.
Hi Deric,
Thank you. Personalization is having an impact in making us think more carefully about what we measure, and what those metrics can tell us. I think rising to the challenge is actually good for most business owners, because it forces them to think more about what their objectives are, and how they can meet them, and improve upon them.
I think there’s more value in a more detailed analysis of the performance of a web site that not only shows some measure of improvement, but that also suggests additional actions that could be taken to meet the goals of a site even better.
Hey Bill,
I’ve been a lurker on your blog, and I have to agree with Jimmy on acknowledging your motivation to continuously posts about quality subjects.
In regards to this article, I agree that many companies approach website analytics as a whole pie rather than slicing it up and looking at what’s underneath. In my opinion a good sales person will try to read the individual customer that is in front of them and change their approach based on their conclusions. Well, website visitors are providing you this information within your analytics. However, many companies neglect their online customers because of the lack of face-to-face interactions. They don’t treat them as individuals that need “personalized” attention (or SEO). What they do is instead look at it as a whole, this is assuming that they even look at it in the first place. Watching the behavior of your visitors is the equivalent of paying attention to customers in front of you, companies need to pay attention. Once you understand the visiting segments, you will increase business, and more than likely bring visitors back to your site.
Great article! Now, we will need to dominate social media (besides SEO). Social Media is becoming a HUGE thing even for SEO.
Hi Morrigan,
It’s good to meet you. Thanks for commenting, and for your kind words.
I agree with you completely. Analytics can be helpful and filled with insights, or they can be used to generate reports that a few people might glance at quickly. It can be possible to learn a lot about the different visitors to your site based upon how they get to your pages, what words they might use in search engines to find you, what pages they view, where they go on your site, how long they stay on your pages, and what other actions they take once they get there.
Different visitors have different informational and transactional needs, and you can learn from their actions on your pages how well you might be meeting those needs. You can also learn if you have set up some roadblocks on your pages, such as forms that cause people to leave rather than move forward in an ordering process, or pages that don’t provide information that people are looking for. As you wrote, “companies need to pay attention.”
Analytics can also help you understand the impact that changes on your site might have.
Hi Victor,
Thanks. Chances are that search engines will be considering social media more in the future. We’ve seen that with Google trying to include tweets in search results, and trying to bring us a”social search.” I’m not sure that I like the word “dominate” though. Participate in a useful and intelligent fashion instead? If social media gives us the chance to interact in a positive and meaningful way with people who might be interested in what we might have to offer on our websites, that’s a good thing. If we try to use it instead as a broadcast medium, where we don’t engage and interact with others socially and positively, then we may be missing out.
Yep. From the 1st point of view the search personalization is great for searchers, but the surfer should keep in mind that every his/her step is recorded. So it’s a privacy violation already or?
Hi Denis,
Personalization is potentially useful in providing search results to searchers, but you’re right about the possibility of privacy violations being a concern.
A recent study conducted by Yahoo on search queries, looking the search engine’s query logs, made a point of telling us that personally identifiable data within that log was removed before the log file information was analyzed. We’ll likely see more disclaimers like that on future research. They also made a point of telling us that 8 days of toolbar data used was used with permission from the participants involved.
excellent post again Bill, i have to say it is amazing to read a blog from a year ago and see just how social media is now taking the lead with seo.