Google Praises SEO, Condemns Web Spam, and Rolls Out an Algorithm Change

Sharing is caring!

Yesterday, Google’s Distinguished Engineer Matt Cutts published a post on the Google Webmaster Central Blog titled Another step to reward high-quality sites that started out by praising SEOs who help improve the quality of web sites they work upon. The post also noted:

In the next few days, we’re launching an important algorithm change targeted at web spam. The change will decrease rankings for sites that we believe are violating Google’s existing quality guidelines.

We’ve always targeted web spam in our rankings, and this algorithm represents another improvement in our efforts to reduce web spam and promote high-quality content.

This isn’t something new, but it sounds like Google is turning up the heat some on violations of their guidelines, involving web spam and we’ve seen patents and papers in the past that describe some of the approaches they might take to accomplish this change.

A good starting point is the Google patent Methods and systems for identifying manipulated articles

There are a couple of different elements to this patent.

One is that a search engine might identify a cluster of pages that might be related to each other in some way, like being on the same host, or interlinked by doorway pages and articles targeted by those pages.

Once such a cluster is identified, documents within the cluster might be examined for individual signals involving web spam, such as whether or not the text within them appears to have been generated by a computer, or if meta tags are stuffed with repeated keywords, if there is hidden text on pages, or if those pages might contain a lot of unrelated links.

I wrote about this patent in more detail when it was granted back in 2007, in Google Patent on Web Spam, Doorway Pages, and Manipulative Articles

Google’s computing and indexing capacity has grown by leaps and bounds since the 2005 paper, Spam: It’s not Just for Inboxes Anymore (pdf), which describes many of the types of web spam mentioned in the Google Blog post, and approaches that a search engine might take to address them.

Infrastructure updates like Google’s Big Daddy and Caffeine, and Google’s very recent move to Software Defined Networks provide the search engine the capacity to handle more complex tasks than they’ve been able to in the past.

Google has also been developing more sophisticated approaches, such as statistical language models to Identify “unnatural word distribution” on web pages. Google has been working with N-grams for a few years to build upon the statistical language models they use for a number of purposes, from speech recognition, to machine translation, to even identifying synonyms within contexts.

Google also has access to a much greater amount of data as well, and in the 2010 paper, The Unreasonable Effectiveness of Data (pdf), Alon Halevy, Peter Norvig, and Fernando Pereira from Google describe how having very large amounts of data can make even relatively unsophisticated algorithms work well.

See my post and the video within it at Big Data at Google for a deeper look at how having access to a great amount of data can make simple algorithms more effective.

Google also has more potential approaches in their hands to identify Web spam than they did in the past. For example, in my recent series on the 10 most important SEO patents, one family of patents I wrote about was Phrase-Based Indexing, which provide for ways to identify scraped content aggregated on pages, anchor text within links on pages that might be considered unusual, and even possibly the ability to combat Google Bombing.

Two more of the Phrase-Based Indexing patents were granted yesterday, Phrase extraction using subphrase scoring and Query phrasification.

I’ve written about a couple of other Google patents which were granted since the identifying manipulated articles patent came out, that describe some other ways that Google might identify Web spam.

One of them looks at how prevalent redirects of different types might be on a site when determining whether or not the search engine should to apply a duplicate content filter to a page on that site. Google would prefer to filter out a page that it might consider more likely to be web spam than one that it doesn’t. The post I wrote about that patent is How Google Might Filter Out Duplicate Pages from Bounce Pad Sites.

The other patent takes a different approach by using n-grams to come up with a classification for a site, with the idea that a very high percentage of sites about topics such as computer games, movies, and music tend to be spam, and then by looking at how often pages within those spammier categories are clicked upon when shown in search results. My post on that patent is How Google Might Fight Web Spam Based upon Classifications and Click Data.

While Google has provided us with a number of examples of how they might identify spam pages within patents, chances are that Google has also come up with other methods and approaches that they might consider trade secrets, so that people creating Web spam are given less of a chance of learning about methods the search engine might use.

Interestingly though, researchers from the major search engines and academia have also been sharing notes since 2005 in a series of workshops referred to as AIRWeb, or Adversarial Information Retrieval on the Web. Even before the AIRWeb workshops, we’ve seen whitepapers like Microsoft’s Spam, Damn Spam, and Statistics (pdf) shared with the search community to help combat web spam.

Since Matt Cutt’s announcement yesterday, we don’t know exactly how much of an impact this new update will have. The post tells us that it might impact around 3% of all queries in languages like English, German, Chinese, and Arabic, and that in some languages that have more web spam such as Polish, it could affect 5% of all queries.

Sharing is caring!

99 thoughts on “Google Praises SEO, Condemns Web Spam, and Rolls Out an Algorithm Change”

  1. I’ve seen some very big changes (actually very positive for us) the last few days. Did you get an idea if they have already started with this or if it’s about to come?

    It seems a bit weird to me that I see so big changes for a number of days, I got the impression from the Google post that it would be today (maybe yesterday) and we started seeing things 3-4 days ago.

  2. Pingback: Google Praises SEO, Condemns Webspam, &Rolls Out an Algorithm Change | Inbound.org
  3. I think this is what they meant with targeting blog networks. Looking into irrelevant links within automated blogposting throughout aritficial blogging networks. Some webmaster will be really hurt if they check their rankings the upcoming days.

  4. Yea we took a hit but the rankings keep jumping around from first page to page 3 and then back to page 1. Alot of the search results are pure garbage and do not make sense at all and this update doesn’t look to be a good one so who knows how it will work out in the next week.

  5. I have seen and heard deindexing of blog networks designed for backlink purposes. I just hope that the next algorithm change would not have casualties like false positives. Across the web, there is a growing buzz of people resulting and looking for alternative traffic sources.
    Thanks for this article.

  6. Magnus- It’s certainly likely they change started a few days ago, and announced it later. The internet is a big place, and it’s virtually impossible for the entire index to change overnight.

    Keep up the good work Bill

  7. Thanks again Bill for the great insights. We see quite many changes right now. Some are positive and some absolutely not. I think I see a pattern in the changes: Sites with thin and questionable link profiles are targeted. Makes sense I guess.

  8. @Ted Ives There are an infinite number of possibly different queries that can be typed into a search engine, so when the search engines say 3% of all queries, they mean 3% of the time a user types something into a search engine, the search result will be affected by this update. In other words, it’s weighted by impressions.

    That’s why the “freshness” update, published last fall, wasn’t that big of a deal. They claimed it would affect a huge percentage of queries, but it was mostly high-volume and trending queries.

  9. Great article and thanks for the information. I’ve been curious to see if Google will use social media and sharing to participate in qualifying data (more shares across reputable streams and websites means more relevant and not spam) but there appears to be little mention of that here. Thanks again for the read and I’ll pass on the good info!

  10. Matt Cutts backed off his “over optimization” comment. (Pretty sure that was on Search Engine Land yesterday.) This is pretty much all about inbound links from blog networks and other spammy inbound links. Inbound links appear to be carrying negative juice, which opens the doorway to competitors using negative seo on sites. Things are going to get worse.

    And agreed. Results are awful in the SERPs. Some very good sites got shoved down, and some complete cr*p has floated to the top.

  11. Hi Bill.
    I have been around your blog lately but not as much as I would like. So glad this one popped up in my reader. Not because of the valuable information that you share but because I hate spammers with a vengance. I took great pleasure in sharing this on G+ because I like to think that it left a few spammers quaking in their boots.
    There’s no need for it and it degrades the web users experience. When I shared on G+ I couldn’t resist putting in a big HA HA for any black hat SEO’s who stumbled on it.
    Great post, great news and thank’s for sharing.

  12. I wonder if 3% means “3% of all kinds of queries” or “3% of queries, impression-weighted”.

    For instance, lots of people search on [britney spears]; if that query is only counted once in the 3% number, then it’s effectively undercounting the actual effect. 80/20-wise I would bet 3% translates to somewhere between 10-20% of all actual queries, from an impression standpoint.

    Great stuff Bill, lots of references I was unaware of – quite a homework assignment for all of us!

  13. Greetings Bill,
    Thanks for the run down and all the leg work to simplify things.

    Related to if Google’s tools and tricks are working to reduce spam, in my space there players that walk the line on all areas that Google would consider infractions to their guidelines. While I can easily spot the on page and off page tactics they use, I am sure in many other online markets such methods are still noticed. Until I see blatant tactics reduced, I am of the opinion that this policing is going to have little benefit. The big offenders continue to just play above the fold of getting called foul and do what they do to play within the lines that Google sets.

    Thanks+

  14. See this. Here is the post that Matt Cutts has pointed out:
    profitmonarchs.com/get-fit-using-these-simple-and-easy-methods/
    Open this post and click on the first anchor text “pay day loan”….You will be redirected to checkintocash.com which is still ranking high on Google for “pay day loan” keyword.

    They must be laughing hard on google.

  15. This is the beginning of the end for google. Too much tweaking of the algo has caused complete crud to appear on page 1 for very competitive keywords. Like “new shoes” which has “http://www.interpretive.com/rd5/index.php?pg=ns” on page one, which is a marketing company and nothing to do with shoes. Or “Bicycle Wheels” which has “www.aerospoke.com/” a site from the late nineties for heaven sake !! I could go on.

    It seems google is throwing the baby out with the bath water in hunting websites using spam links and anchor text, as a new industry in negative seo will now be born to knock every genuine company off page 1 with just a few thousand spammy links thrown at competitor sites.

    Google… u truly messed up big time with this algo change and your days are numbered as number 1 search engine me thinks.

  16. Great content. I love to read about the patents that go into google. Thanks for being so descriptive.

  17. If this will help anybody out, I’d like to talk about my experience with the recent update – along with the specific things I’ve noticed.

    1) Prior to the most recent update, I saw a lot of activity in my backlink profile. (3 days prior)
    By activity, I mean a lot of “new” backlinks created, immediately followed by an unusual increase in sudden links “lost”.

    **Noteworthy: I don’t know if it is directly related to the update or not, but I also noticed a shift in the rate in which Google crawled my website 6 weeks prior to the announcement of this update. It almost tripled the average crawl rate.

    2) 2 days prior to them officially rolling out the update, I noticed my website drop in rankings to the second page. Not a significant loss, but one that definitely grabbed my attention, and alerted me to monitor it closely from then on. This type of behavior hasn’t ever happened to my website.
    Would my rankings flux here and there 1-3 spots for a day or two once every 3 months? Yeah, sure. I’m in an extremely competitive industry, so I expect that.

    3) The day OF the actual update (Tuesday night around 10:30pm EST) I checked my rankings and my site had dropped to the FIFTH PAGE for its main search term. I felt like I had gotten sucker punched.

    4) Long story short – and you guys (and girls) are going to love this by the way:

    It turns out that the company I work for, prior to them hiring me, had employed someone who decided it was a great idea to buy a bunch of spammy links in 2009. I’m talking around 12% of our total backlink profile was garbage — all acquired in 2009. After pulling up our backlink profile and analyzing EVERY single backlink, to my horror, I learned that I was just f&%*ed by the previous employee. All of my hard work and “white hat” techniques down the drain.

    No matter how perfectly the site was optimized, no matter how many EXCELLENT blogs we published on a regular basis, no matter that we never bought links or engaged in any article spinning, no matter how well we maintained our social media presence and authority in our industry, no matter how many quality backlinks we achieved since that time……we got sucker punched over night from a poor decision someone made over 3 years ago.

    And if you’re wondering specifically what kind of drop I’m talking about? #6 to #55.

    The End 🙂

  18. Pingback: How To Make Money Blogging
  19. Hi Jan-Willem,

    There does seem to be some kind of relationship between the recent devaulations of a number of blog networks and this update. I do think a lot of pages are going to start disappearing from the front pages of search results, and I know a good number have already.

  20. I think Google takes into account more detailed distribution among the links to the home page and those to deep pages. I feel that the filter in place yesterday is very localized and it affects the pages which linking was optimized.

  21. Hi Bill,

    While I commend Google for what they are aiming for, sadly they appear to have dropped tha ball on this one. Going forward, and looking at the content of this post, it fits in line with my thinking that Google will now go after platforms and hosting publishers, rather than the individuals who use those hosting publishing platforms, maybe, possibly 🙂

    As we know, there are platforms out there that are targeted by the heavy duty content splogggers, like SENuke and the like. I have no doubt in my mind that Google will start completely blocking any form of link juice from flowing from these platfors, or maybe even blocking them altogether. Will they ever block blogger I wonder ? (joke) well, almost a joke)

    A lot more babies are going to go out with the bathwater before we get an acceptable clarity to the SERP’s.

  22. my space there players that walk the line on all areas that Google would consider infractions to their guidelines.The big offenders continue to just play above the fold of getting called foul and do what they do to play within the lines that Google sets.

  23. It’s good to maintain quality.

    In my opinion, Google had failed a lot though. Try to google the world “lyrics” for a not-known song. Let’s say a song with lyrics in Japanese, Frence etc… The first results not only don’t have the lyrics you wanted, but can be completely irrelevant. I don’t know how they got their name in first page, I just know that they are not worthing it.

    And don’t even think of searhing for translation of your loved, yet not popular song. If it exists, it is buried somewhere in the Internet and you won’t be able to reach it…

  24. Great post, in my experience, there has always been fluctuations within the SERP’s but in the last several months it has definitely increased significantly. SEO’s would, I like to check in on these fluctuating sites to see what they’re doing about it and I have found a pattern.

    Websites that may have previously ranked well are franticly changing content and setup though out their website’s. In my opinion it shows that Googgle really have cracked down on spammy content, more specifically keyword stuffing from what I can see.

    I know that there are some pretty annoyed webmasters out there who are not at all happy with some of Googles algorithm changes but I think it has had exactly the effect they wanted which was to open up the playing field to better populated sites.

  25. Hi Ryan

    I’m seeing a lot of fluctuations for some queries as well, and movements in rankings, and a number of the results seem to be there because they are “fresh” results for those queries, showing one day when they are “news” and disappearing the next day to be replaced by other new results. That might be a little more related to the announcement on the Inside Search blog last month (50 changes) and the one from the month before (40 changes) that involved more fresh results appearing.

    But yes, there’s a lot of movement, and some of the results appearing highly don’t look good.

  26. Hi cmsbuffet,

    The second screenshot in the Google Webmaster Central blog post might be a spun article, but what’s being focused upon there are the “unusual links” that don’t appear to be very relevant to the content on the page. I do suspect that Google will try to identify spun articles using methods like statistical language models, as a possible symptom of web spam.

  27. Hi Jeff,

    Unfortunately, I think there always exists the possibility of false positives and collateral damage when Google makes an algorithm change that affects rankings of sites. Unintended consequences happen. I haven’t spent much time at the Google Webmaster Central help forums yet, but I expect we’ll see some surface there.

  28. Hi Thomas,

    I’d definitely agree that sites that have thin link profiles are going to be among the most impacted by a change that in part affects the link graph in any way.

  29. Hi Mahesh,

    From what Danny Sullivan wrote at Search Engine Land, this is what Matt Cutts was referring to at the SXSW conference, but he mentioned that the name “over optimization penalty” was a poor choice.

  30. Hi Mark,

    Yes, this algorithm change does seem to be independent of things Google is doing with social media at this point. I haven’t seen any mention of it, or associations with the changes we’re seeing.

  31. OK then – awesome update, all i need to do now if i am on page one is fire up xrumer and blast some crappy profile links directly at the competition. Sweet!! Welcome to the era of negative SEO!

    Are you guys serious?

  32. Hi Steve,

    Really good to see you. It can be really frustrating to see pages ranking above yours and take a closer look and wonder what Google is doing ranking those pages where they have.

    I’ve had the feeling that Google hasn’t been focusing as much on enforcing their guidelines in the past couple of years as much as working on other initiatives. It looks like they may be making those more of a priority now.

  33. I haven’t noticed anything yet, except that I’m not getting seeing any PR on the toolbar for a lot of pages. Time will tell…

    Really good post, I liked how in-depth it was!

  34. Hi Ted,

    The Google blog post tells us that “this algorithm affects about 3.1% of queries in English to a degree that a regular user might notice,” but it’s possible that the impact involves a greater percentage of queries than that. I’m guessing that your 10-20% estimate is much closer to what is going on.

    I tried to find some good resources, including few that I haven’t seen mentioned a lot, like the “Spam: It’s not Just for Inboxes Anymore” paper, which is a pretty readable resource.

  35. Hi Scott,

    Thanks. I think Google’s been working to make it more difficult for people to see the benefits of web spam, but doing things like limit the benefits or effectiveness of those actions rather than applying some kind of penalty. It seems like they are taking a harder stand, and the messages that they sent out to so many webmasters might provide them with a lot more information than they had in the past. Because of that, we may see Google taking much more action in the somewhat near future than they have in the past few years.

  36. Hi Nathan,

    I’m not sure if we’re going to get too much clarification from Google about what they actually mean, but from our perspective it’s going to be hard to tell what was impacted by this change, and by others like Panda refreshes or other changes Google might make. 🙂

  37. Hi Jawad,

    I think in some ways, we are still in the early days of this update to Google. As I mentioned a couple of comments above, the reconsideration requests that are likely being sent into Google may provide fuel for future changes as well, whether manual or algorithmic.

  38. Hi Paul,

    I remember back when results on AltaVista started appearing less and less relevant. Maybe that was in comparison to what we were seeing at the early Google, and maybe it wasn’t. But Google’s health does depend upon how searchers perceive how effective it is. It’s possible that some of the irrelevant results your seeing might not last too long, as rankings fluctuate with this change. I’m probably going to be checking back with them in a few days.

    Did the aerospoke site change since you last looked. It seems like a modern site, with a 2012 in the copyright notice, a twitter widget, a link to a “new 2011” catalog, and some blog posts from 2009. It could probably use some TLC, but I don’t think I’d call it a throwback to the 90s.

  39. Hi Kyle,

    Thanks. I do like to try to see a little bit behind the curtains with patents and papers. We’re likely never going to know exactly what’s going on unless maybe we go to work for Google (and maybe not even then), but looking at the patents can help provide some potential clues.

  40. Hi Shony,

    We’ll see what happens. I’m not sure that Google is showing any special preference for bloggers, but maybe a little harsher enforcement of their guidelines might change the landscape some.

  41. Hi JamesO,

    Thank you very much for sharing your observations and experiences. Sorry to hear about the significant drop that you experienced.

    The increased crawling might have been a sign that Google is spending more time focusing upon identifying clusters of links and how pages within them might be related.

    The paper, “Trawling the Web for emerging cyber-communities” describes how Google might potentially be looking for web pages that are related to each other based primarily upon how they are linked together (Do a search for the term, and download it from Citeseer, which is the top result). While the paper is from IBM, one of the authors, Andrew Tomkins, became the Yahoo Chief Scientist for a few years until he left for Google, where he’s at now.

    The sudden increase in back links to your pages is interesting, too.

    Given the circumstances and your activities since the 2009 link building exercise, it sounds like a reconsideration request, along with some more future positive link building might be a good next step.

    I hope it works out well for you.

  42. Hi Nicolas,

    That focusing upon specific pages or articles that were the most heavily linked to seems like it would fit what is described in the first patent in my post. Here’s one of the claims from that patent:

    A computer-implemented method comprising: forming a cluster of documents from a plurality of network-accessible documents by identifying a dense bipartite subgraph from the plurality of network-accessible documents, the dense bipartite subgraph comprising a first set of doorway documents and a second set of target documents, wherein doorway documents in the first set have links to target documents in the second set; analyzing a plurality of documents in the cluster of documents to determine an overall value for the cluster; and when the overall value is greater than a threshold value, marking at least one of the documents in the cluster as a manipulated article.*

    * My Emphasis

  43. Hi Old Welsh Guy,

    Good to see you.

    If Google is looking among densely connected parts of link graphs (what are known as “dense bipartite subgraphs”) as a first step in identifying “manipulated articles,” it’s possible that people using the platforms you mention are being targeted for examination to see if there are signs on individual documents of the issues mentioned in the Google blogpost. Is that targeting the platforms, or their fingerprints?

  44. Hi sofia,

    It’s possible that we aren’t seeing the final results of the changes that Google is making, and may not for a while.

    Interesting that you chose an example of a music related site. One of the my posts I linked to above How Google Might Fight Web Spam Based upon Classifications and Click Data describes music related sites as among those that tend to have the most web spam. Here’s a snippet from the patent:

    A large fraction of all searches are related to only a few commonly-occurring topics, in particular, entertainment-related topics, such as computer games, movies and music. Unfortunately, “spam pages” are a significant problem for searches related to these commonly occurring topics. A large percentage (sometimes 90%) of web pages returned by search engines for these commonly occurring topics are “spam pages,” which exist only to misdirect traffic from search engines. These spam pages are purposely designed to mislead search engines by achieving high rankings during searches related to common topics. However, these spam pages are typically unrelated to topics of interest, and they try to get the user to purchase various items, such as pornography, software, or financial services.

    Looks like things might not have changed much since 2006, when that patent was filed.

  45. Hi mkonday,

    And there’s some ambiguity in some of Google’s guidelines that can have sites taking steps that might be seen as doing a tightrope walk over Google’s guidelines. Just what is a bad Neighborhood? From the guidelines:

    In particular, avoid links to web spammers or “bad neighborhoods” on the web, as your own ranking may be affected adversely by those links.

  46. Hi Warren,

    Thank you. I suspect that you might be right about increases in fluctuations of rankings, and in attributing them to people making more frequent changes on their pages. I suspect the only people who can tell is that is one of the major causes of shifts in rankings are the search engines. It’s been a while since I’ve seen a study on the rate of changes to pages on the Web.

    I have seen a lot of forum threads and post about droppings in rankings, and I understand the frustration. Hopefully those site owners can find ways to return some of the rankings and traffic that they lost.

  47. Hi Evolve,

    My post is an attempt to look at what Google might be doing to effect the changes that they made with this algorithm change, by looking at some of the patents and papers that they’ve published in the past, and some of the changes to Google itself. Whether or not you’re going to find any value in the post is going to be up to you.

    It’s possible that Google is doing something completely different, but I don’t think that it hurts to start looking at what they’ve published in the past.

  48. Hi Jason,

    Thanks.

    Regarding the Google Toolbar, it’s possible that Google is working upon updating it, which might be why you’re not seeing it for many pages. That might be a complete coincidence, or it might have beem planned to coincide with the algorithm change. My suspicion is that it’s probably a coincidence. Updating the PageRank in the toolbar when so many rankings are in the midst of changing may not be a good idea.

  49. Pingback: How Much Of Google’s Webspam Efforts Come From These Patents? | WebProNews
  50. OMG! What a bunch of crap! Google just fucked my world up! Thanks a lot assholes!Years of work down the drain. This is nothing less that google pandering to big business and screwing over the little guy! FUCK YOU GOOGLE!

  51. Pingback: New Google Algorithm Update ! | cleangreenseo.com
  52. I’m just getting started with a blog and learning SEO and all that. It’s hard to determine the best way to move forward. I’m hoping blog comments for back links, article marketing, and social media will be safe, so long as my content remains original and frequently updated. Seems to take a long time, though.

  53. I have noticed my sites shifting to various pages over the last week. Not quite sure how long things will take to settle down. Lets hope google know what they’re up to this time.

  54. We have been jumping around like crazy. Also from page 1 to 3 and back again. Curious indeed. Anxious to see how all of this shakes loose. Great info on this post, enjoyed reading the older PDFs.

  55. I’ve definitely seen these updates have negative effects on some of my sites, but this is good. The sites where I’ve seen negative effects are experimental sites I keep around to tinker with less-than-white-hat techniques. I bought some links from a blog network a few weeks ago and shot up to page #1 from nowhere in about seven days on this site’s three primary keywords. THEN blog networks started getting zapped, and sure enough, my little experimental site completely disappeared from the SERPs. My site didn’t get de-indexed or receive any warnings, itself, just that the blog network was nixed. This is good. It means that Google got rid of the spam, but left my site in-tact so that I might now be able pursue SEO that is within guidelines. If this were not just a throw away site for me, I’d be relieved by that.

  56. Classic Google. They’ve “always” been doing it yet there needs to be this massive update to take care of all the “over-optimized” web spam. It’s like they wait all this time for what they deem to be spam to build up, then roll out an update and make a big deal about it. Spam comes back, repeat cycle. And many good websites get hurt in the process! Thankfully no problems with my sites of those of my clients.

  57. Pingback: Google Penguin is About Inbound Links Now STFU!
  58. Excellent post Bill, I particularly love the insight into the patents. The penguin answers are all there if people bothered to look.
    Most interesting is the bipartite sub graph link analysis. This demonstrates how easy it is to identify spun content, un-natural llinking etc and using those links against the link spammers! Ha Ha!

  59. Hi Pat,

    I’m sorry to hear that Google’s updates had such a drastic impact upon your business. I hope you’re able to put the pieces back together and recover.

  60. Hi Jon,

    I’d definitely recommend doing a lot of research into SEO and spending a fair amount of time with Google’s guidelines, help pages, videos, and blog posts

    If you’re planing upon relying upon blog comments and article marketing for getting back links, you’re missing a lot of opportunities, and you’re not necessarily focusing upon the things that might benefit you most in terms of links to your pages.

    Original and frequently updated content is fine, but focus upon making it content that people find helpful and useful, interesting and engaging. Create content that people will want to link to, will refer people to, will share socially and tell others about.

  61. Hi Gary,

    Changes in search rankings can take place for a lot of different reasons. These can include changes you make to your own pages, changes that competitors might make to theirs, changes in the way that people search (and sometimes responses by the search engines to those changes), and changes and updates from the search engines.

    It’s hard to tell how much of an impact Google’s recent updates might have made, and there is the possibility sometimes that an algorithm change might impact sites that they shouldn’t, as well. These “false positives” can happen.

    Hoping that your pages weren’t impacted negatively.

  62. Hi Charles,

    Thanks. I really liked those older PDFs too, and thought they did a pretty good job of explaining some of the things search engines might be doing. Most of the changes I’ve seen to rankings for pages I’ve been working on are in areas that I somewhat expected – seasonal changes for some sites where those are to be expected. But I have seen a lot of fluctuations for some query terms.

    I hope those rankings settle down positively for you.

  63. Hi Kirk,

    Thanks for sharing your experiences with the sites you’ve been experimenting with.

    It does look like some sites that were using blog networks lost the value of those, while others were penalized. It looks to a degree like there were two stages – the identification and devaluation of link networks, and in some cases penalization of sites being linked to. Hopefully others will follow in your footsteps and find a way to help their sites recover.

  64. Hi Mike,

    Good to hear that you and your clients weren’t harmed by the recent updates.

    Google has been warning about certain practices for years, and taking some actions regarding them, but it does seem like they’ve stepped their efforts up recently. That may be due to a change in focus, or more technical capability to take some of the actions they have, or a combination of both. I’m guessing both.

  65. Hi Paul,

    Thanks. The patents and papers from the search engines do provide a lot of clues about how search engines may be working. The bipartite sub graph link analysis does describe how unusual link patterns can be used effectively and make it easier to find content to be analyzed.

  66. Penguin hit about 5 websites of mine. It is a complete failure and I have a perfect example for it. Two websites optimized for same keywords, had similar rankings on 2nd page in SERP, one of them is quality and has perfect content on it, other one is almost empty, with low content and just serves as a support for quality site. Thing is, quality website lost it’s ranking while low quality even made it to the 1st page although no optimization has been made for more than 6 months.

  67. Pingback: Google Penguin Update: Google Granted Another Possibly Related Patent | WebProNews
  68. Great article Bill. Interesting to see some actual insight into the patents.

    My site has also been jumping around recently – I wonder how long this will take to settle down into a permanent position.

    So will sites get de-indexed that are found to be using black hat techniques or just have their position removed and dodgy links removed?!

  69. “Two more of the Phrase-Based Indexing patents were granted yesterday, Phrase extraction using subphrase scoring and Query phrasification.”

    Released on the same day? Either they are trying to fool us, or they had to do it (of legal reasons). I think the answer in there, Bill it would be great of you could make a post or two about these new patents with your thoughts (or whatever is new about them).

  70. Nice reading and very interesting with the “praising SEOs who help improve the quality of web sites they work upon”. In Denmark there is unfortunately not much of a penalty, to websites where SEO is made with doorway pages.
    We can hope that with this, it’s changing over time. 🙂

  71. Hi Mathias,

    I wish it were easy to explain why one site that you consider to be high quality might have suffered, and the other that you consider to be much lower quality might have thrived under the Penguin update. My question would be why you optimized two different websites for the same keywords instead of using them to focus upon different keywords that might be related in some manner. But regardless of that, I can’t tell you why one site wasn’t impacted while the other was without having some idea of the on-page and off-page efforts surrounding either of those sites.

  72. Hi JJ,

    Thanks. It’s really not all that predictable as to how Google might treat one site compared to another. I’ve seen some sites more than a couple of years ago get penalized for what some people might consider to be fairly insignificant actions, while seeing many other sites in search results ranking well for what I would consider fairly more serious actions.

  73. Hi Per,

    The date when patents are granted isn’t always a good indicator of when Google might have performed some specific upgrade or action related to them. It’s possible that those patents were being prosecuted by the same patent examiner, which is way they were granted on the same day. The phrase-based indexing patents do cover a lot of different aspects of how a search engine functions, including spam and duplicate content detection. Thes particular patents focus more upon how phrase-based indexing might be implemented on a large search engine than focusing specifically upon spam detection, which I’ve written about in the past.

  74. Hi Anders,

    It’s possible that Google may base whether or not they implement certain aspects of their updates regionally. If the impact of a certain update might leave a large amount of queries with very sparse amounts of relevant search results, Google might not make a change in a specific area, or might delay such a change until they are able to find a better approach.

  75. Hi Bill,
    Yeah that’s true, that could be the reason why.
    As I wrote, we can only hope that it changes along with a bigger amount of quality results in the SERPS.

    Thanks for the response, and for a great blog with many interesting articles. 🙂

  76. Hi Anders,

    You’re welcome.

    I mentioned that possibility after reading a patent filing from Google that suggests that they might treat advertisements a little differently based upon region. For instance, the quality score requirements for advertisements might not be as strict in some regions.

    Thanks for your kind words.

  77. It really is interesting to see all of these changes being implemented that affects page ranking. SEO has came along in just a few short years, I believe a lot of this is due to the rise in online web searches. Google has always set out to be the leader when it comes “weeding” out websites based on content to improve content relevancy in its search engine.
    I feel this may affect a lot of websites for a short period of time, but if a site maintains good, up to date, and relevant content, I think most sites will see their page go back to the way it was, or may even experience an increase in ranking in the search engine. I choose to remain optimistic, but only time will tell.
    What do you think Bill? Do you think this will have any affect on YouTube videos, and how they work in the search engine?

  78. I think a lot of innocent, quality sites have been caught up in what is an attack on paid links, blog networks and SEO companies. In some search terms, penguin has brought aged content back from 2007/2008 to rank on the first page. Ironically, some of these sites have not been updated for years.

  79. Penguin has seemed to more than anything else target the backlink profiles of sites. I would like to see an article on SEO by the Sea about how blackhatters can use the negative affect caused by poor backlinks to their advantage by targeting their competitors. Google has opened up a whole new black hat industry with this latest update… Question is, why does Google not just discount bad links rather than actually counting them negatively against a site?

  80. Hi Mo,

    I think Google has been working on improving the search results they display from the time they initially launched, though the efforts they’ve gone through have changed and evolved over time.

    If a site is engaging in practices that are against Google’s guidelines, it might have a hard time recovering in some instances, though I have seen a few sites make changes and file reconsideration requests and return to previous rankings.

    Of course, chances in rankings may also be a result of competitors improving the quality and relevance of their sites as well, changes to ranking signals used by the search engines, and changes in searchers’ behaviors.

    It’s hard to say what the future of rankings are on YouTube and videos at this point. I’ve seen a number of patents that suggest that the search engines may be looking beyond things like the text associated with those to pay more attention to the content that they actually include, but don’t know how far down the line that might be.

  81. Hi George,

    It’s a little hard to generalize in the absence of some specific examples. I’ve seen many sites that have had SEO issues that they thought might have been related to one update or another that were actually harmed by something else entirely. But a site that does rely to a degree upon paid links or blog networks, either directly or indirectly through another site that uses those and links to them, might be prone to being harmed by updates that may devalue those.

    With Google trying out and applying as many updates and changes as they do, it’s hard to state specifically that some older pages that might have ranked well in the past that have returned to rank well are necessarily doing so because of Penguin. There might be other reasons for them doing so as well.

  82. Hi Nate,

    I don’t think that I would write a blog post about how black hatters can attempt to use negative SEO to negatively impact the rankings of others, but I have written a few posts, including this one, about how Google might attempt to identify dense subgraphs of links to try to find linking patterns that are attempting to primarily influence search rankings.

    I don’t think that Google has created a new industry with the introduction of Penguin. It seems that there have been people trying to harm their competitors with things like Google Bowling for years.

    I do think there are many instances where Google discounts bad links instead of “counting them negatively towards a site.” I’d definitely recommend that anyone who believes that they might have been negatively impacted by such links attempt to do a very detailed analysis of their sites first to attempt to see if there are other things on their own pages that may be negatively impacting them, and work to attract or acquire links that Google wouldn’t consider to be links aimed primarily at manipulating search results.

  83. This is just one problem of many with the Google algorithim, but there are many examples of instances where websites are hammered because of their backlinks, as long as this is the case blackhatters can and will use this to target their competitors. And of course Google should do everything in their power and that makes sense to combat any techniques blackhatters can use.

    The only reasonable way to handle bad links by Google is just to count them as NULL. They should not ever be counted negatively against a site, because there is no way to track where the links originally came from, and there is also no way (most of the time) for sites to do anything about links coming in from penalized/spam sites. Your only possible recourse is to try and contact the site owner directly, and if that doesn’t work send them a cease and desist. And both of these options will most likely not even reach the actual site owner, or simply be ignored.

  84. Hi Nate,

    I’m not sure that I agree. It’s true that people don’t have much control over who links to them, and someone might attempt to abuse others by pointing links at them that they didn’t want or ask for. This isn’t something new, and the actual term “Google Bowling” has been around since 2005 to describe the activity.

    It’s likely that Google isn’t counting most bad links that they find. And, in the instances where there’s been a request or two to the owner of a site that controls those links, and there’s no response, that is something that can be included in a request to Google.

  85. Even though its been since April they seem to be continuing to roll out new updates that are really shaking the search results up. It’s interesting how some businesses who thought they were immune are now being affected and I can see this trend continuing. I recently found a site ranking number one for a very competitive term that has been using hidden anchor text links. Its only a matter of time before these sites that think they are immune are hit.

  86. During the Penguin we found that blogrolls were hit very hard and many large SEO-companies in Sweden got big problems. Maybe this is one of the reasons that WP now is taking away the blogroll in WP 1.5 since many uses this for spamlinks and EKL?

  87. Hi Bill
    I was looking for this article and dropped you a mail requesting you to write one on this topic. I have a doubt. Google has already rolled out most of the updates and de-ranking algorithms to improve on the result set for a specific keywords in google search. If someone launches a website and keep working on SEO techniques and site gets a high page rank in a couple of months. Is there a possibility that the team which is looking after SEO is using something wrong unknowingly?
    Is it possible to get a PR 3 or 4 in 2 months?
    Please revert back
    Thanks
    Maria

Comments are closed.