Post your URL in an SEO forum, and get labeled as a Web Spammer? Maybe.
There are some site owners and internet marketers who attempt to increase how well their web sites rank in search engines by buying links to their sites, or exchanging links with others. Those kinds of activities are frowned upon by the major search engines because that kind of manipulation can impact which pages show up in search results. As Google notes on one of their help pages on link schemes:
Your site’s ranking in Google search results is partly based on analysis of those sites that link to you. The quantity, quality, and relevance of links count towards your rating. The sites that link to you can provide context about the subject matter of your site, and can indicate its quality and popularity. However, some webmasters engage in link exchange schemes and build partner pages exclusively for the sake of cross-linking, disregarding the quality of the links, the sources, and the long-term impact it will have on their sites. This is in violation of Google’s webmaster guidelines and can negatively impact your site’s ranking in search results.
Likewise, there are forums where people publicly discuss the exchange of links to manipulate search results.
Microsoft has published a patent application that describes how they might target (and possibly hand pick) Search Engine Optimization (SEO) related forums where they believe such activity may take place, and crawl those to see if they can identify requests for links exchanges.
Forum Mining for Suspicious Link Spam Sites Detection
Invented by Bin Gao, Tie-Yan Liu, Hang Li, and Congkai Sun
Assigned to Microsoft
US Patent Application 20090198673
Published August 6, 2009
Filed: February 6, 2008
An anti-spam technique for protecting search engine ranking is based on mining search engine optimization (SEO) forums. The anti-spam technique collects webpages such as SEO forum posts from a list of suspect spam websites, and extracts suspicious link exchange URLs and corresponding link formation from the collected webpages.
A search engine ranking penalty is then applied to the suspicious link exchange URLs. The penalty is at least partially determined by the link information associated with the respective suspicious link exchange URL.
To detect more suspicious link exchange URLs, the technique may propagate one or more levels from a seed set of suspicious link exchange URLs generated by mining SEO forums.
There’s a nice discussion in the background section of the description in the patent filing about some of the methods that search engines have developed to try to identify web spam, including a few paragraphs on the evolution of web spamming approaches:
Web spamming techniques have also evolved in time. The first generation spam involved keyword stuffing when ranking was dependent on document similarity. The second generation spam involved link farms when ranking was largely dependent on site popularity. The third generation spam uses mutual link exchange through “mutual admiration societies” when ranking is largely dependent on page reputation. In general, the third-generation Web spamming is harder to detect than the previous generations.
Link spamming techniques, which include busying/selling links, exchanging links, and constructing link farms, are a major category of the commonly used spam techniques. Link spamming refers to the cases where spammers set up structures of interconnected pages in order to boost their rankings in link structure-based ranking system such as PageRank. Since link analysis is a crucial factor for commercial search engines, link spam is among the most popular and harmful techniques for search engines nowadays.
The patent application also defines and discusses anti-link spam approaches such as TrustRank, BadRank, and SpamRank, and how they attempt to automatically detect link spam and web spam. We’re told that those methods aren’t effective in certain situations, and that the “link spam problem has yet to be solved.”
One attempt at a solution is to pay more attention to places where people may be openly discussing the exchange of links on the web, and take URLs identified in those discussions to use as a “seed set” of URLs to crawl to identify other pages those link to. The patent filing refers to these places as “search engine optimization (SEO) forums,” which may be manually selected.
Search engine ranking penalties may be applied to URLs that have been identified through the methods described in the patent filing, which relies upon finding URLs mentioned in discussions of links exchanges without actually visiting the sites themselves, or analyzing the content of those sites. We’re told there that:
To conveniently and efficiently exchange link trade information, spammers usually log onto SEO forums to communicate with each other for trading links, including link exchange, link sale, and recommendation link exchange.
These forums are increasingly more popular. Spammers post requests for “link exchange”, “buy & sell link”, and “recommendation exchange” in these forums, along with the URLs of their websites, and other interested spammers may reply the requests and provide the URLs of their websites.
In recognition of these activities, instead of searching and analyzing these spamming websites themselves, the technique described herein identifies the URLs of them by analyzing the context in the posts by spammers on the SEO forums.
There are many forums where search engine optimization is discussed that provide helpful and useful information to people who participate in those forums.
They may offer a chance for people to discuss best practices, exchange ideas on how to create better experiences for their visitors, offer constructive criticism on design and other aspects of a site. Many forums operate as a Community of Practice or an online Third Place as envisioned by Ray Oldenburg.
But there are also forums where discussions about “links for sale” or “exchanging links” or “reciprocal links” may take place. I’m not sure why the researchers at Microsoft felt that they needed to file a patent to protect the idea of finding such sites, and using them to attempt to identify potential web spam.
The patent application does go into much more detail on some of the processes that Microsoft (and possibly other search engines) might use, and is recommended reading if you participate in a forum that discusses SEO.