As long as there have been search engines, there have been people trying to take advantage of them to try to get pages to rank higher in search engines. It’s not unusual to see within many SEO site audits a section on negative practices that a search engine might frown upon, and Google lists a number of those practices in their Webmaster Guidelines. Linked from the Guidelines is a Google page on Hidden Text and Links, where Google tells us to wary about doing things such as:
- Using white text on a white background
- Locating text behind an image
- Using CSS to position text off-screen
- Setting the font size to 0
- Hiding a link by only linking one small character—for example, a hyphen in the middle of a paragraph
Continue reading Google Granted Patent on Invisible Text and Hidden Links
Google collects information about where you compute from, and provides location based services based upon where you travel. To protect this information, and to use it to protect people from spam and scrapers, Google might follow processes to protect that information and to analyze it.
Post a review from Germany about a restaurant, and then 15 minutes later from Hawaii about another restaurant, it’s spam. Drive down a highway where the cell towers collecting information about your journey are located in the middle of Lake Michigan, it’s likely spam. If GPS says you’re in NYC, and you then connect via Wifi in Wisconsin a few minutes later, spam. This information may not even come from you, but rather from others that might impersonate you.
Google was granted a patent last week which explores how they could use location based data to identify spammers and scrapers. It would also put user location information in a quarantine, and possibly hide starting and/or ending points for journeys from mobile devices to protect privacy for users, and to explore whether or not the information is spam. The location information could be used by the search engine, and that detailed information about locations to keep some information from being used in location based services, or other services that Google might offer.
Continue reading Google Patents Identifying User Location Spam
Imagine the Earth broken down into a series of cells, and each of those cells broken down into a series of even smaller cells, and then into smaller cells again, and so on, in a spatial index. Each of the levels become increasingly narrow, and increasingly more precise areas or zoom levels of the surface of the Earth.
As these cells decrease in size, they increase in numbers, which has the impact of increasing the zoom level and the accuracy of areas represented in such an index. Might work good in a place like China, where latitude and longitude are banned for export as munitions. Such a set of cells might be part of a geospatial analyzing module that links specific businesses and points of interest (parks, public regions, landmarks, etc.) to specific places on this model or index of the earth. That might be one index of the businesses and one index for the points of interest, or a combined database that includes both.
Sometimes that index might include a business and a landmark within the same cell. While that could be correct in some instances, such as a shop appearing within the Empire State Building, Often its an error, and sometimes even an intentional error. People will sometimes enter incorrect information into a geographic system like this to try to gain some kind of advantage.
If people search for something like a motel “near” a particular park for instance, the motel that appears to be next to, or even within the boundaries of that part might seem to have something of an an advantage in terms of distance from that part when it comes to ranking the motel. And, sometimes Google doesn’t seem to do the best job in the world at putting businesses in the right locations at Google Maps.
Continue reading Patent on (Intentional) Errors in Google Maps?
When Google ranks businesses at locations in Google Maps, they turn to a number of sources to find mentions of the name of the business coupled with some location data. They can look at the information that a site owner might have provided when verifying their business with Google and Bing and Yahoo. They may look at sources that include business location information such as telecom directories like superpages.com or yellowpages.com. or business location databases such as Localeze. They likely also look at the website for the business itself, as well as other websites that might include the name of the business and some location data for the business, too.
What happens when the information from those sources doesn’t match. Even worse, what happens when one of these sources includes information that might be on the spammy side? A patent granted to Google this week describes a way that Google might use to police for such places. The patent warns against titles for business entities that include terms such as “cheap hotels,” “discounts,” Dr. ABC–555 777 8888.” It also might identify spam in categories for businesses that might include things such as “City X,” “sale,” “City A B C D,” “Hotel X in City Y,” and “Luxury Hotel in City Y.”
In the context of a business entity, information that skews the identity of or does not accurately represent the business entity or both is considered spam.
Continue reading Google Tackles Geographic (Map) Spam for Businesses
Google’s Webmaster Guidelines highlight a number of practices that the search engine warns against, that someone might engage in if they were to try to boost their rankings in the search engine in ways intended to mislead it. The guidelines start with the following warning:
Even if you choose not to implement any of these suggestions, we strongly encourage you to pay very close attention to the “Quality Guidelines,” which outline some of the illicit practices that may lead to a site being removed entirely from the Google index or otherwise impacted by an algorithmic or manual spam action. If a site has been affected by a spam action, it may no longer show up in results on Google.com or on any of Google’s partner sites.
A Google patent granted this week describes a few ways in which the search engine might respond when it believes there’s a possibility that such practices might be taking place on a page, where they might lead to the rankings of pages being improved in those search results. The following image from the patent shows how search results might be reordered based upon such rank modifying spam:
Continue reading The Google Rank-Modifying Spammers Patent
Manipulative repetitive anchor text, blog comments filled with spam, Google bombs, and obscene content could be the targets of a system described in a patent granted to Google today that provides arbiters (human and possibly automated), with ways to disassociate some content found on the Web, such as web pages, with other content, such as links to that content.
In an Official Google Blog post, Another step to reward high-quality sites, Google’s Head of Webspam Matt Cutts wrote about an update to Google’s search results targeted at webspam that they’ve now started calling the Penguin update. The day after, I wrote about some patents and papers that describe the kinds of efforts Google has made in the past to try to curtain web spam in my post Google Praises SEO, Condemns Webspam, and Rolls Out an Algorithm Change.
The patent doesn’t describe in detail an algorithmic approach to identifying practices that might have been used to manipulate the rankings of pages in search results. Instead it tells us about a content management system that people engaged in identifying content impacted by such practices might use to disassociate certain content with webpages and other types of online content.
Continue reading How Google Might Disassociate Webspam from Content
Yesterday, Google’s Distinguished Engineer Matt Cutts published a post on the Google Webmaster Central Blog titled Another step to reward high-quality sites that started out by praising SEOs who help improve the quality of web sites they work upon. The post also noted:
In the next few days, we’re launching an important algorithm change targeted at webspam. The change will decrease rankings for sites that we believe are violating Google’s existing quality guidelines.
We’ve always targeted webspam in our rankings, and this algorithm represents another improvement in our efforts to reduce webspam and promote high quality content.
This isn’t something new, but it sounds like Google is turning up the heat some on violations of their guidelines, and we’ve seen patents and papers in the past that describe some of the approaches they might take to accomplish this change.
Continue reading Google Praises SEO, Condemns Webspam, and Rolls Out an Algorithm Change
How much does feedback from searchers impact the search results that we see at Bing or Google? How do those search engines process and respond to that feedback?
The links that Google and Bing present for searchers to provide feedback on search results are listed at the bottoms of the search results pages for each. If there was a link instead after each search result where someone could provide feedback, how much of an impact would that change have, and would the search engines be able to handle the feedback that they receive?
A patent granted to Microsoft this week describes how the search engine may automate processes for “dissatisfaction reports” that are manually submitted by searchers, and how the search engine may file its own disatisfaction reports in some instances. While some of the feedback that search engines receive may include web spam reports, they may also receive feedback that something is “broken” with the search engines, or that a URL that should be showing for a specific query isn’t, or that the results just weren’t helpful.
Providing Feedback at Bing and Google
Continue reading How a Search Engine May Automate Web Spam Reports and Search Feedback