Does Google use whois information?

Sharing is caring!

Some recently published patent applications from Go Daddy explore whether additional whois information might help reduce spam and phishing, and improve search engine results. Google noted in a patent application last year that they might be looking at whois information while presenting and ranking pages.

I don’t know how easy it would be to set up the processes described by Go Daddy, or verify the reputation information that they describe and maintain the records the system would depend upon.

The purpose of whois information

But it might be a moot point even to wonder. A recent decision by the folks at ICANN to limit the use of whois information makes it seem unlikely that the scenarios envisioned by these documents will happen. ICANN’s Generic Names Supporting Organization held a vote in which they decided upon the sole purpose of whois information:

“The purpose of the gTLD Whois service is to provide information sufficient to contact a responsible party for a particular gTLD domain name who can resolve, or reliably pass on data to a party who can resolve, issues related to the configuration of the records associated with the domain name within a DNS nameserver.”

It may still be worth discussing because that decision, that vote, also seems to put a damper on Google’s use of whois information in the manner in which they describe in their patent application on Historical Data (more on that below).

Go Daddy’s reputation information for whois

Imagine registrars having the ability to add reputation information to the who is collected on domain names.

This information would be contained in a database that could be accessed by people and even by search engines. It could contain material on a site (or even on URLs) from the registrar and organizations like TRUSTe, Verisign, SenderBase.org, Spamcop, and others. For example, it might have data on the amount and frequency of spam originating from a domain and complaints about spam, phishing, and a wide range of website content.

The patent applications list the following types of content as examples of what might be included in this reputation database:

illegal drugs, alcohol, tobacco, sex, pornography, nudity, or any other form of adult content, profanity, violence, intolerance, hate, racism, militant groups, extremists, Satanism, witchcraft, gambling, casino, spam, MLM, pyramid schemes, fraud, or any other illegal activity, etc.

Search engines would have the ability to increase or decrease rankings of sites based upon scores from the reputation information.

The patent applications

On October 29, 2004, Go Daddy filed the following patent applications with the US Patent and Trademark Office. They were all published on May 4, 2004, and assigned to Go Daddy. The inventors listed on all three are Warren Adelman and Michael Chadwick. The three documents differ in their abstract and claims sections but contain the same description sections.

Publishing domain name related reputation in whois records (US Patent Application 20060095459)

Abstract

The invention describes a method for publishing domain name-related reputation data in WHOIS records.

Reputation data may be published in the WHOIS records of the domain name. Reputation data may include values, ratings, scores, and links or references to the locations where such values, ratings, or scores may be found (e.g., URL link). The reputation data may be tracked on the domain name itself, URLs, domain name purchaser or registrant, or email addresses associated with the domain name.

The reputation data may include various categories, such as email practices, website content, privacy policies and practices, fraudulent activities, domain name-related complaints, overall reputation, etc. The requester may decide whether to allow email messages or visit URLs based on the domain name-related reputation.

Tracking domain name related reputation (US Patent Application 20060095586)

Abstract

Systems and methods of the present invention allow for tracking domain name-related reputation by a domain name Registering Entity (e.g., Registry, Registrar, etc.).

The Registering Entity maintains a database with reputation data that the requesters can access in a preferred embodiment. The Registering Entity may update reputation data based on a variety of events related to the domain name. For example, the reputation data may be tracked on the domain name itself, URLs, domain name purchaser or registrant, or email addresses associated with the domain name.

The reputation data may include various categories, such as email practices, website content, privacy policies and practices, fraudulent activities, domain name-related complaints, overall reputation, etc.

The registrant may opt for a reputation service while registering a domain name. The requester may decide whether to allow email messages or to visit URLs based on the domain name-related reputation.

Presenting search engine results based on domain name related reputation (US Patent Application 20060095404)

Abstract

The invention describes a method for presenting search engine results based on domain name-related reputation data. The search engine may sort or order search engine results based on domain name-related reputation data.

In some cases, links connected to low-reputation domain names may be excluded from search engine results. Alternatively, the search engine may show reputation ratings next to the links in the search engine results. Thus, allowing the Internet user to determine whether to visit the link or not.

The reputation data may be tracked on the domain name itself, URLs, domain name purchaser or registrant, or email addresses associated with the domain name. The reputation data may include various categories, such as email practices, website content, privacy policies and practices, fraudulent activities, domain name related complaints, overall reputation, etc.

Reading through these applications, some areas made me wonder if this system could be abused. But it’s an interesting approach. The Go Daddy method would increase the amount of information shared with people. Another recent CircleID article noted that there’s a strong movement towards being able to access less whois information.

This, of course, raises the question of whether whois information should be used in this manner. There are a large number of other articles on the subject in their Privacy Matters section.

Google’s use of whois information?

Last year’s patent application from Google, Information retrieval based on historical data, also described the potential use of whois information to aid in the rankings of web pages, looking at information like as the length of the registration of a web site, or other aspects of the registration, such as:

  • Whether physically correct address information exists over a period of time,
  • Whether contact information for the domain changes relatively often,
  • Whether there is a relatively high number of changes between different name servers and hosting companies,
  • Whether there is known-bad contact information, name servers, and/or IP addresses associated with a domain.

Information about name servers is also cited as a way to determine if a domain is “legitimate,” such as the length of time of a domain on a name server, or:

  • Whether there is a mix of different domains from different registrars and have a history of hosting those domains,
  • Whether the name server hosts mainly pornography or doorway domains or domains with commercial words
  • Whether it contains primarily bulk domains from a single registrar,
  • Whether the name server is brand new.

Conclusion

The use of this information in this manner doesn’t seem to mesh well with the defined purpose of whois information above or the findings of the task force that explored the purpose of whois information. Monitoring this information for ranking pages on a commercial search engine would seem to be against the spirit of whois as defined by the task force.

Then again, we don’t know if Google was using this information in the first place. Does Google use Whois information? We don’t know. But it seems from this vote by the Generic Names Supporting Organization that they shouldn’t be.

Sharing is caring!

12 thoughts on “Does Google use whois information?”

  1. Hi Karl,

    I’m not sure that they will be using Alexa information in the future either. 🙂

    When Google became a domain name registrar after this patent application was published, one thought I saw some people write about was that Google was trying to use this information in a manner like what is described in the patent application.

    If they are, it doesn’t sound like it is a use in accord with this recent decision from ICANN.

  2. I think Google is using the whois information because it would help them determine related sites via owner. Because they are trying to provide the most relevant results they would use anything that might help them. If I were Google, I would want to know if a spammer took up the top 10 search results for a popular search term.

  3. I really cant see google needing to use there infomation when they can most likly get it there selfs, they did use Alexa for some infomation but lol will not be any more.

  4. I agree with you, Neil, that it would be helpful to know if sites are related by owner.

    But, I would suspect that the names entered in the Whois information could be different than the names of the people who paid for the registration, and may not be helpful in that manner.

  5. Hi Caroline,

    That’s a pretty good succinct overview of the patent application. I’ll have to look over some of your other articles. 🙂

    One thing that we can have problems with on some of these filings with the USPTO is whether or not the processes described in them have been implemented, or are a direction that a Google, or another company might want to follow but haven’t.

    There’s a lot of great stuff in that Historical data patent application, but it’s difficult to tell if they have taken everything from it, and put it into place, or even come up with something new and different since it was first filed and published.

    We do know that in earyly April, the ICANN body overseeing whois data, the Generic Names Supporting Organization, asked for papers on a number of subjects, including the use of whois information by registrars. If Google is using whois information to help decide upon scores for pages, will changes in how they can use that information force them to stop? Maybe.

  6. I’ve heard a lot of stories about domain name importance. However, check out the figures below – here are the top five most expensive domain sales in the last few years:
    Business.com – $7.5 million
    AsSeenOnTv.com – $5.1 million
    Altavista.com – $3.3 million
    Wine.com – $2.9 million
    Autos.com – $2.2 million

    I am not sure how such an investment – paying millions of USD for a simple name – can be recovered and used to produce profit. There was a story about Coca-Cola, which said that if the company lost all its infrastructure, but kept the name and brand characteristics, it could bounce back in about 4 years. However, if the company managers were left with all the infrastructure, but lost the Coca-Cola name, the company would be very likely to go bankrupt. Do you think this is also true for domain names?

    Regards,

    Michael Rad
    Web2earn.com – learn how to make money online

  7. Hi Michael,

    We’re a little off the topic of the use of whois information, but it’s an interesting digression.

    A domain name isn’t always a brand name, and there’s some value in some domain names themselves because people will type in the name in a search box to see what’s there.

    I think that’s true with a domain name like “business.com,” which has been active for a few years, yet hasn’t taken many steps to build a strong brand around the name. There is a lot of value to type-in-traffic. If they added a core group of business writers to the site creating blog posts and articles, and developed business tools for business people, and become a place where people could learn about how to start or run a business, find the latest business news, track financial information, and so on, they would be on their way to building a brand. The name itself does help to add an element of credibility to the site, but keeping the site a simple niche directory really doesn’t do much for them.

    On the other hand, altavista built a brand, and a reputation, by creating a service that was, in its time, better than most of the rest of the search engines out there. There wasn’t the type-in traffic that business.com might get, based solely upon the name without any awareness of the brand.

    I do think that it is faster and easier to build a strong brand on the web than it might be in the bricks and mortar world alone.

    The value of a craigslist isn’t in the domain name, but rather the services that it offers, and the spirit in which they are offered. Change the name to ed’s list, offer exactly the same type of services and the same relationship with its audience, and within a few months, most people pointing links at the old site, will likely change their links to the new domain name.

  8. There is no doubt in my mind that Google is using this data to some extent. Whether they should or not is a moot point.

    I agree strongly that it would facilitate them determining ownership of sites and (it only follows) devaluing the links between those sites…

    I have had several comeptitors slip in the rankings over the past few months and now I am openly wondering if this could have some causative effect.

  9. Hi Eric,

    There are a lot of potential reasons why a site might fall in the rankings. The line from Google was often that the search engine looked at 100 or so factors in determining the rankings of pages.

    The patent application which describes the use of whois information probably adds at least 100 more, based upon a wide range of changes to a site over time, and some other considerations.

    My guess is that even 200 factors might be an underestimate.

    Using whois information could provide them with an insight regarding ownership of sites that they might not otherwise learn, though.

  10. Satanism and Witchcraft? how is it illegals? its harming who?

    Somebody clearly having too much fun flexing its muscles

  11. Hi James,

    I think the danger of basing decisions on discussions with “like minded folks” is that there are so many other variables that can play a role in how web sites rank. I’d like to see more than just anecdotal information, even if it’s shared by a number of people.

    It’s possible that length of registration may be a signal that search engines view, but if the assumption behind that is that a site that registers for a single year at a time is more likely to be spam than one that registers for a longer period, it’s a poor ranking signal. There are so many people who register for only a year who have legitimate and relevant sites, that basing rankings on such a signal alone may not be a good idea.

Comments are closed.