Yuri Filimonov, of Improve the Web and I discussed how Google handles misspelled queries, how corrections are triggered, and how those might influence search results.
One of Google’s patent applications describes a process that may answer some of those questions, and Yuri and I thought it might be good to make the discussion public. I believe that Yuri may be describing some of the business issues around misspelled queries later this week.
I wrote about a Google patent application which may be involved, in Google’s Query Rank, and Query Revisions on Search Result Pages.
My earlier post looked at the bigger picture of how the search engine might consider offering query refinements – usually based upon looking at users’ search sessions and watching the different queries that those searchers use when looking for specific information. For example, people may remove some of the words in a query when they only get a few results. Conversely, they may add words if the results may seem too broad. That kind of user behavior is recorded in Google’s log files.
Google may also be looking at whether people click on any of the links returned in the sets of results from those searches and how long people might stay on those pages.
One of the often recorded changes is when a searcher makes a spelling change to a term that they’ve typed in after the results they received might not match what they thought they were searching for, and they realize that they had a misspelled query.
The patent application tells us that the display of a query refinement may be triggered by the absence of relevant search page results.
Here are the steps that it suggests in returning spelling corrections as query refinements for misspelled queries:
1. A search resulting in no or just a few relevant results may trigger a look at possible query refinements
2. One kind of potential query refinement offers a spell correction based upon a score for a query (the misspelled query in this case) and its relationship to a highly-ranked query (the correctly spelled query, possibly).
3. If the search engine thinks that the spelling correction offered is something many people choose, it may start showing some results for the correct spelling at the top of the results page and offering a link to a search for the refined query.
That last step is interesting. In addition to returning results for the misspelled query and a query refinement suggestion, Google will sometimes also display some highly placed results for what it believes are the correct spellings of a word. Keep in mind that it doesn’t do this for most misspellings, but it does for some.
Testing when Google Might Merge Results from Correct and Misspelled Queries
To see that in action, I took some words from the 100 most commonly misspelled words list at yourdictionary.com.
Correct: accidentally
Incorrect: accidently
In a search for the incorrect spelling, the top ten results in Google for me showed three results that only had the correct spelling and three more that had both spellings.
Correct: independent
Incorrect: independant
None of the top ten search results for the incorrect spelling returned included the misspelling on their pages.
Correct: lightning
Incorrect: lightning
The first four results for the lightning search only had the “lightning” spelling. The rest of the misspelled query results included two pages with both spellings, two more with only the incorrect spelling, and two others on the topic of “skin lightening.”
Conclusion
I tried searches on 20 of the misspelled queries from that list and only received these three, where the results for the correct spelling seemed to be mixed in with the incorrect spellings. But it is interesting to see how Google might change results like this when it has a lot of confidence when words are in misspelled queries.
It’s not unusual to create a new word when deciding upon a name for a business. However, a search engine might see the new word as a misspelling of a word that it thinks it recognizes. What kinds of implications might that have for a business with a name like that?
Hi Yuri,
That would be an intuitive approach, and seems to make sense. In part, what you are observing is true – the most relevant pages would be returned in response to the misspelling, if there were pages that Google deemed relevant (looking at query-dependent factors such as use in titles, in content, in hypertext, etc., and query-independent factors such as pagerank). If there was a decision to present results from incorrect spellings or correct spellings or both, then the most relevant of those would be shown.
But the decision of whether or not to show only results with the misspelling, or those misspelling results with a query refinement link, or a query refinement link and a mix of results from both is determined, in this patent application, based upon a mix of the query score and the query’s relationship to a known highly ranked query.
There are a couple of different ways to calculate a query score, which are defined in the patent application. Here’s one of them:
In easier terms – the less a query is revised by a searcher, the higher the score for the query.
I’m not going to go into other variations, because the way this is set up, it’s easy for them to change some of the features that they might use to rank different queries. That’s one method above, but there are other ways, too.
Agustin, that’s a great brand name.
It’s short, funny, memorable, and when you see it, you get the idea immediately that the company behind it sells funny t-shirts.
I’m not seeing any results in the top ten from Google for tshirts or t-shirts, when I search for it, so that’s a good thing. As you grow your brand, as you get more mentions of it online, and as more people search for it, and choose it from results, you’ll likely see the spelling suggestion go away.
Bill, to think of it, I have always been thinking that whether G returns pages with misspellings in the first 5-10 or not depends on how much links the page has got. If it has got links with typos in link text and has typos on the page, but no typos in title and meta description, you may not see any typos and yet, it may have not been a result of the G smart algorithm.
It seems like the 3. point of the algorithm you described (showing correctly spelled pages, based on how often they are chosen for) can also match the quality of the pages with lots of links.
It’s interesting that you worte about this subject because it applies to me so perfectly. My company name is tchirts, which is a funny way that some of us hispanics pronounce the word t-shirts. I used this company name for a humourous t-shirt website called tchirts.com.
Everytime anything containing the word tchirt is queried, google suggests the word tshirt as a query instead.
Nice post, Yuri.
I’ve seen a lot of people struggle with making a decision upon what to name their business. Your advice is really helpful.
Yes, that’s some nice domain name. I guess it can spread well, if you give it a good boost.
There ya go. A post about being a business with a typo through the [trackback] link above.
Just found this post under the misspell seo search. Great posts! Recently I’m looking into how using the misspell word in my seo practice. If you have any related information to share, that will be nice.
This is a great discussion. I am running into the same issue as Augustin. My website is called Revoluminary. The name is a mix between “Revolutionary” and “Luminary”. Despite the clever name and some (growing) presence on the internet, Google still suggests ‘”Did you mean “Revolutionary”?’ when you search for the company name. Is there a way to suggest a correct spelling to Google similar to the way once can submit URLs to search engines?
When a customer googles ISOpay our website used to appear right on top of google but recently google is seeing ISOpay as incorrect spelling and suggesting ISORAY instead and pushing ISOpay down to third spot.
Is there any way to inform Google that ISOpay is spelt correctly so that they can add to their dictionary?
Hi Richard,
You could possibly ask at the Google Webmaster Help group, but I’m not sure how much help they might be.
One of the reasons that the patent application cites for Google to offer spelling refinements like it does for your term is that there may just not be very many results that appear for the term. Increasing the number of pages on the Web that ISOpay appears upon may be helpful, though I couldn’t tell you how many would be needed to stop that kind of query refinement from happening.
We are about to launch our site reelives.com and someone raised the issue of it being a misspelling of reel lives (note there are already reellives.org and reellives.com.au and reel-lives.com sites) and that the correct spelling would always rank higher but my understanding is that if the brand has good awareness and the user types in reelives then they will get us, if we have good seo and ppc etc then even if they type in something similar like reel lives then we would still potentially have a high page rank and bearing in mind what you have said above if people then select our site more often because it was actually the one they were looking for then the ranking would improve over time based on relevance. Therefore we shouldn’t be looking to find a different name to use – is this a reasonable deduction?
Hi Stewart,
I think that is a reasonable deduction. An example I can think of off the top of my head is Flickr, which is a pretty strong brand these days, and definitly a misspelling of the word flicker. It sounds like your point in chosing the name was to find a unique word that people will remember, and not to try to gain traffic through people’s misspellings of other sites, which is sometimes referred to as typosquatting.
When I type the word “reelives” into a Google search box, I’m not seeing a spell correction suggestion, which indicates to me that the search engine isn’t seeing a relationship between reelives and reel lives. With many terms, Google will ask “Did you mean XXXX” after receiving a request for a search term, and may actually show results for the spelling they thought you meant. Right now, that doesn’t look like an issue that you will face.
Good luck on the launch of your new site.
Bill – I don’t know if you’d spotted the debate in the UK about google changing searches for search engine optimisation (UK spelling) to be searches for optimization (US spelling). There was no “did you mean” or anything – it just assumed you really wanted the z spelling. It’s changed it back now after an outcry from UK-based SEOs.
Anyway, I found some other words that it seemed to be doing this for – for instance, a search on stationary returns results for stationery (to the extent of showing a map with local stationers). When I wrote the post yesterday, I’d assumed that this was all part of the same recent change..
You can read the examples I’ve used here: http://www.malcolmcoles.co.uk/blog/googles-spelling-problems-are-worse-than-we-thought/
One difference between your examples, above, and mine is that your ones are incorrect spellings (there is no word “accidentaly”).
Mine are all correct, alternative spellings (Dear vs deer, whether vs weather etc). So do you think it is the case that it’s corrected plain wrong spellings for some time – but recently it’s begun to correct for wrong version of spellings? So using user behaviour to return results for the word you probably meant, even though what you’ve typed in is a real word…
Hi Malcolm,
I’ve read some of the posts and articles about spelling correction suggestions and results changes, and I’ve been thinking about some of the possible reasons.
Your post points out the problem well. I’ve just started digging around a little, and noticed some similar problems in the US. For example, if I search for “colorado capital” and “colorado capitol,” I’, given a Q&A answer for both stating that the Capital/Capitol of Colorado is Denver. A capitol is a geographic location such as a city, while a capitol is a building or set of buildings that might be located in a capital.
When I search for [whether] here, my third result is a page for “National and Local Weather Forecast, Hurricane, Radar and Report,” so this is an issue on both sides of the Atlantic.
It’s worth exploring more, and I will be. Google’s spell correction may be looking at more now than just how people correct spellings in query sessions, and which search results they click upon and spend time with.
Regarding the terms search engine optimisation/search engine optimization, it’s also possible that when people are searching for “search engine optimization” in the UK, they may also perform a significant number of searches for “search engine optimization” within the same query session (and in the other order). Many may be looking for informational resources rather than for service providers, and performing both searches may help them find a wider range of information.
It looks like there are all sorts of spelling-related changes going on – I wonder if this is all bound up in the change to how Google handles synonyms.
I looked into some UK search volumes for American spellings and it’s interesting that UK searchers appear to be increasingly adopting US spellings for some words (doughnut, yoghurt) etc. What we thought of as Google’s US-spelling imperialism may turn out to be just bad spellers doing searches in the UK.
Fortunately for the purists, although Google 7 of 10 results for a search in the UK on colour return pages with it spelt color, UK searchers can spell that word correctly most of the time! (more on all these things on my blog if you’re interested).
Hi Malcolm,
I suspect there is some kind of relationship involving the changes we’ve been seeing involving synonyms and these spelling corrections/changes.
Both seem to involve Google looking at user-based data from their query logs, and possibly the statistical language models that the search engine has been building, and offer the possibility of either offering query suggestions (the “did you mean xxxx” kind) or merging the results for an expanded query terms or phrase into the search results.
It’s interesting how language evolves and changes over time. I was reading some pages about linguistics that described the roots of English and the adoption of words from other languages into mainstream acceptance and usage. It’s fascinating how language itself changes over time.
Nice idea to use Google Insights for search to compare the spellings in your post:
http://www.malcolmcoles.co.uk/blog/uk-us-spelling/
Up until very recently our client’s site, http://www.uniblokcanada.com/ used to be the first result when a search was made for “Uniblok”
Now however, Google provides this in its result page:
“Showing results for uniblock. Search instead for uniblok.”
Particularly frustrating about the results is that the first result that comes up in this search has nothing to do with the replaced “Uniblock” at all.
As it stands now there are over 15,000 pages which get returned for Uniblok. They have unique and highly recognized products in their market and they are a very established company. They have had this site for years.
This is an unusual SEO problem in that there seems little clear guidance in strategical approaches towards correcting it beyond recommendations to “add pages linking back.” I have read nothing else concrete beyond that except to adopt a ‘keep your fingers crossed and wait and see’ attitude.
Our client would like their site to be number one again for searches involving their name. Is there any helpful or constructive advice anyone can offer to help amend this as soon as possible?
Thank you.
BTW: Strategies attempted so far: organic SEO makeover of client’s site and writing of offsite articles on reputable news sites with appropriate deep level link backs. The campaign has seem dramatic increases in all targeted keyphrases except that most important one: Uniblok.
http://www.uniblokcanada.com is that site again.
Hi cdfisherman,
Google has been doing something like this for years, but it’s possible that they might have gotten a little more aggressive with it in the last year or so.
There are a few different ways that Google might handle what it thinks is a misspelling, based upon a confidence level that they may have that something is misspelled.
One is the result that you face, where they state at the top of the search results something like “Showing results for Uniblock. Search instead for Uniblok”
Another would be to show the top two results for the word that they thought a searcher might have meant, and then show the results for the word actually used in the query.
A third would be to show results for the word used in the query, but adding a message (in red lettering) at the top of the result with a link asking if they had meant another word, such as “Did you mean: Uniblock”
I don’t believe that adding more links to the uniblokcanada site would make much of a difference.
There are a few things Google is likely looking at when making the decision to “correct” a query that they might believe is misspelled.
One of them is how frequently the query term appears on the Web, especially when compared to the word that it might think is the correct spelling. Google does show an estimate of how many times each term might be included within its index:
Uniblok – 15,100
Uniblock – 110,000
Other things that it might look at can include how often people click on a link in search results after performing a query.
For example, if people search for uniblok, and click on a uniblock result, it can reinforce Google’s assumption that uniblok is a misspelling.
If people searching for uniblock enter “uniblok” instead as a query, and then revise their query to “uniblock” instead, that can also reinforce Google’s assumption that “uniblok” is a misspelling.
My personal experience with this problem was when I purposefully chose a misspelling of a particular word as a username for a forum I was actively participating in, so that I could easily track mentions of the name through search engines. In the early days when looking up the name, I would trigger spelling corrections at Google when looking up the name. At some point, there were enough mentions of the name on the Web that Google no longer saw the word as a misspelling. That’s one possible approach that you could take – increasing the number of mentions of “uniblok” on the Web so that Google recognizes it as a unique entity, rather than a possible misspelling.