Does Google Use Latent Semantic Indexing (LSI)?

Sharing is caring!

Railroad Turntable Sign
Technology evolves and changes over time.

There was a park in the town in Virginia where I used to live. It was a railroad track that had become a walking path. At one place on that path was a historic turntable where cargo trains might get unloaded. It could join later trains or trains headed in the opposite direction. This is a technology that is no longer used. But it is an example of how technology changes and evolves.

Latent Semantic Indexing is Old Technology

Some people claim that Google uses Latent Semantic Indexing. They believe that by saying that, they are saying that Google is using synonyms and semantically related words. They are not correct. LSI is just one type of Language model based on semantics. It even has the word “semantics” in it. But that does not mean that LSI stands for all semantics. LSI is a particular type of semantics that Bell labs patented. The patent follows below.

Google is likely looking for synonyms and semantically related words on pages. That doesn’t mean that using some toolmakers tools that use the initialism LSI in their tools name will help pages rank higher in search results. For example, Latent Semantic Indexing is an old patented technology, but that doesn’t mean that Google is using synonyms and semantically related words the way that LSI does. Google does like synonyms and Semantics, but they don’t call it Latent Semantic Indexing. For an SEO to use those terms can be misleading and confusing to clients who look up Latent Semantic Indexing and see something very different. There is no Wikipedia information on LSI Keywords. There is no information about how LSI Keywords might use LSI. There are no patents that explain how LSI Keywords work because they have never been patented.

I thought it might be helpful to explore Latent Semantic Indexing and its sources in more detail. It is a technology invented before the Web was around. It works to index the contents of document collections that don’t change much. Latent Semantic Indexing (LSI) might be like the railroad turntables that are used on railroad lines.

There Is a Website for LSI Keywords, but No Patent for LSI Keywords

A website offers “LSI keywords” to site owners but doesn’t provide any information about how they generate those keywords or use Latent Semantic Indexing (LSI) technology. It does not tell us how they have been generated or provide any proof that they make a difference in how a search engine such as Google might index content that contains those keywords. I came across a page from the makers of LSI Keywords that sounds more like it uses the technology behind Phrase Based Indexing instead of Latent Semantic Indexing. It links to the Wikipedia article on LSI, but there are no “LSI Keywords” on the Wikipedia page.

Where does Latent Semantic Indexing (LSI) come from?

One of Microsoft’s researchers and search engineers, Susan Dumais was an inventor behind a technology referred to as Latent Semantic Indexing. She worked on developing LSI at Bell Labs. There are links on her home page that provide access to many of the technologies that she worked upon while performing research at Microsoft. Her papers are very informative and provide many insights into how search engines perform different tasks. Spending time with them is highly recommended.

She performed earlier research before joining Microsoft at Bell Labs. This includes writing about Indexing by Latent Semantic Analysis. She was also granted a patent as a co-inventor on the Latent Semantic Indexing process. Note that this patent is from April of 1989 and got published in August of 1992. The World Wide Web didn’t go live until August 1991. The Latent Semantic Indexing (LSI) patent is:

Computer information retrieval using latent semantic structure
Inventors: Scott C. Deerwester, Susan T. Dumais, George W. Furnas, Richard A. Harshman, Thomas K. Landauer, Karen E. Lochbaum, and Lynn A. Streeter
Assigned to: Bell Communications Research, Inc.
US Patent: 4,839,853
Granted: June 13, 1989
Filed: September 15, 1988

Abstract

A methodology for retrieving textual data objects is disclosed. The information is treated in the statistical domain by presuming an underlying, latent semantic structure in the usage of words in the data objects. Estimates of this latent structure are utilized to represent and retrieve objects. A user query is recouched in the new statistical domain and then processed in the computer system to extract the underlying meaning to respond to the query.

Problems that Latent Semantic Indexing (LSI) was to solve

Because human word use includes extensive synonymy and polysemy, straightforward term-matching schemes have serious shortcomings–relevant material gets missed because different people describe the same topic using different words and, because the same word can have different meanings, the irrelevant material will get retrieved. The basic problem may be stated that people want to access information based on meaning, but the words they select do not adequately express the intended meaning. Previous attempts to improve standard word searching and overcome the diversity in human word usage have involved: restricting the allowable vocabulary and training intermediaries to generate indexing and search keys; hand-crafting thesauri to provide synonyms, or constructing explicit models of the relevant domain knowledge. Not only are these methods expert-labor intensive, but they are often not very successful.

The summary section of the patent tells us that there is a potential solution to this problem. Keep in mind that Latent Semantic Indexing was developed before the world wide web grew to become the huge source of information that it is today:

These shortcomings, as well as other deficiencies and limitations of information retrieval, are obviated, following the present invention, by automatically constructing a semantic space for retrieval. This treats the unreliability of observed word-to-text object association data as a statistical problem. The basic postulate is that there is an underlying latent semantic structure in word usage data that is partially hidden or obscured by the variability of word choice. A statistical approach estimates this latent structure and uncovers the latent meaning. Words, text objects, and, later, user queries extract this underlying meaning, and the new, latent semantic structure domain is then used to represent and retrieve information.

How Latent Semantic Indexing (LSI) Works

To illustrate how Latent Semantic Indexing (LSI) works, the patent provides a simple example, using a set of 9 documents (much smaller than the web as it exists today). The example includes documents that are about human/computer interaction topics. It doesn’t discuss how a process such as this could handle something the size of the Web because nothing that size had quite existed yet then. The Web contains a lot of information and frequently changes, so an approach created to index a known document collection might not be ideal. The patent tells us that an analysis of terms needs to occur “each time there is a significant update in the storage files.”

Google is Using More Modern Language Models

There has been a lot of research and a lot of technology development that can work with a set of documents the size of the Web. We learned from Google that they are using a Word Vector approach developed by the Google Brain team, described in a patent granted in 2017. I wrote about that patent and linked to resources used in the post: Citations behind the Google Brain Word Vector Approach. If you want to get a sense of the technologies that Google may index content and understand words in that content, it had advanced a lot since the days just before the Web started. There are links to papers cited by the inventors of that patent within it. Some of those can relate in some ways to Latent Semantic Indexing since it could be called their ancestor. The LSI technology from 1988 contains some interesting approaches. If you want to learn a lot more about it, this paper is insightful: A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. There are mentions of Latent Semantic Indexing in Patents from Google, where it is used as an example indexing method:

Text classification techniques can be used to classify text into one or more subject matter categories. Text classification/categorization is a research area in information science concerned with assigning text to one or more categories based on its contents. Typical text classification techniques are based on naive Bayes classifiers, tf-idf, latent semantic indexing, support vector machines, and artificial neural networks.

~ Classifying text into hierarchical categories

I was inspired to blog on a similar topic in the post: What are LSI Keywords and What I Use Instead of Them?

Sharing is caring!

48 thoughts on “Does Google Use Latent Semantic Indexing (LSI)?”

  1. So, does this mean LSI can finally be put to rest?

    I agree totally with the “LSI sounds good, but no proof” argument, however it seems to me (as a sometime writer) that using a range of synonyms for keywords would improve the semantic core of a page (I think that is indisputable from a human perspective as opposed to algorithmic).

    Using a word vector approach may be the way Google is approaching word meaning in practice, but just as there is no proof for LSI, is there any proof of this Bill?

  2. Hello Bill,

    I think the LSI is what Google still counts to identify the relevancy.

    If you search Google with “Institutions”; it will show you results of Schools, Colleges and Universities as well. And this might be a live proof that Google follows LSI. 🙂

    BTW, if you can please make a final note at the end of your article about what you think after the research.

    Thanks,

  3. There is a video from Bloomberg about Google’s use of Word vectors in applying rankbrain, and there is a patent which I link to above from Google that tells us that Google is likely using Word Vectors, plus many statements from Google’s Jeff Dean about Word Vectors.

    There are ways to use Semantics on your pages, including Schema markup and Structured Data and Context Terms and Context Vectors (Both context terms and context vectors are described in patents from Google.) Google has a number of patents that describe how they use synonym substitution and synonyms for query terms that don’t use LSI technology.

    Google has written whitepapers about semantic topic modeling which don’t mention LSI, such as:

    Improving semantic topic clustering for search queries with word
    co-occurrence and bigraph co-clustering

    There are so many possibilities worth investigating that we have some idea that Google is actually paying attention to.

  4. I don’t know how someone would use LSI, so I can’t provide any examples. The people at the site that offers LSI keywords don’t explain how they generate those keywords.Google does like some semantic techologies, such as Schema, where they provide examples of how Schema markup is used on schema.org, including json-ld. The Google developers pages also show off how to mark up pages for events and for rich results in snippets. So, if you want to use semantic technologies on your pages that Google does like and appreciate seeing, you should look at those.

  5. HI Shamin,

    Google does attempt to index synonyms and other meanings for words. But it isn’t using LSI technology to do that. Calling it LSI is misleading people. Google has been offering synonym substititions and query refinements based upon synonyms since at least 2003, but that doesn’t mean that they are using LSI. It would be like saying that you are using a smart telegraph device to connect to the mobile web. A telegraph was used to send long distance messages, but it isn’t a phone. Technologies change and evolve, and Google has developed their own semantic technology that is not LSI even though both are based upon Semantics. To claim that Google is using 6the relevancy of LSI is to ignore that “LSI” means (semantically) a specific type of technology. It is not the same.

  6. I could use some advice. So when I train folks in my company, I’ve done my best to explain “semantic search and “LSI.” Should I now stop discussing LSI?
    Thanks – John

  7. Hi John,

    I think if you start teaching them things other than LSI, when teaching about Semantic Search, such as structured data and rich results, and schema markup and how Google might rewrite queries to include synonym substitutions in those queries, or to use Context terms in your content from knowledge bases that show off the context of the meaning of your query term. Google definitely uses an understanding of Semantics, and have developed ways of understanding those.

  8. Bill, I think it would be better to describe the evolution of LSI and talk about its downsides — it simplifies content and words by using SVD, creating concepts to document matrices that it works upon. Google uses the much more advanced technique — word2vec, which is about working with words, not simplified concepts.

  9. I agree with Shamim, I think the LSI is what Google still counts to identify the relevancy.

    Using different expressions with identical intent in title tags, h1tags, h2 tags, content can only help google understand better the purpose of that page.

  10. Bill Slawski I would like to say thanks for sharing this crucial info. According to my knowledge, Google uses latent semantic indexing.

  11. It seems like an exercise in futility to try to unravel how Google “understands” the relationship between terms, phrases, and concepts in an effort to “tune” writing methods to get better rankings. Each time content creators adjust their methods to “accommodate” Google, algorithmic changes make those practices unfruitful. It seems like the sensible thing to do is follow the advice of Google quoted in the post:

    “Focus on creating useful, information-rich content that uses keywords appropriately and in context.”

    It seems like that approach would be more effective. Yet, there appears to be no shortage of SEO “experts” writing about how to manipulate search results.

    … It makes me dizzy.

  12. I think we’re part way there. Google’s always said they want to be able to answer your question before you ask it.

    Will we get to the point where we’ll be able to ask: “Cheapest meal” And Google return “pizza X, $14 or burger and fries $12” knowing that this is exactly what I meant? I don’t know. But with a lot of AI coming to the front (especially from Google) would it not be possible for the search to learn what we mean.

    But yeah, I’m going more with simple (er) use of word vectors at this time. This is easily seen with some words being, almost, perfectly interchangeable.

  13. Hi Up SEO.

    Including Context terms from knowledge bases, as Google has shown in a couple of patents on Context-based indexes this year achieves that and does it in a way that Google says they used. For example, if you read through this Google Patent:

    User-context-based search engine
    https://patentscope.wipo.int/search/en/detail.jsf?docId=US177618724

    They describe how using vocabulary terms related to the context of the word being optimized for, or answering specific questions about that meaning within your page can make it clear what the meaning of that page is about. It’s not LSI, but it is Google saying specifically how they would look at context within content to understand the purpose of a page.

  14. Hi Boris,

    I provided information about the roots of LSI, where it was invented, and by whom. I am not going to write about how it uses SVD because I don’t expect many in my audience to be capable of using SVD to create document Matrices. SEOs aren’t search engineers, and they aren’t building search engines. Having an idea of what Google might be using and may be looking for is likely appropriate. I thank you for pointing those issues out. I did link to the patent from Google that does describe processes such as word2Vec to show that Google has developed a different approach that can scale more that LSI. The audience for this post are SEOs and not computer scientists.

  15. Hi Small Business Web Tips.

    Google has provided us with information about how they are using Semantic Approaches to understanding content in papers and patent about those approaches. There are a number of people doing SEO who insist that Google is using LSI, without providing any proof whatsoever, except to explain that LSI is about synonyms, and Google does understand synonyms. That is a really simplistic approach and explanation. LSI technology wasn’t created for anything the size of the Web, or anything that changes a quickly as the Web. Google has developed a word vector approach (used for Rankbrain) which is much more modern, scales much better, and works on the Web. Using LSI when you have Word2vec available would be like racing a Ferrari with a go-cart.

  16. Hi Robert,

    With Google using Neural Networks and AI, they will be able to answer a lot of questions, and the technology is growing every day. We see new things from the Google Brain team and from Deep Mind. LSI was patented in the 1980s, making it around 30 years old. I think we have come a long way since then.

  17. Hi Newssplususa,

    Thank you. There is so much going on in SEO that is new, it’s a little challenging to keep up with it all.But that is part of what makes it fun. 🙂

  18. Hi Amic,

    I didn’t write this post to teach people how to use LSI, but that it was likely that Google doesn’t use LSI and that it does use other approaches. I linked to the Patent that came out in the late 1980s and early 1990s, and a couple of papers about LSI. I am suggesting that you don’t try to use LSI for SEO because it is possible that it isn’t going to help you much. There are other ways to use semantic approaches to improve your pages, and the content on them. For instance, I gave a presentation in Pubcon a couple of months ago, where I talked about some of those. The presentation can be found here:

    Semantic Keyword Research and Topic Models
    https://www.seobythesea.com/2017/11/semantic-keyword-research-topic-models/

  19. I have to say this is a pretty hilarious comment thread. Bill you have the patience of something that has a lot of patience 🙂

    I believe that the people in here stating Google does use it are simply trying to stack the comments with counterpoints, in order to further support their snake oil. How many people do you know with the last name “Johnsons?” LinkedIn report zero Karen Johnsons and Google won;t even let you re-search with the s.

    My 2 cents. Nice job asking for proof Bill but I think you have answered a very kind amount of these “but I think they do’s” for now. 🙂

  20. Thanks, Chris,

    There are some many things that Google is actually doing with Semantics and Synonyms that people should be concerned about, and they are using LSI tools that possibly don’t even use LSI – just someone taking advantage of the name because it has the word “Semantics” in it. I would like to see people using Schema and structured data and context vocabulary on their pages because it makes sense to do.

  21. Hi Melissa,

    I think a lot of people understand that Google is trying to become more Semantic, and understand the meanings of words better, but have developed other ways of doing that other than through the latent Semantic Indexing process. People who are pushing the idea of using LSI often misrepresent it as Google just using synonyms, which really doesn’t describe what LSI is. People whom are interested in using Semantics in web pages they create should ideally learn about Schema and Structured Data, and Context terms from Knowledge bases that cover the meaning of words they are interested in. It is possible to include semantic meaninig on your pages without using LSI at all.

  22. You are such a genius that my eyes began to bleed reading this.
    I take it that this article is a discussion of semantics, so to speak. I get that LSI is a specific term coined b4 the web. As such it is NOT what we (SEO’s) mean when we refer to it.
    I think that (SEO) people are using LSI to mean the “synonym substitutions” and the “Context terms” you mentioned in one of your reply’s.
    Now if we can have a better term than LSI, let’s have it.
    I appreciate your clarity. Keeping our feet to the fire.

  23. Hi Jennifer,

    There is something very ironic about using a highly technical term that stands for a highly mathematical process such as LSI to act as a substitute for synonyms. It’s like people are working to make themselves sound smarter than they really are.

    Looking at a knowledge base about the meaning of a keyword that you are trying to optimize a page for, and grabbing context vocabulary terms from that page isn’t a difficult thing to do, and it really could help Google index your page under that meaning better. Which makes it worth doing. So, if you are writing about the Jacksonville Jaguars, and you look them up in Wikipedia, and see that they play their home games at EverBank Field, and mention that field on your page about them. Google knows immediately that you are writing about the football team, and not the cat nor the car. That is Semantic Search, where identifying an attribute that may be contextually related to what you are writing about makes it more likely that Google indexes that your page is about the Jaguars NFL football team. It’s not a synonym, but it is a word that indicates context. It adds a preciseness to your page that improves the quality of your content.

    LSI means a technical and mathematical heavy way of indexing content using SVD technology. I would rather tell people to use context terms, or to use Schema markup, or Structured Data on their pages, because those are Semantic approaches that Google really does use, and we know that because they have patented those approaches and written about them in whitepapers and blog posts and Google help/support pages.

  24. Insightful article. I was using LSIGraph for finding the Long Tail Keywords.
    After reading your thought-provoking post, I think, I have to change my attitude towards the LSI keywords. Google is a very smart but and it’s very difficult to rank higher in a saturated niche.
    Many bloggers were dependent on LSI study and I was one of them but after reading your whole article, I must say I was missing something. Thanks for sharing this unique piece.

  25. Hi Bill

    Thanx for a great post discusding issues that need to be addressed within the understanding of LATENT SEMANTIC INDEXING ( LSI ) and the new expanded search criteria that have evolved from the Latent Semantic Search

    In my mind, the issue is that these new search techniques are an expanded and elevated version of semantic indexing which is active ( not latent ) and inclusive, where as the old latent semantic indexing was basicaly for static info, and not dynamic info of todays web pages.

    Changes over time have been connected to better costs and improved computing speeds. Also the physical amount of data available these days allows for many new techniques to be used within the online search environment.

    Just saying that new techniques are built around old ideas.
    #FRANKIE2SOCKS

  26. Hi Bill,

    You are right! Actually, Google never said that it is using LSI technology or something like that. But it has mentioned that use of related words makes it easier to understand content more. Plus, after the introduction of Rank Brain, this LSI thing will get more relevance. It will be really misleading to say that Google using any LSI technology. On the other hand, I believe if one will writ for his/her audience the content itself will contain all important elements to call it an optimized content. Truth is that I have seen more than 60% blogs writing just to rank and not to inform or share the knowledge.

  27. Hi Joyita,

    Google never said in anyway that they were using LSI technology. They have admitted the use of related words in Phrase-Based indexing and in Rankbrain; but neither of those use LSI – they use more modern technology. I have seen people who sell SEO training suggest that people should use LSI keywords to help with Rankbrain, but I really question how helpful that would be. Google has pointed out other approaches involving context terms and realted words and schemas and structured data that are better approaches. I would question anyone suggesting the use of LSI, a 30 year old technology to handle something like Rankbrain. Have them explain how that works.

  28. Thank you, Frank.

    Search has been evolving and developing in the 30 years since LSI was invented. The link to the home page of one of the inventors above, Susan Dumais, is filled with some great papers about User design and searh and personalization. It’s really great stuff. Google does like when people have related content on their pages, and write using themes on their pages about topics, so that a page that might be returned for a query term doesn’t just include that word or those words on the page, but also contain a lot of information about that term or those terms. This can be done by including context terms on your page that are related to your query term, by choosing appropriate schema for your page and structured data. LSI did help set the way; but new approaches have come along. It’s worth learning about them. 🙂

  29. Informative blog, Bill. I have been preached semantic translation for my different websites by SEO gurus without being able to put a finger on it. At least I know what Susan was thinking even if thats not what big G is doing. Also that post from Warrenton, it need to check out the trail as its not far from where I live.

  30. Really interesting article Bill.

    I actually hear it a lot from some “SEO experts” mentioning LSI being used by Google but as you mentioned they can never provide proof to back this claim up and they never really are able to provide a clear definition of what LSI does.

    Having read this article I can honestly say I’m half way there to having a better understanding about LSI’s and I wasn’t aware of it’s origins – really good History lesson!

    Just a quick question for you though Bill, how do you think Search will evolve in the forseeable future (or do you have an article about this already?). It would be really interesting to know you thoughts. Thanks

  31. HI Sid,

    There are ways to add Semantically relevant terms to pages that don’t involve using an LSI Tool that doesn’t explain how it returns the results it returns. It is possible to add terms to a page that are taken from a knowledge base page that help define the context of your page, and the terms you are optimizing it for better. It is also possible to look at other pages that rank highly for the same query terms you are targeting, and look for meaningful phrases on those pages that frequently co-occur on the highest ranking pages for the same meaning of that term that you are targeting. Those methods aren’t LSI, and they don’t have to be. They are semantic approaches that Google has stated in patents and papers that they are paying attention to.

    I really liked that trail in Warrenton. I liked walking around town a lot, too. Many of the homes in the area were older and historic, and were different things in the past. The area where Warrenton is at was a crossroads, which had an inn, and a general store (which is now a house on Main Street in Warrenton), and a blacksmith (I believe it was over by the Courthouse.)

    There are some documents online that say more about the historic district of Warrenton, such as this one:

    https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&ved=0ahUKEwjJo-ecoIjZAhUJx2MKHY_ZDdYQFghcMAI&url=https%3A%2F%2Fwww.dhr.virginia.gov%2Fregisters%2FCounties%2FFauquier%2F156-0019_Warrenton_Historic_District_1983_Final_Nomination.pdf&usg=AOvVaw1JqifSTwg8fC7vpNAsX3qI

  32. I have a question in mind
    How much time google takes to index full site .
    And what if I update my blog daily??

  33. Hi Sanjay,

    Your question has nothing to do with the topic of this post. Google does not use Latent Semantic Indexing to index Web sites. Google does use their Caffeine indexer to index sites. There are 3 related patents that describe how Caffeine works to incrementally capture new content (such as a new blog post every day) and capture that new information and add it to an index where new content can be queried. One of those patents is this one:

    Document treadmilling system and method for updating documents in a document repository and recovering storage space from invalidated documents
    http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=7,617,226.PN.&OS=PN/7,617,226&RS=PN/7,617,226

    The amount of time it takes Google to index different sites may vary based upon how large they might be, and how much they might change on a regular basis, and whether Google is following a politeness protocol when it crawls them, to not use too much bandwidth, so that other people can access the site at the same time.

  34. Hi Boss Digital,

    Reading through the LSI patent was interesting for me. I mentioned that I had read a number of articles from one of the inventors, Susan Dumais, and the work that she did on Microsoft’s search engine show that she knows a lot about search. The LSI patent does use an example of a number of books, and having seen that did put it into perspective for me that it just wasn’t built for something the size of the Web that changes as quickly as the Web does (with user generated content, and new pages and sites springing up everyday.)

    I’ve started outlining a post about all of the Semantic-related changes and updates that seem to be coming to Google, and hopefully will be coming out with that one soon, unless something else shows up in the newly granted patents from Google (There are sometimes some big surprises among those.)

  35. Hi, Mr.Bill
    As I am a SEO fresher got know lot of things about semantic indexing of geogle. thanks for sharing your knowledge, Please keep Posting…….

  36. I joined the Seo company a few months ago. And I did not know of Latent Semantic Indexing. But after reading your post I find it very simple. Thanks for sharing this with us.

  37. Hi Bill,

    I am super excited that at least someone put up forward the actual resource and with proper pieces of evidence that LSI is a traditional term which has now evolved into just “Semantic Search.”

    It is even far more than just a thesaurus which presents some synonyms as a part of their search keywords.

    It is a combined form of word2vectors, NLP and a far better version of that anecdotal research paper published by Microsoft. I think Semantic Search is not just “Keyword matched search Results” or “Just a SERP with synonyms and matching data.”
    For e.g. https://academic.microsoft.com/#/faq

    Marketing people are smart at preaching smart, and thus they have turned an age-old term into their part of marketing. People are also blindly following them in the hope of getting ranked automagically. At least, they should cite some official research papers while terming and promoting a mechanical phenomenon. I believe so.

  38. LSI is a technology created by Bell Labs, and marketing by purposefully mislabeling something is really misleading people. I don’t like the practice of doing that. It is dishonest to say LSI is semantic Search, because there is semantic search that has nothing to do with LSI.

  39. Hello Bill,

    After reading through everything, I think what people mean when they say LSI for SEO, is just the semantic approaches to keywords used by Google, without really drawing the reference to the actual technology that LSI is. So the bottom line here is that using synonyms for the onsite optimization – is what people mean by LSI keywords. It is like Russian people call the duct tape Scotch – beacause it is the only brand they ever had in the 90s and they thought that this is the name for the duct tape. Most don’t even know that it is just a brand name of the duct tape, not the actual duct tape, however it does not imply that what they are refering to when they say ‘scotch’ is wrong, the brand name was just sort of ‘re-adopted’ to mean ‘duct tape’ in the Russian language. So are we really right to say Google does not use LSI keywords, when what people are refering to when they say ‘LSI keywords” is just the semantic search? We can of course criticize the fact that they are kind of ‘mislabeling’ the LSI technology, however the meaning the SEO guys put into it remains intact for SEO.

  40. Hi Jane,

    If people suggested that people use synonyms and semantically related words when they write something, I would have no problem with that. But, there are people who are charging over $3,000. for SEO lessons, some of whom claim to have Ph.D.s. When people are highly educated and brag about that education, and charge such high rates to people who are interested in learning about SEO, and those people tell people to use LSI keywords, I believe that they know they are purposefully using slang that misrepresents what they are actually recommending. I think it’s important to point out that such misleading is taking place, and that LSI for SEO is not an actual technical process, but a guess on the part of the people making that recommendation that the search engines might like semantically related content. Google does not use LSI Keywords, and while they do use Semantic Search, there are actual names for Semantic Search and techniques that people could use that might help their SEO; but randomly choicing synonyms and words they might think are semantically related words is not necessarily helpful. The meaning they put into it is limited and questionable advice at best. And it is sometimes dished out by some highly educated people who know that they are misleading people. Another word for that is fraud.

    There is Schema Vocabulary and Structured Data, neither of which have anything to do with LSI keywords. If anyone is going to recommend semantic approaches to SEO, they should be talking about those concepts.

    There is the use of co-occurring terms used in phrase-based indexing, and which Google has written up in at least one white paper last year. I’ve written some posts about that, and how it is a semantic topic modeling approach. It doesn’t use randomly selected synonyms or related semantic phrases.

    As professionals who, sometimes charge thousands of dollars to teach others about SEO, using slang when they in some cases have post-doctorate degrees is a questionable approach. I have no problems telling people that Google does not use LSI keywords.

Comments are closed.