Does Google Use Latent Semantic Indexing?

Railroad Turntable Sign
Technology evolves and changes over time.

There was a park in the town in Virginia where I used to live that had been a railroad track that was turned into a walking path. At one place near that track was a historic turntable where cargo trains might be unloaded so that they could be added to later trains or trains headed in the opposite direction. This is a technology that is no longer used but it is an example of how technology changes and evolves over time.

Latent Semantic Indexing is Old Technology

Some people doing SEO claim that Google is using Latent Semantic Indexing because they believe that by saying that they are saying that Google is using synonyms and semantically related words. But Latent Semantic Indexing is an old patented technology that doesn’t just mean that Google is using synonyms and semantically related words. Google does like synonyms and Semantics, but they don’t call it Latent Semantic Indexing, and for an SEO to use those terms can be misleading, and confusing to clients who look up Latent Semantic Indexing and see something very different.

Continue reading “Does Google Use Latent Semantic Indexing?”

Google Targeted Advertising, Part 1

Google targeted Ads

One of the inventors of the newly granted patent I am writing about was behind one of the most visited Google patents I’ve written about, from Ross Koningstein, which I posted about under the title, The Google Rank-Modifying Spammers Patent It described a social engineering approach to stop site owners from using spammy tactics to raise the ranking of pages.

This new patent is about targeted advertising at Google in paid search, which I haven’t written too much about here. I did write one post about paid search, which I called, Google’s Second Most Important Algorithm? Before Google’s Panda, there was Phil I started that post with a quote from Steven Levy, the author of the book In the Plex, which goes like this:

They named the project Phil because it sounded friendly. (For those who required an acronym, they had one handy: Probabilistic Hierarchical Inferential Learner.) That was bad news for a Google Engineer named Phil who kept getting emails about the system. He begged Harik to change the name, but Phil it was.

Continue reading “Google Targeted Advertising, Part 1”

Google May Diminish Reviews of Places You Stop Visiting

Google Timeline Reviews

How Google May Diminish Reviews Based on Location History

I don’t consider myself paranoid, but after reading a lot of Google patents, I’ve been thinking of my phone as my Android tracking device. It’s looking like Google thinks of phones similarly; paying a lot of attention to things such as a person’s location history. After reading a recent patent, I’m fine with Google continuing to look at my location history, and reviews that I might write, even though there may not be any financial benefit to me. When I write a review of a business at Google, it’s normally because I’ve either really liked that place or disliked it, and wanted to share my thoughts about it with others.

A Google patent application filed and published by the search engine, but not yet granted is about reviews of businesses.

It tells us about how Google might diminish reviews for businesses because of my location history.

Continue reading “Google May Diminish Reviews of Places You Stop Visiting”

Semantic Keyword Research and Topic Models

Seeing Meaning

I went to the Pubcon 2017 Conference this week in Las Vegas Nevada and gave a presentation about Semantic Search topics based upon white papers and patents from Google. My focus was on things such as Context Vectors and Phrase-Based Indexing.

I promised in social media that I would post the presentation on my blog so that I could answer questions if anyone had any.

I’ve been doing Semantic keyword research like this for years, where I’ve looked at other pages that rank well for keyword terms that I want to use, and identify phrases and terms that tend to appear upon those pages, and include them on pages that I am trying to optimize. It made a lot of sense to start doing that after reading about phrase based indexing in 2005 and later.

Some of the terms I see when I search for Semantic Keyword Research include such things as “improve your rankings,” and “conducting keyword research” and “smarter content.” I’m seeing phrases that I’m not a fan of such as “LSI Keywords” which has as much scientific credibility as Keyword Density, which is next to none. There were researchers from Bell Labs, in 1990, who wrote a white paper about Latent Semantic Indexing, which was something that was used with small (less than 10,000 documents) and static collections of documents (the web is constantly changing and hasn’t been that small for a long time.)

Continue reading “Semantic Keyword Research and Topic Models”

Topical Search Results at Google?

The Oldest Pepper Tree in California

At one point in time, search engines such as Google learned about topics on the Web from sources such as Yahoo! and the Open Directory Project, which provided categories of sites, within directories that people could skim through to find something that they might be interested in.

Those listings of categories included hierarchical topics and subtopics; but they were managed by human beings and both directories have closed down.

In addition to learning about categories and topics from such places, search engines used to use such sources to do focused crawls of the web, to make sure that they were indexing as wide a range of topics as they could.

Continue reading “Topical Search Results at Google?”

Using Ngram Phrase Models to Generate Site Quality Scores

Scrabble-phrases
Source: https://commons.wikimedia.org/wiki/File:Scrabble_game_in_progress.jpg
Photographer: McGeddon
Creative Commons License: Attribution 2.0 Generic

Navneet Panda, whom the Google Panda update is named after, has co-invented a new patent that focuses on site quality scores. It’s worth studying to understand how it determines the quality of sites.

Back in 2013, I wrote the post Google Scoring Gibberish Content to Demote Pages in Rankings, about Google using ngrams from sites and building language models from them to determine if those sites were filled with gibberish, or spammy content. I was reminded of that post when I read this patent.

Continue reading “Using Ngram Phrase Models to Generate Site Quality Scores”