A newly granted Google patent on phrase-based indexing calls for a new look at that approach to indexing phrases on the Web, including a process referred to as phrasification.
Say you want to find out who the chief of police is in New York City. You might type the following words into a search box at Google:
When Google attempts to find an answer for you, it may break your query into individual words to find all of the documents that might be a best match for your search:
- New AND York AND police AND chief
Google may then take all the documents that are returned, and see which ones contain all of the terms you used, and then rank those based upon some of the ranking algorithms the search engine uses to try to show you the best matches for your query.
There are a number of ways a search engine may decide upon how important a web page might be. That measure of importance might be used by search engines, along with a determination of relevance, as one of the ranking signals used to decide which pages to show first in lists of results shown to searchers. That importance might also be used to decide which pages a search engine crawling program should crawl and index, and revisit to see if content on those pages have changed.
A search engine might view the links between web pages, and decide that pages linked to frequently are more important than pages that aren’t. It might also determine that web pages that are linked to by important pages are more important than pages linked to by less important pages. Google’s PageRank is one approach for determining how important pages might be based upon looking at links between pages.
There are other ways that a search engine might use to decide how important a web page might be, including actually attempting to see how many people actually use that page.
A recently granted Google patent from the founders of Applied Semantics discusses a search interface that could help searchers find web pages based upon the meanings of their queries rather than just pages that include those keywords.
In the late 90s, Adam Weissman and Gilad Elbaz decided to start a search engine that would search on meanings or concepts instead of keywords. Along with a few friends and family, they formed a company named Oingo, and along the way filed for a patent on a search based upon meanings rather than keywords.
The technology they developed could be used in a number of ways in addition to search, and provided an interesting alternative to keyword based search that would lead to some significant developments in the world of search engines.
Oingo Changes Directions
There are a lot of pages on the Web that conventional search engines can’t find, crawl, index, and show to searchers. The University of California (UC), funded partially by the US Government, has been working to change that.
When you search the Web at Google or Yahoo or Bing, you really aren’t searching the Web, but rather the indices that those search engines have created of the Web. To some degree, it’s like searching on a map of a place instead of the place itself. The map is only as good as the people mapping it.
Map makers have consistently worked to develop new ways to get more information about the areas that they survey. For example, a New Deal program in the 1930s under the Agricultural Adjustment Administration led to the creation (pdf) of a $ 3,000,000 map. Continue reading
Paul Boag wrote a post at his site Boagworld asking a number of questions about SEO. I started writing a comment at his blog, but it quickly grew to become longer than his post and the questions and comments that he had about SEO, so I decided to post my response here.
In Paul’s post, Why I don’t get SEO, he came up with five reasons why he had doubts about SEO. My response doesn’t address his concerns in the order that he asked them, and it touches upon some of the comments written by others as well. If you have questions or concerns about SEO that aren’t addressed in this response, please feel free to ask them in the comments below.
What is Good SEO?
Good SEO is not “cheating the system,” or “manipulating search results.” Good SEO is part of a marketing plan that makes it more likely that the good content you create will be found by people who might be interested in what your web site has to offer.
When someone types “George Washington” into a search box, they are probably more interested in the Revolutionary War general and President than some random George in Washington. A search for “Washington Hotels” is more likely looking for lodging in Washington than hotels named Washington. Searches for places with signs that say “Washington Slept Here” are probably not about hotels (and those searchers probably have too much time on their hands).
When words used in search queries can have more than one meaning, a search engine may provide better search results to searchers if the search engines can calculate a probability of the most likely meaning of that word. That’s the focus of a patent granted to Yahoo this past week:
Three patents granted today to Google, Microsoft, and Yahoo all describe how each of the search engines might take a close look at page addresses, or URLs on dynamic web sites.
I wrote about the patent from Microsoft back when it had just been published as a pending patent application, in Microsoft Creating Rules for Canonical URLs. It appears that the patent examiner who reviewed the patent saw my blog post, because it is referred to in the patent within the “other references” section (Slawski, “Microsoft Creating Rules for Canonical URLs,” Sep. 29th, 2006, pp. 1-5. cited by examiner.). I don’t know if it is the first blog post to be cited as a reference in a granted patent (probably not), but it’s the first of my posts to be listed in one.
All three patents take a close look at the structures of URLs on dynamic web pages, which can often include large amounts of information within those URLs. For example, here’s a link to a page about a pair of jeans:
A misconception about web pages that lingered for a long time on the Web was that most people who visit your web site will enter your site on your home page. Another one is that the meta description you choose for a page will usually be what a search engines shows as the snippet, or summary, for your page in search results.
Search engines have made it a lot easier for a visitor to enter your site at pages other than your home page. And the summary, or description snippet, that those search engines provide about pages listed in search results are more likely to be taken from text on your page that matches the query terms used to find your page, especially if your meta description doesn’t include that text.
You’ve created a web page, carefully chosen a title for that page that carefully describes the contents of that page, and uses a keyword phrase that you hope your audience will use to try to find the page. You created a meta description for the page that is persuasive, engaging, and (you hope) likely to convince visitors to click on the link to your page when they see it in search results.
How likely is it that a search engine will show your page title and your meta description when your page does show up in search results?