We can make your web site easier to find, and easier to use.
| By Bill Slawski, on December 29, 2008 Search for the word “automobile” at Google, and the search engine might expand your search to include results for the word “car” as well, since it is a synonym of the word automobile. Accidentally misspell the word as “automoble” and Google might automatically correct your spelling error and search for “automobile.” Follow that up with a search for the word “driving” and Google could expand your query by using a process called stemming to look at the root of the word (driv-) and adding common endings to it, to come up with, and include in the search, such words as “driving,” and “driver.” This kind of query expansion is aimed at providing searchers with better search results. This method of expanding queries might not happen yet (though it sometimes appears to for spelling corrections at least), and it might not happen in all searches. Typical approaches to query expansion include: Continue reading How a Search Engine Might Find Synonyms to Use to Expand Search Queries By Bill Slawski, on December 10, 2008 In 2002, Jon Kleinberg wrote a paper about looking at how frequently terms and phrases might appear in the emails he received or the news articles he read, and how some terms would suddenly become popular over hours or days, and then lose that popularity. For example, as a professor, he would receive a lot more emails that contained the word “prelim” in the few days before midterm exams. The paper is Bursty and Hierarchical Structure in Streams, and paying attention to bursts of activity related to certain terms, like those described in the paper might tell us something interesting about the times that certain buzzwords became more popular. Imagine taking this idea of the burstiness of phrases appearing in emails or news articles, and instead looked for burstiness of phrases appearing in search queries at a search engine. Would pages that include phrases that have suddenly become more popular in searches, over a short period of time, be pages that searchers might be more interested in seeing? When people search for different query terms, a search engine can keep track of how frequently those terms are searched for in search logs which record the number of searches at the search engine for different terms. If there has been an increase or decrease in the frequency of searches for that query term, that increase or decrease in popularity could be noted. Continue reading How Burstiness of Search Queries Could Increase Page Rankings By Bill Slawski, on December 2, 2008 Search engines have transformed the way that we locate information and learn about the world around us. When we type a term into a search box, we are presented with pages of search results that bring a wealth of information to our fingertips. The results that we see often include more than just a list of web pages. A search for [baseball] at Google provides links to web pages, videos, news articles, book results, and related search queries. The top result I received was a link to the Major League Baseball (MLB) site, with a list of sitelinks to eight additional pages related to that domain. Interestingly, four of those sitelinks are to different subdomains on the MLB site, to team pages for the Boston Red Sox, The New York Yankees, the Los Angeles Dodgers, and the Baltimore Orioles. There may be many pages that show up in search results relevant to a query that we perform. In my search for [baseball], I was shown “Results 1 – 10 of about 197,000,000 for baseball.” I’m not going to look at all 197 million pages, and chances are that I might not make it past the first page of the search results. Continue reading Domain Collapsing, Indented Pages, and Search Results By Bill Slawski, on November 21, 2008 I’ve written in the past about many of the reasons why you might find the same content at different pages on the Web, and some of the problems that duplicate content might present to search engines. When someone performs a search on the Web, a search engine doesn’t want to show more than one page that contains the same or very similar content to that searcher. A search engine also doesn’t want to spend time and effort in crawling and indexing the same content on different sites. One of the challenges that a search engine faces when it sees duplicate content is deciding which page (or image or video or audio content) to show to a searcher in search results. If a search engine provided a way for creators of content to find unauthorized uses of their content on the Web, it might take some of that burden off the search engine. A newly published patent application from Google describes a process that could be provided for people to search for duplicate copies of their content on the Web, even if their content isn’t readily available online. Continue reading Google to Help Content Creators Find Unauthorized Duplicated Text, Images, Audio, and Video? By Bill Slawski, on November 11, 2008 Search pogosticking is when a searcher bounces back and forth between a search results page at a search engine for a particular query and the pages listed in those search results. A search engine could keep track that kind of pogosticking activity in the data it collects in its log files or through a search toolbar, and use it to rerank the pages that show up in a search for that query. A recent patent application from Yahoo describes information that a search engine may collect when searchers click on search results, and suggests that pogosticking information could be used with a ranking system like the one Yahoo described in a patent filing on User Sensitive PageRank, which I wrote about in Yahoo Replaces PageRank Assumptions with User Data. The Yahoo patent filing on pogosticking is: Search Pogosticking Benchmarks Invented by Thomas A. Kehl and Jyri M. W. Kidwell Assigned to Yahoo US Patent Application 20080275882 Published November 6, 2008 Filed: May 2, 2007 Continue reading Search Pogosticking and Search Previews By Bill Slawski, on November 4, 2008 Google was granted a patent today from the USPTO on Universal Search, which provides searchers with a mix of search results from different categories, such as news, images, advertisements, web pages, and kinds of results when they type in a search query The original patent application was filed on December 31, 2003, and Google announced the introduction of Universal Search in May of 2007. The patent describes some different kinds of document categories that may be shown in search results, such as: - Sponsored links,
- News documents,
- Product documents,
- Documents summarizing discussion groups,
- Images,
- General web documents, and;
- Other document classifications
The Official Google Blog described a few more categories that could be shown to searchers in their announcement, Universal search: The best answer is still the best answer, including Maps, Books, Video, as well as additional contextual links to other categories of documents such as “blogs,” “books,” “groups,” and “code.” Continue reading Google Universal Search Patent Granted By Bill Slawski, on October 2, 2008 If you look at a typical page that shows up after you perform a search at one of the major commercial search engines, you’ll see that those search result pages don’t differ too much from each other. Some sets of search results do include news, images, maps, amd other results that go beyond just a list of web pages that may contain the keywords used in a search. But, how interested would you be in entering the address of a web page and seeing related search queries for that page, or related people or places or other pages? Inversion Searches Showing Related Queries This kind of search, referred to as an “inversion search,” by some Microsoft inventors, is the topic of a new patent application from the Washington-based search provider. Continue reading How a Search Engine Might Provide Searchers with Related Search Queries For Web Pages By Bill Slawski, on September 25, 2008 Web pages can be messy; they can have more than one topic on a page, and use templates that surround those topics adding little meaning to the meat of the content, filled with links and labels, advertising and boilerplate, copyright and other notices. With a diversity of topics, those pages may not be easily crawled and recorded and indexed and found, by search engines and searchers. When we think of search engines and how they work, we often break what they do down into three main parts – discovering new pages and new content on old pages, indexing content on those pages following rules that show a preference for important pages and unique content, and presenting relevant and meaningful information to searchers and their intents (or at least matching their keywords) in response to queries that they enter into a search box. We usually don’t think of search engines as indexing parts of pages, chunks of information that might exist side-by-side with very different topics, and yet many pages are messy like that. But we’ve had signs from the white papers and patent filings that we see from search engineers, that they might try to segment and capture information about different topics found on the same page. Continue reading Microsoft Granted Patent on Vision-Based Document Segmentation (VIPS) | Change Language SEOby the Sea To find out about professional search engine optimization (SEO), consulting and internet marketing services, for your site or business contact Bill Slawski at: SEO by the Sea84 Washington St Warrenton, VA 20186 1 (540) 905-4911 9am - 5pm (EST) Social Networks for Bill Slawski:  |
Recent Comments