When you do a search for some terms over at Google, you might get a mix of results from different types of searches, including Web pages, news stories, images, videos, book listings, and others.
While we’ve been seeing results like this for over a year, we really haven’t heard much from Google on how they go about deciding what to show us where within search results.
We now have some ideas on how those results are blended together, straight from Google, through a patent application published this week at the US Patent and Trademark Office.
David Bailey, one of the inventors listed on the patent, gave us a look Behind the scenes with universal search at the Official Google Blog last year, where he told us of one of the challenges behind Universal Search:
Continue reading How Google Universal Search and Blended Results May Work
Web pages can contain a lot of information about various types of objects such as products, people, papers, organizations, and so on. Information about those objects may be spread out on different pages, at different sites.
For example, a page may host a product review of a particular model of camera, and another page may present an ad offering to sell that model of camera at a certain price.
One page might display a journal article, and another page could be the homepage for the author of that article.
Someone searching for information about the camera, or about the author may need information contained in both pages. They may have to use a search engine to locate multiple pages, to find the information that they need.
If there were a way for a search engine to automatically identify when information on different web pages relates to the same object, that might be helpful to searchers in a number of ways.
Continue reading How Search Engines Can Index Pages in Parts
Many web pages contain more than one topical section, or blocks, which may make it difficult for a search engine to tell what a page is about when it is trying to index that page.
These blocks may include such things as a main content area, navigation bars, headings, footers, advertisments, and other content that may refer to other pages on a site, or on other sites.
The Value of Knowing the Most Important Block
Being able to identify a block within a web page that represents the primary topic of that page may help a search engine decide which words are the most important ones on the page when it tries to associate the page with keywords that someone might search with to find that page.
Identifying that content might also help the search engine decide what topic is most relevant to any ads that they might show on the page if they are an advertising partner with the publisher of the page.
Continue reading Search Engines, Web Page Segmentation, and the Most Important Block
One of the technical issues that can cause problems with a search engine crawling a site to index its pages is when the content of pages on that site appears more than once on the site at different URLs (Unique resource locators, or web page addresses).
Unfortunately, this problem happens more frequently than it should.
A new patent application from Yahoo explores how they might handle dynamic URLs to avoid this problem. What is nice about the patent application is that it identifies a number of the problems that might arise because of duplicate content at different web addresses on the same site, and some approaches that they might use to solve the problem.
While search engines like Yahoo can resolve some of the issues around duplicate content, its often in the best interest of site owners to not rely upon search engines to fix this problem on their own.
Avoiding the Crawling of Duplicate Pages
Continue reading Same-Site Duplicate Pages at Different URLs
The order that pages appear in the results of a search at a search engine may be influenced by the number of pages that link to that page, and by rankings of the pages that link to that page.
When a site is linked to by a popular and trusted domain, that link might provide more value (and a higher ranking) than a link from a site that is less popular and trusted.
Ages of Linking Domains
A new patent application from Microsoft adds another twist, by also ranking domains based upon the ages of domains which link to those domains.
Continue reading Do Domain Ages Affect Search Rankings?
You go to a search engine, and type some query terms in the search box. A list of results is returned by the search engine, and you visit a link to one of the results that appears.
Looking through the page, you may not see your query terms on the page itself. Why would the search engine return that result to you?
Determining Relevance from Anchor Text
One reason might be that the search engine is looking at the anchor text in links pointing to the page to determine that the page is relevant for your query terms.
This can be very helpful when a page doesn’t have much text on it, such as a video or an audio file, or where the amount of text is very limited or is non-existent.
A patent application from Microsoft explores the use of anchor text to define the context of a page and terms that it might rank for that don’t appear upon that page.
Continue reading Using Anchor Text to Determine the Relevance of a Page
If a search engine could understand the layout of a web page and identify the most important part of a web page, it could pay more attention to that section of the page when indexing content from the page.
It could give links found within that section of the page more weight than links found in other sections of the page, and it could consider information within that area more weight when determining what the page is about.
We’ve seen the idea of breaking pages up into parts from a couple of the major commercial search engines:
Continue reading The Importance of Page Layout in SEO
On one level, a search engine indexes a web site by crawling that site one URL at a time, collecting information about what it finds at that address, and indexing the information found so that it can be served to visitors later.
But, the process can be more complicated than that.
For instance, a search engine may try to understand more about specific sites by collecting information on a site wide basis.
Site Wide Information about Web sites
Information that a search engine might look at about a web site on a site wide level might include:
Continue reading Yahoo on Segmenting Web Sites into Topical Hierarchies