Improving Text Segmentation for Displaying Advertisements and Filtering Search Results

When you type a domain name into your browser address bar and the domain isn’t found, sometimes you’ll be served a search results page that has advertisements and links related to a “subject” for that domain name.

For example, you might type “usedrugs.com” into the address bar, and there may not be a website at the domain name “usedrugs.com”. You may be redirected to a third-party website, with advertisements and/or links relevant to that domain name. Ads might be shown for the phrase “used rugs” on that web page, if it is determined to be the most likely segmented version of the string of text from the domain name.

Some sites might be filtered from appearing in search results because the domain names may seem to potentially indicate adult related material.

For example, a domain name, such as “mikesexpress.com”, could be filtered out of search results by an adult filter, because the word “sex” appears in the string of characters.

Continue reading

Google’s Green Border Technologies Patent Filings

On May 11th, Google purchased security company Green Border Technologies, Inc.

Green Border has a handful of patent applications pending, and a granted patent for their security software. One of the names that appears on most of the documents is Ulfar Erlingsson, who left the company in 2003 to join Microsoft Research.

The software from the Mountain View, California based company works to isolate internet sessions from the rest of a user’s PC. I’ve seen speculation that this software might be offered to Google users as part of the free Google Pack software download. The software might also be used by the search engine to crawl sites using client software to locate malware.

Patent

Continue reading

Big Maps, Big Data: Google’s Keyhole Flatfile Patent

Google was granted a patent today on the way that they store, retrieve, and draw geospatially organized data in systems like Google Earth.

Server for geospatially organized flat file data
Invented by Chikai J. Ohazama, Phillip C. Keslin, and Mark A. Aubin
Assigned to Google
US Patent 7,225,207
Granted May 29, 2007
Filed October 10, 2002

Abstract

A flat file data organization technique is used for storing and retrieving geospatially organized data. The invention reduces transfer time by transferring a few large files in lieu of a large number of small files. It also moves the process of locating a given data file away from the file system to a proprietary code base. Additionally, the invention simplifies database management by having quadtree packets generated on demand.

Continue reading

User Intent and Characteristics of Search Queries

One of the short posters at the recent WWW 2007 Conference in Banff, Alberta, Canada, provides an indepth look at classifications of search queries after sampling more than 5 million queries, taken from transaction logs from three different search engines.

They use that data to come up with a classification algorithm, which was then used on a “separate Web search engine transaction log of over a million queries submitted by several hundred thousand users.” The results are interesting.

The article is Determining the User Intent of Web Search Engine Queries, from Bernard J. Jansen and Danielle L. Booth of Pennsylvania State University, and Amanda Spink of the Queensland University of Technology.

Their findings indicated that approximately 80 percent of the queries classified were informational in nature, with the remaining queries being split almost equally between navigational and transactional queries.

Continue reading

Catching Up With Memes, Part 1 – How Nerdy Am I?

I’ve seemed to have been tagged with a lot of memes recently, and haven’t posted any responses to them. The three day weekend seemed to be a good time to try to do that, but I seem to be gathering tags faster than I can reply to them.

My Friend Sophie tagged me with one that asks the question, “How nerdy are you.” I took this test a few days ago without being tagged. Seems my nerd quotient has slipped a little since the first time I took it.


I am nerdier than 65% of all people. Are you a nerd? Click here to find out!

The second part about a meme is that you tag others after taking it. I’m tagging the following folks:

Continue reading

Google Plus Box Patent Application

When you perform searches in Google, sometimes you will see within the listed search results one which has a plus sign next to it. If you click upon the plus sign, you are shown some more information. A new patent application published at the USPTO website provides some information about how expanded and collapsed data in search results might work.

It also raises some questions about how much information search engine results should actually show.

These types of results may have first started appearing displaying maps and local business information for specific businesses. Google referred to this feature in their Web Master Help pages as a plus box (though the link is no longer live):

The address link shown below some sites in our search results (in an expandable area called a Plus Box) is meant to help searchers locate businesses and compare search results. We show the address link for results that are local in nature and for which we have an associated address. If we don’t have an address for your business, or we don’t think that an address is relevant to your site we won’t show it.

Continue reading

Mindmapping Audiences and Tasks for Category and Keyword Development

It’s been a while since my last Back to Basics post here, so I’m going to provide an example of one SEO task that can be a lot of fun if done right.

It’s an exercise in Mind Mapping, and is the kind of thing that can be done in a group. It involves getting something to write upon (ideally posterboard paper and a mix of different colored magic markers), and thinking non-linearly, while filling that paper up with ideas.

The ideas don’t necessarily have to be completely on topic, and sometimes writing down an idea that is only tangentially related to the topic may lead to the exploration of ideas and keyword development that are more relevant.

One of the points of performing search engine optimization on a web site is to make it possible for the site owner to be found in search engines for information that is relevant to inquiries from the audience that will be searching for it.

Continue reading

Refining Queries Using Category Synonyms for Local and Other Searches

Is that local hotspot down the street that you want to look up in Google Maps considered a bar, or is it a tavern?

When looking in Google Maps for a restaurant specializing in steaks, what kinds of calculations might a search engine make regarding categories to help you find a good steakhouse when your query is [steak : city : state].

The way that a search engine classifies a business into a category may affect how it is listed in a local search, or in Google’s universal search interface, so a question like this is important. A Cafe that isn’t listed amongst Coffee Houses may not be shown to searchers looking for coffee houses, even though it might be exactly what the searcher wanted to find.

A new patent application from Google points to a method of recognizing, from user log files and user data, that categories which a business could be listed within may be synonyms, so that inclusion in one should mean inclusion in another.

This query refinement approach may improve search results by understanding that someone searching for a bar is also likely to be searching for a tavern. This approach may have broader implications that just for Google Maps

Continue reading