Google on Aggregating Ad Data, Yahoo Messes with Map Reduce, Microsoft Explores Hierarchical Tagging

I don’t always have time to dig as deeply as I like into some of the patent filings that I uncover each week, and end up not writing about some interesting things that the search engines come out with.

Instead of not mentioning those at all, I think I might try to do a weekly post pointing out some of the most interesting of those each week, like the following. If you find something interesting about any of them that you would like to share in the comments, please be my guest…

Google

Google explores providing better data for advertisers, especially those with multiple web sites focusing upon broad geographical areas, in the following patent application:

Continue reading

How Search Engines Can Learn From Looking at Sequences of Search Queries

Whenever someone searches at a search engine, they not only get information in response to their search, but they also provide information to the search engine about the things they are searching for – information which the search engine might find useful in helping other searchers.

If that searcher performs another search related to their first search, then the search engine might create an association between the two search phrases that the searcher used, if the two phrases appear to be related. If they perform a series, or sequence, of searches on a concept, then the search engine might take advantage of that information.

If a lot of people perform that first search, and then the same second search, or that same search within a search session, then the search engine might decide that the phrases are semantically related to each other. Knowing that relationship exists between search queries might help the search engine help people find things on the web, and it might help provide better advertisements from the search engine.

A patent application from Yahoo explores how the search engine might find semantically related terms by looking at queries searched for by people in search sessions, and describes some of the processes behind how the search engine might determine that phrases may be related to each other. It also describes how a search engine might identify whether a query comes from a person, or from a program.

Continue reading

Phone Keyboards and Seachers Using Predictive Query Suggestions

A few years back, finding myself stranded on the side of the road with a broken down pickup truck and being over an hour’s drive from home, I convinced myself to finally get a mobile phone.

broken down pickup truck

I didn’t necessarily want to have a phone hanging at my side all of the time, and I didn’t need it for work at that time. But it would have been useful in that emergency, and I wanted to start seeing what web sites looked like on a phone.

Search engines are also paying more attention to the smaller screens, and the more limited keyboards available to people who access the Web by phone. What influence do these constraints have upon the future of mobile search?

Studying Query Suggestions on Phones

Continue reading

How Search Engines May Substitute Other Search Terms for Yours

When you search for something at a search engine, the search engine might not just try to find pages on the web which match the keywords that you searched with, but may first try to expand upon those keywords by finding similar or related terms.

A Yahoo search box with a search for refurbished laptops and search query suggestions shown above the search results

This kind of expansion of search terms can be most visible when one of the query terms that you use is a misspelling, and a search engine might display results with the correctly spelled words if it is pretty confident that one of the terms is misspelled.

How does a search engine know that a term is misspelled, or that there might be related phrases that might provide better and more helpful results to a searcher?

Continue reading

Community Tagging and Ranking in Images of Landmarks

In addition to collecting a lot of information about the Web by using crawling programs to index content across the internet, search engines can learn a lot about pages and images and videos and other objects on the web by watching what we choose when we search, by seeing how we browse web pages through their toolbars, and by noting what words we might choose when we annotate and tag images and pages.

As publishers of text and links and pictures, as users of web pages, and as interactive participants on pages when leaving comments and tags and annotations, we provide search engines with information about our interests and what we might be interested in seeing on the Web.

As those search engines learn about us and our interests from the pages that we like to visit and the images and text that we might publish, they can compare what we see and what we do online with other travelers and publishers on the Web, and they might view us as communities who may share some common interests, and whom they can learn from.

Continue reading

A Personalized Search Using Advanced Search Operators

Search engines often provide an “advanced search” page, where a searcher can define search results they receive in many ways, beyond the simpler keyword search found on the front page search at those search engines.

For example, Yahoo’s advanced web page search lets searchers select a combination of different search limitations, such as:

  1. Different relationships between keywords in a search (e.g., “all of these words”, “the exact phrase”, “any of these words”, and “none of these words”),
  2. A time limitation on when the web page was last updated,
  3. A limit on what top-level domain names to search,
  4. A limit based on available legal rights,
  5. A file format limitation
  6. Country and language limitations.

If someone uses advanced search, they can significantly narrow the number of search results they receive, perhaps making it easier to find what they are looking for. But, most people don’t use the advanced search interface and its many ways of limiting search results.

A personalized search method described in a Yahoo patent application published last week collects information about a searcher’s interests from their search history, their browsing history, and their interests listed in profiles from places like MySpace and other social networks.

Continue reading

Search Engines, Web Page Segmentation, and the Most Important Block

Many web pages contain more than one topical section, or blocks, which may make it difficult for a search engine to tell what a page is about when it is trying to index that page.

These blocks may include such things as a main content area, navigation bars, headings, footers, advertisments, and other content that may refer to other pages on a site, or on other sites.

The Value of Knowing the Most Important Block

Being able to identify a block within a web page that represents the primary topic of that page may help a search engine decide which words are the most important ones on the page when it tries to associate the page with keywords that someone might search with to find that page.

Identifying that content might also help the search engine decide what topic is most relevant to any ads that they might show on the page if they are an advertising partner with the publisher of the page.

Continue reading

Microsoft on Organizing Information in Storylines

What will the search interfaces of tomorrow look like? How might we be presented with information that we are interested in differently than we are today, and how might that information be delivered to us in manners that we find helpful?

On Google’s corporate Quick Profile page, they tell us that their mission is

…organizing the world’s information and making it universally accessible and useful

That seems to be a pretty tall offer, and it causes me to think about the many different ways that information might be organized in accessible and useful ways.

A newly published patent application from Microsoft takes an interesting spin on presenting information, pulling together news from a mix of sources to present topics in storylines, and providing ways to have that information delivered to us over computers, smart phones, watch interfaces, and in other ways.

Continue reading