How much does feedback from searchers impact the search results that we see at Bing or Google? How do those search engines process and respond to that feedback?
The links that Google and Bing present for searchers to provide feedback on search results are listed at the bottoms of the search results pages for each. If there was a link instead after each search result where someone could provide feedback, how much of an impact would that change have, and would the search engines be able to handle the feedback that they receive?
A patent granted to Microsoft this week describes how the search engine may automate processes for “dissatisfaction reports” that are manually submitted by searchers, and how the search engine may file its own disatisfaction reports in some instances. While some of the feedback that search engines receive may include web spam reports, they may also receive feedback that something is “broken” with the search engines, or that a URL that should be showing for a specific query isn’t, or that the results just weren’t helpful.
Providing Feedback at Bing and Google
Continue reading How a Search Engine May Automate Web Spam Reports and Search Feedback
I hadn’t heard the term “Bounce Pad” being referred to websites before, but it’s useful knowing the language of search engines, and the things they might look for when crawling and indexing webpages, and serving results to searchers. Determining whether a site is a bounce pad involves an analysis about redirects appearing on the site, like in the image below from a Google patent granted this week:
One of the mysteries associated with Google’s search results is how it determines which pages to show when there are duplicate or substantially duplicated documents within its index. A search engine doesn’t want to show searchers a list of search results that contains substantially the same pages, so when it finds pages that are pretty close to being the same, it will create a “cluster” of those pages and choose a representative page to display.
That kind of duplication can happen for a number of reasons, such as someone copying content from another page (with or without permission or license to do so), the majority of the content on a page being a manufactor’s or publisher’s description, a content management system set up so that the same page gets published more than once at different URLs, content being republished on a mirror site or sites set up so that if there’s too much traffic to one of the sites that the others may handle overflow, and more.
Continue reading How Google Might Filter Out Duplicate Pages from Bounce Pad Sites
Earlier this year, Google acquired the patents of a real time search engine started in 2009, Wowd (a play on the word “crowd.”) Wowd had no web crawlers, but rather relied upon users downloading a browser application, so that every page they visited was nominated to be included in search results. A Press Release from February, 2010 tells us about the search engine:
Wowd is a real-time search engine for discovering what’s popular on the Web right now. Unlike other engines in the space, Wowd focuses on discovery and exploration of the entire Web, i.e. surfacing trends, breaking news, social media topics, and popular pages. Wowd then taps into the “attention frontier” of its user community to build real-time search results. Wowd makes it easy to discover the latest trends, topics, and hottest Web pages.
In August of last year, Wowd released a search tool for Facebook, to add a number of features to the Facebook experience, including custom feeds, game spam blocking, and social search. A look at the Wowd website however tells us that “the team has decided to pursue new opportunities,” with some members of the engineering team joining Facebook. There’s no date on the message.
Continue reading Wow! Google Acquires Wowd Search Patents
A new patent filing from Yahoo raises the question, “How much has social media influenced the expectations of searchers, and forced search engines to change?”
Before I can begin to even think about that, I have to ask if looking at Yahoo patents even a good idea after their 2009 deal with Microsoft to have Bing power their search results.
The Yahoo patent application was filed after the agreement between Yahoo and Microsoft, and was published last week. Are Yahoo patents are still worth spending time with? After reading through the Yahoo patent application about how the search engine might use information from social media platforms to discover recently hot topics and webpages that are relevant to those topics, I would say that they are. The terms of the agreement between Yahoo and Bing includes a 10 year exclusive right for Microsoft to use search technologies developed by Yahoo, and doesn’t stop Yahoo from applying those technologies itself.
The patent filing explores “recency-sensitive” queries, where searchers are looking for resources that are both topically relevant as well as fresh, such as novel information about an earthquake. If you’ve been watching twitter streams, Facebook updates, and other social media, you’ve seen that sometimes these sources are the best and fastest places on the Web to find that kind of information.
It’s possible that a search engine that ignores sources like those isn’t going to be able to return any relevant results for those types of queries – what the patent’s inventors call a “zero recall” problem.
Continue reading Do Search Engines Use Social Media to Discover New Topics?
I love local search. It follows many practices similar to Web search, though different often in ways that do reflect an attempt to map the real world. Google’s Streetview cars are a little like Google’s webcrawler Googlebot. Instead of collecting URLs for Websites, Google Maps collects addresses to associate with businesses, nonprofits, government offices, parks, landmarks, and many other destinations. It has its own challenges as well, such as the streetviews car being turned away at sentry guard booths for military bases, or not being able to drive down “private” roads. Google Maps also can’t use latitude and longitude coordinates in places like China since their export and use is classified by that country as if they were munitions.
I’m also often frustrated by local search. Driving directions from Google often begin by telling you to go “east” or “west” on your first turn. I’m not Mason or Dixon, Lewis or Clark, and I don’t carry an in-car compass with me when I drive. I often have no problems with the directions other than that, for the first 99% of the trip, and then have problems with the last few hundred feet.
Continue reading GPS to Correct Google Maps and Driving Directions as a Local Search Ranking Factor?
Might Google start providing more link options in Google Instant Previews as a result of this acquisition?
A company that filed a patent infringement lawsuit against Google in 2007,on the day that their last patent was granted, has now assigned all of their patents to Google. The flowchart below is from one of their patents and shows multiple link options available when someone hovers over a link.
The company, iLOR, LLC, applied for a preliminary injunction against Google’s Notebook application, and Google successfully filed a motion for summary judgment to terminate the claims against it, and was awarded around $660,000 in attorney’s fees. The case set a new standard (pdf) on appeal on when attorney’s fees should be awarded in patent infringement cases when the decision regarding the fees was reversed on appeal.
Continue reading Google Acquires iLOR Patent Used to Sue Google
Google seems to be making a regular habit of acquiring patents from IBM, with a new acquisition of 39 granted patents and two pending patent applications on September 30th, recorded at the USPTO today. Like the earlier transactions this year of 1,030 patents tranferred in May, and 1,023 patents assigned in August, there’s a wide range of technology included in the transaction between Google and IBM.
The list of patents includes one filed in 1996 involving the use of an API and a java applet, which sounded pretty interesting (I listed it first), especially considering the ongoing Oracle-Google litigation involves java and APIs. Some of the other patents included are listed in that patent as being related to it. Other inventions include such things as file archiving approaches, distributed database information systems, encryption, user authentication, and managing configurations of computer systems.
Google and Oracle are set to go to trial on October 31st on claims that Google infringed java related patents held by Oracle, in which Oracle is claiming more than $1 Billion in damages.
Continue reading Google Acquires More IBM Patents In September
One of the things that’s clear about how search engines work is that when they find a link pointing to a page using certain anchor text, that page might be seen to be a little more relevant for the text found in that link. Google pointed that out in one of the earliest white papers about how the search engine works:
This idea of propagating anchor text to the page it refers to was implemented in the World Wide Web Worm [McBryan 94] especially because it helps search non-text information, and expands the search coverage with fewer downloaded documents. We use anchor propagation mostly because anchor text can help provide better quality results. Using anchor text efficiently is technically difficult because of the large amounts of data which must be processed. In our current crawl of 24 million pages, we had over 259 million anchors which we indexed.
- The Anatomy of a Large-Scale Hypertextual Web Search Engine
But one of the assumptions that many make is that each link, with its anchor text, is equally as important as any other link and that if a page has lots of links pointing to it with certain anchor text included in those links that it will rank more highly for the terms found in that text than it otherwise might in the absence of all those links.
Continue reading How a Search Engine might Weigh the Relevance of Anchor Text Differently