Advertising through a search engine is a little more complicated than just “highest bidder wins.” A recent Google patent application, a video from one of its inventors, and a couple of older patent applications that he co-authored show some of the complexities that an advertiser may face when wanting to advertise through a system like Adwords.
Back in October, 2005, Dr. Hal Varian gave a presentation on the advertising model at Google to a class at UC Berkeley. At the time, he had been working with Google’s Adstats team for approximately 3 1/2 years, as a consultant. It’s a nice introduction to contextual based ads at Google.
Dr. Varian is also listed as one of the inventors on number of patent applications from Google describing some of the decision-making processes that may be involved in determining which types and configurations of ads show up on content pages, and the cost of showing ads on content pages based upon ad and document scores. I’ve included links to those, and short introductions to them below.
Continue reading “Google Ad Configurations, Quality Scores, and Ad Types”
Knowing something about the language used in a query might help a search engine decide which pages to show a searcher. A search engine wants to lead its users to pages they can read. A recent Microsoft patent application explores how language types can be used in ranking pages in search results.
Language types can be seen as a measure of relevance because they can help find pages relevant for a search. They are considered a “query-dependent” measure of relevance, because while the language type for a page can be identified before anyone performs a search that might include the page, the language used in the query influences which results are shown.
Query-independent measures, or attributes, are different. I wrote previously about a couple of other Microsoft patent applications which this one notes are related, in a post titled Ranking Search Results by File Type and by Click Distance.
Those two measures are considered “query independent,” because whatever words used in the query that might return those pages is irrelevant to the ranking method.
Continue reading “Penalizing Pages in Search Results Based upon Language (Except English)”
I’m only doing this because:
+ It’s fun.
+ The way Andy titled his (“New Year’s Resolution – Be a Super Hero”)
+ Even David Dalka is doing it.
+ Never linked to Jason Calacanis before.
+ Michael Arrington notes that the site has more than 12 million hits since launch. (A little more than a post about a Google patent).
+ I kinda like Spiderman’s attitude…
You are Spider-Man
I had heard about new layers being added to Google Earth a few weeks back, but really hadn’t explored them. I’m glad I spent some time looking more deeply at Google Earth tonight. There are a lot of potential things that web designers and developers can do with Google Earth that look like they could be fun, and could attract some visitors to a site.
Checking through new videos in the Google Tech Talk series, I noticed one on Google Earth I hadn’t seen before from November 21, 2006. There’s a lot of stuff going on with Google Earth that I wasn’t aware of, and the layers are definitely worth exploring. The current events section of the Google Earth Community is an awesome feature, and the way that some environmental groups are using Google Earth is worth looking at.
Jessica Pfund has been exploring and working on Google Earth for almost a year, and she introduces us to some of the things we can do with Google Earth (after we find our homes and maybe where we work), in the Google Video: Google Earth: Beyond Your Backyard (link to the Google Video)
Continue reading “Google Earth Video, Layers, Environmental Activism, and Links”
How Phrase Based Indexing Works
Imagine an information retrieval system that uses phrases to index, search, rank, and describe documents on the web. This system would look at the way those phrases were used across the web to decide if they were “valid” or “good” phrases. In addition to considering if their use on all webpages was significant statistically by how frequently they were used, it would also look at how those phrases might have been related to each other – certain phrases tend to be mentioned in the same documents as other phrases
For example, a document that talks about the “President of the United States” may also be likely to include the phrase “white house.” So the appearance of some phrases can be used to predict the appearance of other phrases. And a spam document might contain an excessive number of related phrases.
Some “spam” pages have little meaningful content, but may instead be made up of large collections of popular words and phrases. These are sometimes referred to as “keyword stuffed pages.” Similar pages containing specific words and phrases that advertisers might be interested in are often called “honeypots,” and are created for search engines to display along with paid advertisements. To searchers looking for meaningful content, those pages can be a waste of time, and cause of frustration.
Continue reading “Phrase Based Indexing and Spam Detection”
Pagerank is a measure of the popularity of a page, yet it has a flaw according to some researchers. The problem is that newer pages haven’t had the chance to be viewed like older pages that have more links to them.
How would a problem like this be overcome? One way might be to determine how likely it would be that someone who viewed the newer page would link to it, and use a “future” measure of PageRank to return results to searchers.
In a paper by Junghoo Cho, Sourashis Roy, and Robert E. Adams, Page Quality: In Search of an Unbiased Web Ranking (pdf), the researchers try to address this problem. Here’s how they describe it:
In a number of recent studies researchers have found that because search engines repeatedly return currently popular pages at the top of search results, popular pages tend to get even more popular, while unpopular pages get ignored by an average user. This “rich-get-richer” phenomenon is particularly problematic for new and high-quality pages because they may never get a chance to get users’ attention, decreasing the overall quality of search results in the long run.
Continue reading “Using Page Quality to Overcome Bias in Ranking Newer Sites”