Google Notebook Released

I downloaded the Google Notebook browser extension about twenty minutes ago, and have been trying it out.

In case you didn’t hear about Google Notebook yet, it’s a new tool from Google announced last week during the Google Press Day, but not planned to be released until this week.

The idea behind it is that you can use it to take notes about web pages , and copy snippets from those pages, and keep them in notebooks, which you can keep private, or make accessible to the public. A link to the page where you found the material makes it easy to return to the source of the information.

Notebooks can be organized into sections, and can contain images as well as text. The program can be accessed from more than one computer, which means that the information contained within it is stored by Google rather than on your own computer.

I really like the way that the mini notebook, and the full page notebook work together. As a tool for tracking information on the web, it’s pretty useful. I could see some value in using it as a work tool when looking at a site, and considering rewriting content on the pages of that site. Or in writing notes for a blog post, or article or paper.

Continue reading Google Notebook Released


Trust and the Internet: Search Engine Bias

Trust is essential in our reliance on search engines. But we should understand some of the risks in placing too much trust in search results.

There’s the possibility of bias in what search engines show people based upon the engines’ business practices and operating policies, limitations in indexing and ranking algorithms, and in political and cultural pressures placed upon them.

When I think of conferences like the one to be held next week in Edinburgh, Scotland, during the 15th Annual World Wide Web Conference, I don’t expect to see presentations that are critical of search engines. But, during a workshop on Models of Trust for the Web, there’s a paper being presented that takes a close look at search engine bias, from a couple of researchers at Yuan Ze University in Taiwan.

Position Paper: A Study of Web Search Engine Bias and its Assessment (pdf) by Ing-Xiang Chen and Cheng-Zen Yang

The authors of this paper describe in more detail the three different sources of bias that I mentioned above. How could business practices shape the bias of search engines? Continue reading Trust and the Internet: Search Engine Bias


Testing Google and Yahoo Alerts

I’ve been using Google Alerts for the past year or so to stay on top of a handful of topics, and I decided this weekend that it might be worth expanding their use a little more.

So, I added about ten terms that I’m interested in tracking to my alerts list for Google.

And then, I decided that it might be fun to try out Yahoo Alerts also, and compare what the two services provide.

My experience with Google Alerts has been interesting so far. With some news articles, the alerts I’ m sent have been fairly timely. But every so often, I see an alert pointing to a page that’s more than a year old. When I see that, I wonder if Google has just descovered the page, and noticed in some vast database that they hadn’t sent me a copy of it yet.

I haven’t searched to see if someone has tried this already, but it might be fun to keep track of what links I’m provided with, and compare the two alert systems over a period of a few weeks or months. How old are the pages that I receive an alert for? How many links am I provided per term over the length of time, and how many do I receive each day from both search engines?

Continue reading Testing Google and Yahoo Alerts


Trust and the Internet: Web Search Spam

Trust is a topic that has a profound affect upon the way search engines work on the web.

How easy or difficult is it to come up with methods that don’t rely (much) on human judgment to identify spam free pages that can be trusted, and to locate pages that are intended solely to rank well in search engines without providing any value at all for visitors, except possibly ads that are on the topic of their search?

In a week, there will be a gathering in Edinburgh, Scotland, during the 15th Annual World Wide Web Conference, on the subject of Models of Trust for the Web. While I won’t be attending, it sounds like an interesting presentation, and I wanted to take a look at some of the papers written by presenters at the conference. In this post, I’ll be looking at one of the papers to be presented, and listing some of the other work by its authors.

Problems with Yahoo’s Trustrank Assumptions

Continue reading Trust and the Internet: Web Search Spam


On a Hypertext Roadtrip

Came across a lot of interesting stopping points on my travels around the web over the last few days, some fun stories, and some thoughtful musings…

Favorite title, and analogy, Please Stop With Your Chinese Math, reminded me of all the meetings I’ve been in where I’ve inadvertently rolled my eyes at some statistics, and hoped that no one noticed.

Book on the Science of Google Rankings – Probably has too much math for my tastes, but I’m going to have to get a copy after reading their Deeper Inside Pagerank to see where they pick up the storyline. I hope they don’t kill off any of the main characters.

LEGO’s Incredible Marketing Strategy (yes, legos and marketing are a great match)

Continue reading On a Hypertext Roadtrip


Does Google use whois information?

Some recently published patent applications from Go Daddy explore whether additional whois information might help reduce spam and phishing, and improve search engine results. Google noted in a patent application last year that they might be looking at whois information while presenting and ranking pages.

I don’t know how easy it would be to set up the processes described by Go Daddy, or verify the reputation information that they describe, and maintain the records the system would depend upon.

The purpose of whois information

But it might be a moot point to even wonder. A recent decision by the folks at ICANN to limit the use of whois information makes it seem unlikely that that the scenerios envisioned by these documents will happen. ICANN’s Generic Names Supporting Organization held a vote in which they decided upon the sole purpose of whois information:

Continue reading Does Google use whois information?


Google on Improving Adsense/Adwords

What are the best ways to pay someone for displaying ads on their sites? What are the easiest ways for people to find sites that they may want to have their ads placed upon? What information should be shared with advertisers about the sites that they might want to advertise upon, or have chosen to place ads on?

Adsense and Adwords are two sides of a content-based advertising system used by Google, and are amongst the methods that the search engine relies upon to make money. One of the main issues that faces Google is finding ways to make it easy to match up Adwords advertisers with the sites of people who display Adsense ads.

A new patent application from the Mountain View based search engine describes a method to help people looking to place ads with sites that are rich in content, have a lot of traffic, and are good prospective advertising hosts.

The patent filing is Determining prospective advertising hosts using data such as crawled documents and document access statistics (US Patent Application 20060095322), and lists Timothy Matthew Dierks as its inventor. It was originally filed on November 3, 2004, and published on May 4, 2006, and appears in the USTPO assignment database as being assigned to Google.

Continue reading Google on Improving Adsense/Adwords


Microsoft Patents Dynamic Ranking Changes


I spent too much time this past weekend paying attention to the NFL draft. Television coverage of the two day event really isn’t “must see TV,” but there were some surprises. One of them involved the fourth pick of the draft.

According to the New York Daily News, the Jets view left tackle D’Brickashaw Ferguson as the infrastructure for their offense, which Matt Leinart was supposed to be a part of. The Jets were working the phones trying to move back into the top 10 to get the USC quarterback after selecting Ferguson.

The Jets got their lineman, but missed out on the marquee name quarterback. It wasn’t an exciting choice, but probably a good move. We’ve been hearing for months about changes to the infrastructure of Google, which is almost equally exciting. You know the lineman is going to help the team a lot, but you really wished they picked that flashy quarterback or speedy running back.

There’s nothing quite like a good infrastructure on a search engine. It isn’t quite the same as an update, but it opens up a lot of possibilities.

Continue reading Microsoft Patents Dynamic Ranking Changes


Getting Information about Search, SEO, and the Semantic Web Directly from the Search Engines