A new patent application from Microsoft considers ways to present search results to searchers in clusters, with meaningful names.
Published on February 2, 2006, it was originally filed on July 13, 2004, and is assigned to Microsoft Corporation.
Query-based snippet clustering for search result grouping
Inventors: Hua-Jun Zeng, Qicai He, Guimei Liu, Zheng Chen, Benyu Zhang, and Wei-Ying Ma
US Patent Application 20060026152
A clustering architecture that dynamically groups the search result documents into clusters labeled by phrases extracted from the search result snippets. Documents related to the same topic usually share a common vocabulary. The words are first clustered based on their co-occurrences and each cluster forms a potentially interesting topic. Keywords are chosen and then clustered by counting co-occurrences of pairs of keywords. Documents are assigned to relevant topics based on the feature vectors of the clusters.
Some recent research I’ve been doing had me looking at the Infoseek search engine, and its part in the history of search engines. I remembered an old book I have on search engines which has a couple of chapters on Infoseek, and started to reread it.
The book is the Web Developer.com Guide to Search Engines, from February of 1998. It’s been a while since I’ve picked up a book about search engines which hasn’t mentioned Google. This one focuses upon the search engines on the web at that time, and on adding a search feature to your site.
I didn’t get much past the first section of the first chapter of the book, titled Bow Down and Give Thanks to Archie, before I hopped on the web and started looking at Archie’s role on the net. As it notes there:
The grandfather of all search engines was Archie, created in 1990 by Alan Emtage, a student at McGill University in Montreal.
I don’t just love Eric Weaver’s post “Direct Marketing: A Science of Stupidities” (no longer available) because he starts off his own list of ten steps to successful marketing with “become search friendly.”
I love it because he offers ten more suggestions that are spot on. And because he provides some great and snarky opinions on some other “best practices” of intrusive marketing.
Thanks to Anthony Garcia at Future Now for pointing it out.
Over at the Yahoo! Search Blog, there’s a nice opportunity to submit some questions about search technology to one of the giants in the field, Dr. Andrei Broder, who recently joined Yahoo!
Dr. Broder is the co-inventor of many interesting Patents on search. The latest include one with Google’s Krishna Bharat on how to estimate the coverage of web search engines, and a somewhat different approach for ranking web page search results.
It was tempting to ask Dr. Broder which search engine he would estimate covers more of the web than others, but I instead asked about the Yahoo! patent I mentioned a couple of days ago, and where he might see the future of search headed.
I just got back from a business meeting, and I really have to say how much I like lunch or dinner meetings.
Meetings can be both informal and informative. The one I had today is a weekly event where a client and I get together, and talk about marketing strategy, blog posts, what’s happening on the web, business issues related to running an office, and so on. It works out well because we don’t pull any punches, and we hold serious discussions which lead to actions that can be easily taken.
These meetings usually require big tables, where we can spread out to jot down notes on a notepad, and write up a recap of some of the ideas and action items that come out during the meeting. They also giving us an opportunity to scout out some of the many restaurants in the area.
They are quite a change from some of the meetings I’ve been in on past days, with formal minutes of the meetings approved or amended, committees formed to determine who has the power and authority to undertake certain actions, so many people participating that people need name place holders, and so many stakeholders involved that not everyone who attends is empowered to take actions.
Maybe there’s a little irony to the date that United States Patent 6,990,628, Method and apparatus for measuring similarity among electronic documents, was granted – January 24, 2006.
After all, that’s the day when many were saying that Yahoo! was giving up on matching or beating Google in the field of search, based upon some comments from the company’s Chief Financial Officer. Does this new patent assigned to Yahoo! hold hope for them to keep up, and maybe even surpass the present king of the search mountain, Google?
We may only find that out in the future, but it is an interesting document, and it covers a lot of ground. It’s worth poking through, and getting a sense of what it covers. A little more about the patent itself, first. The named inventors are Michael E. Palmer, Gordon Sun, and Hongyuan Zha. While it was granted on January 24, 2006, it was originally filed on June 14, 1999.
That file date may be a little misleading. From the patents and other documents referenced in the patent application (which I’ve provided links to at the end of this post), it appears that the document evolved over time from when it was originally filed until when it was granted this week.
You probably won’t catch me on American Idol anytime soon. I know the limitations of my voice, and I could envision the scowl on Simon Cowl’s face if I tried. No need to go there.
I pretty much love most types of music, with the possible exception of gregorian chants. There are a handful of songs that I hold special though. These are the ones that even sound good to me, within the confines of my shower. In the Midnight Hour, as performed by Mr. Wilson Pickett, is one of those.
And it’s not just the singing of the song that gets to me, but the way the whole thing comes together, the hook that instantly appeals, and yet the improvisation that makes it unique. As as Wilson Pickett would have it, “You harmonize; then you customize.”
Love this post in Design Observer – Wilson Pickett, Design Theorist, 1942 – 2006. Great points, and a sweet tribute.
South Korea is one of the most internet advanced and connected countries in the world. Google only has 1.6 percent of the search traffic. Why the lack of success in an endeavor where they’ve seen so much acceptance in many other places in the world?
Do a search for “Rain” in Google, and chances are good that you will get weather information. That’s true in the United States, and it’s also true in Korea. That’s part of the problem. There’s a famous singer in Korea whose name translates into “Rain.” Google’s results fail to turn up any information about this celebrity. Yet the information isn’t difficult to find on Korean sites.
The Korea Times uses that example in “Why Is Google Struggling in Korea?” (no longer available).
Google has been offering search in South Korea since 2001. But they haven’t been incorporating User Created Content (UCC) the way that local Korean search engines have.