I’ve wondered why an occasional post from here sometimes showed up in the news sources that appear in Google Finance. I now have a little clearer understanding of how they perform their news gathering.
If you use Google Finance, and want to know a little more about how it works, or are just interested in how Google might tackle providing information in a narrow field in a meaningful manner, you may want to check out two newly published patent applications from Google on their finance offering: Computing a group of related companies for financial information systems, and Interactive financial charting and related news correlation.
Both of them take deep looks at how to present financial information that might make it easier for people to use and to understand how news may impact the prices of stock. Both documents overlap a great deal, and share a detailed description and abstract. The abstract tells us:
Techniques are disclosed by which users looking for financial information about publicly traded or private companies may richly and interactively navigate both pricing and material news information about those companies. The techniques facilitate and encourage the user’s use and understanding of financial information presented. Related company information can also be provided to the user, where related companies are organized by hierarchal categories for a meaningful display.
There’s a body of what could be described as folklore surrounding how search engines work. These tales, or sometimes superstitions, may have a grounding in a comment made by a presenter from a search engine during a conference, or a statement made upon a search engine blog, or just an assumption that a search engine has to work a certain way in order to do some of the things that it does.
One of these that many have taken for granted is that a search engine could notice large shifts or changes on the Web, such as a site suddenly gaining lots of lots of pages, or outgoing links, or incoming links which might increase their rankings in the search engines. I recall a Google representative at a conference I attended answering a question about how a search engine could notice such things, where he said that they could because they have “lots and lots of computers.”
A Yahoo patent application from last week, Using exceptional changes in webgraph snapshots over time for internet entity marking (US Patent Application 20070198603), provides some insight into how such changes could be flagged automatically, and also could “identify exceptional entities that exhibit abnormal attributes or characteristics due solely to their excellence and high quality.”
The abstract from the patent filing tells us:
Last week, I made a post introducing a newly granted patent from Google, Personalizing anchor text scores in a search engine (US Patent 7,260,573) which was filed in May of 2004.
In the midst of the Search Engine Strategies Conference, I didn’t have a chance to delve too deeply into the patent. I am returning to it, and to the context in which it was filed and granted. The Mad Hat has a nice overview of the processes involved in Personalized Anchor Text Score.
Let’s look at a little of the history, and some of the papers and ideas around at the time that it was filed.
The Role of Kaltix in Personalizing PageRank and Page Rankings
A little glimpse into my journey to the San Jose Search Engine Strategies Conference this past week.
I headed out to BWI airport outside of Baltimore, usually about an hour drive, and wondered if I would make my flight after an accident delayed traffic along the way. I hope the people involved the collision are ok. I managed to get through security, and make it to the plane on time for my journey to Salt Lake City.
After my layover in Utah (wish I had a window seat, so that I could actually have seen the lake – did see the Bonneville Salt Flats from my aisle seat), I get on a plane headed for San Jose. I end up sitting next to a rocket scientist (the NASA documents and his killing time solving maths problems were a giveaway).
The plane arrives without any delays or problems, and I take a cab to the Airport Best Western (closer to the center of Santa Clara than San Jose), where I was planning on spending very little time over the weekend, before moving to the Fairmont in downtown San Jose.
When we talk about how a search engine like Google crawls and indexes information from websites, it’s often in the context of the Web results that the search engine shows to searchers.
Facts in Web Results
But, with Universal Search and blended search results showing information from local search, question answering, definitions, and others, it may make sense to start paying more attention to how the search engine is extracting facts from pages, creating “objects” from those facts, and ranking those objects.
In a post from last September, I went into a lot of detail on how a Google patent application focusing upon data practices with Local Search, titled Generating Structured Information, discussed how facts and information were taken from the Web and included in a local search repository.
Explosion of Patent Filings
If Google was to reward you for sending specific ads with your emails or blog posts or instant messages or forum posts to your friends and acquaintances, would you? Consider that the reward could be money, it could be a credit of some type, or it could be “an increased reputation ranking.”
If you are an advertiser, would you want your ads to be distributed in a manner like this?
Might metrics gathered from the effectiveness of User Distributed Ads (UDA) be used for “later ad serving arbitration?”
A series of patent applications from Google describe ways that people might send advertisements and search results to people they know via mail and messaging and blogging and forum posting, receiving rewards in return for doing so.
I wrote about User Distributed Search from Google June, in a post titled Google’s User Distributed Search Results in Emails, IMs, Blogger, and break down how that works in a somewhat lengthy post. The following patent applications expand a little upon how these concepts would work together
Wish I had more time to break this newly granted Google patent down, but I have to run to catch the keynote address this morning at the SES conference. One of the authors of this granted patent was on a conference panel on personalization yesterday.
Personalizing anchor text scores in a search engine
Invented by Glen Jeh, Taher H. Haveliwala, and Sepandar D. Kamvar
Assigned to Google
United States Patent 7,260,573
Granted August 21, 2007
Filed: May 17, 2004
I’m presently in San Jose, getting ready for the Search Engine Strategies Conference coming up.
I was fortunate enough to attend BarCampBlock yesterday, which was a lot of fun. I met some interesting folks, and watched a thoughtful presentation on privacy on the Web, and was involved in a couple of discussions on “problems with search,” and on “corporations and their fear of blogging.”
One of the issues that came up in the search discussion, and one of challenges that face search engines is trying to understand what someone is searching for based upon their entry into a search box of just a word or two or three. It reminded me of a post that I’ve had in my queue, waiting to write about for a little while.
A paper presented at SIGIR 07, this last July, looks at using some additional information to enhance those words in a search of a collection of documents. I came across a video presenting the paper in a webcast at a venue in the UK a few days after the SIGIR conference, which adds some addition information about the paper. Both are worth a look to see how search terms might be expanded before a search is performed.