If you’ve spent any time on the Web, chances are that you’ve run into an entry on Wikipedia for a specific person or place.
A recent paper from a Microsoft reseacher explores the idea of using pages like those, from the “free encyclopedia that anyone can edit,” to try to resolve any confusion about who is being referred to when someone performs a search for a person or place in a search engine.
The paper, Large-Scale Named Entity Disambiguation Based on Wikipedia Data (pdf), from Silviu Cucerzan describes how agreement between documents and the context from wikipedia articles involving people and places might help a search engine understand which person or place is being referred to in those documents. The category tags for Wikipedia may also help.
Some great examples in the document, such as this one:
Continue reading Can Web Search Use Wikipedia to Understand References to Names?
The Web isn’t a static place, where pages remain the same, as search engines try to index and lead searchers to information.
A new patent application from Ask.com explores this stream of data, and trends within it, and how those can be used to improve search rankings and advertisements, as well as supplying searchers with relevant and up to date content.
I would suspect that similar inquiries into trends and burstiness of information happen at other search engines, too.
System and method for monitoring evolution over time of temporal content
Invented by Antonino Gulli, Filippo Tanganelli, and Antonio Savona
Assigned to Ask Jeeves, Inc.
US Patent Application 20070143300
Published June 21, 2007
Filed: December 20, 2005
Continue reading Ask.com on Trends, Freshness, Personalization, and Better Search Results
Imagine surfing the Web, and being able to look at what other Web sites or other visitors wrote about the site you’re visiting.
For example, someone might be viewing a manufacturer’s web page relating to a product they are interested in purchasing.
A past effort at Web annotation was the Third Voice browser plug-in, which let people post public notes about a site that could be seen by other Third voice viewers. Many of those ended up being spammy and/or inappropriate. Click on the image below for a larger version of Google’s potential approach.
This invention would let people receive summaries of blog posts linking to the Web site being visited. Those people could also perform a Web search or blog search through a search engine requesting documents relevant to the site.
Continue reading Blog-Based Annotations for Web Sites
A thoughful and intelligent article from Shannon Watters at Digital Web this week, How to Choose an eCommerce Package, offers some great suggestions on what to look for when choosing software for an online shop offering goods or services or both.
Shannon writes about what she calls the “top eleven things to consider when choosing an eCommerce package,” and it’s difficult to argue with her selections, but I was hoping for an even dozen things to consider – with an addition of how the ecommerce software might interact with search engines.
Of course, an ecommerce system should be easy to use for both shopper and site owner. Updating the software, and adding functionality from third party toolmakers should be a breeze. The software should be able to scale with growth, and it should be easy to use with an analytics package, so that you can measure your traffic and see how visitors use the site.
Shannon’s suggestions regarding promotions and discounts and the ability to offer customer service are spot on. Security is essential, and an intuitive checkout process will be a major determinent as to whether visitors become customers. Her opinions on open source options, and on the community and company behind a software are filled with thoughtful suggestions.
Continue reading When Choosing an eCommerce System, Remember the Search Engines
When you type a domain name into your browser address bar and the domain isn’t found, sometimes you’ll be served a search results page that has advertisements and links related to a “subject” for that domain name.
For example, you might type “usedrugs.com” into the address bar, and there may not be a website at the domain name “usedrugs.com”. You may be redirected to a third-party website, with advertisements and/or links relevant to that domain name. Ads might be shown for the phrase “used rugs” on that web page, if it is determined to be the most likely segmented version of the string of text from the domain name.
Some sites might be filtered from appearing in search results because the domain names may seem to potentially indicate adult related material.
For example, a domain name, such as “mikesexpress.com”, could be filtered out of search results by an adult filter, because the word “sex” appears in the string of characters.
Continue reading Improving Text Segmentation for Displaying Advertisements and Filtering Search Results
When you perform searches in Google, sometimes you will see within the listed search results one which has a plus sign next to it. If you click upon the plus sign, you are shown some more information. A new patent application published at the USPTO website provides some information about how expanded and collapsed data in search results might work.
It also raises some questions about how much information search engine results should actually show.
These types of results may have first started appearing displaying maps and local business information for specific businesses. Google referred to this feature in their Web Master Help pages as a plus box (though the link is no longer live):
The address link shown below some sites in our search results (in an expandable area called a Plus Box) is meant to help searchers locate businesses and compare search results. We show the address link for results that are local in nature and for which we have an associated address. If we don’t have an address for your business, or we don’t think that an address is relevant to your site we won’t show it.
Continue reading Google Plus Box Patent Application
One of the most talked about patent applications from Google over the past couple of years was one which looked at how time might be incorporated into a system of ranking documents, and how time might help the search engine recognize when people might be attempting to manipulate (spam) search results.
The patent application was published in March of 2005 – Information retrieval based on historical data
Over the past couple of months, Google has had some new patent applications published which share a good amount of the description of that original patent filing, but contain new and modified claims.
Can Web traffic information help to improve the relevancy of search results?
Should a search engine learn about how to rank a page by watching searchers use other search engines?
Can information gathered from Internet Service Providers (ISPs) and Web proxies be used to construct a near real-time map of the Web that fold information about that traffic into a ranking system for those pages?
A new patent application from the Pisa-based research team at Ask.com explores these topics and a few more, and suggests ways to improve the freshness, coverage, ranking and clustering of search results through looking at user Web traffic data.
Why Look at Web Traffic Information?
There are three basic tasks that a search engine will normally perform. It will:
Continue reading Calculating Search Rankings with User Web Traffic Data