Using Anchor Text to Find Documents in Other Languages

Google was granted a new patent this morning, which looks like an attempt to make it easier to find relevant documents in other languages for a query which relies considerably upon anchor text. It was originally filed in 2001, so it really isn’t that new.

Worth a look if you are concerned about how pages in one language might show up as results of a query in another language (or just interested in how search engines might approach something like this. Of course, it’s just a patent, so that doesn’t necessarily mean that it was ever implemented.

Systems and methods for using anchor text as parallel corpora for cross-language information retrieval
Invented by Luis Gravano and Monika H. Henzinger
Assigned to Google
US Patent 7,146,358
Granted December 5, 2006
Filed August 28, 2001

Abstract

A system performs cross-language query translations. The system receives a search query that includes terms in a first language and determines possible translations of the terms of the search query into a second language. The system also locates documents for use as parallel corpora to aid in the translation by:

(1) locating documents in the first language that contain references that match the terms of the search query and identify documents in the second language;

(2) locating documents in the first language that contain references that match the terms of the query and refer to other documents in the first language and identify documents in the second language that contain references to the other documents; or

(3) locating documents in the first language that match the terms of the query and identify documents in the second language that contain references to the documents in the first language.

The system may use the second language documents as parallel corpora to disambiguate among the possible translations of the terms of the search query and identify one of the possible translations as a likely translation of the search query into the second language.

Share

6 thoughts on “Using Anchor Text to Find Documents in Other Languages”

  1. Can you tell me where you found this release about the google patent? My patent news reader don´t show this…

  2. Hi Lukas,

    I don’t use a release service, and most organizations that have patents granted don’t issue press releases when those are granted.

    I found this granted patent by doing a search at the patent office database.

    http://patft.uspto.gov/netahtml/PTO/search-adv.htm

    I usually spend a few hours each week searching through the patents and also the patent applications.

    (The patent application database is a different one, and the above link only leads to a search for granted patents.)

Comments are closed.