Imagine that you want to cook a Chinese meal from an authentic Chinese recipe. You search for the recipe in Google by the English name of the dish, and the search engine translates your query into Standard Mandarin, performs the search and finds some recipes in that language, and returns those pages to you with an option to translate them into English.
Or, you have an assignment for your philosophy class, and you want to write a paper on Benedetto Croce and his Philosophy of Spirit. You’ve read some papers on him published in English, but would like to find some in Italian.
You go to Google, and choose “Italian” as a language that you would like to see source pages from as you enter his name into the search box. The search engine looks through Italian language results that contain the name Benedetto Croce, finds a number of results, and provides page titles and snippets that it finds. Clicking on a “translate” link next to the title to one of those pages will bring you the page translated into English.
Perhaps you’ve decided that you want to start biking to work instead of commuting by car, and you want to learn more about biking and bicycles. You enter a search into Google for the word [biking], and the search engine looks at statistical associations that it has created between keywords and Web content, and not only returns results to you for [biking] but also for the word [cycling].
A Google patent application published this week describes an expanded language search engine, which might enable you to search for pages in other languages using translations of the queries you search with. It might also enable you to receive additional results in your own language for words that are statistically related to your query terms in a meaningful way, like “biking” and “cycling.”
I wrote a post a few weeks back about another Google patent filing which described how Google might find synonyms for search queries using a statistical machine translation approach.
The expanded language approach to searching is described in:
Automatic Expanded Languge Search
Invented by Johnny Chen
Assigned to Google
US Patent Application 20090024595
Published January 22, 2009
Filed July 20, 2007
A computer-implemented method can include translating a search query from a first language to a second language, comparing the translated query with content in the second language, and identifying content in the second language relevant to the translated query based on the comparing.
Also, a computer-implemented method can include translating content in a second language at one or more network locations into a first language, comparing the translated content with a search query written in the first language, and identifying, from the translated content, content relevant to the query based on the comparing.
The patent application tells us that the search engine might look for results in other languages if those are specified by a searcher, or it might return pages in other languages based on some other factors, such as:
- The popularity of the language,
- Attributes of the user,
- Search history,
- Relevance to the subject matter of the query,
- The character type entered by the searcher (e.g., Chinese characters, Greek characters, etc.),
- The browser’s language settings,
- The domain of the search engine (e.g., google.cn for Chinese, google.tw for Taiwan-Chinese, etc.),
- User input, and;
- Other factors.
While a searcher might be able to choose to see search results from pages in other languages, if they don’t and there aren’t many results for their query, and one reason might be related to the language of that query (such as the name of a Chinese dish that might not be very popularly known in English), the search engine might be set up to expand the languages used in choosing results.