How Google Might Enable You to Translate Your Webpages Through a Proxy Server

In July, Google launched a beta version of their Page Speed Service, which collects content from your pages on the fly, and republishes it on a proxy, rewritten in a manner that should provide faster pages. Search Engine Watch wrote about this proxy service on July 29, 2011, and Adam Hopkinson in the comments points to details about the configuration page one sets up to control this service. The service appears to be one that Google will charge for once the beta is over.

What if Google also offered the ability to do other things through that proxy service such as offer localization of those pages, with the ability to set up translations of text on the page in different languages, or to change logos and other images for viewers from certain locations?

Imagine that rather than using machine translation, you could edit the proxy versions of your page through a browser, like the service from Israel that NETMASK Internet Technologies provides with their Netmask.IT! tool. That tool can work on webpages as well as on software products. Customers of Netmask Internet Technologies in the past, for at least the software localization that they offer, have included Siemens, Compaq, IBM, Data General, Sun, Oracle, Motorola, HP, and SGI.

Google was assigned the patents behind Netmask’s technology this past month, according to the USPTO assignment database, which doesn’t provide details of the transaction other than a September 1st assignment recorded on September 20th.

We don’t know if Google will expand the Page Speed service to include the kind of localization offered in the patents, as well as some security measures described in one of them that includes a digital rights management approach that can make it difficult for automated scrapers to copy the content of a page.

Here are the 4 patents that were assigned to Google in the transaction:

Dynamic content conversion
Invented by Eliyahu Marmor
Assigned to Netmask (El-Mar) Internet Technologies Ltd.
United States Patent 7,987,293
Granted July 26, 2011
Filed: April 10, 2006

Abstract

A method of display modification in a client server web system, comprising, intercepting, by a web intermediary, a response to a client request, sent by a server in response to the request, the response including client side active content adapted to execute at a browsing software on a client computer; replacing at least one display-related code section in said response by a wrapper section that includes code for modification of at least one display element and code for executing the original display-related code section; and executing said wrapper section as client side active content at said client to generate a display, modified from a display that would have been generated by executing the response.

Configuration setting
Invented by Eliyahu Marmor
US Patent Application 20070055739
Published March 8, 2007
Filed: April 3, 2006

Abstract

A method of defining customization for electronic content retrieved over an electronic connection (100, FIG. 1), which retrieves electronic content from a remote server (110) to a local client (102); edits the content at the local client by a user using a WYSIWYG editor, wherein AUTHORIZED USER REQUEST PAGE 202 said editor is a standard software used for displaying of content and wherein said editing does not require installation of software requiring user authorization; and automatically generates at least one customization definition based on said editing, said customization definition suitable for automatic applying to said content.

Non-intrusive digital rights enforcement
Invented by Eliyahu Marmor
Assigned to Netmask (El-Mar) Internet Technologies Ltd.
United States Patent 7,167,925
Granted January 23, 2007
Filed: August 28, 2001

Abstract

A method for transferring information between a server (24) and a client (28), through a converter (22), comprising: analyzing at least a portion the information by the converter (22), to determine a standard used by the server (24) to encode the information in the portion; and replacing at least a portion of the analyzed information with other information, which other information uses a second standard, wherein, analyzing comprises parsing the information on a syntactic level and wherein the information comprises at least one Internet hypertext document.

Automatic conversion system
Invented by Eliyahu Marmor
Assigned to Netmask (El-Mar) Internet Technologies Ltd.
US Patent 6,601,108
Granted July 29, 2003
Filed: April 12, 1999

Abstract

A method for transferring information between a server (24) and a client (28), through a converter (22), comprising: analyzing at least a portion of the information by the converter (22), to determine a standard used by the server (24) to encode the information in the portion; and replacing at least a portion of the analyzed information with other information, which other information uses a second standard, analyzing comprises parsing the information on a syntactic level and wherein the information comprises at least one Internet hypertext document.

Conclusion

When I read through these patents, I instantly thought of the Page Speed Proxy service that Google has set up, and how Google could potentially offer the services described in this patent as well.

It’s interesting that Google has acquired these patents at a time when NetMask seems to have just started really pushing the Web version of this service in the past few months. I’m not sure if Netmask is still in business, or if this transaction means that Google has effectively taken over the technology involved completely.

Fog Creek Software very recently experimented with Google’s Page Speed service, and provides a very detailed look at it in their post, Experiments with Google Page Speed Service. While there were some issues encountered the Page Speed service, the people at Fog Creek seemed pretty upbeat about the service, which remember is still in Beta.

Using the translation tools described in the patents could potentially make it much easier, faster, and more affordable to provide different language versions of sites, and the digital rights management aspect of the service could possibly help protect sites against some scrapers.

Share

18 thoughts on “How Google Might Enable You to Translate Your Webpages Through a Proxy Server”

  1. Hi Bill,

    These patents are certainly some of the trickiest ones to digest.

    The language conversion side would be very useful – currently I see a fair proportion of traffic coming from translate.google.hk and to be able to offer a simple translation for other languages would be incredibly useful considering our proximity to our european neighbours.

    Profiting from other peoples content is an awful tactic and it seems that this is quite explicitly covered in the Non-Intusive DRM patent

    Typically, consumers must either pay to view copyrighted information or they are forced to view advertisement information along with the copyrighted information. However, once data is available in a computer readable format, there is a danger that an infringer will copy the data, remove any advertisements and redistribute the data himself, for his own enjoyment and/or profit.

    Which could be a great thing if Google is proactively acknowledging there’s an issue!

  2. I hate bubblefish, 90% of the time I can’t understand any word if a machine translate “english=> German”

    It is a nightmare to read “machine translation”, it gives me the feeling the site owner has not enough money to pay a professional translation or I’m not worth for this.

    If I’m searching for a solution, I accept that I can’t read japanese, ;)

    But the last weeks I got a headache, I found a site where I can’t read it in english, because they give me bubblefish (my browser is sending that I’m from Austria German language)

    You can’t read code-explanation if there is bubblefish all the way ;) a “box” is a box and not a “Schachtel” ;)

    variable=true and not variable=Wahrheit

    we have 20 different words for “variable” in German
    http://dict.leo.org/ende?lp=ende&lang=de&searchLoc=0&cmpType=relaxed&sectHdr=on&spellToler=&search=variable

    and bubblefish translate this to 90% false

    maybe Google enables this, but please don’t do this

  3. Thanks Bill for providing this information, I too agree with you that with this new tool Google will give us the translation in many languages and also with this it will gain a more global approach.

  4. I actually think Google’s translation service is incredibly handy. Yes, it is no substitute for a professional translation, but sometimes that’s not possible if you’re a small firm dealing with “an order or two” from various countries that are not your primary focus (the company I work for runs into this often), and something like the story describes would be better than nothing.

  5. I do remain sceptical about using automatic translation to provide content in different languages on a website, unless there is careful editing. The geo-localisation possibilities and scraper protection do seem more promising at this stage.

  6. People browsing the web can have the Google Translate toolbar installed, and have the pages translated automatically. Very nice, but the current machine translation capability is lousy, and the text often does not make sense, except for very simple text. It might give you an idea of what a page is about, but don’t expect a well-translated text.

    Now, is current machine translation interesting for website owners? Not for SEO, as the pages will be generated on the fly, as current Google Translate pages are, and will never be indexed. It does therefore not contribute to their SEO juice. From the marketing point of view it’s not very good either – the only thing that scares a visitor away as much as a page that he does not understand is a badly-written page, and, as stated before, the quality of machine translation – including Google’s- is today just so-so..

    However, if there IS a way to edit such pages by human translators AND those pages could be properly indexed (i.e., the edited pages would be accessible for the search engines) this could be a new ball game – today it is quite complex to manage multilingual websites.

  7. These are definitely not the easiest one to understand. I had to read through a few times to get a grasp.
    My question mark evolves around the ability to translate with a minimum of success.

  8. Yes, that is a great idea that Google could do. Right now the way everything is set up with translators I don’t really like as much and that is why I don’t use it. If Google implemented something like what you were suggesting I think it would be very good and I would definitely look into using it.

  9. Hi Tom,

    That these patents were developed in a completely different atmosphere by another company, and not Google adds some complexity to them. That they were originally developed to provide a localization interface for software programs rather than web pages adds an additional layer as well, but they seem to have had in mind for a number of years that they could be applied to Web pages as well.

    Being able to specifically target potential customers in their language, and provide human translations that not only may be much stronger than automated ones, but could also take into account cultural differences could be a real boon to site owners looking to expand what they offer.

    The digital rights management patent is interesting as well, if it could truly help to cut down on people scraping content. One place that I’d love to see that applied is to Google’s cached copies of pages, so that people don’t scrape those cache copies. As you stated though, Google would have to acknowledge that their ability to distinquish between original content and someone copying it has limitations.

  10. Hi Monika,

    I’m more likely to use Google’s translation tool than Babelfish these days, but there are still problems with automated translations.

    The translations described in this patent aren’t automated ones, but rather human translations added through a brower-based editing tool that you can use. The invention is being able to easily change what is shown to visitors from different locations, on a proxy server, so that you don’t have to spend the time and money developing your pages in different languages.

    The proxy server set up described appears to be very similar to Google’s Page Speed Service, which is why I mention it in the post.

  11. Hi Neeraj,

    It would add the potential for people to provide their pages in more languages to a wider audience. Let’s see if Google decides to use the technology that they acquired with this acquisition.

  12. Hi Marcus,

    I think Google’s translation service is useful.

    What is described in the patent could potentially make the cost of coming up with pages that use human translations much more affordable, and much quicker to develop and implement, which could potentially help many more smaller businesses capable of using them.

    For example, there are a lot of businesses in the US that could benefit from having versions of their pages written in Spanish. If they could get a translator come in and provide Spanish translations for their pages that could be set up in a few hours or a few days, that might definitely be worth amount spent on the translation work.

  13. Hi Eleonor,

    There’s a page on one of the Netmask.it pages (Human Translation vs. Machine Translation — or — What’s Wrong with Machine Translation) that tells us something about how they feel about automated translations, and I liked this paragraph a lot:

    There is a joke that illustrates this difficulty: When the CIA inaugurated its first English-Russian-English translation computer, one of the participants asked how one could check and see that the translation was the way it should be. The developers of the translation engine answered: “Type in an expression in English and see the translation”. And so the asker typed in the expression “Out of sight, out of mind”, and he received an answer in Russian. Someone else asked: “How do we know that this is the correct translation?”. He was answered: “Type the result back into the machine”. The translation was not long in coming: “Invisible idiot”. There is a similar joke with the translation of “the spirit is willing, but the flesh is weak” (Matthew 26) into “the vodka is good but the meat is rotten”.

    The approach described in the patents doesn’t involve automated translations, but rather a way to add human translations much faster and easier at a much lower cost that developing those pages independently.

  14. Hi Website Translator,

    I think you’ve described one of the most important issues that Google might face with implementing this kind of proxy provided translations, which would be to make sure that the human translated pages offered by the proxy service can be indexed by search engines. I could see how Google could easily capture that information since they are the ones providing the proxy service, but I’m not sure how readily that might be available to other search engines.

  15. Hi Caleb,

    The Page Speed Service that Google offers presently could potentially help a lot of sites that offer a lot of pages, but might find it both difficult and expensive to speed those pages up.

    If they were to offer a similar proxy service that could provide localization in an indexable manner, I think it would definitely be worth pursuing to site owners who offer something that appeals to audiences who speak different languages.

  16. Hi Laurent,

    Sorry for any difficulties you might have had with this post or the patents that I’ve written about. I had to go through the patents and the netmask.it pages a few times myself, and it wasn’t until spent some time on the Google Page Speed Service help pages that I got a better idea of how this system might work.

    It seems like Netmask worked with some very high profile clients implementing software (and possibly web pages) translations in different languages, so there definitely appears to be some value to their approach.

    Since the translations you would use with a system like this would be human translations, how effective they might be may depend upon both the quality of those translations, and the ability to understand how different cultural references and content approaches might impact your audience.

    There was an interesting case study online which I can’t locate right now which involved Tide Detergent. The US version of the website showed a typical America house, with a kitchen, bathrooms, bedrooms, and recreation rooms, and showcased stains that often were associated with those different rooms. For example, food stains are often associated with a kitchen. A translated version of this website, with the house metaphor was presented on a site in India, and confused a lot of visitors to the site who are used to homes that have more multi-purpose rooms rather than rooms dedicted to one specific use. Regardless of how good the translation might have been, culturally the ad made little sense.

    The proxy service described in the patent not only allows you to add translations, but also to add or remove certain images and possibly make other changes as well, so that you wouldn’t just be limited to pure translations.

  17. @ Laurent B,

    Just adding my 2 cents worth re. auto translation work.
    There are a couple of softwares that do a better work on the auto translation than Google translate, babelfish and the rest.

    Those generally belong to the realm of machine learning softwares that become better overtime by having a human translator input.
    The machine has an in-built a confidence level on its work
    You can setup a level of translation quality based on the importance of a specific page and have translator review the machine work.

    Main example is Tripadvisor
    http://www.tripadvisor.fr/Hotel_Review-g35805-d1516481-Reviews-Elysian_Hotel_Chicago-Chicago_Illinois.html
    using
    Language Weaver
    but those softwards come with a hefty price tag + maintenance fees. For now case like @Marcus S. would definitely benefit from Big G’s improvement on their translation work

  18. Hi Freddy,

    Funny thing about the patents Google acquired is that the focus behind them is to make it much easier for someone to present human translations to audiences rather than machine translations, where not only can the content be changed to include appropriate cultural references, but also appropriate images can be swapped in and out as well.

    I expect that we will see automated translation continue to improve in the future, but one of the benefits of using a human translator familiar with the culture that content is being created for isn’t just the accuracy of the translation, but also the cultural impact of products and services and the ways that they are being presented.

Comments are closed.