Google Acquires Neven Vision: Adding Object and Facial Recognition Mobile Technology

Google has acquired a mobile company which appears to have a rich technical background in face and object recognition.

A hat tip to Ionut Alex. Chitu, who writes about the acquisition of Neven Vision by Google, and makes some very good points on why Neven Vision was a great choice when it comes to bringing mobile technology to Google.

Neven Vision appears to be the trade name of Nevenenginering, Inc., which has been assigned a number of patents by the United States Patent Office, and was the successor to another recognition software company, Eyematic Interfaces, Inc.

According to the Neven Vision web site, they have a number of offerings based upon the use of mobile technology:

  • Image-driven mobile marketing services
  • Visual mobile search
  • Comparison shopping and m-commerce
  • Enhanced photo messaging
  • Secure data access
  • Field identity verification.

The Google blog tells us that the technology from Neven Vision will be useful in helping the Picassa team in “extracting information from a photo,” which could make it easier to organize and search through photographs. I expect that we may see even more from this new acquisition.

Following is some information about the patent applications and patents that have been assigned to Nevenengineering. The patents that they have been granted or assigned are first, with patent applications following. I have listed them in chronological order, by date filed. There’s a large gap between the last filed patent, and the oldest filed patent application, and it’s possible that there may be additional patent filings from the company that are presently unpublished.

Note that two of the patent applications at the bottom of the list focus on mobile phones with cameras and image search.

Patents

1996

Video superposition system and method
Invented by Jaron Lanier
Originally assigned to Eyematic Interfaces, Inc.
US Patent 6,400,374
Granted June 4, 2002
Filed September 18, 1996

Abstract

A graphic image system comprising a video camera producing a first video signal defining a first image including a foreground object and a background, the foreground object preferably including an image of a human subject having a head with a face; an image position estimating system for identifying a position with respect to said foreground object, e.g., the head, the foreground object having features in constant physical relation to the position; and a computer, responsive to the position estimating system, for defining a mask region separating the foreground object from said background. The computer generates a second video signal including a portion corresponding to the mask region, responsive to said position estimating system, which preferably includes a character having a mask outline. In one embodiment, the mask region of the second video signal is keyed so that the foreground object of the first video signal shows through, with the second video signal having portions which interact with the foreground object. In another embodiment, means, responsive to the position estimating system, for dynamically defining an estimated boundary of the face and for merging the face, as limited by the estimated boundary, within the mask outline of the character. Video and still imaging devices may be flexibly placed in uncontrolled environments, such as in a kiosk in a retail store, with an actual facial image within the uncontrolled environment placed within a computer generated virtual world replacing the existing background and any non-participants.

1997

Labeled bunch graphs for image analysis
Invented by Laurenz Wiskott, and Christoph von der Malsburg
Originally assigned to Eyematic Interfaces, Inc.
US Patent 6,222,939
Granted April 24, 2001
Filed June 25, 1997

Abstract

A process for image analysis which includes selecting a number M of images, forming a model graph from each of the number of images, such that each model has a number N of nodes, assembling the model graphs into a gallery, and mapping the gallery of model graphs into an associated bunch graph by using average distance vectors .DELTA..sub.ij for the model graphs as edge vectors in the associated bunch graph. A number M of jets is associated with each node of the associated bunch graph, and at least one jet is labeled with an attribute characteristic of one of the number of images. An elastic graph matching procedure is performed wherein the graph similarity function is replaced by a bunch-similarity function.

1998

Wavelet-based facial motion capture for avatar animation
Invented by Thomas Maurer, Egor Valerievich Elagin, Luciano Pasquale Agostino Nocera, Johannes Bernhard Steffens, and Hartmut Neven
Originally assigned to Eyematic Interfaces, Inc.
US Patent 6,272,231
Granted August 7, 2001
Filed November 6, 1998

Abstract

The present invention is embodied in an apparatus, and related method, for sensing a person’s facial movements, features and characteristics and the like to generate and animate an avatar image based on facial sensing. The avatar apparatus uses an image processing technique based on model graphs and bunch graphs that efficiently represent image features as jets. The jets are composed of wavelet transforms processed at node or landmark locations on an image corresponding to readily identifiable features. The nodes are acquired and tracked to animate an avatar image in accordance with the person’s facial movements. Also, the facial sensing may use jet similarity to determine the person’s facial features and characteristic thus allows tracking of a person’s natural characteristics without any unnatural elements that may interfere or inhibit the person’s natural characteristics.

Face recognition from video images
Invented by Johannes Bernhard Steffens, Egor Valerievich Elagin, Luciano Pasquale Agostino Nocera, Thomas Maurer, and Hartmut Neven
Originally assigned to Eyematic Interfaces, Inc.
US Patent 6,301,370
Granted October 9, 2001
Filed December 4, 1998

Abstract

The present invention is embodied in an apparatus, and related method, for detecting and recognizing an object in an image frame. The object may be, for example, a head having particular facial characteristics. The object detection process uses robust and computationally efficient techniques. The object identification and recognition process uses an image processing technique based on model graphs and bunch graphs that efficiently represent image features as jets. The jets are composed of wavelet transforms and are processed at nodes or landmark locations on an image corresponding to readily identifiable features. The system of the invention is particularly advantageous for recognizing a person over a wide variety of pose angles.

1999

Procedure for automatic analysis of images and image sequences based on two-dimensional shape primitives
Invented by Michael Potzsch, Norbert Kruger, and Christoph von der Malsburg
Originally assigned to Eyematic Interfaces, Inc.
US Patent 6,466,695
Granted October 15, 2002
Filed August 4, 1999

Abstract

The invention provides an apparatus, and related method, for providing a procedure to analyze images based on two-dimensional shape primitives. In the procedure, an object representation is created automatically from an image and then this representation is applied to another image for the purpose of object recognition. The features used for the representation are the two type of two-dimensional shape primitives: local line segments and vertices. Furthermore, the creation of object representations is extended to sequences of images, which is especially needed for complex scenes in which, for example, the object is presented in front of a structured background.

2001

Labeled bunch graphs for image analysis
Invented by Laurenz Wiskott, and Christoph von der Malsburg
Originally assigned to Eyematic Interfaces, Inc.
US Patent 6,356,659
Granted March 12, 2002
Filed January 16, 2001

Abstract

A process for image analysis which includes selecting a number M of images, forming a model graph from each of the number of images, such that each model has a number N of nodes, assembling the model graphs into a gallery, and mapping the gallery of model graphs into an associated bunch graph by using average distance vectors .DELTA..sub.ij for the model graphs as edge vectors in the associated bunch graph. A number M of jets is associated with each node of the associated bunch graph, and at least one jet is labeled with an attribute characteristic of one of the number of images. An elastic graph matching procedure is performed wherein the graph similarity function is replaced by a bunch-similarity function.

Method and apparatus for image analysis of a gabor-wavelet transformed image using a neural network
Invented by Johannes B. Steffens, Hartwig Adam, and Hartmut Neven
Assigned to Nevengineering, Inc.
US Patent 6,917,703
Granted July 12, 2005
Filed February 28, 2001

Abstract

The present invention may be embodied in a method, and in a related apparatus, for classifying a feature in an image frame. In the method, an original image frame having an array of pixels is transformed using Gabor-wavelet transformations to generate a transformed image frame. Each pixel of the transformed image is associated with a respective pixel of the original image frame and is represented by a predetermined number of wavelet component values. A pixel of the transformed image frame associated with the feature is selected for analysis. A neural network is provided that has an output and a predetermined number of inputs. Each input of the neural network is associated with a respective wavelet component value of the selected pixel. The neural network classifies the local feature based on the wavelet component values, and indicates a class of the feature at an output of the neural network.

Wavelet-based facial motion capture for avatar animation
Invented by Thomas Maurer, Egor Valerievich Elagin, Luciano Pasquale Agostino Nocera, Johannes Bernhard Steffens, and Hartmut Neven
Originally assigned to Eyematic Interfaces, Inc.
US Patent 6,580,811
Granted June 17, 2003
Filed May 31, 2001

Abstract

The present invention is embodied in an apparatus, and related method, for sensing a person’s facial movements, features and characteristics and the like to generate and animate an avatar image based on facial sensing. The avatar apparatus uses an image processing technique based on model graphs and bunch graphs that efficiently represent image features as jets. The jets are composed of wavelet transforms processed at node or landmark locations on an image corresponding to readily identifiable features. The nodes are acquired and tracked to animate an avatar image in accordance with the person’s facial movements. Also, the facial sensing may use jet similarity to determine the person’s facial features and characteristic thus allows tracking of a person’s natural characteristics without any unnatural elements that may interfere or inhibit the person’s natural characteristics.

Method and system for customizing facial feature tracking using precise landmark finding on a neutral face image
Invented by Ulrich F. Buddenmeier and Hartmut Neven
Assigned to Nevengineering, Inc.
US Patent 6,714,661
Granted March 30, 2004
Filed July 24, 2001

Abstract

The present invention is embodied in a method and system for customizing a visual sensor for facial feature tracking using a neutral face image of an actor. The method may include generating a corrector graph to improve the sensor’s performance in tracking an actor’s facial features.

System and method for feature location and tracking in multiple dimensions including depth
Invented by Orang Dialameh and Hartmut Neven
Assigned to Nevengineering, Inc.
US Patent 7,050,624
Granted May 23, 2006
Filed July 24, 2001

Abstract

The present invention is directed to a method and related system for determining a feature location in multiple dimensions including depth. The method includes providing left and right camera images of the feature and locating the feature in the left camera image and in the right camera image using bunch graph matching. The feature location is determined in multiple dimensions including depth based on the feature locations in the left camera image and the right camera image.

Method for optimizing off-line facial feature tracking
Invented by Thomas Maurer, Hartmut Neven, and Bjoern Poehlker
Assigned to Nevengineering, Inc.
US Patent 6,834,115
Granted December 21, 2004
Filed August 13, 2001

Abstract

The present invention relates to a technique for optimizing off-line facial feature tracking. Facial features in a sequence of image frames are automatically tracked while a visual indication is presented of the plurality of tracking node locations on the respective image frames. The sequence of image frames may be manually paused at a particular image frame in the sequence of image frames if the visual indication of the tracking node locations indicates that at least one location of a tracking node for a respective facial feature is not adequately tracking the respective facial feature. The location of the tracking node may be reinitialized by manually placing the tracking node location at a position on the particular image frame in the monitor window that corresponds to the respective facial feature. Automatic tracking of the facial feature may be continued based on the reinitialized tracking node location.

Method and system for generating facial animation values based on a combination of visual and audio information
Invented by Frank Paetzold, Ulrich F. Buddemeier, Yevgeniy V. Dzhurinskiy, Karin M. Derlich, and Hartmut Neven
Assigned to Nevengineering, Inc.
US Patent 6,940,454
Granted September 6, 2005
Filed August 13, 2001

Abstract

Facial animation values are generated using a sequence of facial image frames and synchronously captured audio data of a speaking actor. In the technique, a plurality of visual-facial-animation values are provided based on tracking of facial features in the sequence of facial image frames of the speaking actor, and a plurality of audio-facial-animation values are provided based on visemes detected using the synchronously captured audio voice data of the speaking actor. The plurality of visual facial animation values and the plurality of audio facial animation values are combined to generate output facial animation values for use in facial animation.

Method for generating an animated three-dimensional video head
Invented by Randall Ho, David Westwood, James Stewartson, Luciano Pasquale Agostino Nocera, Ulrich F. Buddemeier, Gregory Patrick Lane Lutter, and Hartmut Neven
Assigned to Nevengineering, Inc.
US Patent 7,050,655
Granted May 23, 2006
Filed August 13, 2001

Abstract

The invention relates to a technique for generating an animated three-dimensional video head based on sensed locations of facial features and texture mapping of corresponding two dimensional video image frames onto a shaped head mesh generated using the sensed locations.

Labeled bunch graphs for image analysis
Invented by Laurenz Wiskott and Christoph von der Malsburg
Originally assigned to Eyematic Interfaces, Inc.
US Patent 6,563,950
Granted May 13, 2003
Filed December 21, 2001

Abstract

A process for image analysis which includes selecting a number M of images, forming a model graph from each of the number of images, such that each model has a number N of nodes, assembling the model graphs into a gallery, and mapping the gallery of model graphs into an associated bunch graph by using average distance vectors .DELTA..sub.ij for the model graphs as edge vectors in the associated bunch graph. A number M of jets is associated with each node of the associated bunch graph, and at least one jet is labeled with an attribute characteristic of one of the number of images. An elastic graph matching procedure is performed wherein the graph similarity function is replaced by a bunch-similarity function.

Patent Applications

2004

Image base inquiry system for search engines for mobile telephones with integrated camera
Invented by Hartmut Neven, Sr.
US Patent Application 20050185060
Published August 25, 2005
Filed February 20, 2004

Abstract

An increasing number of mobile telephones and computers are being equipped with a camera. Thus, instead of simple text strings, it is also possible to send images as queries to search engines or databases. Moreover, advances in image recognition allow a greater degree of automated recognition of objects, strings of letters, or symbols in digital images. This makes it possible to convert the graphical information into a symbolic format, for example, plain text, in order to then access information about the object shown.

2005

Image-based search engine for mobile phones with camera
Invented by Hartmut Neven, Sr. and Hartmut Neven
US Patent Application 20060012677
Published January 19, 2006
Filed May 13, 2005

Abstract

An image-based information retrieval system is disclosed that includes a mobile telephone and a remote server. The mobile telephone has a built-in camera and a communication link for transmitting an image from the built-in camera to the remote server. The remote server has an optical character recognition engine for generating a first confidence value based on an image from the mobile telephone, an object recognition engine for generating a second confidence value based on an image from the mobile telephone, a face recognition engine for generating a third confidence value based on an image from the mobile telephone, and an integrator module for receiving the first, second, and third confidence values and generating a recognition output.

Single image based multi-biometric system and method
Invented by Hartwig Adam, Hartmut Neven, and Johannes B. Steffens
US Patent Application 20060050933
Published March 9, 2006
Filed June 21, 2005

Abstract

This disclosure describes methods to integrate face, skin and iris recognition to provide a biometric system with unprecedented level of accuracy for identifying individuals. A compelling feature of this approach is that it only requires a single digital image depicting a human face as source data.

Share

26 thoughts on “Google Acquires Neven Vision: Adding Object and Facial Recognition Mobile Technology”

  1. Pingback: Matjaž Jogan » Blog Archive » Mobile vision
  2. It would be interesting to see if you could combine Streetview with this recognition technology. I don’t know what the applications would be but reading signage and billboards might be useful.

  3. Pingback: Google Patents
  4. Some of the technology from that acquisition has been used in Google’s picture sharing software, Picasa. A compelling feature of this approach is that it only requires a single digital image. I don’t know what the applications would be but reading signage and billboards might be useful.

  5. Hi icedpf,

    It is pretty clear at this point that technology from Neven Vision has been used in Picasa. It’s less clear how much of the technology has been used in other ways at Google, though a number of patent filings from Google hint that it could be used in Google Maps, in their street views program, and to read text within images on web sites. That makes things pretty interesting…

  6. I think this technology is definitely being used by Google in other areas of their business. I wonder where they will use this digital imaging facial recognition technology next?

  7. Hi George,

    It is interesting to speculate where different technologies like those from Neven Vision might be used. In the Street Views part of Google Maps, it looks like Google blurs out the faces of people who might be seen in the images of neighborhoods. I suspect the Neven facial recognition technology is used to help find faces – doing so manually would be an incredible amount of work.

  8. Pretty interesting with regard to imaging recognition. For a minute I thought that Google had found technology that could recognize faces from a computer monitor. Could you imagine that? Talk about big brother… BTW I also heard that they just aquired Captcha for something to do with their project to digitize all books…Did you hear anything about that?

  9. Hi Neil,

    Google did recently purchase Recaptcha. Here’s the news from Carnegie Mellon University’s CyLab, where the technology originated:

    http://www.cylab.cmu.edu/news_events/cylab_news/recaptcha.html

    Don’t know if Google will develop technology that can recognize people browsing the web from viewing their faces, but I’ve seen a few patents from the search engines that hint at recognizing who is using a computer from the way that they type on their keyboards. Seriously. :)

  10. Will it influence the SEO technique since google may recognize image just like he recognize text?

    ..few patents from the search engines that hint at recognizing who is using a computer from the way that they type on their keyboards. Seriously.

    If it’s true, it’s really scaring..

  11. I’ll have to agree with dekguss99 on this one. Recognizing a user based on how they type? I’m normally a fan of Google’s (and the companies they acquire) innovation, but I think that is going just a bit too far. In my opinion, they should be focusing much more on voice recognition than something like this. All but perfect voice recognition is something everyone with a phone or computer could use on a daily basis, but image recognition and typing habits? I suppose on the business side of things it would help them to collect data, but there’s too much personal data floating around already.

  12. Hi John,

    I’m not sure that I see much difference between Google being able to recognize your voice, compared to something like recognizing faces or how someone types.

  13. Bill, I just thought the same about if Google will recognize users by web-cam in the future. Of course web cams would have laser technology that could scan your eye iris and stuff :) But no need for passwords anymore… and creating multiple users will get tougher :D

  14. Hi Mika,

    That’s definitely a possibility. I know that thumb print scanners have been around for a few years to provide security for laptops, so that people can’t turn the things on unless they scan their thumbprint first. Having a webcam do a retina scan sounds like it would be a similar approach. I don’t know if Google would go that far, though.

  15. I have read somewhere that Neven Vision has been detecting whether there is a person on the picture, or not a photo contains a person or not, and will be someday recognizing people, places and objects. Isn’t it that Facebook already have this feature where it recognize the people in the picture?

  16. Hi Angela,

    Neven Vision has a wide range of patents that cover object recognition of different types from images, and that Google has been using at least some of this technology in a few ways, such as in automating indexing of images in Picasa where people appear in the images. I don’t know that Facebook uses people recognition technology for photos – people who post photos can tag images at Facebook with people’s names, and that can help them index those images.

  17. Mika’s reply reminded me of The Blade Runner movie based on Phillip Dick’s “Do androids dream of the electric sheep”. Eye pupil scanning etc. by the almighty Google:) But seriously, it will take a long time until such technologies really get put into practice. Even face recognition is just developing…

Comments are closed.