US-9305240

Motion aligned distance calculations for image comparisons

PublishedApril 5, 2016

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Image comparison techniques allow a quick method of recognizing and identifying faces or other objects appearing in images. A series of quick distance calculations can be performed between an unknown input image and a reference image. These calculations may include facial detection, normalization, discrete cosine transform calculations, and threshold comparisons to determine whether an image is recognized. In the case of identification uncertainty, slower but more precise motion aligned distance calculations are initiated. Motion aligned distance calculations involve generating a set of downscaled images, determining motion field and motion field-based distances between an unknown input image and reference image, best scale factors for aligning an unknown input image with reference images, and calculating affine transformation matrices to modify and align an unknown input image with reference images.

Patent Claims

25 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A computer-implemented method comprising: detecting a facial image in an input image, the input image received by a detection module; calculating a distance between the input image and a reference image, the reference image retrieved from a reference image database and the reference image containing a known facial image for aiding recognizing detected facial images; executing, in response to determining the distance calculated is within a predetermined threshold, a motion aligned distance calculation comprising: identifying a center point for each of the input image and the reference image; splitting the input image and the reference image into blocks based on the center point of the input image and the center point of the reference image, respectively; determining a motion field for aligning the input image with the reference image, the motion field comprising, for a pair of corresponding blocks of the input image and the reference image, at least one vector corresponding to a modification of a portion of the facial image within a first block in the pair of corresponding blocks to shift the portion of the facial image within the first block to align with a corresponding portion of the facial image within a second block in the pair of corresponding blocks; and aligning the input image with the reference image based on the motion field; calculating a motion aligned distance between the input image and the reference image based on the motion field; and providing, in response to the motion aligned distance calculated, a recognition result for the input image.

2. The method of claim 1 , wherein the detecting the facial image is based in part on a detection of facial features comprising of one or more of the following: eyes, nose, mouth, ears, or facial outline.

3. The method of claim 1 , wherein the facial image is normalized, downscaled, rescaled, or a combination thereof.

4. The method of claim 3 , wherein normalizing comprises normalizing size scale, orientation, brightness, contrast, or a combination thereof.

5. The method of claim 3 , wherein downscaling comprises a reduction of image size, quality, or a combination thereof.

6. The method of claim 1 , wherein calculating distances uses a discrete cosine transform.

7. The method of claim 1 , wherein the center point comprises: a physical center of an image, a common facial feature, or a point determined by a normalization process.

8. The method of claim 1 , wherein the splitting the input image and the reference image comprises dividing each image into specifically sized sections.

9. The method of claim 1 , wherein the determining the motion field comprises: matching the corresponding blocks between the input image and the reference image based on common features; calculating coefficients for each coordinate on the input image based on differences between the corresponding blocks; and comparing the coefficients with a second predetermined threshold value and if greater than the second predetermined threshold value: calculating motion field vector parameters; and calculating a motion field distance based on an average sum of motion field vectors.

10. The method of claim 1 , wherein the calculating the motion aligned distance comprises: determining a scale factor; generating affine transformation matrices; dividing the reference image into parts; calculating distances between regions in the input image to the parts of the reference image; calculating normalizing coefficients based on each region; and calculating a precise distance between the input image and the reference image based on the normalizing coefficients and the distances between the regions.

11. The method of claim 10 , wherein the determining the scale factor comprises: obtaining scale parameters; generating a set of scaled images; calculating motion field distances between a scaled input image and reference images; determining a best distance from a set of calculated motion field distances; and selecting the scale factor that produced the best distance.

12. The method of claim 10 , wherein the determining affine transformation matrices comprises: obtaining a downscaled input image and reference images; generating a first affine transformation matrix based on a best scale factor; resizing the input image based on the scale factor; calculating a first motion field for the resized input image and the reference images; generating a second affine transformation matrix based on the calculated first motion field; calculating a third affine transformation matrix based on a convolution of the first affine transformation matrix and the second affine transformation matrix; applying the third affine transformation matrix to the downscaled input image to produce a transformed input image; calculating a second motion field for the transformed input image and the reference images; generating a fourth affine transformation matrix based on the calculated second motion field; and calculating a fifth affine transformation matrix based on a convolution of the third affine transformation matrix and the fourth affine transformation matrix.

13. The method of claim 10 , wherein the parts comprise three regions: eyes, nose, and mouth.

14. An image processor system, embodied in a mobile computing device, for identifying a facial image, the system comprising: a detection module configured to detect the facial image in an input image; a reference image database configured to store a reference image; a distance calculator module configured to calculate a distance between the input image and the reference image, the reference image containing a known facial image for aiding recognizing detected facial images; and a motion field module configured to calculate a plurality of vectors in a motion field to align the input image with the reference image, the calculation through the motion field module further configured to: identify a center point for each of the input image and the reference image; split the input image and the reference image into blocks based on the center point of the input image and the center point of the reference image, respectively; determine, for each pair of corresponding blocks of the input image and the reference image, at least one vector in the motion field corresponding to a modification of a portion of the facial image within a first block in a pair of corresponding blocks to shift the portion of the facial image within the first block to align with a corresponding portion of the facial image within a second block in the pair of corresponding blocks; and align the input image with the reference image based on the motion field.

15. The system of claim 14 , wherein a second distance calculated by the distance calculator module based on the aligned input image is a motion aligned distance.

16. The system of claim 14 , further comprising: a normalization module configured to normalize facial images, based in part on at least one of the following: orientation, scale, brightness, or contrast; and a downscale module configured to modify an image by reducing image size, reducing image quality, or a combination thereof.

17. The system of claim 14 , wherein the motion field module is further configured to: match the corresponding blocks between the input image and the reference image based on common features; calculate coefficients for each coordinate on the input image based on differences between the corresponding blocks; compare the coefficients with a second predetermined threshold value and if greater than the second predetermined threshold value; calculate motion field vector parameters; and calculate a motion field distance based on an average sum of motion field vectors.

18. The system of claim 14 , wherein the image processor system further comprises a scale module configured to determine a scale factor, a determination through the scale module configured to: obtain scale parameters; generate a set of scaled images; calculate motion field distances between a scaled input image and reference images; determine a best distance from a set of motion field distances; and select the scale factor that produced the best distance.

19. The system of claim 14 , wherein the image processor system further comprises an affine transformation module configured to calculate an affine transformation matrix, a calculation through the affine transformation module configured to: obtain a downscaled input image and reference images; generate a first affine transformation matrix based on a best scale factor; resize the input image based on the scale factor; calculate a first motion field for the resized input image and the reference images; generate a second affine transformation matrix based on the first motion field; calculate a third affine transformation matrix based on a convolution of the first affine transformation matrix and the second affine transformation matrix; apply the third affine transformation matrix to the downscaled input image to produce a transformed input image; calculate a second motion field for the transformed input image and the reference images; generate a fourth affine transformation matrix based on the second motion field; and calculate a fifth affine transformation matrix based on a convolution of the third affine transformation matrix and the fourth affine transformation matrix.

20. The system of claim 14 , wherein the distance calculator module is configured to: divide the reference image into parts; calculate distances between regions in the input image to the parts of the reference image; calculate normalizing coefficients based on each region; and calculate a precise distance between the input image and the reference image based on the normalizing coefficients and the distances between the regions.

21. A computer-implemented method comprising: detecting a type of object in an input image, the input image received by a detection module; calculating a distance between the input image and a reference image, the reference image retrieved from a reference image database and the reference image containing a known type of object for aiding recognizing detected objects; executing, in response to determining the distance calculated is within a predetermined threshold, a motion aligned distance calculation comprising: identifying a center point for each of the input image and the reference image; splitting each of the input image and the reference image into blocks based on the center point of the input image and the center point of the reference image, respectively; determining a motion field for aligning the input image with the reference image, the motion field comprising, for a pair of corresponding blocks of the input image and the reference image, at least one vector corresponding to a modification of a portion of the object within a first block in the pair of corresponding blocks to shift the portion of the object within the first block to align with a corresponding portion of the object within a second block in the pair of corresponding blocks; aligning the input image with the reference image based on the motion field; calculating a motion aligned distance between the input image and the reference image based on the motion field; and providing, in response to the motion aligned distance calculated, a recognition result for the input image.

22. The method of claim 21 , wherein the determining the motion field comprises: matching the corresponding blocks between the input image and the reference image based on common features; calculating coefficients for each coordinate on the input image based on differences between the corresponding blocks; and comparing the coefficients with a second predetermined threshold value and if greater than the second predetermined threshold value: calculating motion field vector parameters; and calculating a motion field distance based on an average sum of motion field vectors.

23. The method of claim 21 , wherein the calculating the motion aligned distance comprises: determining a scale factor; generating affine transformation matrices; dividing the reference image into parts; calculating distances between regions in the input image to the parts of the reference image; calculating normalizing coefficients based on each region; and calculating a precise distance between the input image and the reference image based on the normalizing coefficients and the distances between the regions.

24. The method of claim 23 , wherein the determining the scale factor comprises: obtaining scale parameters; generating a set of scaled images; calculating motion field distances between a scaled input image and reference images; determining a best distance from a set of calculated motion field distances; and selecting the scale factor that produced the best distance.

25. The method of claim 23 , wherein the determining affine transformation matrices comprises: obtaining a downscaled input image and reference images; generating a first affine transformation matrix based on a best scale factor; resizing the input image based on the scale factor; calculating a first motion field for the resized input image and the reference images; generating a second affine transformation matrix based on the calculated first motion field; calculating a third affine transformation matrix based on a convolution of the first affine transformation matrix and the second affine transformation matrix; applying the third affine transformation matrix to the downscaled input image to produce a transformed input image; calculating a second motion field for the transformed input image and the reference images; generating a fourth affine transformation matrix based on the calculated second motion field; and calculating a fifth affine transformation matrix based on a convolution of the third affine transformation matrix and the fourth affine transformation matrix.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06V

Patent Metadata

Filing Date

December 6, 2012

Publication Date

April 5, 2016

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search