Image Similarity from Disparate Sources

PublishedJuly 5, 2016

Assigneenot available in USPTO data we have

InventorsMalcolm Slaney Kilian Quirin Weinberger Kaushal Kurapati Sriram J. Sathish Polly Ng

Technical Abstract

Patent Claims

12 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A computer-implemented method comprising steps of: determining, for each image in a plurality of images, a set of metadata for the image, said metadata including, for each concept of a plurality of concepts, values that indicate the probability that said each image pertains to said concept; generating, for each particular image in the plurality of images, a data structure that contains information regarding the set of metadata for the particular image; in response to a particular user's request to find other images that are similar to a selected image, comparing values in the data structure that was generated for the selected image to values in the data structure that was generated for a candidate search result image in the plurality of images; and in response to determining that a result of the comparing exceeds a specified threshold, presenting at least the candidate search result image to the user as an image that is similar to the selected image; wherein determining the set of metadata for the image comprises determining: (a) a set of attributes that reflect visual characteristics that are visible in the image, (b) a set of tags that have been associated with the image by one or more users in a community of users, (c) a set of concepts to which tags in the set of tags are related, and (d) for each concept in the set of concepts, a probability that the set of tags reflects the concept, thereby producing a set of concept probabilities for the image; wherein generating the data structure that contains information regarding the set of metadata for the particular image comprises generating a particular data structure that contains information regarding the (a) the set of attributes determined for the particular image, and (b) the set of concept probabilities determined for the particular image; for each particular tag in the set of tags determined for the particular image, (a) determining a quantity of different images, in the plurality of images, with which the particular tag is associated, and (b) weighting a value for a particular concept probability, in the data structure that was generated for the particular image, based at least part on said quantity. wherein the steps are performed by one or more computing devices.

2. A computer-implemented method comprising steps of: determining, for each image in a plurality of images, a set of metadata for the image; generating, for each particular image in the plurality of images, a data structure that contains information regarding the set of metadata for the particular image; in response to a particular user's request to find other images that are similar to a selected image, comparing values in the data structure that was generated for the selected image to values in the data structure that was generated for a candidate search result image in the plurality of images; and in response to determining that a result of the comparing exceeds a specified threshold, presenting at least the candidate search result image to the user as an image that is similar to the selected image; wherein determining the set of metadata for the image comprises determining: (a) a set of attributes that reflect visual characteristics that are visible in the image, (b) a set of tags that have been associated with the image by one or more users in a community of users, (c) a set of concepts to which tags in the set of tags are related, and (d) for each concept in the set of concepts, a probability that the set of tags reflects the concept, thereby producing a set of concept probabilities for the image; wherein generating the data structure that contains information regarding the set of metadata for the particular image comprises generating a particular data structure that contains information regarding the (a) the set of attributes determined for the particular image, and (b) the set of concept probabilities determined for the particular image; for each particular tag in the set of tags determined for the particular image, (a) determining a quantity of different images, in the plurality of images, with which the particular tag is associated, and (b) weighting a value for a particular concept probability, in the data structure that was generated for the particular image, based at least part on said quantity; wherein weighting the value for the particular concept probability comprises determining a location, within a distribution of different tags' quantities, at which the particular tag's quantity lies, and weighting the value for the particular concept probability based at least in part on a distance of the particular tag's location from a median of the distribution; wherein the steps are performed by one or more computing devices.

3. A computer-implemented method comprising steps of: determining, for each image in a plurality of images, a set of metadata for the image; generating, for each particular image in the plurality of images, a data structure that contains information regarding the set of metadata for the particular image; in response to a particular user's request to find other images that are similar to a selected image, comparing values in the data structure that was generated for the selected image to values in the data structure that was generated for a candidate search result image in the plurality of images; and in response to determining that a result of the comparing exceeds a specified threshold, presenting at least the candidate search result image to the user as an image that is similar to the selected image; wherein determining the set of metadata for the image comprises determining: (a) a set of attributes that reflect visual characteristics that are visible in the image, (b) a set of tags that have been associated with the image by one or more users in a community of users, (c) a set of concepts to which tags in the set of tags are related, and (d) for each concept in the set of concepts, a probability that the set of tags reflects the concept, thereby producing a set of concept probabilities for the image; wherein generating the data structure that contains information regarding the set of metadata for the particular image comprises generating a particular data structure that contains information regarding the (a) the set of attributes determined for the particular image, and (b) the set of concept probabilities determined for the particular image; for each particular tag in the set of tags determined for the particular image, (a) determining a quantity of different images, in the plurality of images, with which the particular tag is associated, and (b) weighting a value for a particular concept probability, in the data structure that was generated for the particular image, based at least part on said quantity; wherein weighting the value for the particular concept probability comprises decreasing an extent of influence that the value for the particular concept probability will have on said comparing as a distance of the particular tag's quantity becomes farther away from a median of a distribution of different tags' quantities; wherein the steps are performed by one or more computing devices.

4. A computer-implemented method comprising steps of: receiving a user's request to view other images that are deemed to be similar to a first image; determining a first group with which the first image is associated; determining a first set of concepts based at least in part on textual data that are associated with multiple images in the first group; determining a first set of concept probabilities for concepts in the first set of concepts; populating at least a subsection of a first feature vector with concepts in the first set of concept probabilities; determining a second group with which a second image is associated; determining a second set of concepts based at least in part on textual data that are associated with multiple images in the second group; determining a second set of concept probabilities for concepts in the second set of concepts; populating at least a subsection of a second feature vector with concepts in the second set of concept probabilities; determining a Euclidean distance based at least in part on the first feature vector and the second feature vector; and based at least in part on the Euclidean distance, selecting the second image from an image corpus for inclusion within a set of other images that are deemed to be similar to the first image; presenting the set of other images to a user from which the request was received; and weighting one or more concept probabilities in the first feature vector based at least in part on how far from a median of a concept probability distribution the one or more concept probabilities in the first feature vector occur in the concept probability distribution; wherein the steps are performed by one or more computing devices.

5. A non-transitory computer-readable storage medium storing one or more instructions which, when executed by one or more processors, cause the one or more processors to perform steps comprising: determining, for each image in a plurality of images, a set of metadata for the image, said metadata including, for each concept of a plurality of concepts, values that indicate the probability that said each image pertains to said concept; generating, for each particular image in the plurality of images, a data structure that contains information regarding the set of metadata for the particular image; in response to a particular user's request to find other images that are similar to a selected image, comparing values in the data structure that was generated for the selected image to values in the data structure that was generated for a candidate search result image in the plurality of images; and in response to determining that a result of the comparing exceeds a specified threshold, presenting at least the candidate search result image to the user as an image that is similar to the selected image; wherein determining the set of metadata for the image comprises determining: (a) a set of attributes that reflect visual characteristics that are visible in the image, (b) a set of tags that have been associated with the image by one or more users in a community of users, (c) a set of concepts to which tags in the set of tags are related, and (d) for each concept in the set of concepts, a probability that the set of tags reflects the concept, thereby producing a set of concept probabilities for the image; wherein generating the data structure that contains information regarding the set of metadata for the particular image comprises generating a particular data structure that contains information regarding the (a) the set of attributes determined for the particular image, and (b) the set of concept probabilities determined for the particular image; for each particular tag in the set of tags determined for the particular image, (a) determining a quantity of different images, in the plurality of images, with which the particular tag is associated, and (b) weighting a value for a particular concept probability, in the data structure that was generated for the particular image, based at least part on said quantity. wherein the steps are performed by one or more computing devices.

6. A non-transitory computer-readable storage medium storing one or more instructions which, when executed by one or more processors, cause the one or more processors to perform steps comprising: determining, for each image in a plurality of images, a set of metadata for the image; generating, for each particular image in the plurality of images, a data structure that contains information regarding the set of metadata for the particular image; in response to a particular user's request to find other images that are similar to a selected image, comparing values in the data structure that was generated for the selected image to values in the data structure that was generated for a candidate search result image in the plurality of images; and in response to determining that a result of the comparing exceeds a specified threshold, presenting at least the candidate search result image to the user as an image that is similar to the selected image; wherein determining the set of metadata for the image comprises determining: (a) a set of attributes that reflect visual characteristics that are visible in the image, (b) a set of tags that have been associated with the image by one or more users in a community of users, (c) a set of concepts to which tags in the set of tags are related, and (d) for each concept in the set of concepts, a probability that the set of tags reflects the concept, thereby producing a set of concept probabilities for the image; wherein generating the data structure that contains information regarding the set of metadata for the particular image comprises generating a particular data structure that contains information regarding the (a) the set of attributes determined for the particular image, and (b) the set of concept probabilities determined for the particular image; for each particular tag in the set of tags determined for the particular image, (a) determining a quantity of different images, in the plurality of images, with which the particular tag is associated, and (b) weighting a value for a particular concept probability, in the data structure that was generated for the particular image, based at least part on said quantity; wherein weighting the value for the particular concept probability comprises determining a location, within a distribution of different tags' quantities, at which the particular tag's quantity lies, and weighting the value for the particular concept probability based at least in part on a distance of the particular tag's location from a median of the distribution; wherein the steps are performed by one or more computing devices.

7. A non-transitory computer-readable storage medium storing one or more instructions which, when executed by one or more processors, cause the one or more processors to perform steps comprising: determining, for each image in a plurality of images, a set of metadata for the image; generating, for each particular image in the plurality of images, a data structure that contains information regarding the set of metadata for the particular image; in response to a particular user's request to find other images that are similar to a selected image, comparing values in the data structure that was generated for the selected image to values in the data structure that was generated for a candidate search result image in the plurality of images; and in response to determining that a result of the comparing exceeds a specified threshold, presenting at least the candidate search result image to the user as an image that is similar to the selected image; wherein determining the set of metadata for the image comprises determining: (a) a set of attributes that reflect visual characteristics that are visible in the image, (b) a set of tags that have been associated with the image by one or more users in a community of users, (c) a set of concepts to which tags in the set of tags are related, and (d) for each concept in the set of concepts, a probability that the set of tags reflects the concept, thereby producing a set of concept probabilities for the image; wherein generating the data structure that contains information regarding the set of metadata for the particular image comprises generating a particular data structure that contains information regarding the (a) the set of attributes determined for the particular image, and (b) the set of concept probabilities determined for the particular image; for each particular tag in the set of tags determined for the particular image, (a) determining a quantity of different images, in the plurality of images, with which the particular tag is associated, and (b) weighting a value for a particular concept probability, in the data structure that was generated for the particular image, based at least part on said quantity; wherein weighting the value for the particular concept probability comprises decreasing an extent of influence that the value for the particular concept probability will have on said comparing as a distance of the particular tag's quantity becomes farther away from a median of a distribution of different tags' quantities; wherein the steps are performed by one or more computing devices.

8. A non-transitory computer-readable storage medium storing one or more instructions which, when executed by one or more processors, cause the one or more processors to perform steps comprising: receiving a user's request to view other images that are deemed to be similar to a first image; determining a first group with which the first image is associated; determining a first set of concepts based at least in part on textual data that are associated with multiple images in the first group; determining a first set of concept probabilities for concepts in the first set of concepts; populating at least a subsection of a first feature vector with concepts in the first set of concept probabilities; determining a second group with which a second image is associated; determining a second set of concepts based at least in part on textual data that are associated with multiple images in the second group; determining a second set of concept probabilities for concepts in the second set of concepts; populating at least a subsection of a second feature vector with concepts in the second set of concept probabilities; determining a Euclidean distance based at least in part on the first feature vector and the second feature vector; and based at least in part on the Euclidean distance, selecting the second image from an image corpus for inclusion within a set of other images that are deemed to be similar to the first image; presenting the set of other images to a user from which the request was received; and weighting one or more concept probabilities in the first feature vector based at least in part on how far from a median of a concept probability distribution the one or more concept probabilities in the first feature vector occur in the concept probability distribution; wherein the steps are performed by one or more computing devices.

9. An apparatus comprising: one or more storage devices storing a plurality of images; one or more metadata logic components configured to determine, for each image in the plurality of images stored on the one or more storage devices, a set of metadata for the image, the metadata including, for each concept of a plurality of concepts, values that indicate the probability that the each image pertains to the concept, wherein determining the set of metadata for the image comprises determining: (a) a set of attributes that reflect visual characteristics that are visible in the image, (b) a set of tags that have been associated with the image by one or more users in a community of users, (c) a set of concepts to which tags in the set of tags are related, and (d) for each concept in the set of concepts, a probability that the set of tags reflects the concept, thereby producing a set of concept probabilities for the image, determine, for each particular tag in the set of tags determined for the particular image, (a) a quantity of different images, in the plurality of images, with which the particular tag is associated, and (b) weighting a value for a particular concept probability, in the data structure that was generated for the particular image, based at least part on the quantity; and generate, for each particular image in the plurality of images, a data structure that contains information regarding the set of metadata for the particular image, wherein generating the data structure that contains information regarding the set of metadata for the particular image comprises generating a particular data structure that contains information regarding the (a) the set of attributes determined for the particular image, and (b) the set of concept probabilities determined for the particular image; and a search engine configured to receive a particular user's request to find other images that are similar to a selected image, respond to the request by comparing values in the data structure that was generated for the selected image to values in the data structure that was generated for a candidate search result image in the plurality of images, and respond to determining that a result of the comparing exceeds a specified threshold by presenting at least the candidate search result image to the user as an image that is similar to the selected image.

10. An apparatus comprising: one or more storage devices storing a plurality of images; one or more metadata logic components configured to determine, for each image in the plurality of images stored on the one or more storage devices, a set of metadata for the image, wherein determining the set of metadata for the image comprises determining: (a) a set of attributes that reflect visual characteristics that are visible in the image, (b) a set of tags that have been associated with the image by one or more users in a community of users, (c) a set of concepts to which tags in the set of tags are related, and (d) for each concept in the set of concepts, a probability that the set of tags reflects the concept, thereby producing a set of concept probabilities for the image, determine, for each particular tag in the set of tags determined for the particular image, (a) a quantity of different images, in the plurality of images, with which the particular tag is associated, and (b) weighting a value for a particular concept probability, in the data structure that was generated for the particular image, based at least part on the quantity; and generate, for each particular image in the plurality of images, a data structure that contains information regarding the set of metadata for the particular image, wherein generating the data structure that contains information regarding the set of metadata for the particular image comprises generating a particular data structure that contains information regarding the (a) the set of attributes determined for the particular image, and (b) the set of concept probabilities determined for the particular image; wherein weighting the value for the particular concept probability comprises determining a location, within a distribution of different tags' quantities, at which the particular tag's quantity lies, and weighting the value for the particular concept probability based at least in part on a distance of the particular tag's location from a median of the distribution; and a search engine configured to receive a particular user's request to find other images that are similar to a selected image, respond to the request by comparing values in the data structure that was generated for the selected image to values in the data structure that was generated for a candidate search result image in the plurality of images, and respond to determining that a result of the comparing exceeds a specified threshold by presenting at least the candidate search result image to the user as an image that is similar to the selected image.

11. An apparatus comprising: one or more storage devices storing a plurality of images; one or more metadata logic components configured to determine, for each image in the plurality of images stored on the one or more storage devices, a set of metadata for the image, wherein determining the set of metadata for the image comprises determining: (a) a set of attributes that reflect visual characteristics that are visible in the image, (b) a set of tags that have been associated with the image by one or more users in a community of users, (c) a set of concepts to which tags in the set of tags are related, and (d) for each concept in the set of concepts, a probability that the set of tags reflects the concept, thereby producing a set of concept probabilities for the image, determine, for each particular tag in the set of tags determined for the particular image, (a) a quantity of different images, in the plurality of images, with which the particular tag is associated, and (b) weighting a value for a particular concept probability, in the data structure that was generated for the particular image, based at least part on the quantity; and generate, for each particular image in the plurality of images, a data structure that contains information regarding the set of metadata for the particular image, wherein generating the data structure that contains information regarding the set of metadata for the particular image comprises generating a particular data structure that contains information regarding the (a) the set of attributes determined for the particular image, and (b) the set of concept probabilities determined for the particular image; wherein weighting the value for the particular concept probability comprises decreasing an extent of influence that the value for the particular concept probability will have on said comparing as a distance of the particular tag's quantity becomes farther away from a median of a distribution of different tags' quantities; and a search engine configured to receive a particular user's request to find other images that are similar to a selected image, respond to the request by comparing values in the data structure that was generated for the selected image to values in the data structure that was generated for a candidate search result image in the plurality of images, and respond to determining that a result of the comparing exceeds a specified threshold by presenting at least the candidate search result image to the user as an image that is similar to the selected image.

12. An apparatus comprising: one or more processors; and one or more memories storing instructions which, when processed by the one or more processors, cause: receiving a user's request to view other images that are deemed to be similar to a first image; determining a first group with which the first image is associated; determining a first set of concepts based at least in part on textual data that are associated with multiple images in the first group; determining a first set of concept probabilities for concepts in the first set of concepts; populating at least a subsection of a first feature vector with concepts in the first set of concept probabilities; determining a second group with which a second image is associated; determining a second set of concepts based at least in part on textual data that are associated with multiple images in the second group; determining a second set of concept probabilities for concepts in the second set of concepts; populating at least a subsection of a second feature vector with concepts in the second set of concept probabilities; determining a Euclidean distance based at least in part on the first feature vector and the second feature vector; and based at least in part on the Euclidean distance, selecting the second image from an image corpus for inclusion within a set of other images that are deemed to be similar to the first image; presenting the set of other images to a user from which the request was received; and weighting one or more concept probabilities in the first feature vector based at least in part on how far from a median of a concept probability distribution the one or more concept probabilities in the first feature vector occur in the concept probability distribution.

Patent Metadata

Filing Date

Unknown

Publication Date

July 5, 2016

Inventors

Malcolm Slaney

Kilian Quirin Weinberger

Kaushal Kurapati

Sriram J. Sathish

Polly Ng

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search