Learning Image Categorization Using Related Attributes

PublishedApril 24, 2018

Assigneenot available in USPTO data we have

InventorsZHE LIN HAILIN JIN JIANCHAO YANG

Technical Abstract

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A non-transitory computer storage medium comprising computer-useable instructions that, when used by one or more computing devices, cause the one or more computing devices to perform operations comprising: implementing a regularized double-column convolutional neural network (RDCNN) to classify image features for a set of images, the implementing comprising: receiving an image from the set of images; training a first feature column in a first neural network of the RDCNN using a first set of image representations as inputs to the first feature column, the trained first feature column having fixed parameters; generating a first set of image attributes by the trained first feature column using the fixed parameters; and training a second feature multi-column in a second neural network of the RDCNN using the generated first set of image attributes, wherein the parameters of the trained first feature column remain fixed, wherein the second feature multi-column comprises at least two columns that are independent in each convolutional layer of the second feature multi-column, and wherein each of the at least two columns receives a different input; and identifying a class associated with the second feature multi-column for the image using the implemented RDCNN.

2. The non-transitory computer storage medium of claim 1 , further comprising extracting a global image representation of the image as one or more global inputs to the second feature multi-column of the RDCNN.

3. The non-transitory computer storage medium of claim 1 , further comprising extracting a local image representation of the image as one or more fine-grained inputs to the second feature multi-column of the RDCNN.

4. The non-transitory computer storage medium of claim 3 , comprising extracting a local image representation of the image as an input to the first feature column of the RDCNN.

5. The non-transitory computer storage medium of claim 2 , wherein the global image representation is extracted by resizing the image.

6. The non-transitory computer storage medium of claim 2 , wherein the global image representation is extracted by warping the image into a normalized input with a fixed size.

7. The non-transitory computer storage medium of claim 2 , wherein the global image representation is extracted by normalizing a shorter side of the image to a normalized input with a fixed length S and center-cropping the normalized input to generate a s×s×3 input.

8. The non-transitory computer storage medium of claim 2 , wherein the global image representation is extracted by normalizing a longer side of the image to a fixed length S and generating a normalized input of a fixed size s×s×3 by padding border pixels with zero.

9. The non-transitory computer storage medium of claim 4 , further comprising randomly cropping the image into a normalized input for local image representation for the first feature column and second feature multi-column, the normalized input having a fixed size and preserving details of the image in original high-resolution format.

10. The non-transitory computer storage medium of claim 1 , wherein the first feature column is a style column.

11. The non-transitory computer storage medium of claim 10 , wherein styles associated with the style column include rule-of-thirds, high dynamic range, black and white, long exposure, complementary colors, vanishing point, and soft focus.

12. The non-transitory computer storage medium of claim 10 , wherein the second feature multi-column is an aesthetics column.

13. The non-transitory computer storage medium of claim 1 , wherein an architecture associated with each column in the RDCNN comprises at least four convolutional layers and at least two fully-connected layers.

14. The non-transitory computer storage medium of claim 1 , further comprising replacing a last layer of the RDCNN with a regression.

15. A computer-implemented method comprising: implementing a regularized double-column convolutional neural network (RDCNN) to classify image features for a set of images, the implementing comprising: receiving an image from the set of images; extracting a local image representation of the image; training a first feature column in a first neural network of the RDCNN utilizing the extracted local image representation as an input to the first feature column of the RDCNN, the first feature column associated with style and having fixed parameters; extracting a global image representation of the image; and training a second feature multi-column in a second neural network of the RDCNN utilizing the fixed parameters of the trained first feature column and utilizing the extracted global image representation as one or more global inputs to the second feature multi-column of the RDCNN, and utilizing the local image representation of the image as one or more fine-grained inputs to the second feature multi-column of the RDCNN, the second feature multi-column associated with aesthetics, wherein the second feature multi-column comprises at least two columns that are independent in each convolutional layer of the second feature multi-column, and each of the at least two columns receives a different input; and identifying a class associated with the second feature multi-column for the image utilizing the implemented RDCNN.

16. The method of claim 15 , further comprising resizing an image to extract the global image representation.

17. The method of claim 15 , further comprising warping the image into a normalized input with a fixed size to extract the global image representation.

18. The method of claim 15 , further comprising randomly cropping the image into a normalized input for the local image representation for the first feature column and second feature multi-column, the normalized in put having a fixed size and preserving details of the image in original high-resolution format.

19. The method of claim 15 , wherein styles associated with the style column include rule-of-thirds, high dynamic range, black and white, long exposure, complementary colors, vanishing point, and soft focus.

20. A computerized system comprising: one or more processors; and one or more computer storage media storing computer-useable instructions that, when used by the one or more processors, cause the one or more processors to: implement a regularized double-column convolutional neural network (RDCNN) to classify image features for a set of images, the implementing to: receive an image from the set of images; train a first feature column in a first neural network of the RDCNN using a first set of image representations as inputs to the first feature column, the trained first feature column having fixed parameters; and train a second feature multi-column in a second neural network of the RDCNN using the fixed parameters of the trained first feature column, and using a second set of image representations as inputs to the second feature multi-column, wherein the second feature multi-column comprises at least two columns that are independent in each convolutional layer of the second feature multi-column, and each of the at least two columns receives a different input; and identify a class associated with the second feature multi-column for the image using the implemented RDCNN.

Patent Metadata

Filing Date

Unknown

Publication Date

April 24, 2018

Inventors

ZHE LIN

HAILIN JIN

JIANCHAO YANG

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search