Semantically-Consistent Image Style Transfer

PublishedJuly 5, 2022

Assigneenot available in USPTO data we have

InventorsStephan Gouws Frederick Bertsch Konstantinos Bousmalis Amelie Royer Kevin Patrick Murphy

Technical Abstract

Patent Claims

18 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method comprising: receiving an input source domain image from a source domain; processing the source domain image using one or more source domain low-level encoder neural network layers that are specific to images from the source domain to generate a low-level representation of the input source domain image; processing the low-level representation using one more high-level encoder neural network layers that are shared between images from the source and target domains to generate an embedding of the input source domain image; processing the embedding of the input source domain image using one or more high-level decoder neural network layers that are shared between images from the source and target domains to generate a high-level feature representation of features of the input source domain image; and processing the high-level feature representation of the features of the input source domain image using one or more target domain low-level decoder neural network layers that are specific to generating images from the target domain to generate an output target domain image that is from the target domain but that has similar semantics to the input source domain image.

2. The method of claim 1 , further comprising: receiving an input target domain image from a target domain; processing the input target domain image using one or more target domain low-level encoder neural network layers that are specific to images from the target domain to generate a low-level representation of the input target domain image; processing the low-level representation using the one or more high-level encoder neural network layers that are shared between images from the source and target domains to generate an embedding of the input target domain image; processing the embedding of the input target domain image using the one or more high-level decoder neural network layers that are shared between images from the source and target domains to generate a high-level feature representation of features of the input target domain image; and processing the high-level feature representation of the features of the target source domain image using one or more source domain low-level decoder neural network layers that are specific to generating images from the source domain to generate an output source domain image that is from the source domain but that has similar semantics to the input target domain image.

3. The method of claim 1 , wherein the source domain low-level encoder neural network layers, the high-level encoder neural network layers, the high-level decoder neural network layers, and the target domain low-level decoder neural network layers have been trained jointly with one or more target domain low-level encoder neural network layers and one or more source domain low-level decoder neural network layers.

4. The method of claim 2 , wherein the source domain low-level encoder neural network layers have a same architecture as the target domain low-level encoder neural network layers but different parameter values.

5. The method of claim 2 , wherein the source domain low-level decoder neural network layers have a same architecture as the target domain low-level decoder neural network layers but different parameter values.

6. A method of training source domain low-level encoder neural network layers, high-level encoder neural network layers, high-level decoder neural network layers, target domain low-level decoder neural network layers, target domain low-level encoder neural network layers and source domain low-level encoder neural network layers, the method comprising: receiving a training input source domain image from a source domain; processing the training input source domain image using the source domain low-level encoder neural network layers to generate a low-level representation of the training input source domain image; processing the low-level representation using the high-level encoder neural network layers to generate an embedding of the training input source domain image; processing the embedding of the training input source domain image using the one or more high-level decoder neural network layers to generate a high-level feature representation of features of the training input source domain image; and processing the high-level feature representation of the features of the training input source domain image using the one or more target domain low-level decoder neural network layers to generate a training output target domain image that is from a target domain; processing the training output target domain image using the one or more target domain low-level encoder neural network layers to generate a low-level representation of the training output target domain image; processing the low-level representation using the one more high-level encoder neural network layers to generate an embedding of the training output target domain image; determining a first gradient of a semantic consistency loss function that reduces a distance measure between the embedding of the training output target domain image and the embedding of the training input source domain image; and updating, using the first gradient, current values of the parameters of the source domain low-level encoder neural network layers, the high-level encoder neural network layers, the high-level decoder neural network layers, the target domain low-level decoder neural network layers, and the target domain low-level encoder neural network layers.

7. The method of claim 6 , further comprising: receiving a training input target domain image from the target domain; processing the training input target domain image using the target domain low-level encoder neural network layers to generate a low-level representation of the training input target domain image; processing the low-level representation using the high-level encoder neural network layers to generate an embedding of the training input target domain image; processing the embedding of the training input target domain image using the one or more high-level decoder neural network layers to generate a high-level feature representation of features of the training input target domain image; and processing the high-level feature representation of the features of the training input target domain image using the one or more source domain low-level decoder neural network layers to generate a training output source domain image that is from the source domain; processing the training output source domain image using the one or more source domain low-level encoder neural network layers to generate a low-level representation of the training output target domain image; processing the low-level representation using the one more high-level encoder neural network layers to generate an embedding of the training output source domain image; determining a second gradient of a semantic consistency loss function that reduces a distance measure between the embedding of the training output source domain image and the embedding of the training input target domain image; and updating, using the second gradient, current values of the parameters of the target domain low-level encoder neural network layers, source domain low-level encoder neural network layers, the high-level encoder neural network layers, the high-level decoder neural network layers, and the source domain low-level decoder neural network layers.

8. The method of claim 6 , further comprising: processing the high-level feature representation of the features of the training input source domain image using the one or more source domain low-level decoder neural network layers to generate a training output source domain image; determining a gradient of a reconstruction loss function that reduces a distance measure between the embedding of the training output target domain image and the embedding of the training input source domain image; and updating, using the gradient, current values of the parameters of the source domain low-level encoder neural network layers, the high-level encoder neural network layers, the high-level decoder neural network layers, and the source domain low-level decoder neural network layers.

9. The method of claim 6 , wherein the training is performed jointly with the training of a classifier that is configured to receive an embedding of an input image and to process the embedding to classify the input image as either being a target domain image or an image that was adapted from the source domain, and wherein the method further comprises: processing the embedding of the training input source domain image using the classifier to generate a classification of the training input source domain image; determining a gradient of a classification loss function that decreases an accuracy of the classification generated by the classifier; and updating, using the gradient, current values of the parameters of the source domain low-level encoder neural network layers and the high-level encoder neural network layers.

10. The method of claim 9 , further comprising: determining a gradient of a classification loss function that increases the accuracy of the classification generated by the classifier; and updating, using the gradient, current values of the parameters of the classifier.

11. The method of claim 6 , further comprising: wherein the training is performed jointly with the training of a discriminator that is configured to receive an input image and to process the input image and to classify the input image as either being from the source domain or the target domain, and wherein the method further comprises: processing the training output target domain image using the discriminator to generate a classification of the training output target domain image; determining a gradient of a discriminator loss function that decreases an accuracy of the classification generated by the discriminator; and updating, using the gradient, current values of the parameters of the source domain low-level encoder neural network layers, the high-level encoder neural network layers, the high-level decoder neural network layers, and the target domain low-level decoder neural network layers.

12. The method of claim 11 , further comprising: determining a gradient of a discriminator loss function that increases an accuracy of the classification generated by the discriminator; and updating, using the gradient, current values of the parameters of the discriminator.

13. The method of claim 6 , further comprising: processing the training input source domain image using a pre-trained teacher network to generate a teacher embedding of the input source domain image; and determining a gradient of a teacher loss function that decreases a distance measure between the embedding of the input source domain image and the teacher embedding of the input source domain image; and updating, using the gradient, current values of the parameters of the source domain low-level encoder neural network layers and the high-level encoder neural network layers.

14. A system comprising one or more computers and one or more storage devices storing instructions that when implemented by the one or more computers cause the one or more computers to perform operations comprising: receiving an input source domain image from a source domain; processing the source domain image using one or more source domain low-level encoder neural network layers that are specific to images from the source domain to generate a low-level representation of the input source domain image; processing the low-level representation using one more high-level encoder neural network layers that are shared between images from the source and target domains to generate an embedding of the input source domain image; processing the embedding of the input source domain image using one or more high-level decoder neural network layers that are shared between images from the source and target domains to generate a high-level feature representation of features of the input source domain image; and processing the high-level feature representation of the features of the input source domain image using one or more target domain low-level decoder neural network layers that are specific to generating images from the target domain to generate an output target domain image that is from the target domain but that has similar semantics to the input source domain image.

15. The system of claim 14 , the operations further comprising: receiving an input target domain image from a target domain; processing the input target domain image using one or more target domain low-level encoder neural network layers that are specific to images from the target domain to generate a low-level representation of the input target domain image; processing the low-level representation using the one or more high-level encoder neural network layers that are shared between images from the source and target domains to generate an embedding of the input target domain image; processing the embedding of the input target domain image using the one or more high-level decoder neural network layers that are shared between images from the source and target domains to generate a high-level feature representation of features of the input target domain image; and processing the high-level feature representation of the features of the target source domain image using one or more source domain low-level decoder neural network layers that are specific to generating images from the source domain to generate an output source domain image that is from the source domain but that has similar semantics to the input target domain image.

16. The system of claim 14 , wherein the source domain low-level encoder neural network layers, the high-level encoder neural network layers, the high-level decoder neural network layers, and the target domain low-level decoder neural network layers have been trained jointly with one or more target domain low-level encoder neural network layers and one or more source domain low-level decoder neural network layers.

17. The system of claim 15 , wherein the source domain low-level encoder neural network layers have a same architecture as the target domain low-level encoder neural network layers but different parameter values.

18. The system of claim 15 , wherein the source domain low-level decoder neural network layers have a same architecture as the target domain low-level decoder neural network layers but different parameter values.

Patent Metadata

Filing Date

Unknown

Publication Date

July 5, 2022

Inventors

Stephan Gouws

Frederick Bertsch

Konstantinos Bousmalis

Amelie Royer

Kevin Patrick Murphy

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search