US-10915817

Method of training a neural network

PublishedFebruary 9, 2021

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Training a target neural network comprises providing a first batch of samples of a given class to respective instances of a generative neural network, each instance providing a variant of the sample in accordance with the parameters of the generative network. Each variant produced by the generative network is compared with another sample of the class to provide a first loss function for the generative network. A second batch of samples is provided to the target neural network, at least some of the samples comprising variants produced by the generative network. A second loss function is determined for the target neural network by comparing outputs of instances of the target neural network to one or more targets for the neural network. The parameters for the target neural network are updated using the second loss function and the parameters for the generative network are updated using the first and second loss functions.

Patent Claims

18 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method of training a target neural network comprising: a) providing a first batch of samples of a class to respective instances of a generative neural network, each instance of said generative neural network providing a variant of said sample in accordance with parameters of said generative neural network; b) comparing each variant provided by said generative neural network with another sample of said class to provide a first loss function for said generative neural network; c) determining to include one or more variants provided by said generative neural network or one or more samples of said class in a second batch of samples based at least in part on a probability; d) including a proportion of the one or more variants provided by said generative neural network and the one or more samples of said class in the second batch of samples based at least in part on the determining, wherein said proportion varies from the second batch of samples to a third batch of samples; e) providing the second batch of samples to said target neural network, at least one sample of said second batch of samples comprising the one or more variants produced by said generative neural network; f) determining a second loss function for said target neural network by comparing outputs of instances of said target neural network to one or more targets for said target neural network; g) updating the parameters for said target neural network using said second loss function; and h) updating the parameters for said generative neural network using said first loss function for said generative neural network and said second loss function for said target neural network.

2. A method according to claim 1 wherein said proportion increases or decreases with a successive repetition of steps a) to h).

3. A method according to claim 1 wherein said second batch of samples comprises a proportion of variants less than all variants provided by said generative neural network, the proportion based at least in part on the first loss function for said generative neural network.

4. A method according to claim 1 wherein said target neural network is a multi-class classifier, said generative neural network is a first generative neural network, and said method further comprises: providing a third batch of samples of a second class to respective instances of a second generative neural network, each instance of said second generative neural network providing a variant of said sample in accordance with the parameters of said second generative neural network; comparing each variant produced by said second generative neural network with another sample of said second class to provide a loss function for said second generative neural network; wherein said second batch of samples provided to said target neural network further comprises said variants produced by said second generative neural network; and updating the parameters for said second generative neural network using said loss function for said second generative neural network and said second loss function for said target neural network.

5. A method according to claim 1 wherein a combined loss function for said generative neural network is αL A +βL B where L A is said first loss function and L B is said second loss function and said combined loss function is used to update the parameters of said generative neural network.

6. A method according to claim 5 , wherein α is less than or equal to 0.3 and β is greater than or equal to 0.7.

7. A method according to claim 5 where a and change with a successive repetition of steps a) to h).

8. A method according to claim 1 wherein said target neural network comprises a fully-connected layer providing said outputs.

9. A method according to claim 1 wherein each sample comprises an image comprising one or more channels.

10. A method according to claim 9 wherein said one or more channels comprise one or more of image planes or processed versions of image planes.

11. A method according to claim 9 wherein said target neural network comprises a gender classifier for indicating a gender of a subject of an image.

12. A method according to claim 1 comprising providing pairs of samples from said first batch of samples of a given class to respective instances of a generative neural network, each instance of said generative neural network combining said pairs of samples and providing a variant of said samples in accordance with the parameters of said generative neural network.

13. A non-transitory computer program product comprising a computer readable medium storing computer executable instructions that, when executed, configure a computing device to perform operations comprising: inputting a first batch of samples to a generative neural network; outputting, by the generative neural network, one or more variants of at least one sample in the first batch of samples based at least in part on a parameter of the generative neural network; determining a first loss function for the generative neural network based at least in part on comparing the one or more variants produced by the generative neural network to another sample; selecting a proportion of the one or more variants and one or more samples in the first batch of samples for inclusion in a second batch of samples based at least in part on the first loss function for the generative neural network; inputting the second batch of samples to a target neural network, the proportion varying from the second batch of samples to a third batch of samples; determining a second loss function for the target neural network by comparing an output of the target neural network to a target value for the target neural network; updating a parameter for the target neural network based at least in part on the second loss function; and updating the parameter for the generative neural network based at least in part on the first loss function and the second loss function.

14. A non-transitory computer program product of claim 13 , wherein the first batch of samples comprises a label identifying a class for at least one sample in the first batch of samples.

15. A non-transitory computer program product of claim 13 , the operations further comprising: determining a probability for inputting the one or more variants determined by the generative neural network in the second batch of samples; and inputting the one or more variants in the second batch of samples based at least in part on the probability.

16. A computer device comprising: hardware programmed to: input a first batch of samples to a generative neural network; output, by the generative neural network, one or more variants of one or more samples in the first batch of samples based at least in part on parameters of the generative neural network; determine a first loss function for the generative neural network based at least in part on comparing the one or more variants produced by the generative neural network to at least one of the one or more samples in the first batch of samples; determine to include the one or more variants or the one or more samples in a second batch of samples based at least in part on a probability; input a proportion of the one or more variants and the one or more samples in the second batch of samples to a target neural network, the proportion varying from the second batch of samples to a third batch of samples; determine a second loss function for the target neural network by comparing outputs of instances of the target neural network to one or more targets for the target neural network; update parameters for the target neural network based at least in part on the second loss function; and update the parameters for the generative neural network based at least in part on the first loss function and the second loss function.

17. A computer device of claim 16 , further comprising a selector component located between the generative neural network and the target neural network to select the one or more variants for inclusion in the second batch of samples based at least in part on the first loss function for the generative neural network.

18. A computer device of claim 17 , wherein the selector component to select the one or more variants for inclusion in the second batch of samples is further based at least in part on the probability.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06N

Patent Metadata

Filing Date

January 23, 2017

Publication Date

February 9, 2021

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search