US-11494614

Subsampling training data during artificial neural network training

PublishedNovember 8, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Perplexity scores are computed for training data samples during ANN training. Perplexity scores can be computed as a divergence between data defining a class associated with a current training data sample and a probability vector generated by the ANN model. Perplexity scores can alternately be computed by learning a probability density function (“PDF”) fitting activation maps generated by an ANN model during training. A perplexity score can then be computed for a current training data sample by computing a probability for the current training data sample based on the PDF. If the perplexity score for a training data sample is lower than a threshold, the training data sample is removed from the training data set so that it will not be utilized for training during subsequent epochs. Training of the ANN model continues following the removal of training data samples from the training data set.

Patent Claims

10 claims

Legal claims defining the scope of protection, as filed with the USPTO.

3. The computer-implemented method of claim 1, further comprising prior to a start of an epoch for training the ANN model, adding training data samples previously removed from the training data set back to the training data set.

4. The computer-implemented method of claim 1, wherein the divergence comprises a Kullback-Leibler divergence.

5. The computer-implemented method of claim 1, wherein a SoftMax layer of the ANN generates the probability vector.

6. The computer-implemented method of claim 1, wherein the data defining the class associated with the current training data sample comprises a one-hot vector.

9. The computer-implemented method of claim 7, further comprising prior to a start of an epoch for training the ANN, adding training data samples previously removed from the training data set back to the training data set.

10. The computer-implemented method of claim 7, wherein the PDF comprises a Gaussian Mixture Model PDF.

14. The computing device of claim 12, wherein the at least one computer storage medium has further computer-executable instructions stored thereupon to add training data samples previously removed from the training data set back to the training data set prior to a start of an epoch for training the ANN model.

15. The computing device of claim 12, wherein the divergence comprises a Kullback-Leibler divergence.

16. The computing device of claim 12, wherein a SoftMax layer of the ANN generates the probability vector.

17. The computing device of claim 12, wherein the data defining the class associated with the current training data sample comprises a one-hot vector.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06N G06F G06V

Patent Metadata

Filing Date

March 20, 2019

Publication Date

November 8, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search