Training Diverse and Robust Ensembles of Artificial Intelligence Computer Models

PublishedOctober 10, 2023

Assigneenot available in USPTO data we have

InventorsIan Michael Molloy Taesung Lee Benjamin James Edwards

Technical Abstract

Patent Claims

15 claims

Legal claims defining the scope of protection, as filed with the USPTO.

2. The method of claim 1, wherein the similarity measure between the gradients of the loss surfaces is determined based on at least one of a cosine similarity or a Lp norm similarity.

3. The method of claim 1, wherein modifying the loss surface of one of the first AI model or the second AI model comprises adding a first regularizer term, having a first regularizer strength value, to a loss function of one of the first AI model or the second AI model, or increasing a second regularizer strength value of a second regularizer term in the loss function of one of the first AI model or the second AI model, and thereby control the similarity of the first gradient and second gradient of the first AI model and the second AI model.

4. The method of claim 3, wherein the first regularizer strength value or second regularizer strength value is set to a value that maximizes a distance between the loss surfaces of the first AI model and second AI model while minimizing accuracy loss of the at least two AI models in outputs generated by the at least two AI models.

5. The method of claim 1, wherein the at least two AI models comprises more than two AI models, and wherein the modifying comprises performing multiple pairwise comparisons of pairs of AI models in the at least two AI models and modifying loss surfaces of one or more of the AI models in each pair based on results of the comparisons, wherein the first AI model and the second AI model is one of the pairs of AI models.

6. The method of claim 5, wherein the pairwise comparisons of pairs of AI models are performed based on at least one of a clique architecture, a star architecture, or a ring architecture.

8. The method of claim 7, wherein combining the outputs of the AI models in the modified ensemble of AI models to generate a single output result for the modified ensemble comprises at least one of averaging the outputs of the AI models in the modified ensemble of AI models or performing a majority vote operation on the outputs of the AI models in the modified ensemble of AI models.

9. The method of claim 1, wherein the co-training and modifying operations are performed for each mini-batch of training data used to train the at least two AI models.

11. The computer program product of claim 10, wherein the similarity measure between the gradients of the loss surfaces is determined based on at least one of a cosine similarity or a Lp norm similarity.

12. The computer program product of claim 10, wherein the computer readable program further causes the hardened ensemble AI model generator to the loss surface of one of the first AI model or the second AI model comprises adding a first regularizer term, having a first regularizer strength value, to a loss function of one of the first AI model or the second AI model, or increasing a second regularizer strength value of a second regularizer term in the loss function of one of the first AI model or the second AI model, and thereby control the similarity of the first gradient and second gradient of the first AI model and the second AI model.

13. The computer program product of claim 12, wherein the first regularizer strength value or second regularizer strength value is set to a value that maximizes a distance between the loss surfaces of the first AI model and second AI model while minimizing accuracy loss of the at least two AI models in outputs generated by the at least two AI models.

14. The computer program product of claim 10, wherein the at least two AI models comprises more than two AI models, and wherein the modifying comprises performing multiple pairwise comparisons of pairs of AI models in the at least two AI models and modifying loss surfaces of one or more of the AI models in each pair based on results of the comparisons, wherein the first AI model and the second AI model is one of the pairs of AI models.

15. The computer program product of claim 14, wherein the pairwise comparisons of pairs of AI models are performed based on at least one of a clique architecture, a star architecture, or a ring architecture.

17. The computer program product of claim 16, wherein combining the outputs of the AI models in the modified ensemble of AI models to generate a single output result for the modified ensemble comprises at least one of averaging the outputs of the AI models in the modified ensemble of AI models or performing a majority vote operation on the outputs of the AI models in the modified ensemble of AI models.

19. The method of claim 3, wherein the first regularizer term or second regularizer term comprises λ cos(∇L (x,y; M_1), ∇L(x,y; M_2)) where λ is a regularization parameter having a value corresponding to the first regularizer strength value or the second regularizer strength value, ∇L (x,y; M_1) is the first gradient of the first AI model M_1, ∇L(x,y; M_2) is the second gradient of the second AI model M_2, and cos(∇L (x,y; M_1), ∇L(x,y; M_2)) is a cosine distance between the first gradient and the second gradient.

20. The computer program product of claim 12, wherein the first regularizer term or second regularizer term comprises λ cos(∇L (x,y; M_1), ∇L(x,y; M_2)) where λ is a regularization parameter having a value corresponding to the first regularizer strength value or the second regularizer strength value, ∇L (x,y; M_1) is the first gradient of the first AI model M_1, ∇L(x,y; M_2) is the second gradient of the second AI model M_2, and cos(∇L (x,y; M_1), ∇L(x,y; M_2)) is a cosine distance between the first gradient and the second gradient.

Patent Metadata

Filing Date

Unknown

Publication Date

October 10, 2023

Inventors

Ian Michael Molloy

Taesung Lee

Benjamin James Edwards

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search