Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for generating simulated training data for a physical process, the method comprising: receiving, as input to at least one machine learning model, a first simulated image of a first object, wherein the at least one machine learning model includes mappings between simulated images generated from models of physical objects and real-world images of the physical objects; performing, by the at least one machine learning model, one or more operations on the first simulated image to generate a first augmented image of the first object; and transmitting the first augmented image as training data to a training pipeline that trains an additional machine learning model to control a behavior of the physical process.
2. The method of claim 1 , wherein receiving the first simulated image of the first object comprises generating the first simulated image from a computer aided design (CAD) model of the first object.
3. The method of claim 1 , further comprising: generating simulated training data that comprises the simulated images and real-world training data that comprises the real-world images; and inputting the simulated training data and the real-world training data as unpaired training data for training the at least one machine learning model.
4. The method of claim 1 , further comprising: generating labels associated with the first simulated image; and transmitting the labels and the first augmented image as training data to the training pipeline.
5. The method of claim 4 , wherein the labels comprise a type of the first object, a graspable point on the first object, a position of the first object in the first augmented image, and an orientation of the first object in the first augmented image.
6. The method of claim 1 , wherein the additional machine learning model comprises an artificial neural network.
7. The method of claim 1 , wherein the at least one machine learning model comprises a generator neural network that produces augmented images from simulated images.
8. The method of claim 7 , wherein the at least one machine learning model further comprise a discriminator neural network that categorizes augmented images produced by the generator network as simulated or real.
9. The method of claim 1 , wherein the one or more operations performed by the at least one machine learning model comprise at least one of: performing one or more shading operations on the first simulated image; performing one or more lighting operations on the first simulated image; and performing one or more operations that add noise to the first simulated image.
10. The method of claim 1 , wherein the physical process comprises a robot performing a grasping task.
11. One or more non-transitory computer-readable media storing instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of: receiving, as input to at least one machine learning model, a first simulated image of a first object, wherein the at least one machine learning model includes mappings between simulated images generated from models of physical objects and real-world images of the physical objects; performing, by the at least one machine learning model, one or more operations on the first simulated image to generate a first augmented image of the first object; and transmitting the first augmented image as training data to a training pipeline that trains an additional machine learning model to control a behavior of the physical process.
12. The one or more non-transitory computer-readable media of claim 11 , wherein the method further comprises: generating simulated training data that comprises the simulated images and real-world training data that comprises the real-world images; and inputting the simulated training data and the real-world training data as unpaired training data for training the at least one machine learning model.
13. The one or more non-transitory computer-readable media of claim 11 , wherein the method further comprises: generating labels associated with the first simulated image; and transmitting the labels and the first augmented image as training data to the training pipeline.
14. The one or more non-transitory computer-readable media of claim 11 , wherein the first simulated image and the first augmented image comprise at least one of: a two-dimensional (2D) representation of the first object; and one or more three-dimensional (3D) locations associated with the first object.
15. The one or more non-transitory computer-readable media of claim 11 , wherein the method further comprises: performing, by the at least one machine learning model, the one or more operations on a second simulated image of a second object to generate a second augmented image of the second object; and transmitting the second augmented image to the training pipeline.
16. The one or more non-transitory computer-readable media of claim 11 , wherein the at least one machine learning model comprises: a generator neural network that produces augmented images from simulated images; and a discriminator neural network that categorizes augmented images produced by the generator network as simulated or real.
17. The one or more non-transitory computer-readable media of claim 11 , wherein the additional machine learning model comprises an artificial neural network.
18. The one or more non-transitory computer-readable media of claim 11 , wherein the one or more operations performed by the at least one machine learning model comprise at least one of: performing one or more shading operations on the first simulated image; performing one or more lighting operations on the first simulated image; and performing one or more operations that add noise to the first simulated image.
19. The one or more non-transitory computer-readable media of claim 11 , wherein the physical process comprises a robot performing a grasping task.
20. A system, comprising: a memory that stores instructions, and a processor that is coupled to the memory and, when executing the instructions, is configured to: receive, as input to at least one machine learning model, a first simulated image of a first object, wherein the at least one machine learning model includes mappings between simulated images generated from models of physical objects and real-world images of the physical objects; perform, by the at least one machine learning model, one or more operations on the first simulated image to generate a first augmented image of the first object; and transmit the first augmented image as training data to a training pipeline that trains an additional machine learning model to control a behavior of the physical process.
Unknown
March 15, 2022
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.