Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for predicting the pose of an articulated object, comprising: receiving spatial information for n joints of the articulated object; passing the spatial information for the n joints to a machine learning model previously trained to receive spatial information for n+m joints as input, wherein m>=1, and wherein previously training the machine learning model includes providing training input data to the machine learning model for a quantity of joints that is greater than n, and over a series of training iterations, progressively reducing the quantity of joints; and receiving as output from the machine learning model a pose prediction for the articulated object based at least on the spatial information for the n joints, and without spatial information for the m joints.
2. The method of claim 1, wherein the articulated object is a human body.
3. The method of claim 2, wherein the n joints include a head joint of the human body, and the spatial information for the head joint details parameters for a head of the human body.
4. The method of claim 3, wherein the n joints include one or more wrist joints of the human body, and the spatial information for the one or more wrist joints details parameters for one or more corresponding hands of the human body.
5. The method of claim 2, wherein the spatial information for the n joints of the articulated object is derived from positioning data output by one or more sensors.
6. The method of claim 5, wherein the one or more sensors include one or both of a camera configured to image the one or more body parts of the human body, and a position sensor configured to be held by or worn by at least one body part of the human body.
7. The method of claim 1, wherein the articulated object is a human hand.
8. The method of claim 7, wherein the n joints include one or more finger joints of the human hand, and the spatial information for the one or more finger joints details parameters for one or more fingers or finger segments of the human hand.
9. The method of claim 1, wherein the machine learning model is previously trained with training input data having ground truth labels for the articulated object.
10. The method of claim 1, wherein previously training the machine learning model includes, for a first training iteration, providing training input data for all n+m joints to the machine learning model, and over a series of subsequent training iterations, progressively masking the training input data for one or more of the m joints.
11. The method of claim 10, wherein progressively masking the training input data for each of the m joints includes, on each training iteration, masking one or more next joints of the m joints along a kinematic tree of the articulated object toward a root of the kinematic tree.
12. The method of claim 10, wherein progressively masking the training input data for each of the m joints includes, on each training iteration, randomly selecting one or more of the m joints for masking.
13. The method of claim 10, wherein the training input data includes spatial information corresponding to a plurality of different poses of the articulated object, and wherein progressively masking the training input data for the one or more m joints includes, on each training iteration, masking a same joint of the m joints for each of the plurality of different poses.
14. The method of claim 1, wherein the machine learning model includes a normalizing flow that applies a plurality of invertible transforms to the spatial information for the n joints to output the pose prediction.
15. The method of claim 14, wherein previously training the machine learning model includes applying intermediate supervision by supplying a ground-truth pose of the articulated object to one or more intermediate invertible transforms of the plurality of invertible transforms.
16. The method of claim 1, wherein the pose prediction includes predicted spatial information for all n+m joints of the articulated object.
17. A computing system, comprising: a logic subsystem; and a storage subsystem holding instructions executable by the logic subsystem to: receive spatial information for n joints of the articulated object; pass the spatial information for the n joints to a machine learning model previously trained to receive spatial information for n+m joints as input, wherein m>=1, and wherein previously training the machine learning model includes providing training input data to the machine learning model for a quantity of joints that is greater than n, and over a series of training iterations, progressively reducing the quantity of joints; and receive as output from the machine learning model a pose prediction for the articulated object based at least on the spatial information for the n joints, and without spatial information for the m joints.
18. The computing system of claim 17, wherein the articulated object is a human body, and wherein the n joints include a head joint of the human body and one or more wrist joints of the human body.
19. The computing system of claim 17, wherein previously training the machine learning model includes, for a first training iteration, providing training input data for all n+m joints to the machine learning model, and over a series of subsequent training iterations, progressively masking the training input data for each of the m joints.
20. A method for predicting the pose of a human body, comprising: receiving spatial information for n joints of the human body, the n joints including a head joint of the human body and one or more wrist joints of the human body; passing the spatial information for the n joints to a machine learning model previously trained to receive spatial information for n+m joints as input, wherein m>=1, the machine learning model previously trained by, for a first training iteration, providing training input data for all n+m joints to the machine learning model, and over a series of subsequent training iterations, progressively masking the training input data for each of the m joints; and receiving as output from the machine learning model a pose prediction for the human body based at least on the spatial information for the n joints, and without spatial information for the m joints.
Unknown
June 24, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.