Legal claims defining the scope of protection, as filed with the USPTO.
1. An activity classifier system, for classifying human activities using two-dimensional (2D) skeleton data comprising joint positions, the system comprising: a skeleton preprocessor that transforms the 2D skeleton data into transformed skeleton data, the transformed skeleton data comprising scaled, relative joint positions and joint velocities; a gesture classifier comprising a first recurrent neural network that receives the transformed skeleton data, and is trained to identify the a most probable gesture a plurality of gestures; and an action classifier comprising a second recurrent neural network that receives information from the first recurrent neural networks network and is trained to identify the a most probable action of a plurality of actions, wherein the first recurrent neural network is trained on data comprising 2D skeleton sequences with associated gesture labels and the second recurrent neural network is trained with a pre-trained first recurrent neutral network, and 2D skeleton sequences with associated action labels, and wherein the skeleton preprocessor temporally smooths the joint positions, transforms the joint positions to be relative to one of the joint positions, scales the joint positions to the a height of a feature of the 2D skeleton, and computes the velocity of each joint position.
2. The activity classifier system of claim 1, wherein the action classifier further receives contextual object information comprising an object identifier and a joint identifier for any contextual objects associated with a joint, and the second recurrent neural network is further trained with contextual object and joint information.
3. The activity classifier system of claim 1, wherein the plurality of gestures comprise a set of gesture classes.
4. The activity classifier system of claim 1, wherein the plurality of actions comprise a set of action classes.
5. The activity classifier system of claim 1, wherein the first recurrent neural network comprises at least one inner product layer, at least one rectified linear unit layers, and at least one long-short term memory layer.
6. The activity classifier system of claim 5, wherein the first recurrent neural network comprises one or more pairs of inner product and rectified linear unit layers, followed by at least one long-short term memory layer, followed by zero or more pairs of inner product and rectified linear unit layers, followed by an inner product layer.
7. The activity classifier system of claim 6, wherein the information received by the second recurrent neural network from the first recurrent neural networks network is from a layer prior to the final inner product layer.
8. The activity classifier system of claim 1, wherein the second recurrent neural network comprises one or more pairs of inner product and rectified linear unit layers, followed by at least one long-short term memory layer, followed by zero or more pairs of inner product and rectified linear unit layers, followed by an inner product layer.
9. A method of classifying human activities using two-dimensional (2D) skeleton data comprising joint positions, the method comprising: pre-processing the joint position data by transforming the 2D skeleton data into transformed skeleton data, the transformed skeleton data comprises scaled, relative joint positions and joint velocities; classifying gestures using a first recurrent neural network that receives the transformed skeleton data, and is trained to identify the a most probable gesture of a plurality of gestures; and classifying actions using a second recurrent neural network that receives information from the first recurrent neural networks network and is trained to identify the a most probable action of a plurality of actions, wherein the first recurrent neural network is trained on data comprising 2D skeleton sequences with associated gesture labels and the second recurrent neural network is trained with a pre-trained first recurrent neutral network, and 2D skeleton sequences with associated action labels, and wherein the pre-processing comprises temporally smoothing the joint positions, transforming the joint positions to be relative to one of the joint positions, scaling the joint positions to the a height of a feature of the 2D skeleton, and computing the velocity of each joint position.
10. The method of claim 9, wherein classifying actions further comprises receiving contextual object information comprising an object identifier and a joint identifier for any contextual objects associated with a joint, and the second recurrent neural network is further trained with contextual object and joint information.
11. The method of claim 9 wherein the plurality of gestures comprise a set of gesture classes.
12. The method of claim 9, wherein the plurality of actions comprise a set of action classes.
13. The method of claim 9, wherein the first recurrent neural network comprises at least one inner product layer, at least one rectified linear unit layers, and at least one long-short term memory layer.
14. The method of claim 13, wherein the first recurrent neural network comprises one or more pairs of inner product and rectified linear unit layers, followed by at least one long-short term memory layer, followed by zero or more pairs of inner product and rectified linear unit layers, followed by an inner product layer.
15. The method of claim 14, wherein the information received by the second recurrent neural network from the first recurrent neural networks network is from a layer prior to the final inner product layer.
16. The action classifier method of claim 9, wherein the second recurrent neural network comprises one or more pairs of inner product and rectified linear unit layers, followed by at least one long-short term memory layer, followed by zero or more pairs of inner product and rectified linear unit layers, followed by an inner product layer.
Unknown
August 19, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.