Legal claims defining the scope of protection, as filed with the USPTO.
1. A driver monitoring system to determine activity in a sequence of successively acquired images of a scene within a vehicle, comprising: memory; and one or more processors configured to perform operations comprising: acquiring the sequence of images; forming, for each image in the sequence of images, a feature block of features extracted from the image; determining, for each image in the sequence of images, image specific information, wherein the image specific information includes a weighting indicating image importance for the image and one or more likelihoods of one or more activities, wherein the weighting is determined based on retrieving a previously determined stored feature block and image specific information of the previously determined stored feature block; storing the formed feature blocks and the determined image specific information; passing a plurality of weighted feature blocks through a predictive model to determine an activity in the sequence of images; comparing the determined activities with a most likely image activity in the image specific information; validating the determined activity based on the comparison; and controlling the vehicle according to the validated activity.
2. The system of claim 1, wherein determining image specific information comprises determining respective likelihoods of a plurality of predetermined activities occurring in the image, and the weighting for the image relates to the highest of the determined likelihoods of the plurality of predetermined activities.
3. The system of claim 2, wherein the operations further comprise: comparing the determination of the activity in the sequence of images against the most likely image activity in the image specific information of at least one image in the sequence of images; and validating the determination of the activity if no difference is found in the comparison.
4. The system of claim 3, wherein the at least one image in the sequence of images is all images in the sequence of images.
5. The system of claim 3, wherein operations further comprise triggering a further action if the comparison reveals at least one difference.
6. The system of claim 5, wherein triggering a further action comprises at least one of issuing a warning; and adjusting the determination of the activity in the sequence of images to a warning or a default value.
7. The system of claim 5, wherein triggering a further action comprises: counting how many compared images reveal a difference in the comparison to find a number of disagreeing images; and responsive to the number of disagreeing images being greater than half a number of images in the sequence, outputting a new determination of the activity in the sequence of images.
8. The system of claim 7, wherein outputting a new determination of the activity in the sequence of images comprises: when the disagreeing images all have the same most likely image activity in the image specific information, outputting the most likely image activity of the disagreeing images as the activity in the sequence of images.
9. The system of claim 2, wherein the image specific information is determined for at least one frame.
10. The system of claim 1, wherein forming a feature block of features extracted from the image and determining image specific information including a weighting for the image comprises: passing each image through a feature encoding convolutional neural network to form a feature block; and passing each feature block through an image-based module comprising at least one fully connected layer.
11. The system of claim 1, wherein passing the plurality of weighted feature blocks through a predictive model to determine an activity in a sequence of images comprises: passing a concatenation of the weighted feature blocks through a time-based model comprising a convolutional neural network.
12. The system of claim 1, wherein the operations further comprise normalizing the determined weightings to form a normalized weighting for each image in the sequence of images and passing the determined weightings through a SoftMax module.
13. A method for determining an activity in a sequence of successively acquired images of a scene within a vehicle, comprising: acquiring the sequence of images; forming, for each image in the sequence of images, a feature block of features extracted from the image; determining, for each image in the sequence of images, image specific information, wherein the image specific information includes a weighting indicating image importance for the image and one or more likelihoods of one or more activities, wherein the weighting is determined based on retrieving a previously determined stored feature block and image specific information of the previously determined stored feature block; storing the formed feature blocks and the determined image specific information; passing a plurality of weighted feature blocks through a predictive model to determine an activity in the sequence of images; comparing the determined activities with a most likely image activity in the image specific information; validating the determined activity based on the comparison; and controlling the vehicle according to the validated activity.
14. The method of claim 13, wherein determining image specific information comprises determining respective likelihoods of a plurality of predetermined activities occurring in the image, and the weighting for the image relates to the highest of the determined likelihoods of the plurality of predetermined activities.
Unknown
March 11, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.