US-11375204

Feature-domain residual for video coding for machines

PublishedJune 28, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

An apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: decode encoded data to generate decoded data, the encoded data having a bitrate lower than that of original data, and extract features from the decoded data; decode encoded residual features to generate decoded residual features; and generate enhanced decoded features as a result of combining the decoded residual features with the features extracted from the decoded data.

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. An apparatus comprising: at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: downsample original data to a lower resolution prior to encoding the original data; encode the downsampled original data with a first codec to generate encoded data with a bitrate lower than that of the original data, and decode the encoded data to generate decoded data; encode the original data with at least one second learned codec to generate encoded residual features and decoded residual features; generate enhanced decoded features as a result of combining the decoded residual features with features extracted from the decoded data generated with the first codec; wherein the decoded data and the enhanced decoded features are configured to be processed or analyzed with at least one machine; generate enhanced decoded video resulting from combining the decoded data with the enhanced decoded features; wherein the at least one machine processes or analyzes the decoded data using the enhanced decoded video.

2. The apparatus of claim 1 , wherein the at least one machine comprises at least one task neural network.

3. The apparatus of claim 1 , wherein the enhanced decoded video is generated using a neural network.

4. The apparatus of claim 1 , wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: generate the enhanced decoded video resulting from combining the decoded data with the decoded residual features.

5. The apparatus of claim 4 , wherein the enhanced decoded video is generated using a neural network.

6. The apparatus of claim 1 , wherein the residual features are encoded using at least one neural network of the at least one second learned codec, and the encoded residual features are decoded using at least one neural network of the at least one second learned codec.

7. The apparatus of claim 1 , wherein the features extracted from the decoded data generated with the first codec are extracted using a neural network.

8. The apparatus of claim 1 , wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: extract features from the original data; extract features from the decoded data; and generate the residual features, prior to being encoded, as a result of computing a difference between the features extracted from the decoded data and the features extracted from the original data.

9. An apparatus comprising: at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: decode encoded data to generate decoded data, the encoded data having a bitrate lower than that of original data, and extract features from the decoded data; wherein the encoded data is generated from downsampling the original data to a lower resolution prior to encoding the original data, and encoding the downsampled original data with a codec; decode encoded residual features to generate decoded residual features; generate enhanced decoded features as a result of combining the decoded residual features with the features extracted from the decoded data; wherein the enhanced decoded features are configured to be processed or analyzed with at least one machine; generate enhanced decoded video as a result of combining the decoded data with the enhanced decoded features; and process or analyze the enhanced decoded video using at least one machine task.

10. The apparatus of claim 9 , wherein the at least one machine comprises at least one task neural network.

11. The apparatus of claim 9 , wherein the combining of the decoded data with the enhanced decoded features to generate the enhanced decoded video is performed using a neural network; and wherein the at least one machine task used to process or analyze the enhanced decoded video comprises at least one task neural network.

12. The apparatus of claim 9 , wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: generate the enhanced decoded video as a result of combining the decoded data with the decoded residual features; wherein the combining of the decoded data with the decoded residual features to generate the enhanced decoded video is performed using a neural network; and wherein the at least one machine task used to process or analyze the enhanced decoded video comprises at least one task neural network.

13. The apparatus of claim 9 , wherein the features are extracted from the decoded data using a neural network; and wherein the encoded residual features are decoded using a neural network of a learned codec.

14. The apparatus of claim 9 , wherein the combining of the decoded residual features with the features extracted from the decoded data to generate the enhanced decoded features is a summation of the decoded residual features and the features extracted from the decoded data.

15. The apparatus of claim 9 , wherein the encoded residual features are a difference between features extracted from the original data, and features extracted from preliminary decoded data or the features extracted from the decoded data.

16. The apparatus of claim 9 , wherein the decoded residual features are decoded using entropy decoding and dequantization.

17. The apparatus of claim 9 , wherein the decoded residual features are decoded using an image of a video decoder, the decoding of the residual features comprising converting decoded feature map images to the decoded residual features.

18. The apparatus of claim 9 , wherein the original data is video data.

19. A method comprising: decoding encoded data to generate decoded data, the encoded data having a bitrate lower than that of original data, and extracting features from the decoded data; wherein the encoded data is generated from downsampling the original data to a lower resolution prior to encoding the original data, and encoding the downsampled original data with a codec; decoding encoded residual features to generate decoded residual features; generating enhanced decoded features as a result of combining the decoded residual features with the features extracted from the decoded data; wherein the enhanced decoded features are configured to be processed or analyzed with at least one machine; generate enhanced decoded video as a result of combining the decoded data with the enhanced decoded features; and process or analyze the enhanced decoded video using at least one machine task.

20. The method of claim 19 , wherein the at least one machine comprises at least one task neural network.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

H04N G06T

Patent Metadata

Filing Date

March 31, 2021

Publication Date

June 28, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search