US-11276140

Method and device for digital image, audio or video data processing

PublishedMarch 15, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Computer implemented method for digital image data, digital video data or digital audio data enhancement, and a computer implemented method for encoding or decoding this data in particular for transmission or storage, wherein an element representing a part of said digital data comprises an indication of a position of the element in an ordered input data of a plurality of data elements, wherein a plurality of elements is transformed to a representation depending on an invertible linear mapping, wherein the invertible linear mapping maps the input of the plurality of elements to the representation, wherein the invertible linear mapping comprises at least one autoregressive convolution.

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A computer implemented method for digital image enhancement, in which each element of a plurality of elements representing a pixel of a digital image includes an indication of a spatial dimension, the spatial dimension indicating a position of the pixel in the digital image, and the element includes an indication of a channel dimension, the channel dimension indicating a channel of the pixel in the digital image, the method comprising the following steps: transforming the plurality of elements representing pixels of the digital image to a representation depending on an invertible linear mapping, the invertible linear mapping mapping an input of the plurality of elements to the representation; modifying the representation to determine a modified representation depending on the representation; determining a plurality of elements representing pixels of an enhanced digital image depending on the modified representation; and transforming the modified representation depending on an inversion of the invertible linear mapping, wherein the invertible linear mapping includes at least one autoregressive convolution.

2. The computer implemented method as recited in claim 1 , wherein a plurality of digital images of a digital video are processed according to the method.

3. The computer implemented method according to claim 1 , wherein a convolutional neural network for the invertible linear mapping determines the representation from the input.

4. The computer implemented method according to claim 1 , wherein the representation is determined depending on a first autoregressive convolution of the input and a first convolution filter, and depending a consecutive second autoregressive convolution of the first autoregressive convolution and a second convolution filter.

5. The computer implemented method according to claim 1 , wherein the autoregressive convolution imposes an order on the input such that values of the representation for a specific element depend only on elements of the input representing input that is in the imposed order before the specific element in the order.

6. The computer implemented method according to claim 1 , wherein an input of an input dimension is mapped to the representation by a plurality of consecutive autoregressive convolutions, wherein a dimension of the consecutive convolutions is equal or less than the input dimension.

7. The computer implemented method according to claim 1 , further comprising the following step: determining a N-dimensional kernel for the mapping depending on concatenating a plurality of (N−1)-dimensional kernels with identical size one after another along the dimension N.

8. The computer implemented method according to claim 7 , wherein determining the N-dimensional kernel includes associating the (N−1)-dimensional kernel to the N-dimensional kernel as a last dimension entry, wherein a size of the last dimension of the N-dimensional kernel defines a center value, wherein for any entries of the N-dimensional kernel in a last dimension of the N-dimensional kernel having an index smaller than the center value, arbitrary values are assigned, wherein for any entries in the last dimension having an index larger than the center value, zeros are assigned.

9. The computer implemented method according to claim 1 , wherein the representation is modified for image transformation, and/or for image recognition, and/or for anomaly detection and/or for image validation.

10. The computer implemented method according to claim 1 , wherein an at least partial autonomous vehicle or robot is controlled depending on the representation.

11. A computer implemented method for digital video enhancement, in which each element of a plurality of elements representing a pixel of a digital image of a digital video includes an indication of a spatial dimension, the spatial dimension indicating a position of the pixel in the digital image, and the element includes an indication of a channel dimension, the channel dimension indicating a channel of the pixel in the digital image and an indication of a time dimension, the time dimension, indicating a position of the digital image in a video timeline of the digital video, the method comprising the following steps: transforming the plurality of elements representing pixels of the digital image to a representation depending on an invertible linear mapping, wherein the invertible linear mapping maps an input of the plurality of elements to the representation; modifying the representation to determine a modified representation depending on the representation; determining a plurality of elements representing pixels of an enhanced digital video depending on the modified representation; and transforming the modified representation depending on an inversion of the invertible linear mapping; wherein the invertible linear mapping includes at least one autoregressive convolution.

12. A computer implemented method for digital audio enhancement, in which each element of a plurality of elements representing a part of a digital audio sample includes an indication of a spatial dimension, the indication of the spatial dimension is a constant value, and the element includes an indication of a time dimension, the time dimension indicating a position in an audio timeline of the audio sample, the method comprising the following steps: transforming the plurality of elements representing parts of the audio sample to a representation depending on an invertible linear mapping, wherein the invertible linear mapping maps an input of the plurality of elements to the representation; modifying the representation to determine a modified representation depending on the representation; determining a plurality of elements representing parts of an enhanced digital audio sample depending on the modified representation; and transforming the modified representation depending on an inversion of the invertible linear mapping; wherein the invertible linear mapping includes at least one autoregressive convolution.

13. The computer implemented method according to claim 12 , wherein the constant value is one.

14. The computer implemented method according to claim 12 , wherein the digital audio sample includes audio channels, wherein each element of the plurality of elements includes an indication of a channel dimension, the channel dimension indicating an audio channel in the audio sample, and the plurality of elements including the indication of the channel dimension and representing parts of the audio sample is transformed to the representation depending on the invertible linear mapping, wherein the invertible linear mapping maps an input of the plurality of elements comprising the indication of the channel dimension to the representation, wherein the representation is modified to determine the modified representation depending on the representation, and wherein the plurality of elements comprising the indication of the channel dimension and representing parts of an enhanced digital audio sample is determined depending on the modified representation, wherein the modified representation s transformed depending on the inversion of the invertible linear mapping.

15. A computer implemented method for encoding digital audio data, in which each element of a plurality of elements representing a part of a digital audio sample includes an indication of a spatial dimension, wherein a first indication and a second indication of the spatial dimension is a constant value, wherein the element includes an indication of a time dimension, the time dimension indicating a position in an audio timeline of the audio sample, the method comprising: transforming the plurality of elements representing parts of the audio sample to a representation depending on an invertible linear mapping, wherein the invertible linear mapping maps an input of the plurality of elements to the representation; and transmitting or storing the representation; wherein the invertible linear mapping includes at least one autoregressive convolution; wherein the digital audio sample includes audio channels, wherein each element of the plurality of elements includes an indication of a channel dimension, the channel dimension indicating an audio channel in the audio sample, and the plurality of elements including the indication of the channel dimension and representing parts of the audio sample is transformed to the representation depending on the invertible linear mapping, wherein the invertible linear mapping maps an input of the plurality of elements including the indication of the channel dimension to the representation, and wherein the representation is transmitted or stored.

16. The computer implemented method as recited in claim 15 , wherein the constant value is 1.

17. A device, comprising: a processor; and storage comprising instructions for a convolutional neural network; wherein the processor is configured for digital image enhancement, in which each element of a plurality of elements representing a pixel of a digital image includes an indication of a spatial dimension, the spatial dimension indicating a position of the pixel in the digital image, and the element includes an indication of a channel dimension, the channel dimension indicating a channel of the pixel in the digital image, the processor configured to: transform the plurality of elements representing pixels of the digital image to a representation depending on an invertible linear mapping, the invertible linear mapping mapping an input of the plurality of elements to the representation; modify the representation to determine a modified representation depending on the representation; determine a plurality of elements representing pixels of an enhanced digital image depending on the modified representation; and transform the modified representation depending on an inversion of the invertible linear mapping, wherein the invertible linear mapping includes at least one autoregressive convolution.

18. The device according to claim 17 , further comprising an output adapted to output a result of an image transformation, an image recognition, an anomaly detection and/or an image validation.

19. The device according to claim 17 , further comprising an actuator adapted to control an at least partial autonomous vehicle or robot depending on the representation, and/or depending on a result of processing the representation, and/or depending on image data determined by the inversion of the invertible linear mapping.

20. A non-transitory computer-readable medium on which is stored instructions for digital image enhancement, in which each element of a plurality of elements representing a pixel of a digital image includes an indication of a spatial dimension, the spatial dimension indicating a position of the pixel in the digital image, and the element includes an indication of a channel dimension, the channel dimension indicating a channel of the pixel in the digital image, the instructions, when executed by a computer, causing the computer to perform the following steps: transforming the plurality of elements representing pixels of the digital image to a representation depending on an invertible linear mapping, the invertible linear mapping mapping an input of the plurality of elements to the representation; modifying the representation to determine a modified representation depending on the representation; determining a plurality of elements representing pixels of an enhanced digital image depending on the modified representation; and transforming the modified representation depending on an inversion of the invertible linear mapping, wherein the invertible linear mapping includes at least one autoregressive convolution.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06N G06F G06T H04N

Patent Metadata

Filing Date

December 3, 2019

Publication Date

March 15, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search