The present invention relates to an apparatus and method for video coding using compressive measurements. The method includes receiving video data including frames, and determining at least one temporal structure based on a series of consecutive frames in the video data. The temporal structure includes a sub-block of video data from each frame in the series. The method further includes obtaining a measurement matrix, and generating a set of measurements by applying the measurement matrix to the at least one temporal structure. The measurement matrix includes an assigned pattern of pixel values and the set of measurements is coded data representing the at least one temporal structure.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for encoding video data by an encoder, the method including: receiving, by the encoder, the video data including frames; determining, by the encoder, at least one temporal structure based on a series of consecutive frames in the video data, the at least one temporal structure including a sub-block of video data from each frame in the series, wherein each frame in the series is divided into a plurality of blocks, wherein the sub-block of video data from each frame in the series corresponds to one of the respective plurality of blocks; obtaining, by the encoder, a measurement matrix corresponding to spatial and temporal variations of the video data within the at least one temporal structure, the measurement matrix including a set of measurement bases having pixel values of a particular type; and generating, by the encoder, a set of measurements by applying the measurement matrix to the at least one temporal structure, the set of measurements being coded data representing the at least one temporal structure.
2. The method of claim 1 , wherein the determining the at least one temporal structure includes: extracting, by the encoder, the sub-block of video data from a same location in each of the frames in the series; and forming, by the encoder, the at least one temporal structure based on the extracted sub-blocks.
3. The method of claim 1 , wherein the determining the at least one temporal structure includes: extracting, by the encoder, the sub-block of video data from a different location in at least one of the frames in the series; and forming, by the encoder, the at least one temporal structure based on the extracted sub-blocks, wherein the extracted sub-blocks represents a motion trajectory of an object.
4. The method of claim 1 , wherein each measurement basis in the set of measurement bases has a same determined the at least one temporal structure of the video data.
5. The method of claim 1 , wherein the measurement matrix is a randomly permutated Walsh-Hadamard matrix.
6. The method of claim 1 , wherein the generating the set of measurements includes: scanning, by the encoder, pixels of the at least one temporal structure to obtain a one-dimensional (1-D) vector, the 1-D vector including pixel values of the at least one temporal structure; and multiplying, by the encoder, the 1-D vector by each column of the measurement matrix to generate the set of measurements.
7. The method of claim 6 , wherein a 1-D length of the vector is based on a number of horizontal and vertical pixels in one frame and the number of consecutive frames in the series.
8. A method for decoding at least one measurement from a set of measurements into reconstructed video data, wherein the set of measurements represents video data including a series of consecutive frames by a decoder, the method including: receiving, by the decoder, the at least one measurement from the set of measurements; obtaining, by the decoder, a measurement matrix, corresponding to spatial and temporal variations of the video data within at least one temporal structure based on the series of consecutive frames in the video data, that was applied to the video data at an encoder, the measurement matrix including a set of measurement bases having pixel values of a particular type; determining, by the decoder, candidate video data, the candidate video data being based on the measurement matrix and the received at least one measurement, determining, by the decoder, discrete cosine transform (DCT) coefficients of the candidate video data by applying a DCT transform in the temporal direction, and determining, by the decoder, the reconstructed video data by performing a total variation (TV) of the coefficients on a frame-by-frame basis.
9. The method of claim 8 , wherein the TV is one of anisotropic TV and isotropic TV.
10. The method of claim 8 , wherein the determining the reconstructed video data further includes: calculating, by the decoder, a one-dimensional (1-D) representation of the reconstructed video data according to a minimization of the TV of the DCT coefficients.
11. The method of claim 10 , wherein the determining the reconstructed video data further includes: determining, by the decoder, the frames of the reconstructed video data based on the 1-D representation of the reconstructed video data.
12. The method of claim 10 , wherein the determining the reconstructed video data further includes: forming, by the decoder, the at least one temporal structure based on the 1-D representation of the reconstructed video data, the at least one temporal structure including a sub-block of video data from each of the frames in the video data; and determining, by the decoder, the reconstructed video data based on the at least one temporal structure.
13. The method of claim 8 , wherein each measurement basis of the set of measurement bases has a temporal structure with pixel values of a random pattern.
14. The method of claim 8 , wherein the measurement matrix is a randomly permutated Walsh-Hadamard matrix.
15. An apparatus for encoding video data, the apparatus including: an encoder configured to receive the video data including frames, the encoder configured to determine at least one temporal structure based on a series of consecutive frames in the video data, the at least one temporal structure including a sub-block of video data from each of the frames in the series, wherein each frame in the series is divided into a plurality of blocks, wherein the sub-block of video data from each frame in the series corresponds to one of the respective plurality of blocks, the encoder configured to obtain a measurement matrix corresponding to spatial and temporal variations of the video data within the at least one temporal structure, the measurement matrix including a set of measurement bases having pixel values of a particular type, and the encoder configured to generate a set of measurements by applying the measurement matrix to the at least one temporal structure, the set of measurements being coded data representing the at least one temporal structure.
16. The apparatus of claim 15 , wherein the encoder extracts the sub-block of video data from a same location in each of the frames in the series and forms the at least one temporal structure based on the extracted sub-blocks.
17. The apparatus of claim 15 , wherein the encoder extracts the sub-block of video data from a different location in at least one of the frames in the series and forms the at least one temporal structure based on the extracted sub-blocks, wherein the extracted sub-blocks represents a motion trajectory of an object.
18. The apparatus of claim 15 , wherein each measurement basis in the set of measurement bases has a same the at least one temporal structure as the video data.
19. The apparatus of claim 15 , wherein the measurement matrix is a randomly permutated Walsh-Hadamard matrix.
20. The apparatus of claim 15 , further comprising: the encoder configured to scan pixels of the at least one temporal structure to obtain a one-dimensional (1-D) vector, the 1-D vector including pixel values of the at least one temporal structure, and the encoder configured to multiply the 1-D vector by each column of the Measurement matrix to generate the set of measurements.
21. The apparatus of claim 20 , wherein a 1-D length of the vector is based on a number of horizontal and vertical pixels in one frame and the number of consecutive frames in the series.
22. An apparatus for decoding at least one measurement from a set of measurements into reconstructed video data, wherein the set of measurements represents video data including a series of consecutive frames, the apparatus including: a decoder configured to receive the at least one measurement from the set of measurements, the decoder configured to obtain a measurement matrix, corresponding to spatial and temporal variations of the video data within at least one temporal structure based on the series of consecutive frames in the video data, that was applied to the video data at an encoder, the measurement matrix including a set of measurement bases having pixel values of a particular type, the decoder configured to determine candidate video data, the candidate video data being based on the measurement matrix and the received at least one measurement, the decoder configured to determine discrete cosine transform (DCT) coefficients of the candidate video data by applying a DCT transform in the temporal direction, and the decoder configured to determine the reconstructed video data by performing a total variation (TV) of the coefficients on a frame-by-frame basis.
23. The apparatus of claim 22 , wherein the TV is one of anisotropic TV and isotropic TV.
24. The apparatus of claim 22 , further including: the decoder configured to calculate a one-dimensional (1-D) representation of the reconstructed video data according to a minimization of the TV of the DCT coefficients.
25. The apparatus of claim 24 , further including: the decoder configured to determine the frames of the reconstructed video data based on the 1-D representation of the reconstructed video data.
26. The apparatus of claim 24 , further including: the decoder configured to form the at least one temporal structure based on the 1-D representation of the reconstructed video data, the at least one temporal structure including a sub-block of video data from each of the frames in the video data, and the decoder configured to determine the reconstructed video data based on the at least one temporal structure.
27. The apparatus of claim 22 , wherein each measurement basis of the set of measurement bases has a temporal structure with pixel values of a random pattern.
28. The apparatus of claim 22 , wherein the measurement matrix is a randomly permutated Walsh-Hadamard matrix.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
September 30, 2010
January 6, 2015
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.