US-11240478

Efficient multi-view coding using depth-map estimate for a dependent view

PublishedFebruary 1, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

The usual coding order according to which the reference view is coded prior to the dependent view, and within each view, a depth map is coded subsequent to the respective picture, may be maintained and does lead to a sacrifice of efficiency in performing inter-view redundancy removal by, for example, predicting motion data of the current picture of the dependent view from motion data of the current picture of the reference view. Rather, a depth map estimate of the current picture of the dependent view is obtained by warping the depth map of the current picture of the reference view into the dependent view, thereby enabling various methods of inter-view redundancy reduction more efficiently by bridging the gap between the views. According to another aspect, the following discovery is exploited: the overhead associated with an enlarged list of motion predictor candidates for a block of a picture of a dependent view is comparatively low compared to a gain in motion vector prediction quality resulting from an adding of a motion vector candidate which is determined from an, in disparity-compensated sense, co-located block of a reference view.

Patent Claims

18 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A decoder for decoding a multi-view signal transmitted via a data stream, comprising: a depth estimator configured for obtaining, using a processor, a depth map of a first view; and a dependent view reconstructor configured for: processing, using the processor, a flag that signals whether first motion data associated with the first view is derived using second motion data associated with a second view; responsive to the flag signaling that the first motion data is to be derived using the second motion data associated with the second view: estimating, using the processor and based on the depth map, a disparity with respect to the first view, identifying, using the processor, for a first picture coding block in the first view, a second picture coding block in the second view based on the disparity, obtaining, using the processor, the second motion data associated with the second picture coding block in the second view, predicting, using the processor, the first motion data associated with the first picture coding block in the first view based on the second motion data and the disparity, the predicting including deriving a first reference picture index associated with the first motion data by modifying a second reference picture index associated with the second motion data such that a first picture order count of a first reference picture is equal to a second picture order count of a second reference picture, adding, using the processor, the first motion data as a candidate in a set of motion data candidates for the first picture coding block in the first view, extracting, using the processor and from the data stream, index information specifying a motion data candidate of the set of motion data candidates for the first picture coding block in the first view, and reconstructing, using the processor, the first picture coding block of the first view by prediction based on the motion data candidate specified by the index information.

2. The decoder of claim 1 , wherein the dependent view reconstructor is further configured for: extracting motion data residual from the data stream; generating refined motion data for the first picture coding block based on the motion data candidate and the motion data residual; and reconstructing the first picture coding block of the first view by prediction based on the refined motion data.

3. The decoder of claim 1 , wherein the depth estimator is configured for obtaining the depth map of the first view by warping another depth map associated with the second view into the depth map of the first view.

4. The decoder of claim 3 , wherein the depth estimator is configured for warping by: obtaining a second disparity associated with a second picture of the second view; and applying the second disparity to a reference depth map of the second view to derive the depth map of the first view.

5. The decoder of claim 1 , wherein the dependent view reconstructor is further configured for: identifying additional second picture coding blocks in the second view; obtaining additional second motion data associated with the additional second picture coding blocks; and estimating the first motion data for the first picture coding block in the first view based on both the second motion data and the additional second motion data.

6. The decoder of claim 1 , wherein the dependent view reconstructor configured for extracting from the data stream, using the processor, a sub-block syntax element representing a sub-block flag that indicates whether the picture coding block in the first view is to be decoded in units of sub-blocks of the picture coding block, wherein the index information specifies the motion data candidate of the set of motion data candidates for a one of the sub-blocks of the picture coding block in the first view, and the one of the sub-blocks of the picture coding block is reconstructed by prediction based on the motion data candidate specified by the index information for the one of the sub-blocks.

7. A method for decoding a multi-view signal transmitted via a data stream, comprising: obtaining a depth map of a first view; processing a flag that signals whether first motion data associated with the first view is derived using second motion data associated with a second view; responsive to the flag signaling that the first motion data is to be derived using the second motion data associated with the second view: estimating, based on the depth map, a disparity with respect to the first view, identifying, for a first picture coding block in the first view, a second picture coding block in the second view based on the disparity, obtaining the second motion data associated with the second picture coding block in the second view, predicting the first motion data associated with the first picture coding block in the first view based on the second motion data and the disparity, the predicting including deriving a first reference picture index associated with the first motion data by modifying a second reference picture index associated with the second motion data such that a first picture order count of a first reference picture is equal to a second picture order count of a second reference picture, adding the first motion data as a candidate in a set of motion data candidates for the first picture coding block in the first view, extracting, from the data stream, index information specifying a motion data candidate of the set of motion data candidates for the first picture coding block in the first view, and reconstructing the first picture coding block of the first view by prediction based on the motion data candidate specified by the index information.

8. The method of claim 7 , further comprising: extracting motion data residual from the data stream; generating refined motion data for the first picture coding block based on the motion data candidate and the motion data residual; and reconstructing the first picture coding block of the first view by prediction based on the refined motion data.

9. The method of claim 7 , wherein the step of obtaining the depth map of the first view comprises warping another depth map associated with the second view into the depth map of the first view.

10. The method of claim 9 , wherein the step of warping comprises: obtaining a second disparity associated with a second picture of the second view; and applying the second disparity to a reference depth map of the second view to derive the depth map of the first view.

11. The method of claim 7 , wherein the step of predicting the dependent motion data comprises: identifying additional second picture coding blocks in the second view; obtaining additional second motion data associated with the additional second picture coding blocks; and estimating the first motion data for the first picture coding block in the first view based on both the second motion data and the additional second motion data.

12. The method of claim 7 , further comprising extracting from the data stream, using the processor, a sub-block syntax element representing a sub-block flag that indicates whether the picture coding block in the first view is to be decoded in units of sub-blocks of the picture coding block, wherein the index information specifies the motion data candidate of the set of motion data candidates for a one of the sub-blocks of the picture coding block in the first view, and the one of the sub-blocks of the picture coding block is reconstructed by prediction based on the motion data candidate specified by the index information for the one of the sub-blocks.

13. An encoder for encoding a multi-view signal into a data stream, comprising: a depth estimator configured for obtaining, using a processor, a depth map of a first view; and a dependent view encoder configured for responsive to a flag signaling that first motion data is to be derived using second motion data associated with a second view: estimating, using the processor and based on the depth map, a disparity with respect to the first view, identifying, using the processor, for a first picture coding block in the first view, a second picture coding block in the second view based on the disparity, obtaining, using the processor, the second motion data associated with the second picture coding block in the second view, predicting, using the processor, the first motion data associated with the first picture coding block in the first view based on the second motion data and the disparity, the predicting including deriving a first reference picture index associated with the first motion data by modifying a second reference picture index associated with the second motion data such that a first picture order count of a first reference picture is equal to a second picture order count of a second reference picture, adding, using the processor, the first motion data as a candidate in a set of motion data candidates for the first picture coding block in the first view, and inserting, using the processor into the data stream, the flag and index information specifying a motion data candidate of the set of motion data candidates for the first picture coding block in the first view, wherein the first picture coding block of the first view is reconstructed using prediction based on the motion data candidate specified by the index information.

14. The encoder of claim 13 , wherein the dependent view encoder is further configured for: determining motion data residual based on a difference between the motion data candidate and the first motion data associated with the picture coding block in the first view; and inserting the motion data residual, without the motion data candidate, into the data stream.

15. The encoder of claim 13 , wherein the depth estimator is configured for obtaining the depth map of the first view by warping another depth map associated with the second view into the depth map of the first view.

16. The encoder of claim 15 , wherein the depth estimator is configured for warping by: obtaining a second disparity associated with a second picture of the second view; and applying the second disparity to a reference depth map of the second view to derive the depth map of the first view.

17. The encoder of claim 13 , wherein the step of predicting the first motion data includes: identifying additional second picture coding blocks in the second view; obtaining additional second motion data associated with the additional second picture coding blocks; and estimating the first motion data for the first picture coding block in the first view based on both the second motion data and the additional second motion data.

18. The encoder of claim 13 , wherein the dependent view encoder is configured for inserting into the data stream, using the processor, a sub-block syntax element representing a sub-block flag that indicates whether the picture coding block in the first view is to be coded in units of sub-blocks of the picture coding block, wherein the index information specifies the motion data candidate of the set of motion data candidates for a one of the sub-blocks of the picture coding block in the first view, and the one of the sub-blocks of the picture coding block is reconstructed by prediction based on the motion data candidate specified by the index information for the one of the sub-blocks.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

H04N

Patent Metadata

Filing Date

May 11, 2020

Publication Date

February 1, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search