US-10332242

Method and system for reconstructing 360-degree video

PublishedJune 25, 2019

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Methods and system for reconstructing 360-degree video is disclosed. A video sequence V1 including a plurality of frames associated with spherical content at a first frame rate and a video sequence V2 including a plurality of frames associated with a predefined viewport at a second frame rate is received by a processor. The first frame rate is lower than the second frame rate. An interpolated video sequence V1′ of the video sequence V1 is generated by creating a plurality of intermediate frames between a set of consecutive frames of the plurality of frames of the sequence V1 corresponding to the second frame rate of the video sequence V2. A pixel based blending of each intermediate frame of the plurality of the intermediate frames of sequence V1′ with a corresponding frame of the plurality of frames the sequence V2 is performed to generate a fused video sequence Vm for displaying.

Patent Claims

16 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A computer-implemented method, comprising: receiving, by a processor, a video sequence V 1 comprising a plurality of frames associated with spherical content at a first frame rate and a video sequence V 2 comprising a plurality of frames associated with a predefined viewport at a second frame rate, wherein the first frame rate is lower than the second frame rate; generating, by the processor, an interpolated video sequence V 1 ′ of the video sequence V 1 , generating the interpolated video sequence V 1 ′ comprising creating a plurality of intermediate frames between each set of consecutive frames R 1 , R 2 of the plurality of frames of the video sequence V 1 corresponding to the second frame rate of video sequence V 2 ; and performing, by the processor, a pixel based blending of each intermediate frame of the plurality of the intermediate frames of the interpolated video sequence V 1 ′ with a corresponding frame of the plurality of frames of the video sequence V 2 to generate a fused video sequence Vm for displaying.

2. The method as claimed in claim 1 , further comprising: performing, by the processor, a sphere rotation of the video sequence V 1 to achieve a default view orientation.

5. The method as claimed in claim 4 , wherein a value of alpha(x, y) is determined based on a distance of a location of the pixel (x, y) to be reconstructed from a view center (x0, y0), the view center (x0, y0) being a center of a video frame.

6. The method as claimed in claim 5 , wherein a value of alpha(x, y) is set to 1.0 for a location of the pixel (x, y) lying within a predetermined distance of a pixel at a location (|x|<⅛, |y|<⅛) from the view center (x0, y0).

7. The method as claimed in claim 6 , wherein a value of alpha(x, y) is set to 0.0 for a location of the pixel (x, y) lying outside a predetermined distance of a pixel at a location (|x|<⅜, |y|<⅜) from the view center (x0, y0).

8. The method as claimed in claim 6 , wherein a value of alpha(x, y) is set between 1.0 to 0.0 for a location of the pixel (x, y) lying within a predetermined distance of a pixel at a location (⅜>|x|>⅛, ⅜>|y|>⅛) from the view center (x0, y0).

9. The method as claimed in claim 1 , wherein the interpolated video sequence V 1 ′ is a motion predicted video sequence, and generating an intermediate frame between two consecutive frame R 1 and R 2 of the video sequence V 1 , the intermediate frame being a temporally co-located frame of a frame P in the video sequence V 2 , comprises: determining at least one first motion vector M 1 between the frames P and a frame P 1 and at least one second motion vector M 2 between the frame P and a frame P 2 , wherein the frame P 1 is a frame in the video sequence V 2 that is temporally co-located frame of the frame R 1 in the video sequence V 1 , and wherein the frame P 2 is a frame in the video sequence V 2 that is temporally co-located frame of frame R 2 in the video sequence V 1 ; selecting at least one motion vector M, the at least one motion vector M being selected from the at least one first motion vector M 1 , or the at least one second motion vector M 2 , based on a cost function associated with the at least one first motion vector M 1 and a cost function associated with the at least one first motion vector M 2 ; selecting a reference frame for generating the intermediate frame, the reference frame being one of the frame R 1 and the frame R 2 based on the selected at least one motion vector M; and generating the intermediate frame based on the reference frame and the selected at least one motion vector M.

10. The method as claimed in claim 9 , wherein each of the at least one first motion vector M 1 is determined for a macroblock of a plurality of macroblocks of the frame P based on a motion estimation between the macroblock in the frame P and a corresponding macroblock in the frame P 1 , wherein each of the at least one first motion vector M 2 is determined for the macroblock of a plurality of macroblocks of the frame P based on a motion estimation between the macroblock in the frame P and a corresponding macroblock in the frame P 2 .

11. The method as claimed in claim 9 , further comprising, performing for each macroblock: selecting the reference frame as the frame R 1 if a cost function associated with a first motion vector M 1 for the macroblock is less than a cost function associated with a second motion vector M 2 for the macroblock; selecting the reference frame as the frame R 2 if the cost function associated with the second motion vector M 2 for the macroblock is less than the cost function associated with the first motion vector M 1 for the macroblock; and selecting one of the first motion vector M 1 and the second motion vector M 2 as a motion vector M for the macroblock, which has the smallest cost function.

12. The method as claimed in claim 11 , further comprising: determining a motion predicted macroblock for the intermediate frame using the motion vector M and the frame R 1 , if frame R 1 is the reference frame; determining a motion predicted macroblock for the intermediate frame using the motion vector M and the frame R 2 , if frame R 1 is the reference frame; and wherein intermediate frame of the interpolated video sequence V 1 ′ is generated comprising a plurality of motion predicted macroblocks.

14. The method as claimed in claim 13 , wherein a value of alpha is determined based on a distance of a location of the macroblock (bx, by) to be reconstructed from a view center (bx0, by0), the view center (bx0, by0) being a center of a video frame.

15. A system, comprising: a communication interface configured to receive a video sequence V 1 comprising a plurality of frames associated with spherical content at a first frame rate and a video sequence V 2 comprising a plurality of frames associated with a predefined viewport at a second frame rate, wherein the first frame rate is lower than the second frame rate; a frame interpolator configured to generate an interpolated video sequence V 1 ′ of the video sequence V 1 , generating the interpolated video sequence V 1 ′ comprising creating a plurality of intermediate frames between each set of consecutive frames R 1 , R 2 of the plurality of frames of the video sequence V 1 corresponding to the second frame rate of video sequence V 2 ; a memory comprising executable instructions; and a processor communicably coupled to the communication interface and the frame interpolator, the processor configured to execute the instructions to cause to the system to perform a pixel based blending of each intermediate frame of the plurality of the intermediate frames of the interpolated video sequence V 1 ′ with a corresponding frame of the plurality of frames of the video sequence V 2 to generate a fused video sequence Vm for displaying.

16. The system as claimed in claim 15 , wherein the system is further caused to: perform a sphere rotation of the sequence V 1 to achieve a default view orientation.

17. The system as claimed in claim 15 , wherein the frame interpolator further comprises: a motion estimation module configured to perform a motion estimation between a set of frames in the video sequence V 2 , wherein the set of frames are selected based on matching temporal location from a corresponding set of consecutive frames of the video sequence V 1 ; and a motion compensation module configured to perform a motion compensation between the set of selected frames in the video sequence V 2 to generate the interpolated video sequence V 1 ′, wherein the processor is configured to perform the pixel based blending of an intermediate frame of the interpolated video sequence V 1 ′ with a corresponding frame of the plurality of frames the video sequence V 2 to generate the fused video sequence Vm, and wherein performing the pixel based blending comprises performing a macroblock based blending.

19. A computer-implemented method, comprising: receiving, by a processor, a video sequence V 1 comprising a plurality of frames associated with spherical content at a first frame rate and a video sequence V 2 comprising a plurality of frames associated with a predefined viewport at a second frame rate, wherein the first frame rate is lower than the second frame rate; performing, by the processor, a sphere rotation of the video sequence V 1 to achieve a default view orientation; generating, by the processor, an interpolated video sequence V 1 ′ of the rotated video sequence V 1 by creating a plurality of intermediate frames, wherein creating the plurality of intermediate frames comprises performing one of: selecting a set of consecutive frames of the plurality of frames of the video sequence V 1 corresponding to the second frame rate of video sequence V 2 for performing a temporal fusion; and selecting a set of frames in the video sequence V 2 based on matching temporal location from a corresponding set of consecutive frames of the video sequence V 1 to perform a motion estimation and a motion compensation between the set of selected frames in the video sequence V 2 ; and performing, by the processor, a pixel based blending of an intermediate frame of the plurality of the intermediate frames of sequence V 1 ′ with a corresponding frame of the plurality of frames the sequence V 2 to generate a fused video sequence Vm for displaying.

20. The method as claimed in claim 19 , wherein performing the pixel based blending further comprising performing, by the processor, a macroblock based blending of an intermediate frame of the interpolated video sequence V 1 ′ with a corresponding frame of the plurality of frames the video sequence V 2 to generate the fused video sequence Vm for displaying.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06T H04N

Patent Metadata

Filing Date

January 18, 2018

Publication Date

June 25, 2019

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search