US-11343529

Calculation of predication refinement based on optical flow

PublishedMay 24, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

A method of video processing includes determining a first motion displacement Vx(x,y) at a position (x,y) and a second motion displacement Vy(x,y) at the position (x,y) in a video block coded using an optical flow based method, wherein x and y are fractional numbers, where Vx(x,y) and Vy(x,y) are determined based at least on the position (x,y) and a center position of a basic video block of the video block, and performing a conversion between the video block and a bitstream representation of the current video block using the first motion displacement and the second motion displacement.

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method of processing video data, comprising: determining, for an affine coded video block of a video, at least one control point motion vector; determining a motion vector for a sub-block comprising a position (x, y) of the affine coded video block based on the at least one control point motion vector; determining, based on the at least one control point motion vector, a first motion displacement Vx(x,y) in a first direction and a second motion displacement Vy(x,y) in a second direction for the position (x,y), wherein Vx(x, y)=a×(x−xc)+b×(y−yc), Vy(x, y)=c×(x−xc)+d×(y−yc), wherein (xc, yc) is based on a center position or a size of the sub-block, a, b, c and d are affine parameters; determining a first gradient component Gx(x, y) in the first direction and a second gradient component Gy(x, y) in the second direction for the position (x,y); determining a refined prediction sample P′(x,y) for the position (x,y) by modifying a prediction sample P(x,y) derived for the position (x,y) with the first gradient component Gx(x, y), the second gradient component Gy(x, y), the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y), wherein the prediction sample P(x,y) is derived based on the motion vector for the sub-block; and performing a conversion between the affine coded video block and a bitstream of the video using the refined prediction sample P′(x,y); wherein a precision of the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y) is different from a precision of the motion vector for the sub-block, wherein the precision of the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y) is 1/32 pixel precision and the precision of the motion vector for the sub-block is 1/16 pixel precision; and wherein a decoder-side motion vector refinement method and a bi-directional optical flow method are not applied to the affine coded video block.

2. The method of claim 1 , wherein a color component of the affine coded video block is a luma component.

3. The method of claim 1 , wherein Vx(x,y) and Vy(x,y) are determined at least based on the position (x,y) and a center position of the sub-block.

4. The method of claim 1 , wherein Vx(x,y) and Vy(x,y) are determined at least based on the position (x,y) and a size of the sub-block.

5. The method of claim 1 , wherein c=−b and d=a in response to the affine coded video block being coded using a 4-parameter affine mode.

6. The method of claim 1 , wherein a, b, c and d may be derived from the control point motion vector, a width (W) of the affine coded video block, and a height (H) of the affine coded video block.

7. The method of claim 6 , wherein a = ( m ⁢ v 1 h - m ⁢ v 0 h ) W , b = ( mv 1 v - m ⁢ v 0 v ) W , c = ( m ⁢ v 2 h - m ⁢ v 0 h ) H ⁢ ⁢ and d = ( m ⁢ v 2 v - m ⁢ v 0 v ) H , wherein mv 0 , mv 1 , and mv 2 are the control point motion vectors, wherein a motion vector component with superscript of h indicates a motion vector component being in a first direction, wherein another motion vector component with a superscript of v indicates the another motion vector component being in the second direction, wherein the first direction is orthogonal to the second direction, wherein W indicates the width of the video block and H indicates the height of the video block.

8. The method of claim 7 , wherein a, b, c and d are shifted.

9. The method of claim 1 , wherein the method is used for a luma component and is not used a chroma component.

10. The method of claim 1 , wherein the method is used for an affine mode and is not used a non-affine mode.

11. The method of claim 1 , wherein the conversion comprises encoding the affine coded video block in to the bitstream.

12. The method of claim 1 , wherein the conversion comprises decoding the affine coded video block from the bitstream.

13. An apparatus for processing video data comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to: determine, for an affine coded video block of a video, at least one control point motion vector; determine a motion vector for a sub-block comprising a position (x, y) of the affine coded video block based on the at least one control point motion vector; determine, based on the at least one control point motion vector, a first motion displacement Vx(x,y) in a first direction and a second motion displacement Vy(x,y) in a second direction for the position (x,y), wherein Vx(x, y)=a×(x−xc)+b×(y−yc), Vy(x, y)=c×(x−xc)+d×(y−yc), wherein (xc, yc) is based on a center position or a size of the sub-block, and a, b, c and d are affine parameters; determine a first gradient component Gx(x, y) in the first direction and a second gradient component Gy(x, y) in the second direction for the position (x,y); determine a refined prediction sample P′(x,y) for the position (x,y) by modifying a prediction sample P(x,y) derived for the position (x,y) with the first gradient component Gx(x, y), the second gradient component Gy(x, y), the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y), wherein the prediction sample P(x,y) is derived based on the motion vector for the sub-block; and perform a conversion between the affine coded video block and a bitstream of the video using the refined prediction sample P′(x,y); wherein a precision of the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y) is different from a precision of the motion vector for the sub-block, wherein the precision of the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y) is 1/32 pixel precision and the precision of the motion vector for the sub-block is 1/16 pixel precision; and wherein a decoder-side motion vector refinement method and a bi-directional optical flow method are not applied to the affine coded video block.

14. A non-transitory computer-readable storage medium storing instructions that cause a processor to: determine, for an affine coded video block of a video, at least one control point motion vector; determine a motion vector for a sub-block comprising a position (x, y) of the affine coded video block based on the at least one control point motion vector; determine, based on the at least one control point motion vector, a first motion displacement Vx(x,y) in a first direction and a second motion displacement Vy(x,y) in a second direction for the position (x,y), wherein Vx(x, y)=a×(x−xc)+b×(y−yc), Vy(x, y)=c×(x−xc)+d×(y−yc), wherein (xc, yc) is based on a center position or a size of the sub-block, a, b, c and d are affine parameters; determine a first gradient component Gx(x, y) in the first direction and a second gradient component Gy(x, y) in the second direction for the position (x,y); determine a refined prediction sample P′(x,y) for the position (x,y) by modifying a prediction sample P(x,y) derived for the position (x,y) with the first gradient component Gx(x, y), the second gradient component Gy(x, y), the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y), wherein the prediction sample P(x,y) is derived based on the motion vector for the sub-block; and perform a conversion between the affine coded video block and a bitstream of the video using the refined prediction sample P′(x,y); wherein a precision of the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y) is different from a precision of the motion vector for the sub-block, wherein the precision of the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y) is 1/32 pixel precision and the precision of the motion vector for the sub-block is 1/16 pixel precision; and wherein a decoder-side motion vector refinement method and a bi-directional optical flow method are not applied to the affine coded video block.

15. A non-transitory computer-readable recording medium storing a bitstream of a video data which is generated by a method performed by a video processing apparatus, wherein the method comprises: determining, for an affine coded video block of a video, at least one control point motion vector; determining a motion vector for a sub-block comprising a position (x, y) of the affine coded video block based on the at least one control point motion vector; determining, based on the at least one control point motion vector, a first motion displacement Vx(x,y) in a first direction and a second motion displacement Vy(x,y) in a second direction for the position (x,y), wherein Vx(x, y)=a×(x−xc)+b×(y−yc), Vy(x, y)=c×(x−xc)+d×(y−yc), wherein (xc, yc) is based on a center position or a size of the sub-block, a, b, c and d are affine parameters; determining a first gradient component Gx(x, y) in the first direction and a second gradient component Gy(x, y) in the second direction for the position (x,y); determining a refined prediction sample P′(x,y) for the position (x,y) by modifying a prediction sample P(x,y) derived for the position (x,y) with the first gradient component Gx(x, y), the second gradient component Gy(x, y), the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y), wherein the prediction sample P(x,y) is derived based on the motion vector for the sub-block; and generating the bitstream using the refined prediction sample P′(x,y); wherein a precision of the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y) is different from a precision of the motion vector for the sub-block, wherein the precision of the first motion displacement Vx(x,y) and the second motion displacement Vy(x,y) is 1/32 pixel precision and the precision of the motion vector for the sub-block is 1/16 pixel precision; and wherein a decoder-side motion vector refinement method and a bi-directional optical flow method are not applied to the affine coded video block.

16. The apparatus of claim 13 , wherein a color component of the affine coded video block is a luma component.

17. The apparatus of claim 13 , wherein a, b, c and d may be derived from the control point motion vector, a width (W) of the affine coded video block, and a height (H) of the affine coded video block.

18. The apparatus of claim 17 , wherein a = ( m ⁢ v 1 h - m ⁢ v 0 h ) W , b = ( mv 1 v - m ⁢ v 0 v ) W , c = ( m ⁢ v 2 h - m ⁢ v 0 h ) H ⁢ ⁢ and d = ( m ⁢ v 2 v - m ⁢ v 0 v ) H , wherein mv 0 , mv 1 , and mv 2 are the control point motion vectors, wherein a motion vector component with superscript of h indicates a motion vector component being in a first direction, wherein another motion vector component with a superscript of v indicates the another motion vector component being in the second direction, wherein the first direction is orthogonal to the second direction, wherein W indicates the width of the video block, H indicates the height of the video block.

19. The non-transitory computer-readable storage medium of claim 14 , wherein a, b, c and d may be derived from the control point motion vector, a width (W) of the affine coded video block, and a height (H) of the affine coded video block, wherein a = ( m ⁢ v 1 h - m ⁢ v 0 h ) W , b = ( mv 1 v - m ⁢ v 0 v ) W , c = ( m ⁢ v 2 h - m ⁢ v 0 h ) H ⁢ ⁢ and d = ( m ⁢ v 2 v - m ⁢ v 0 v ) H , wherein mv 0 , mv 1 , and mv 2 are the control point motion vectors, wherein a motion vector component with superscript of h indicates a motion vector component being in a first direction, wherein another motion vector component with a superscript of v indicates the another motion vector component being in the second direction, wherein the first direction is orthogonal to the second direction, wherein W indicates the width of the video block, H indicates the height of the video block.

20. The non-transitory computer-readable recording medium of claim 15 , wherein a, b, c and d may be derived from the control point motion vector, a width (W) of the affine coded video block, and a height (H) of the affine coded video block, wherein a = ( m ⁢ v 1 h - m ⁢ v 0 h ) W , b = ( mv 1 v - m ⁢ v 0 v ) W , c = ( m ⁢ v 2 h - m ⁢ v 0 h ) H ⁢ ⁢ and d = ( m ⁢ v 2 v - m ⁢ v 0 v ) H , wherein mv 0 , mv 1 , and mv 2 are the control point motion vectors, wherein a motion vector component with superscript of h indicates a motion vector component being in a first direction, wherein another motion vector component with a superscript of v indicates the another motion vector component being in the second direction, wherein the first direction is orthogonal to the second direction, wherein W indicates the width of the video block, H indicates the height of the video block.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06T H04N

Patent Metadata

Filing Date

August 30, 2021

Publication Date

May 24, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search