Ways to mitigate loss in inter-operability scenarios for digital video are presented. For example, a bitstream modification tool (such as a bitstream rewriter running on a network node of a videoconferencing system) receives an incoming bitstream of encoded video (e.g., from an encoder that uses a first loss recovery strategy). The bitstream modification tool processes the incoming bitstream of encoded video to produce an outgoing bitstream of encoded video. In doing so, the bitstream modification tool changes at least one syntax element between the incoming bitstream and the outgoing bitstream so as to mitigate picture loss effects during decoding of the outgoing bitstream under lossy delivery conditions. The bitstream modification tool outputs the outgoing bitstream. In this way, the bitstream modification tool can help avoid blank screens, frozen screens, or other failures during decoding under lossy delivery conditions (e.g., with a decoder that uses a different loss recovery strategy).
Legal claims defining the scope of protection, as filed with the USPTO.
1. One or more computer-readable media storing computer-executable instructions for causing a node of a system configured for videoconferencing or other real-time communication, when programmed thereby, to perform bitstream modification operations comprising: receiving, at the node of the system, an incoming bitstream of encoded video organized according to a given codec standard or format, the incoming bitstream having multiple temporal layers; at the node of the system, processing the incoming bitstream to produce an outgoing bitstream of encoded video organized according to the given codec standard or format, the outgoing bitstream having a single temporal layer, wherein the processing includes converting the multiple temporal layers into the single temporal layer, and wherein the converting includes changing at least one syntax element between the incoming bitstream and the outgoing bitstream so as to mitigate picture loss effects during decoding of the outgoing bitstream under lossy delivery conditions; and outputting, from the node of the system, the outgoing bitstream.
2. The one or more computer-readable media of claim 1 , wherein the node of the system is a network node, wherein the incoming bitstream is received from a transmitter node of the system, and wherein the outgoing bitstream is transmitted to a receiver node of the system.
3. The one or more computer-readable media of claim 1 , wherein the changing the at least one syntax element uses bitstream rewriting of the at least one syntax element between the incoming bitstream and the outgoing bitstream without modification of other syntax elements between the incoming bitstream and the outgoing bitstream.
4. The one or more computer-readable media of claim 1 , wherein the changing the at least one syntax element uses transcoding between the incoming bitstream and the outgoing bitstream by at least in part decoding the incoming bitstream and at least in part re-encoding results of the decoding.
5. The one or more computer-readable media of claim 1 , wherein quality of video content is unchanged between the incoming bitstream and the outgoing bitstream in terms of temporal resolution, spatial resolution, and signal-to-noise ratio resolution.
6. The one or more computer-readable media of claim 1 , wherein the changing comprises, for a picture in a temporal enhancement layer of the multiple temporal layers that is not already marked as a reference picture: marking the picture as a reference picture; and adding a syntax structure that includes information about reference picture handling.
7. The one or more computer-readable media of claim 1 , wherein the changing comprises, for a picture in a temporal base layer of the multiple temporal layers that has a reference picture used for motion compensation, selectively adjusting reference picture management information to account for marking of new pictures, from at least one temporal enhancement layer of the multiple temporal layers, as reference pictures.
8. The one or more computer-readable media of claim 7 , wherein the selectively adjusting the reference picture management information comprises, evaluating whether the reference picture for the picture in the temporal base layer is a long-term reference picture and, if not, adjusting the reference picture management information.
9. The one or more computer-readable media of claim 7 , wherein the selectively adjusting the reference picture management information comprises, in a syntax structure that includes information about reference picture list modification for the picture in the temporal base layer, setting the previous picture in the temporal base layer to be an initial reference picture.
10. The one or more computer-readable media of claim 1 , wherein the changing comprises adjusting frame number for any non-instantaneous decoding reference picture in the incoming bitstream.
11. A method comprising: receiving an incoming bitstream of encoded video organized according to a given codec standard or format; processing the incoming bitstream to produce an outgoing bitstream of encoded video organized according to the given codec standard or format, including changing at least one syntax element between the incoming bitstream and the outgoing bitstream so as to mitigate picture loss effects during decoding of the outgoing bitstream under lossy delivery conditions, without changing quality of video content between the incoming bitstream and the outgoing bitstream in terms of temporal resolution, spatial resolution, and signal-to-noise ratio resolution; and outputting the outgoing bitstream.
12. A node of a system configured for videoconferencing or other real-time communication, the node comprising: a buffer configured to receive an incoming bitstream of encoded video organized according to a given codec standard or format; a bitstream modification tool configured to process the incoming bitstream to produce an outgoing bitstream of encoded video organized according to the given codec standard or format, by changing at least one syntax element between the incoming bitstream and the outgoing bitstream so as to mitigate picture loss effects during decoding of the outgoing bitstream under lossy delivery conditions, wherein the changing the at least one syntax element uses bitstream rewriting of the at least one syntax element between the incoming bitstream and the outgoing bitstream without modification of other syntax elements between the incoming bitstream and the outgoing bitstream; and a buffer configured to store the outgoing bitstream.
13. The node of claim 12 , wherein, to process the incoming bitstream to produce the outgoing bitstream, the bitstream modification tool is configured to switch between multiple modes, the multiple modes including: a first mode that is used before a receiver acknowledges receipt of an instantaneous decoding refresh (“IDR”) picture in the outgoing bitstream; and a second mode that is used after the receiver has acknowledged receipt of an IDR picture in the outgoing bitstream.
14. The node of claim 12 , wherein, to process the incoming bitstream to produce the outgoing bitstream, the bitstream modification tool is configured to convert multiple temporal layers of the incoming bitstream into a single temporal layer of the outgoing bitstream.
15. The method of claim 11 , wherein the changing the at least one syntax element uses bitstream rewriting of the at least one syntax element between the incoming bitstream and the outgoing bitstream without modification of other syntax elements between the incoming bitstream and the outgoing bitstream.
16. The method of claim 11 , wherein the changing the at least one syntax element uses transcoding between the incoming bitstream and the outgoing bitstream by at least in part decoding the incoming bitstream and at least in part re-encoding results of the decoding.
17. The method of claim 11 , wherein the processing includes switching between multiple modes, the multiple modes including: a first mode that is used before a receiver node acknowledges receipt of an instantaneous decoding refresh (“IDR”) picture in the outgoing bitstream; and a second mode that is used after the receiver node has acknowledged receipt of an IDR picture in the outgoing bitstream.
18. The method of claim 11 , wherein the incoming bitstream has multiple temporal layers and the outgoing bitstream has a single temporal layer, and wherein the processing comprising converting the multiple temporal layers into the single temporal layer.
19. One or more computer-readable media storing computer-executable instructions for causing a node of a system configured for videoconferencing or other real-time communication, when programmed thereby, to perform bitstream modification operations comprising: receiving, at the node of the system, an incoming bitstream of encoded video organized according to a given codec standard or format; at the node of the system, processing the incoming bitstream to produce an outgoing bitstream of encoded video organized according to the given codec standard or format, wherein the processing includes switching between multiple modes, and wherein the switching includes changing at least one syntax element between the incoming bitstream and the outgoing bitstream so as to mitigate picture loss effects during decoding of the outgoing bitstream under lossy delivery conditions, the multiple modes including: a first mode that is used before a receiver node acknowledges receipt of an instantaneous decoding refresh (“IDR”) picture in the outgoing bitstream; and a second mode that is used after the receiver node has acknowledged receipt of an IDR picture in the outgoing bitstream; and outputting, from the node of the system, the outgoing bitstream.
20. The one or more computer-readable media of claim 19 , wherein the changing the at least one syntax element uses bitstream rewriting of the at least one syntax element between the incoming bitstream and the outgoing bitstream without modification of other syntax elements between the incoming bitstream and the outgoing bitstream.
21. The one or more computer-readable media of claim 19 , wherein the changing the at least one syntax element uses transcoding between the incoming bitstream and the outgoing bitstream by at least in part decoding the incoming bitstream and at least in part re-encoding results of the decoding.
22. The one or more computer-readable media of claim 19 , wherein quality of video content is unchanged between the incoming bitstream and the outgoing bitstream in terms of temporal resolution, spatial resolution, and signal-to-noise ratio resolution.
23. The one or more computer-readable media of claim 19 , wherein, in the first mode, the changing includes, for an IDR picture in the incoming bitstream that is not already marked as a long-term reference (“LTR”) picture: marking the IDR picture as a LTR picture; and incrementing a value of a syntax element that indicates a maximum count of reference frames in a decoded picture buffer.
24. The one or more computer-readable media of claim 19 , wherein, in the second mode, the changing includes, for an IDR picture in the incoming bitstream: converting the IDR picture to an intra (I) picture by changing NAL unit type; removing at least one syntax element from each slice header of the I picture; and modifying a syntax structure that includes information about reference picture handling.
25. The one or more computer-readable media of claim 19 , wherein the bitstream modification operations further comprise: receiving a feedback message from the receiver node; and in response to the feedback message, as part of the processing, switching from the first mode to the second mode.
26. The one or more computer-readable media of claim 19 , wherein the bitstream modification operations further comprise: detecting a change sequence parameter set data in the incoming bitstream; and in response to the detection of the change in sequence parameter set data, as part of the processing, switching from the second mode to the first mode.
27. The one or more computer-readable media of claim 19 , wherein the bitstream modification operations further comprise: detecting a decoder restart; and in response to the detection of the decoder restart, as part of the processing, switching from the second mode to the first mode.
28. A node of a system configured for videoconferencing or other real-time communication, the node comprising: a buffer configured to receive an incoming bitstream of encoded video organized according to a given codec standard or format; a bitstream modification tool configured to process the incoming bitstream to produce an outgoing bitstream of encoded video organized according to the given codec standard or format, by changing at least one syntax element between the incoming bitstream and the outgoing bitstream so as to mitigate picture loss effects during decoding of the outgoing bitstream under lossy delivery conditions, wherein the changing the at least one syntax element uses transcoding between the incoming bitstream and the outgoing bitstream by at least in part decoding the incoming bitstream and at least in part re-encoding results of the decoding; and a buffer configured to store the outgoing bitstream.
29. The node of claim 28 , wherein, to process the incoming bitstream to produce the outgoing bitstream, the bitstream modification tool is configured to switch between multiple modes, the multiple modes including: a first mode that is used before a receiver acknowledges receipt of an instantaneous decoding refresh (“IDR”) picture in the outgoing bitstream; and a second mode that is used after the receiver has acknowledged receipt of an IDR picture in the outgoing bitstream.
30. The node of claim 28 , wherein, to process the incoming bitstream to produce the outgoing bitstream, the bitstream modification tool is configured to convert multiple temporal layers of the incoming bitstream into a single temporal layer of the outgoing bitstream.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
April 9, 2015
January 3, 2017
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.