In one embodiment of the present invention, an encode validator identifies and classifies errors introduced during the parallel chunk-based translation of a source to a corresponding aggregate encode. In operation, upon receiving a source for encoding, a frame difference generator creates a frame difference file for the source. A parallel encoder then distributes per-chunk encoding operations across machines and creates an aggregate encode. The encode validator decodes the aggregate encode and creates a corresponding frame difference file. Subsequently, the encode validator performs phase correlation operations between the two frame difference files to detect errors generated by encoding process faults (i.e., dropping a frame, etc.) while suppressing discrepancies inherent in encoding, such as those attributable to low bit-rate encoding. Advantageously, since the encode validator leverages frame difference files, this indirect verification technique enables efficient debugging of parallel encoding processes in which the complete source is unavailable for post-encode analysis.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A computer-implemented method for identifying errors introduced during encoding, the method comprising: receiving frame difference data derived from source data by determining, for each source frame included in a plurality of source frames of the source data, a difference between a characteristic of the source frame and a characteristic of an adjacent source frame that resides adjacent to the source frame in the plurality of source frames; receiving frame difference data derived from aggregate decoded data by determining, for each decoded frame included in a plurality of decoded frames of the aggregate decoded data, a difference between a characteristic of the decoded frame and a characteristic of an adjacent decoded frame that resides adjacent to the decoded frame in the plurality of decoded frames, wherein the plurality of source frames of the source data corresponds to the plurality of decoded frames of the aggregate decoded data, wherein the aggregate decoded data is generated by decoding aggregate encoded data that is derived from the source data by separately encoding a plurality of chunks of the source data to generate a plurality of encoded chunks of the source data and combining the plurality of encoded chunks of the source data, and wherein each chunk included in the plurality of chunks of the source data is encoded by a separate compute instance included in a plurality of compute instances; comparing the frame difference data derived from the source data and the frame difference data derived from the aggregate decoded data to generate a first comparison; detecting that an error condition occurred while encoding one or more encoded chunks of the source data included in the plurality of encoded chunks of the source data based on the first comparison; and debugging, based on the error condition, an encoding process performed to generate the one or more encoded chunks of source data by one or more compute instances included in the plurality of compute instances.
2. The computer-implemented method of claim 1 , wherein detecting that the error condition occurred comprises: determining (1) a first total number of source frames associated with the frame difference data derived from the source data, and (2) a second total number of decoded frames associated with the frame difference data derived from the aggregate decoded data, and determining that the first total number of source frames differs from the second total number of decoded frames by more than a predetermined threshold.
3. The computer-implemented method of claim 1 , wherein the comparing comprises: partitioning the frame difference data derived from the source data and the frame difference data derived from the decoded aggregate encode into a plurality of blocks, wherein each block in the plurality of blocks includes a subset of frame difference data derived from the source data and a corresponding subset of frame difference data derived from the decoded aggregate encoded data; and for each block, comparing the frame difference data derived from the source data to the frame difference data derived from the decoded aggregate encoded data to determine a phase correlation value for the block.
4. The computer-implemented method of claim 3 , wherein detecting that the error condition occurred comprises: identifying a first number of blocks, wherein each block included in the first number of blocks has a phase correlation value less than a first predetermined threshold; and determining that the first number of blocks is greater than a second predetermined threshold.
5. The computer-implemented method of claim 3 , wherein detecting that the error condition occurred comprises: for a first block, performing a phase shift operation on the frame difference data derived from the source data; for the first block, comparing the shifted frame difference data derived from the source data to the corresponding frame difference data derived from the decoded aggregate encoded data to determine a shifted phase correlation value; and determining that the shifted phase correlation value is greater than a phase correlation value for the first block by at least a first predetermined amount.
6. The computer-implemented method of claim 3 , wherein detecting that the error condition occurred comprises: identifying a scene cut based on the phase correlation values; identifying a first number of blocks, wherein each block included in the first number of blocks does not immediately precede the scene cut and has a phase correlation value less than a first predetermined threshold; and determining that the first number of blocks is greater than a second predetermined threshold.
7. The computer-implemented method of claim 3 , wherein detecting that the error condition occurred comprises: identifying a first block having a phase correlation value less than a first predetermined threshold; and determining that the encoding bit-rate of the aggregate encoded data is greater than a second predetermined threshold.
8. The computer-implemented method of claim 3 , wherein detecting that the error condition occurred comprises: identifying a set of low correlation blocks, wherein each block included in the set of low correlation blocks has a phase correlation value less than a first predetermined threshold; determining a distribution based on phase correlation values for the set of low correlation blocks; computing a confidence zone based on the distribution; identifying a first number of blocks, wherein each block included in the first number of blocks is included in the set of low correlation blocks and has a phase correlation value that is outside the confidence zone; and determining that the first number of blocks is greater than a second predetermined threshold.
9. The computer-implemented method of claim 8 , wherein identifying a first block that has a phase correlation value that is outside the confidence zone comprises applying the Grubbs test to the distribution.
10. One or more non-transitory computer-readable storage media including instructions that, when executed by one or more processing units, cause the one or more processing units to identify errors introduced during encoding by performing the steps of: receiving frame difference data derived from source data by determining, for each source frame included in a plurality of source frames of the source data, a difference between a characteristic of the source frame and a characteristic of an adjacent source frame that resides adjacent to the source frame in the plurality of source frames; receiving frame difference data derived from aggregate decoded data by determining, for each decoded frame included in a plurality of decoded frames of the aggregate decoded data, a difference between a characteristic of the decoded frame and a characteristic of an adjacent decoded frame that resides adjacent to the decoded frame in the plurality of decoded frames, wherein the plurality of source frames of the source data corresponds to the plurality of decoded frames of the aggregate decoded data, wherein the aggregate decoded data is generated by decoding aggregate encoded data that is derived from the source data by separately encoding a plurality of chunks of the source data to generate a plurality of encoded chunks of the source data and combining the plurality of encoded chunks of the source data, and wherein each chunk included in the plurality of chunks of the source data is encoded by a separate compute instance included in a plurality of compute instances; comparing the frame difference data derived from the source data and the frame difference data derived from the aggregate decoded data to generate a first comparison; detecting that an error condition occurred while encoding one or more encoded chunks of the source data included in the plurality of encoded chunks of the source data based on the first comparison; and debugging, based on the error condition, an encoding process performed to generate the one or more encoded chunks of source data by one or more compute instances included in the plurality of compute instances.
11. The one or more non-transitory computer-readable storage media of claim 10 , wherein detecting that the error condition occurred comprises: determining(1) a first total number of source frames associated with the frame difference data derived from the source data, and (2) a second total number of decoded frames associated with the frame difference data derived from the aggregate decoded data, and determining that the first total number of source frames differs from the second total number of decoded frames by more than a predetermined threshold.
12. The one or more non-transitory computer-readable storage media of claim 10 , wherein the comparing comprises: partitioning the frame difference data derived from the source data and the frame difference data derived from the decoded aggregate encode into a plurality of blocks, wherein each block in the plurality of blocks includes a subset of frame difference data derived from the source data and a corresponding subset of frame difference data derived from the decoded aggregate encoded data; and for each block, comparing the frame difference data derived from the source data to the frame difference data derived from the decoded aggregate encoded data to determine a cross-correlation value for the block.
13. The one or more non-transitory computer-readable storage media of claim 12 , wherein detecting that the error condition occurred comprises: identifying a first number of sequential blocks, wherein each block included in the first number of sequential blocks has a cross-correlation value less than a first predetermined threshold; and determining that the first number of sequential blocks exceeds a second predetermined threshold.
14. The one or more non-transitory computer-readable storage media of claim 12 , wherein detecting that the error condition occurred comprises: determining that the encoding bit-rate of the aggregate encoded data is greater than a second predetermined threshold; identifying a first number of blocks, wherein each block included in the first number of blocks has a cross-correlation value less than a first predetermined threshold; and determining that the first number of blocks is greater than a second predetermined threshold.
15. The one or more non-transitory computer-readable storage media of claim 12 , wherein detecting that the error condition occurred comprises: identifying a set of low cross-correlation blocks, wherein each block included in the set of low cross-correlation blocks has a cross-correlation value less than a first predetermined threshold; applying the Grubbs test to the set of low cross-correlation blocks to identify a first number of blocks; and determining that the first number of blocks is greater than a second predetermined threshold.
16. The one or more non-transitory computer-readable storage media of claim 12 , wherein detecting that the error condition occurred comprises: for a first block, performing a phase shift operation on the frame difference data derived from the source data; for the first block, comparing the shifted frame difference data derived from the source data to the corresponding frame difference data derived from the decoded aggregate encoded data to determine a shifted cross-correlation value; and determining that the shifted cross-correlation value is greater than a cross-correlation value for the first block by at least a first predetermined amount.
17. The one or more non-transitory computer-readable storage media of claim 12 , wherein detecting that the error condition occurred comprises: identifying a scene cut based on the cross-correlation values; and identifying a first block, wherein the first block does not immediately precede the scene cut and has a cross-correlation value less than a first predetermined threshold.
18. The one or more non-transitory computer-readable storage media of claim 17 , wherein identifying the scene cut comprises detecting a first source frame that has a frame difference data derived from the source data that exceeds a first threshold; and determining that a second source frame that immediately precedes the first source frame has a frame difference data derived from the source data that differs from the frame difference data derived from the source data for the first source frame by at least a second threshold.
19. A system configured to identify errors introduced during encoding, the system comprising: a source frame difference data generator configured to receive frame difference data derived from source data by determining, for each source frame included in a plurality of source frames of the source data, a difference between a characteristic of the source frame and a characteristic of an adjacent source frame that resides adjacent to the source frame in the plurality of source frames; a parallel encoding engine configured to derive aggregate encoded data from the source data by separately encoding a plurality of chunks of the source data to generate a plurality of encoded chunks of the source data and combining the plurality of encoded chunks of the source data, and wherein each chunk included in the plurality of chunks of the source data is encoded by a separate compute instance included in a plurality of compute instances; a verification generator configured to: receive frame difference data derived from the aggregate decoded data by determining, for each decoded frame included in a plurality of decoded frames of the aggregate decoded data, a difference between a characteristic of the decoded frame and a characteristic of an adjacent decoded frame that resides adjacent to the decoded frame in the plurality of decoded frames, wherein the plurality of source frames of the source data corresponds to the plurality of decoded frames of the aggregate decoded data, wherein the aggregate decoded data is generated by decoding the aggregate encoded data; compare the frame difference data derived from the source data and the frame difference data derived from the aggregate decoded data to a first comparison; detect that an error condition occurred while encoding one or more encoded chunks of the source data included in the plurality of encoded chunks of the source data based on the first comparison; and debug, based on the error condition, an encoding process performed to generate the one or more encoded chunks of source data by one or more compute instances included in the plurality of compute instances.
20. The system of claim 19 , wherein detecting that the error condition occurred comprises: determining (1) a first total number of source frames associated with the frame difference data derived from the source data, and (2) a second total number of decoded frames associated with the frame difference data derived from the aggregate decoded data, and determining that the first total number of source frames differs from the second total number of decoded frames by more than a predetermined threshold.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
February 13, 2015
June 2, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.