A method and apparatus for adaptive denoising of source video in a video conference application is provided. Video captured is analyzed on a frame by frame basis to determine whether denoising of the frame should be performed prior to providing the source frame to an encoder. If the frame is to be denoised, the frame is divided into a plurality of blocks and a local denoising process is performed on a block per block basis.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for adaptive denoising of source video in a video conferencing application, the method comprising: buffering a plurality of source frames from captured video in a source frame buffer; filtering the buffered source frames to identify source frames for further processing; for each of the filtered source frames dividing the filtered source frame into a plurality of blocks, each block having N×N pixels, N being an integer; performing a temporal denoising process on each of the plurality of blocks; combining the plurality of denoised blocks in to an output frame; and scanning the denoised blocks of the output frame and for each denoised block, determining whether to keep the denoised block or replace it with its corresponding block from the filtered source frame; and providing the scanned output frames to an encoder.
2. The method of claim 1 , wherein the filtering the buffered source frame includes further processing each of the buffered frames.
3. The method of claim 1 , further comprising: encoding the scanned output frames into a bitstream; transmitting the bitstream to a destination device; and parsing the bitstream to extract quantization parameters and motion vectors, wherein filtering the buffered source frames includes, for each buffered source frame determining whether the average quantization employed in the bitstream satisfies a predefined threshold; in response to the predefined threshold being satisfied, copying the buffered source directly to an output frame without denoising and providing the output frame to the encoder; and in response to the predefined threshold not being satisfied, outputting the filtered source frame for further processing.
4. The method of claim 3 , further comprising: after dividing the filtered block into a plurality of blocks, filtering the plurality of blocks by comparing an extracted motion vector from a previous encoded frame at the same spatial position with a predefined motion threshold; in response to the predefined motion threshold being satisfied, copying the block directly into the output frame without denoising the block; and in response to the predefined motion threshold not be satisfied, performing the temporal denoising process on the block.
5. The method of claim 1 , wherein scanning the denoised blocks of the output frame comprises: sequentially processing each denoised block within the output frame, and for each denoised block: determining whether the denoised block is a skin block; and determining whether the variance of the denoised block satisfies a first adaptive threshold or a second adaptive threshold based on the determination of whether the denoised block is a skin block; and in response to first adaptive threshold being satisfied or the second adaptive threshold being satisfied, keeping the denoised block in the output frame; and in response to the first adaptive threshold not be satisfied or the second adaptive threshold not being satisfied, replacing the denoised block in the output frame with its corresponding block from the filtered source frame.
6. The method of claim 5 , wherein scanning the denoised blocks of the output frame further comprises: sequentially scanning the previously scanned output frame, and for each denoised block within the previously scanned output frame, determining whether a set of connecting neighbor blocks have been denoised; and in response to a determination that the set of connecting neighbor blocks have not been denoised, replacing the denoised block in the output frame with its corresponding block from the filtered source frame.
7. The method of claim 1 , wherein scanning the denoised blocks of the output frame comprises: processing the denoised blocks in the output frame using a checkerboard pattern such that every other denoised block in the output frame is sequentially processed starting with the odd blocks and then the even blocks.
8. The method of claim 7 , wherein the sequential processing of the odd blocks includes, for each odd denoised block: determining whether the denoised block is a skin block; and determining whether the variance of the denoised block satisfies a first adaptive threshold or a second adaptive threshold based on the determination of whether the denoised block is a skin block; and in response to the first adaptive threshold being satisfied or the second adaptive threshold being satisfied, keeping the denoised odd block in the output frame; and in response to the first adaptive threshold or the second adaptive threshold not being satisfied or the second adaptive threshold not being satisfied, replacing the denoised block in the output frame with its corresponding block from the filtered source frame.
9. The method of claim 8 , wherein the sequential processing of the even blocks includes, for each even denoised block, determining whether the denoised block is a skin block; and determining whether the variance of the denoised block satisfies a first adaptive threshold or a second adaptive threshold based on the determination of whether the denoised block is a skin block; and in response to the first adaptive threshold not being satisfied or the second adaptive threshold not being satisfied, replacing the denoised block in the output frame with its corresponding block from the filtered source frame; and in response to the first adaptive threshold being satisfied or the second adaptive threshold being satisfied, determining whether a set of connecting odd neighbor blocks have been denoised; and in response to a determination that the set of connecting odd neighbor blocks have not been denoised, replacing the denoised block in the output frame with its corresponding block from the filtered source frame; and in response to a determination that the set of connecting odd neighbor blocks has been denoised, keeping the denoised in the output frame.
10. The method of claim 8 , wherein for every other frame, the sequential processing of the denoised blocks in a checkerboard pattern starts with the even blocks and then the odd blocks.
11. The method of claim 10 , wherein the sequential processing of the even blocks includes, for each even denoised block determining whether the denoised block is a skin block; and determining whether the variance of the denoised block satisfies a adaptive first threshold or a second adaptive threshold based on determination of whether the denoised block is a skin block; and in response to the adaptive first threshold or the second adaptive threshold being satisfied, keeping the denoised even block in the output frame; in response to the first adaptive threshold or the second adaptive threshold not being satisfied, replacing the denoised block with its corresponding block from the filtered source frame.
12. The method of claim 11 , wherein the sequential processing of the odd blocks including, for each odd denoised block, determining whether the denoised block is a skin block; and determining whether the variance of the denoised block satisfies a first adaptive threshold or a second adaptive threshold based on the determination of whether the denoised block is a skin block; and in response to the first adaptive threshold not being satisfied or the second adaptive threshold not being satisfied, replacing the denoised block in the output frame with its corresponding block from the filtered source frame; and in response to the first adaptive threshold being satisfied or the second adaptive threshold being satisfied, determining whether a set of connecting even neighbor blocks have been denoised; and in response to a determination that the set of connecting even neighbor blocks have not been denoised, replacing the denoised block in the output frame with its corresponding block from the filtered source frame; and in response to a determination that the set of connecting odd neighbor blocks has been denoised, keeping the denoised in the output frame.
13. A mobile device, comprising a video capturing device; a video encoder; a source frame buffer that stores video frames from the video capturing device; and a processor configured to: buffer a plurality of source frames from captured video in a source frame buffer; filter the buffered source frames to identify source frames for further processing; for each of the filtered source frames divide the filtered source frame into a plurality of blocks, each block having N×N pixels, N being an integer; perform a temporal denoising process on each of the plurality of blocks; and combine the plurality of denoised blocks in to an output frame; and scan the denoised blocks of the output frame and for each denoised block, determine whether to keep the denoised block or replace it with its corresponding block from the filtered source frame; and provide the scanned output frames to the encoder.
14. The device of claim 13 , wherein the filtering the buffering source frame includes further processing each of the buffered frames.
15. The device of claim 13 , wherein the processor is further configured to: encode the scanned output frames into a bitstream; transmit the bitstream to a destination device; and parse the bitstream to extract quantization parameters and motion vectors, wherein filtering the buffered source frames includes, for each buffered source frame determining whether the average quantization employed in the bitstream satisfies a predefined threshold; in response to the predefined threshold being satisfied, copying the buffered source directly to an output frame without denoising and providing the output frame to the encoder; and in response to the predefined threshold not being satisfied, outputting the filtered source frame for further processing.
16. The method of claim 15 , wherein the processor is further configured to: filter, after dividing the filtered block into a plurality of blocks, the plurality of blocks by comparing an extracted motion vector from a previous encoded frame at the same spatial position with a predefined motion threshold; copy, in response to the predefined motion threshold being satisfied, the block directly into the output frame without denoising the block; and perform, in response to the predefined motion threshold not be satisfied, the temporal denoising process on the block.
17. The device of claim 13 , wherein scanning the denoised blocks of the output frame comprises: sequentially processing each denoised block within the output frame, and for each denoised block: determining whether the denoised block is a skin block; and determining whether the variance of the denoised block satisfies a first adaptive threshold or a second adaptive threshold based on the determination of whether the denoised block is a skin block; in response to first adaptive threshold being satisfied or the second adaptive threshold being satisfied, keeping the denoised block in the output frame; and in response to the first adaptive threshold not be satisfied or the second adaptive threshold not being satisfied, replacing the denoised block in the output frame with its corresponding block from the filtered source frame.
18. The method of claim 17 , wherein scanning the denoised blocks of the output frame further comprises: sequentially scanning the previously scanned output frame, and for each denoised block within the previously scanned output frame, determining whether a set of connecting neighbor blocks have been denoised; and in response to a determination that the set of connecting neighbor blocks have not been denoised, replacing the denoised block in the output frame with its corresponding block from the filtered source frame.
19. The method of claim 13 , wherein scanning the denoised blocks of the output frame comprises: processing the denoised blocks in the output frame using a checkerboard pattern such that every other denoised block in the output frame is sequentially processed starting with the odd blocks and then the even blocks.
20. The method of claim 19 , wherein the sequential processing of the odd blocks includes, for each odd denoised block: determining whether the denoised block is a skin block; and determining whether the variance of the denoised block satisfies a first adaptive threshold or a second adaptive threshold based on the determination of whether the denoised block is a skin block; and in response to the first adaptive threshold being satisfied or the second adaptive threshold being satisfied, keeping the denoised odd block in the output frame; and in response to the first adaptive threshold or the second adaptive threshold not being satisfied or the second adaptive threshold not being satisfied, replacing the denoised block in the output frame with its corresponding block from the filtered source frame.
21. The method of claim 20 , wherein the sequential processing of the even blocks includes, for each even denoised block, determining whether the denoised block is a skin block; and determining whether the variance of the denoised block satisfies a first adaptive threshold or a second adaptive threshold based on the determination of whether the denoised block is a skin block; and in response to the first adaptive threshold not being satisfied or the second adaptive threshold not being satisfied, replacing the denoised block in the output frame with its corresponding block from the filtered source frame; and in response to the first adaptive threshold being satisfied or the second adaptive threshold being satisfied, determining whether a set of connecting odd neighbor blocks have been denoised; and in response to a determination that the set of connecting odd neighbor blocks have not been denoised, replacing the denoised block in the output frame with its corresponding block from the filtered source frame; and in response to a determination that the set of connecting odd neighbor blocks has been denoised, keeping the denoised in the output frame.
22. The device of claim 19 , wherein for every other frame, the sequential processing of the denoised blocks in a checkerboard pattern starts with the even blocks and then the odd blocks.
23. The device of claim 22 , wherein the sequential processing of the even blocks includes, for each even denoised block determining whether the denoised block is a skin block; and determining whether the variance of the denoised block satisfies a adaptive first threshold or a second adaptive threshold based on determination of whether the denoised block is a skin block; and in response to the adaptive first threshold or the second adaptive threshold being satisfied, keeping the denoised even block in the output frame; in response to the first adaptive threshold or the second adaptive threshold not being satisfied, replacing the denoised block with its corresponding block from the filtered source frame.
24. The device of claim 23 , wherein the sequential processing of the odd blocks including, for each odd denoised block, determining whether the denoised block is a skin block; and determining whether the variance of the denoised block satisfies a first adaptive threshold or a second adaptive threshold based on the determination of whether the denoised block is a skin block; and in response to the first adaptive threshold not being satisfied or the second adaptive threshold not being satisfied, replacing the denoised block in the output frame with its corresponding block from the filtered source frame; and in response to the first adaptive threshold being satisfied or the second adaptive threshold being satisfied, determining whether a set of connecting even neighbor blocks have been denoised; and in response to a determination that the set of connecting even neighbor blocks have not been denoised, replacing the denoised block in the output frame with its corresponding block from the filtered source frame; and in response to a determination that the set of connecting odd neighbor blocks has been denoised, keeping the denoised in the output frame.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
August 9, 2016
August 14, 2018
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.