The present invention provides a framework for the processing of blocks between two data frames and in particular application to motion estimation calculations in which a balance among the performance of a motion search algorithm, the size of on-chip memory to store the reference data, and the required data transfer bandwidth between on-chip and external memory can be optimized in a scalable manner, such that the total system cost with hierarchical embedded memory structure can be optimized in a flexible manner. The scope of the present invention is not limited to digital video encoding in which motion vector is part of information to be encoded, but is applicable to any other implementation in which difference between any two data frames are to be computed.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A machine-implemented method for processing a first data frame in an external memory against a second data frame in the external memory, wherein the first data frame and the second data frame are divided into a plurality of blocks, comprising: ordering the blocks of the second data frame into a pre-defined order; defining an addressable data block (ADB) in an on-chip memory over the first data frame; defining a motion search range (MSR) within the ADB; loading selected blocks of the first data frame from the external memory into the ADB in the on-chip memory; loading one or more blocks of the second data frame from the external memory in the pre-defined order; and processing, with a chip, the one or more blocks of the second data frame against the selected blocks of the first data frame in the MSR, where the size of the MSR and the size of the ADB are defined as a function of the data transfer bandwidth between the on-chip memory and the external memory.
2. The method of claim 1 further comprising the steps of: while there are unprocessed blocks of the first data frame defined in the ADB, re-defining the MSR within the ADB; and processing the one or more blocks of the second data frame against the selected blocks of the first data frame in the MSR.
3. The method of claim 2 further including the steps of: repeating the defining an ADB step, the defining a MSR step, the loading steps, and the processing step of claim 1 and the while condition of claim 2 .
4. The method of claim 3 wherein in repeating the defining an ADB step, the ADB is refined over the blocks of the first data frame in the direction of the pre-defined order.
5. The method of claim 2 wherein in the re-defining step, the MSR is re-defined in the direction of the unprocessed blocks of the first data frame.
6. The method of claim 1 wherein the MSR is defined having a size of m by n and the additional numbers of blocks to load at any one time into the ADB is M−1 in the vertical direction and N−1 in the horizontal direction, the method further comprising defining the size of the ADB to (M−1+m) by (N−1+n).
7. The method of claim 1 where the size of the ADB is a function of the size of the on-chip memory.
8. The method of claim 1 where the size of the MSR is a function of coding performance.
9. The method of claim 1 , wherein the MSR is a subset of the ADB.
10. The method of claim 1 , wherein the pre-defined order of the blocks of the second data frame is a function of the amount of information of processed neighbouring blocks available to the processing of the current block.
11. The method of claim 1 , wherein the pre-defined order of the blocks of the second data frame is a function of performance of the processing.
12. A machine-implemented method for processing a first data frame in an external memory against a second data frame in the external memory, wherein the first data frame and the second data frame are divided into a plurality of blocks, comprising: ordering the blocks of the second data frame into a pre-defined order; defining an addressable data block (ADB) in an on-chip memory over the first data frame; defining a motion search range (MSR) within the ADB; loading selected blocks of the first data frame from the external memory into the ADB in the on-chip memory; loading one or more blocks of the second data frame from the external memory in the pre-defined order; processing, with a chip, the one or more blocks of the second data frame against the selected blocks of the first data frame in the MSR; while there are unprocessed blocks of the first data frame in the ADB, re-defining the MSR within the ADB; and processing the one or more blocks of the second data frame against the selected blocks of the first data frame in the MSR; and repeating the defining an ADB step, the defining a MSR step, the loading steps, the processing step, and the while step, where the size of the MSR and the size of the ADB are defined as a function of the data transfer bandwidth between the on-chip memory and the external memory.
13. The method of claim 12 wherein in repeating the defining an ADB step, the ADB is refined over the blocks of the first data frame in the direction of the pre-defined order.
14. The method of claim 12 wherein in the re-defining step, the MSR is re-defined in the direction of the unprocessed blocks of the first data frame.
15. The method of claim 12 wherein the MSR is defined having a size of m by n and the additional numbers of blocks to load at any one time into the ADB is M−1 in the vertical direction and N−1 in the horizontal direction, the method further comprising defining the size of the ADB to (M−1+m) by (N−1+n).
16. The method of claim 12 where the size of the ADB is a function of the size of the on-chip memory.
17. The method of claim 12 where the size of the MSR is a function of coding performance.
18. A machine-implemented method for processing a first data frame in an external memory against a second data frame in the external memory, wherein the first data frame and the second data frame are divided into a plurality of blocks, comprising: ordering the blocks of the second data frame into a pre-defined order; defining an addressable data block (ADB) in an on-chip memory over the first data frame; defining a motion search range (MSR) within the ADB; loading selected blocks of the first data frame from the external memory into the ADB in the on-chip memory; loading one or more blocks of the second data frame from the external memory in the pre-defined order; processing, with a chip, the one or more blocks of the second data frame against the selected blocks of the first data frame in the MSR; while there are unprocessed blocks of the first data frame in the ADB, re-defining the MSR within the ADB; and processing the one or more blocks of the second data frame against the one or more blocks of the first data frame in the MSR; and repeating the defining an ADB step, the defining a MSR step, the loading steps, the processing step, and the while step; wherein the MSR is defined having a size of m by n and the additional numbers of blocks to load at any one time into the ADB is M−1 in the vertical direction and N−1 in the horizontal direction, and the size of the ADB is defined as (M−1+m) by (N−1+n), where the size of the MSR and the size of the ADB are defined as a function of the data transfer bandwidth between the on-chip memory and the external memory.
19. A machine for processing a first data frame in an external memory against a second data frame in the external memory, wherein the first data frame and the second data frame are divided into a plurality of blocks, the machine comprising: an on-chip memory; a chip communicatively coupled to the on-chip memory; wherein the machine is configured to perform the steps of: ordering the blocks of the second data frame into a pre-defined order; defining an addressable data block (ADB) in the on-chip memory over the first data frame; defining a motion search range (MSR) within the ADB; loading selected blocks of the first data frame from the external memory into the ADB in the on-chip memory; loading one or more blocks of the second data frame from the external memory in the pre-defined order; and processing, with a processor, the one or more blocks of the second data frame against the selected blocks of the first data frame in the MSR, where the size of the MSR and the size of the ADB are defined as a function of the data transfer bandwidth between the on-chip memory and the external memory.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
June 25, 2007
January 3, 2012
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.