Tile based interleaving and de-interleaving of row-column interleaved data is described. In one example, the de-interleaving is divided into two memory transfer stages, the first from an on-chip memory to a DRAM and the second from the DRAM to an on-chip memory. Each stage operates on part of a row-column interleaved block of data and re-orders the data items, such that the output of the second stage comprises de-interleaved data. In the first stage, data items are read from the on-chip memory according to a non-linear sequence of memory read addresses and written to the DRAM. In the second stage, data items are read from the DRAM according to bursts of linear address sequences which make efficient use of the DRAM interface and written back to on-chip memory according to a non-linear sequence of memory write addresses.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A digital signal processing system-on-chip, comprising: a first memory storing a plurality of data items arranged in a first sequence, each data item having an associated memory address on the first memory; a second memory; and a transfer engine coupled to the first memory and the second memory and comprising a port to a dynamic random access memory (DRAM) wherein the transfer engine is configured to transfer the plurality of data items directly from the first memory to the DRAM in a first memory transfer stage and to transfer the plurality of data items directly from the DRAM to the second memory in a second memory transfer stage, wherein in the first memory transfer stage, the transfer engine is arranged to read the plurality of data items from the first memory according to a predefined non-linear sequence of memory read addresses and to write the plurality of data items to the DRAM, and wherein in the second memory transfer stage, the transfer engine is arranged to read the plurality of data items from the DRAM according to bursts of linear address sequences, each burst of linear address sequences having a length selected based on a DRAM interface burst size, and to write the plurality of data items to the second memory according to a predefined non-linear sequence of memory write addresses, such that the plurality of data items are arranged in a second sequence on the second memory that is different from the first sequence and wherein one of the first sequence and the second sequence comprises row-column interleaved data and wherein the second sequence is either row-column interleaved or de-interleaved with respect to the first sequence.
2. A digital signal processing system-on-chip according to claim 1 , wherein the first memory and the second memory are both static random access memory.
3. A digital signal processing system-on-chip according to claim 1 , wherein the first memory and the second memory are the same on-chip memory.
4. A digital signal processing system-on-chip according to claim 1 , further comprising the DRAM.
5. A digital signal processing system-on-chip according to claim 1 , wherein the plurality of data items comprises a subset of a block of data items and the transfer engine is further arranged to repeat the first and second memory transfer stages until all the block of data items has been written to the second memory.
6. A digital signal processing system-on-chip according to claim 1 , further comprising at least one address generating element arranged to generate the predefined non-linear sequence of memory read addresses and the predefined non-linear sequence of memory write addresses.
7. A digital signal processing system-on-chip according to claim 1 , wherein the plurality of data items comprises a subset of a block of data items and the block of data items is defined as being arranged as a grid comprising a number of rows of data items and a number of columns of data items.
8. A digital signal processing system-on-chip according to claim 7 , wherein the grid further comprises a plurality of tiles, each tile comprising a rectangular portion of the grid and further comprising R rows and C columns of data items and wherein the plurality of data items comprises one or more tiles.
9. A digital signal processing system-on-chip according to claim 8 , wherein the predefined non-linear sequence of memory read addresses comprises, for each tile in the first plurality of data items: a sequence of non-consecutive memory addresses separated by a fixed number of memory addresses and starting at an initial starting address, the fixed number corresponding to one less than the number of rows in the grid, until a boundary of the tile is reached, followed by one or more additional sequences of non-consecutive memory addresses, each additional sequence starting at an offset initial starting address.
10. A digital signal processing system-on-chip according to claim 8 , wherein the predefined non-linear sequence of memory write addresses comprises: a sequence of groups of C consecutive memory addresses separated by a fixed number of memory addresses in the second memory and starting at an initial starting address in the second memory, the fixed number corresponding to C less than the number of columns in the grid.
11. A digital signal processing system-on-chip according to claim 8 , wherein the plurality of data items comprises a tile of the grid.
12. A digital signal processing system-on-chip according to claim 8 , wherein in the second memory transfer stage, the bursts of linear address sequences comprises a sequence of bursts of X consecutive memory addresses separated by a fixed number of memory addresses in the second memory and starting at an initial starting address in the second memory, where X is equal to the number of data items in a tile of the grid.
13. A digital signal processing system-on-chip according to claim 8 , wherein in the first memory transfer stage, the transfer engine is arranged to write the plurality of data items to the DRAM according to bursts of linear address sequences, each burst of linear address sequences having a length selected based on a DRAM interface burst size.
14. A digital signal processing system-on-chip according to claim 13 , wherein in the first memory transfer stage, the bursts of linear address sequences comprises a sequence of bursts of X consecutive memory addresses separated by a fixed number of memory addresses in the second memory and starting at an initial starting address in the second memory, where X is equal to the number of data items in a tile of the grid.
15. A digital signal processing system-on-chip according to claim 8 , wherein a tile is sized based on a size of the DRAM interface burst.
16. A method of performing an interleaving or de-interleaving operation on data items in a digital signal processing system, the method comprising: reading, from a first on-chip memory, a first plurality of data items stored in a first sequence according to a predefined non-linear sequence of memory read addresses; writing the first plurality of data items to a dynamic random access memory (DRAM); reading, from the DRAM, the first plurality of data items according to bursts of linear address sequences, each burst of linear address sequences having a length selected based on a DRAM interface burst size; and writing the first plurality of data items to a second on-chip memory according to a predefined non-linear sequence of memory write addresses, such that the data items are arranged in a second sequence on the second on-chip memory that is different from the first sequence and wherein one of the first sequence and the second sequence comprises row-column interleaved data and wherein the second sequence is either row-column interleaved or de-interleaved with respect to the first sequence.
17. A method according to claim 16 , wherein the first plurality of data items comprises a subset of a block of data items, wherein the block of data items is defined as being arranged as a grid comprising a number of rows of data items and a number of columns of data items, the grid further comprising a plurality of tiles, each tile comprising a rectangular portion of the grid and further comprising R rows and C columns of data items and wherein the first plurality of data items comprises one or more tiles, and wherein reading, from a first on-chip memory, a first plurality of data items stored in a first sequence according to a predefined non-linear sequence of memory read addresses comprises, for each tile in the first plurality of data items: (i) reading a data item at an initial starting address in the first on-chip memory; (ii) skipping a fixed number of data items, the fixed number corresponding to one less than the number of rows in the grid; (iii) reading a data item; (iv) repeating steps (ii) and (iii) until a boundary of the tile is reached; (v) adding an offset to the initial starting address; and (vi) repeating steps (i)-(v) until each data item in the tile has been read.
18. A method according to claim 16 , wherein the first plurality of data items comprises a subset of a block of data items, wherein the block of data items is defined as being arranged as a grid comprising a number of rows of data items and a number of columns of data items, the grid further comprising a plurality of tiles, each tile comprising a rectangular portion of the grid and further comprising R rows and C columns of data items and wherein the first plurality of data items comprises one or more tiles, and wherein writing the first plurality of data items to a second on-chip memory according to a predefined non-linear sequence of memory write addresses comprises: (i) writing C data items from the first plurality of data items to a plurality of consecutive addresses in the second on-chip memory, starting at an initial starting address in the second on-chip memory for the tile; (ii) skipping a fixed number of addresses in the second on-chip memory, the fixed number corresponding to C less than the number of columns in the grid; (iii) writing C data items from the first plurality of data items to a plurality of consecutive addresses in the second on-chip memory; and (iv) repeating steps (ii) and (iii).
19. A method according to claim 16 , wherein the first plurality of data items comprises a subset of a block of data items, wherein the block of data items is defined as being arranged as a grid comprising a number of rows of data items and a number of columns of data items, the grid further comprising a plurality of tiles, each tile comprising a rectangular portion of the grid and further comprising R rows and C columns of data items and wherein the first plurality of data items comprises one or more tiles, and wherein writing the first plurality of data items to the DRAM comprises: (i) writing X data items from the first plurality of data items to a plurality of consecutive addresses in the DRAM, starting at an initial starting address in the DRAM for the tile; (ii) skipping a fixed number of addresses in the DRAM; (iii) writing X data items from the first plurality of data items to a plurality of consecutive addresses in the DRAM; and (iv) repeating steps (ii) and (iii), wherein X is equal to the number of data items in a tile of the grid.
20. A method according to claim 16 , wherein the first plurality of data items comprises a subset of a block of data items, wherein the block of data items is defined as being arranged as a grid comprising a number of rows of data items and a number of columns of data items, the grid further comprising a plurality of tiles, each tile comprising a rectangular portion of the grid and further comprising R rows and C columns of data items and wherein the first plurality of data items comprises one or more tiles, and wherein reading the first plurality of data items from the DRAM according to bursts of linear address sequences comprises: (i) reading X data items from the first plurality of data items from a plurality of consecutive addresses in the DRAM, starting at an initial starting address in the DRAM; (ii) skipping a fixed number of addresses in the DRAM; (iii) reading X data items from the first plurality of data items from a plurality of consecutive addresses in the DRAM; and (iv) repeating steps (ii) and (iii), wherein X is equal to the number of data items in a tile of the grid.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
March 12, 2013
May 21, 2019
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.