Legal claims defining the scope of protection, as filed with the USPTO.
1. A method of copying unaligned data comprising: performing a pipelined loop by overlapping execution of instructions in a repeated sequence of operations, wherein an iteration of the pipelined loop includes: loading in a pipelined operation unaligned data from a string of unaligned data to form units of unaligned data, wherein loading includes: loading into a first temporary data register, a remainder of unaligned data; loading into a second temporary data register, a first unit of unaligned data from the string of unaligned data; and loading into a third temporary data register, a second unit of unaligned data from the string of unaligned data; shifting in a pipelined operation portions of the units of unaligned data to form aligned portions of data; merging in a pipelined operation the aligned portions of data into units of aligned data; storing in a pipelined operation the units of aligned data to form a string of aligned data; and moving into the first temporary data register, data from the third temporary data register, to become the remainder on a next iteration of the pipelined loop; and wherein the method includes executing one iteration of the pipelined loop in one processor cycle.
2. The method of claim 1 , wherein the shifting includes: shifting into a first shift data register, data from the first temporary data register, shifted by a shift amount; shifting into a second shift data register, data from the second temporary data register, shifted by a shift complement amount.
3. The method of claim 2 , wherein the shifting includes: shifting into a third shift data register, data from the second temporary data register, shifted by the shift amount; and shifting into a fourth shift data register, data from the third temporary data register, shifted by the shift complement amount.
4. The method of claim 1 , wherein the merging includes merging into a first merge data register, data from a first shift data register merged with data from a second shift data register.
5. The method of claim 4 , wherein the merging includes merging into a second merge data register, data from a third shift data register merged with data from a fourth shift data register.
6. The method of claim 1 , wherein the storing includes storing into a first store data register, data from a first merge data register.
7. The method of claim 6 , wherein the storing includes storing into a second store data register, data from a second merge data register.
8. A computer readable medium having instructions for causing a pipelined machine to perform a method comprising: performing a pipelined loop by overlapping execution of instructions in a repeated sequence of operations, wherein an iteration of the pipelined loop includes: loading in a pipelined operation unaligned data from an unaligned data item to form units of unaligned data, wherein loading includes: loading into a first temporary data register, a remainder of unaligned data; loading into a second temporary data register, a first unit of unaligned data from the unaligned data item; and loading into a third temporary data register, a second unit of unaligned data from the unaligned data item; positioning in a pipelined operation portions of the units of unaligned data to form aligned portions of data; merging in a pipelined operation the aligned portions of data into units of aligned data; storing in a pipelined operation the units of aligned data to form a string of aligned data; and moving into the first temporary data register, data from the third temporary data register to become the remainder on a next iteration of the pipelined loop; and wherein the method includes executing one iteration of the pipelined loop in one processor cycle.
9. The medium of claim 8 , wherein the method includes: performing the loading as a first set of pipelined tasks; performing the positioning and the merging as a second set and a third set of pipelined tasks; and performing the storing as a fourth set of pipelined tasks.
10. The medium of claim 8 , wherein the method includes: performing the loop wherein the positioning includes: positioning into a first rotating position data register, data from the first temporary data register, positioned by a position amount; positioning into a second rotating position data register, data from the second temporary data register, positioned by a position complement amount; positioning into a third rotating position data register, data from the second temporary data register, positioned by the position amount; and positioning into a fourth rotating position data register, data from the third temporary data register, positioned by the position complement amount; and rotating the rotating position data registers each time the loop is performed.
11. The medium of claim 10 , wherein the method includes: performing the loop wherein the merging includes: merging into a first rotating merge data register, data from the first rotating position data register merged with data from the second rotating position data register; merging into a second rotating merge data register, data from the third rotating position data register merged with data from the fourth rotating position data register; and rotating the rotating merge data registers each time the loop is performed.
12. The medium of claim 11 , wherein the method includes performing the loop wherein the storing includes: storing into memory pointed to by a first store data register, data from a first rotating merge data register; and storing into memory pointed to by a second store data register, data from a second rotating merge data register.
13. A computer readable medium having instructions for causing a device to perform a method, comprising: performing a pipelined loop by overlapping execution of instructions in a repeated sequence of operations, wherein an iteration of the pipelined loop includes: loading as a first set of pipelined tasks unaligned data from an unaligned data item to form units of unaligned data, wherein loading includes: loading into a first temporary data register, a remainder of unaligned data; loading into a second temporary data register, a first unit of unaligned data from the unaligned data item; and loading into a third temporary data register, a second unit of unaligned data from the unaligned data item; shifting as a second set of pipelined tasks portions of the units of unaligned data to form aligned portions of data; merging as a third set of pipelined tasks the aligned portions of data into units of aligned data; storing as a fourth set of pipelined tasks the units of aligned data to form a string of aligned data; and moving into the first temporary data register, data from the third temporary data register to become the remainder on a next iteration of the pipelined loop; and wherein the method includes executing one iteration of the pipelined loop in one processor cycle.
14. The medium of claim 13 , wherein the method includes performing the loop until all of the sets of pipelined tasks are false.
15. The medium of claim 13 , wherein the method includes: performing the loop wherein the shifting includes: shifting into a fourth temporary data register, data from the first temporary data register, shifted by a shift amount; shifting into a first rotating shift data register, data from the second temporary data register, shifted by a shift complement amount; shifting into a second rotating shift data register, data from the second temporary data register, shifted by a shift amount; and shifting into a third rotating shift data register, data from the third temporary data register; shifted by a shift complement amount; and rotating the rotating shift data registers each time the loop is performed.
16. The medium of claim 15 , wherein the method includes: performing the loop wherein the merging includes: merging into a first rotating merge data register, data from the fourth temporary data register merged with data from the first rotating shift data register; merging into a second rotating merge data register, data from the second rotating shift data register merged with data from a third rotating shift data register; and rotating the rotating merge data registers and the rotating shift data registers each time the loop is performed.
17. The medium of claim 16 , wherein the method includes: performing the loop wherein the storing includes: storing into memory pointed to by a first store data register, data from a first rotating merge data register; and storing into memory pointed to by a second store data register, data from a second rotating merge data register.
18. The medium of claim 13 , the method including performing the loop in a sequence.
19. A computing device comprising: a processor; a memory, connected to the processor; program instructions storable in the memory and executable by the processor to: perform a pipelined loop by overlapping execution of instructions in a repeated sequence of operations, wherein an iteration of the pipelined loop includes pipelined operations to: load unaligned data from a string of unaligned data to form units of unaligned data, including operations to: load a remainder of unaligned data into a first temporary data register; load a first unit of unaligned data from the unaligned data item into a second temporary data register; and load a second unit of unaligned data from the unaligned data item into a third temporary data register; shift portions of the units of unaligned data to form aligned portions of data; merge the aligned portions of data into units of aligned data; store the units of aligned data to farm a string of aligned data; move into the first temporary data register, data from the third temporary data register to become the remainder on a next iteration of the pipelined loop; and execute one iteration of the pipelined loop in one processor cycle.
20. The device of claim 19 , including pipelined program instructions storable in the memory and executable by the processor to perform the pipelined operations in a loop.
21. The device of claim 20 , including pipelined program instructions storable in the memory and executable by the processor to perform the loop until the entire string of unaligned data is loaded, shifted, merged, and stored.
Unknown
October 7, 2008
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.