Out-Of-Order Cache Returns

PublishedFebruary 5, 2019

Assigneenot available in USPTO data we have

InventorsDaniel Schneider Fataneh Ghodrat

Technical Abstract

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method for performing out-of-order cache returns, the method comprising: determining that a first entry at a head of a first return ordering queue of a plurality of return ordering queues is available for return to a wavefront, wherein the first entry corresponds to a first cache access request, wherein the first return ordering queue stores entries for cache access requests of a first cache access type but not a second cache access type, and wherein a second return ordering queue, of the plurality of return ordering queues, stores entries for cache access requests of the second cache access type but not the first cache access type; and directing a cache return corresponding to the first entry to be transmitted to the wavefront, responsive to the determining, without waiting for cache access requests corresponding to entries in the second return ordering queue that are older than the first cache access request to become available for return to the wavefront, wherein the cache return includes data indicative of a corresponding cache access request that has been completed.

2. The method of claim 1 , wherein: the first cache access type includes one of a read type, a write type, and a texture sampler type; the second cache access type includes one of the read type, the write type, and the texture sampler type; and the second cache access type is different than the first cache access type.

3. The method of claim 2 , wherein: the read type comprises an access type that requests data from a memory system and receives data in return; the write type comprises an access type that writes data to the memory system and receives and acknowledgment signal in return; and the texture sampler type comprises an access type that requests texture data via texture coordinates and receives the texture data in return.

4. The method of claim 3 , wherein: the texture sample type comprises an access type that requests one or more of converting the texture coordinates to one or more memory addresses, fetching data from the one or more memory addresses, decompressing the fetched data, and applying filtering to the fetched data.

5. The method of claim 1 , further comprising: selecting a mode for the plurality of return ordering queues, the mode defining a number of return ordering queues in the plurality of return ordering queues and one or more cache access types that are stored and ordered in each return ordering queue of the plurality of return ordering queues.

6. The method of claim 5 , wherein: the plurality of return ordering queues comprise virtual queues that are stored within a monolithic memory, wherein the virtual queues are resizeable to accommodate the selected mode.

7. The method of claim 6 , further comprising: copying entries from a head of each virtual queue to a head of corresponding physical queues, wherein directing the cache return corresponding to the first entry to the wavefront comprises: removing an entry from a head of a physical queue that corresponds to the first return ordering queue, modifying a next-oldest entry of the physical queue to be at the head of the physical queue, and copying an entry from the first return ordering queue to the physical queue that corresponds to the first return ordering queue.

8. The method of claim 1 , further comprising: executing a cache access type-based barrier instruction in the wavefront.

9. The method of claim 8 , wherein executing the cache access type-based barrier instruction comprises: stalling the wavefront until outstanding cache accesses of a particular cache access type are completed.

10. A compute unit for performing out-of-order cache returns, the compute unit comprising: a single-instruction-multiple-data unit configured to execute a wavefront; and a cache system configured to: store a plurality of return ordering queues that includes a first return ordering queue and a second return ordering queue, wherein the first return ordering queue stores entries for cache access requests of a first cache access type but not a second cache access type, and wherein the second return ordering queue stores entries for cache access requests of the second cache access type but not the first cache access type; determine that a first entry at a head of the first return ordering queue is available for return to the wavefront, wherein the first entry corresponds to a first cache access request; and direct a cache return corresponding to the first entry to be transmitted to the wavefront, responsive to the determining, without waiting for cache access requests corresponding to entries in the second return ordering queue that are older than the first cache access request to become available for return to the wavefront, wherein the cache return includes data indicative of a corresponding cache access request that has been completed.

11. The compute unit of claim 10 , wherein: the first cache access type includes one of a read type, a write type, and a texture sampler type; the second cache access type includes one of the read type, the write type, and the texture sampler type; and the second cache access type is different than the first cache access type.

12. The compute unit of claim 11 , wherein: the read type comprises an access type that requests data from a memory system and receives data in return; the write type comprises an access type that writes data to the memory system and receives and acknowledgment signal in return; and the texture sampler type comprises an access type that requests texture data via texture coordinates and receives the texture data in return.

13. The compute unit of claim 12 , wherein: the texture sample type comprises an access type that requests one or more of converting the texture coordinates to one or more memory addresses, fetching data from the one or more memory addresses, decompressing the fetched data, and applying filtering to the fetched data.

14. The compute unit of claim 10 , wherein the cache system is further configured to: select a mode for the plurality of return ordering queues, the mode defining a number of return ordering queues in the plurality of return ordering queues and one or more cache access types that are stored and ordered in each return ordering queue of the plurality of return ordering queues.

15. The compute unit of claim 14 , wherein: the plurality of return ordering queues comprise virtual queues that are stored within a monolithic memory, wherein the virtual queues are resizeable to accommodate the selected mode.

16. The compute unit of claim 15 , wherein the cache system is further configured to: copy entries from a head of each virtual queue to a head of corresponding physical queues, wherein directing the cache return corresponding to the first entry to the wavefront comprises: removing an entry from a head of a physical queue that corresponds to the first return ordering queue, modifying a next-oldest entry of the physical queue to be at the head of the physical queue, and copying an entry from the first return ordering queue to the physical queue that corresponds to the first return ordering queue.

17. The compute unit of claim 10 , wherein the wavefront is configured to: execute a cache access type-based barrier instruction.

18. The compute unit of claim 17 , wherein: in response to executing the cache access type-based barrier instruction, the wavefront is stalled until outstanding cache accesses of a particular cache access type are completed.

19. A computer system comprising: an accelerated processing device including a compute unit; and a processor configured to cause the accelerated processing device to execute a wavefront in the compute unit, wherein the compute unit comprises: a single-instruction-multiple-data unit configured to execute the wavefront; and a cache system configured to: store a plurality of return ordering queues that includes a first return ordering queue and a second return ordering queue, wherein the first return ordering queue stores entries for cache access requests of a first cache access type but not a second cache access type, and wherein the second return ordering queue stores entries for cache access requests of the second cache access type but not the first cache access type; determine that a first entry at a head of the first return ordering queue is available for return to the wavefront, wherein the first entry corresponds to a first cache access request; and direct a cache return corresponding to the first entry to be transmitted to the wavefront, responsive to the determining, without waiting for cache access requests corresponding to entries in the second return ordering queue that are older than the first cache access request to become available for return to the wavefront, wherein the cache return includes data indicative of a corresponding cache access request that has been completed.

20. The computer system of claim 19 , wherein: the first cache access type includes one of a read type, a write type, and a texture sampler type; the second cache access type includes one of the read type, the write type, and the texture sampler type; and the second cache access type is different than the first cache access type.

Patent Metadata

Filing Date

Unknown

Publication Date

February 5, 2019

Inventors

Daniel Schneider

Fataneh Ghodrat

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search