Power for on-die heavily used local memories in general purpose graphics processing unit (GPGPU) applications may be reduced by using low latency read and high latency write operations. Power consumption in read heavy graphic operations can be reduced using a small memory footprint design with possible reduction of hot spotting in some embodiments.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method comprising: intercepting, by a mode selection unit of a graphics processor, write request to a memory of the graphics processor; determining, by the mode selection unit, whether the intercepted write requests are associated with processing a texture in the graphics processor; in response to a determination that a first intercepted write request is not associated with processing the texture in the graphics processor, implementing the first intercepted write request as a faster write operation; and in response to a determination that a second intercepted write request is associated with processing the texture in the graphics processor, implementing the second intercepted write request as a slower write operation.
2. The method of claim 1 wherein the faster write operation consumes one clock cycle per write, and the slower write operation consumes two clock cycles per write.
3. The method of claim 1 including performing all reads to the memory of the graphics processor at a same speed, wherein the same speed is one clock cycle per read.
4. The method of claim 1 including implementing the second intercepted write request as the slower write operation in response to a determination that the second intercepted write request is a memory fill for a table.
5. The method of claim 1 including implementing the second intercepted write request as the slower write operation in response to a determination that the second intercepted write request is a memory fill for a design state.
6. The method of claim 1 including implementing the second intercepted write request as the slower write operation in response to a determination that the second intercepted write request is a memory fill for a constant.
7. One or more non-transitory computer readable media storing instructions executed by a processor to perform a sequence comprising: intercepting, by a mode selection unit of a graphics processor, a write request to a memory of the graphics processor; determining, by the mode selection unit, whether the intercepted write request is associated with processing a texture in the graphics processor; in response to a determination that the intercepted write request is not associated with processing the texture in the graphics processor, implementing the intercepted write request as a faster write operation; and in response to a determination that the intercepted write request is associated with processing the texture in the graphics processor, implementing the intercepted write request as a slower write operation.
8. The one or more non-transitory media of claim 7 , wherein the faster write operation consumes one clock cycle per write, and the slower write operation consumes two clock cycles per write.
9. The one or more non-transitory media of claim 7 , further storing instructions to perform a sequence including performing all reads to the memory of the graphics processor at a same speed, wherein the same speed is one clock cycle per read.
10. The one or more non-transitory media of claim 7 , further storing instructions to perform a sequence including implementing the intercepted write request as the slower write operation in response to a determination that the intercepted write request is a memory fill for a table.
11. The one or more non-transitory media of claim 7 , further storing instructions to perform a sequence including implementing the intercepted write request as the slower write operation in response to a determination that the intercepted write request is a memory fill for a design state.
12. The one or more non-transitory media of claim 7 , further storing instructions to perform a sequence including implementing the intercepted write request as the slower write operation in response to a determination that the intercepted write request is a memory fill for a constant.
13. An apparatus comprising: a graphics processing unit to: intercept a write request to a memory, determine whether the intercepted write request is associated with processing a texture in the graphics processing unit; in response to a determination that the intercepted write request is not associated with processing the texture in the graphics processing unit, implement the intercepted write request as a faster write operation, and in response to a determination that the intercepted write request is associated with processing the texture in the graphics processing unit, implement the intercepted write request as a slower write operation; and a storage coupled to said graphics processing unit.
14. The apparatus of claim 13 , wherein the faster write operation consumes one clock cycle per write, and the slower write operation consumes two clock cycles per write.
15. The apparatus of claim 13 , said graphics processing unit to perform all reads to the memory at a same speed, wherein the same speed is one clock cycle per read.
16. The apparatus of claim 13 , said graphics processing unit to implement the intercepted write request as the slower write operation in response to a determination that the intercepted write request is a memory fill for a table.
17. The apparatus of claim 13 , said graphics processing unit to implement the intercepted write request as the slower write operation in response to a determination that the intercepted write request is a memory fill for a design state.
18. The apparatus of claim 13 , said graphics processing unit to implement the intercepted write request as the slower write operation in response to a determination that the intercepted write request is a memory fill for a constant.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
April 17, 2017
March 3, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.