A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.
Legal claims defining the scope of protection, as filed with the USPTO.
11. The method of claim 7, wherein the dual instruction comprises a first instruction to execute on a second wavefront at the first ALU pipeline and a second instruction to execute on a third wavefront at the second ALU pipeline in the first execution cycle.
17. The device of claim 13, wherein the dual instruction comprises a first instruction to execute on a first wavefront at the first ALU pipeline and a second instruction to execute on a second wavefront at the second ALU pipeline in the first execution cycle.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
December 14, 2020
June 13, 2023
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.