Iterating Group Sum of Multiple Accumulate Operations

PublishedFebruary 28, 2023

Assigneenot available in USPTO data we have

Technical Abstract

Patent Claims

10 claims

Legal claims defining the scope of protection, as filed with the USPTO.

3. The system of claim 1, wherein the GSOMAC instruction includes a GDP (Group Dot Product) operative to convolve an entire M×N block against an (M+a)×(N+b) block to generate an a×b output, wherein M, N, a, and b are positive integers.

4. The system of claim 1, wherein the GSOMAC instruction includes a GCONV (Group Convolve) operative to convolve an entire M×N block against an M×N×P block to generate a 1×P output, wherein M, N, and P are positive integers.

7. The system of claim 6, wherein each bit of the plurality of bits corresponding to a set of the plurality of sets of the second source operand having all terms of zero are reset, and each bit of the plurality of bits corresponding to a set of the plurality of sets of the second source operand having at least one term non-zero are set.

10. The system of claim 2, wherein the SIMD thread includes a dispatch mask, and wherein the dispatch mask indicates which channels of a plurality of channels are enabled and/or indicates which channels of the plurality of channels are disabled at an initial point in time.

11. The system of claim 2, wherein the walk instruction block involves processing data from the subset of the channels of the plurality of channels during execution of an iteration of a block of instructions for the SIMD thread.

12. The system of claim 2, wherein the logic when executed is further operable to cause the one or more processors to perform operations comprising generating, by the code block iterator, a walk mask of the SIMD thread, wherein the walk mask indicates which subset of channels that are enabled and/or disabled during a particular walk iteration of executing the plurality of instructions for the SIMD thread.

13. The system of claim 2, wherein the logic when executed is further operable to cause the one or more processors to perform operations comprising generating, by the code block iterator, a walk mask of the SIMD thread based at least on the walk size and the execution mask, wherein the execution mask is a mask that is applied when performing the plurality of instructions for the SIMD thread during a particular iteration of executing instructions for the SIMD thread.

14. The system of claim 2, wherein the logic when executed is further operable to cause the one or more processors to perform operations comprising utilizing a subset of walk registers of a plurality of walk registers to execute the walk instruction block.

18. The method of claim 16, wherein the GSOMAC instruction includes a GDP (Group Dot Product) operative to convolve an entire M×N block against an (M+a)×(N+b) block to generate an a×b output, wherein M, N, a, and b are positive integers.

19. The method of claim 16, wherein the GSOMAC instruction includes a GCONV (Group Convolve) operative to convolve an entire M×N block against an M×N×P block to generate a 1×P output, wherein M, N, and P are positive integers.

Patent Metadata

Filing Date

Unknown

Publication Date

February 28, 2023

Inventors

Satyaki Koneru

Kamaraj Thangam

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search