Device and Method for Scheduling Multiple Thread Groups on Simd Lanes Upon Divergence in a Single Thread Group

PublishedNovember 10, 2020

Assigneenot available in USPTO data we have

Technical Abstract

Patent Claims

21 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. An apparatus for managing divergent threads based on a Single Instruction Multiple Data (SIMD) architecture, the apparatus comprising: a plurality of Front End Units (FEUs) configured to fetch instructions of thread groups of a program flow; and a controller configured to schedule a thread group based on SIMD lane availability information, activate an FEU of the plurality of FEUs, and control the activated FEU to fetch an instruction for processing the scheduled thread group, wherein scheduling the thread group by the controller comprises, determining at least one thread group, among a plurality of thread groups, based on a number of idle SIMD lanes and an idle SIMD lane number included in the SIMD lane availability information, and scheduling an SIMD width to be processed for the at least one determined thread group, an SIMD depth which is greater than one to be processed for the at least one determined thread group, and an SIMD lane number to be processed for the at least one determined thread group, based on the number of idle SIMD lanes and the idle SIMD lane number included in the SIMD lane availability information, wherein the controller is configured to, in response to thread divergence occurring in the thread group that is scheduled due to a branch instruction, schedule, based on the SIMD lane availability information that is managed, another thread group to be processed through another FEU among the plurality of FEUs, and control the another FEU to fetch another instruction for execution by one or more SIMD lanes that are made available as a result of the thread divergence, and wherein the another thread group is independent and not divergent from the scheduled thread group.

2. The apparatus of claim 1 , wherein the controller comprises an active thread manager configured to, in response to the thread divergence occurring in the thread group that is scheduled due to the branch instruction, manage active thread information of the thread group with the thread divergence.

3. The apparatus of claim 2 , wherein the controller further comprises an SIMD manager configured to manage the SIMD lane availability information by checking any available SIMD lanes based on the active thread information that is managed.

4. The apparatus of claim 2 , further comprising a distribution network configured to transfer the instruction fetched by the FEU that is activated to a corresponding SIMD lane based on the active thread information that is managed.

5. The apparatus of claim 4 , wherein a plurality of SIMD lanes is provided, and each SIMD lane comprises an Execution Unit (EU) configured to execute a corresponding instruction transferred through the distribution network.

6. The apparatus of claim 1 , wherein the controller is configured to schedule the thread group based on at least one of memory access characteristics, computation latency, and user input information with respect to the thread group.

7. The apparatus of claim 1 , wherein the controller is configured to, before threads of the thread group that is scheduled are diverged, or after divergent threads of the thread group that is scheduled are converged, activate the FEU to fetch another instruction that controls the FEU that is activated to process the thread group using all SIMD lanes.

8. A method of managing divergent threads based on Simple Instruction Multiple Data (SIMD) architecture, the method comprising: fetching, at a first Front End Unit (FEU) among a plurality of FEUs, an instruction of a first thread group; determining, at the first FEU, whether threads of the first thread group are diverged due to the instruction that is fetched; in response to determining that the threads of the first thread group are diverged, activating a second FEU among the plurality of FEUs; scheduling, based on a number of idle SIMD lanes and an idle SIMD lane number included in SIMD lane availability information, a second thread group to be processed through the second FEU, the second thread group being independent and not divergent from the first thread group; scheduling, based on the number of idle SIMD lanes and the idle SIMD lane number included in the SIMD lane availability information, an SIMD width to be processed for the second thread group, an SIMD depth which is greater than one to be processed for the second thread group, and an SIMD lane number to be processed for the second thread group; and fetching, at both the first FEU and the second FEU, instructions.

9. The method of claim 8 , further comprising managing active thread information of each of the first thread group and the second thread group.

10. The method of claim 9 , wherein the managing the active thread information comprises managing SIMD lane availability information by checking an available SIMD lane based on the active thread information of each of the first thread group and the second thread group.

11. The method of claim 8 , further comprising, in response to the determining that the threads of the first thread group are diverged, managing SIMD lane usage status information that indicates information about any SIMD lane that was being used at a time shortly before the threads of the first thread group are diverged.

12. The method of claim 11 , further comprising: in response to the determining that the threads of the first thread group are diverged due to a conditional branch, jumping into a Taken Program Counter (PC).

13. The method of claim 11 , further comprising: determining whether the instruction fetched by the first FEU is a branch-join instruction; in response to determining that the instruction fetched by the first FEU is the branch-join instruction, determining whether there is any Not-Taken Program Counter (PC) not processed due to thread divergence of the first thread group; and in response to a determination that there is no Not Taken PC not processed due to the thread divergence of the first thread group, fetching another instruction based on the SIMD lane usage status information to process the first thread group.

14. The method of claim 8 , wherein the scheduling the second thread group comprises scheduling the second thread group based on at least one of memory access characteristics, computation latency, and user input information with respect to the second thread group.

15. A method of managing divergent threads based on Simple Instruction Multiple Data (SIMD) architecture, the method comprising: scheduling, based on a number of idle SIMD lanes and an idle SIMD lane number included in SIMD lane availability information, a first thread group of a program flow; scheduling, based on the number of idle SIMD lanes and the idle SIMD lane number included in the SIMD lane availability information, an SIMD width to be processed for the first thread group of the program flow, an SIMD depth which is greater than one to be processed for the first thread group of the program flow, and an SIMD lane number to be processed for the first thread group of the program flow; activating a first FEU among a plurality of FEUs configured to fetch instructions for execution by SIMD lanes; fetching, at the first FEU that is activated, a first instruction for processing the first thread group that is scheduled; managing the SIMD lane availability information by checking any available SIMD lanes resulting from a thread divergence occurring in the first thread group that is scheduled due to a branch instruction; scheduling, based on the SIMD lane availability information that is managed, a second thread group to be processed through a second FEU among the plurality of FEUs, the second thread group being independent and not divergent from the first thread group; and fetching, at the second FEU, a second instruction for execution by one or more first SIMD lanes that are made available as a result of the thread divergence.

16. The method of claim 15 , further comprising, in response to the thread divergence occurring in the first thread group that is scheduled due to the branch instruction, managing active thread information of the first thread group with the thread divergence.

17. The method of claim 16 , further comprising managing the SIMD lane availability information by checking any available SIMD lanes based on the active thread information that is managed.

18. The method of claim 15 , further comprising: scheduling, based on the SIMD lane availability information that is managed, a third thread group to be processed through a third FEU among the plurality of FEUs; and fetching, at the third FEU, instructions for execution by SIMD lanes that are made available as a result of the thread divergence.

19. The method of claim 16 , further comprising transferring the instruction fetched by the first FEU that is activated to a corresponding SIMD land based on the active thread information that is managed.

20. A non-transitory computer readable recording medium having recorded thereon a program executable by a computer for performing the method of claim 9 .

21. A non-transitory computer readable recording medium having recorded thereon a program executable by a computer for performing the method of claim 15 .

Patent Metadata

Filing Date

Unknown

Publication Date

November 10, 2020

Inventors

Seung-Hun JIN

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search