Enqueuing Kernels from Kernels on GPU/CPU

PublishedMarch 23, 2021

Assigneenot available in USPTO data we have

Technical Abstract

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A program storage device, on which are stored instructions, comprising instructions that when executed cause one or more compute units to: enqueue a first kernel by a first compute unit for execution on a second compute unit, wherein the first and second compute units have different capabilities; determine, based on the execution of the first kernel, that a condition is met; and in response to the condition being met based on the execution of the first kernel, enqueue a second kernel for execution on the second compute unit.

2. The program storage device of claim 1 , wherein the first compute unit is a central processing unit (CPU) and the second compute unit is a graphic processing unit (GPU).

3. The program storage device of claim 1 , wherein the second kernel is enqueued after execution of the first kernel is complete.

4. The program storage device of claim 1 , wherein the second kernel is enqueued during execution of the first kernel.

5. The program storage device of claim 1 , wherein the first kernel is a data-parallel kernel, and wherein the second kernel is a data-parallel kernel with a different range than the first kernel.

6. The program storage device of claim 1 , wherein the first kernel is a task-parallel kernel, and wherein the second kernel is a task-parallel kernel.

7. The program storage device of claim 1 , wherein the first kernel is a task-parallel kernel, and wherein the second kernel is a data-parallel kernel.

8. The program storage device of claim 1 , wherein the first kernel is a data-parallel kernel, and wherein the second kernel is a task-parallel kernel.

9. The program storage device of claim 1 , wherein the second compute unit enqueues a barrier on a queue of commands to blocks execution of commands enqueued on the queue of commands after the barrier until the barrier completes.

10. The program storage device of claim 1 , wherein the second compute unit enqueues a marker on a queue of commands that does not complete until one or more other commands completes.

11. A computing device, comprising: one or more compute units; and a global memory, coupled to the one or more compute units, on which are stored instructions that when executed cause the one or more compute units to: enqueue a first kernel by a first compute unit for execution on a second compute unit, wherein the first and second compute units have different capabilities; determine, based on the execution of the first kernel, that a condition is met; and in response to the condition being met based on the execution of the first kernel, enqueue a second kernel for execution on the second compute unit.

12. The computing device of claim 11 , wherein the first compute unit is a central processing unit (CPU) and the second compute unit is a graphic processing unit (GPU).

13. The computing device of claim 11 , wherein the second kernel is enqueued after execution of the first kernel is complete.

14. The computing device of claim 11 , wherein the second kernel is enqueued during execution of the first kernel.

15. The computing device of claim 11 , wherein the first kernel is a data-parallel kernel, and wherein the second kernel is a data-parallel kernel with a different range than the first kernel.

16. The computing device of claim 11 , wherein the first kernel is a task-parallel kernel, and wherein the second kernel is a task-parallel kernel.

17. The computing device of claim 11 , wherein the first kernel is a task-parallel kernel, and wherein the second kernel is a data-parallel kernel.

18. The computing device of claim 11 , wherein the first kernel is a data-parallel kernel, and wherein the second kernel is a task-parallel kernel.

19. The computing device of claim 11 , wherein the second compute unit enqueues a barrier on a queue of commands to blocks execution of commands enqueued on the queue of commands after the barrier until the barrier completes.

20. The computing device of claim 11 , wherein the second compute unit enqueues a marker on a queue of commands that does not complete until one or more other commands completes.

Patent Metadata

Filing Date

Unknown

Publication Date

March 23, 2021

Inventors

Aaftab A. Munshi

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search