11080101

Dependency Scheduling for Control Stream in Parallel Processor

PublishedAugust 3, 2021
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. An apparatus, comprising: graphics processor circuitry configured to execute instructions specified by kernels; one or more storage elements configured to store a control stream that includes kernels and commands, wherein the control stream includes multiple substreams; barrier clearing circuitry; and a set of multiple substream processors, wherein ones of the substream processors are configured to: fetch and parse portions of the control stream corresponding to an assigned sub stream; in response to a neighbor barrier command in the assigned substream that identifies another substream, communicate the identified other substream to the barrier clearing circuitry; wherein the barrier clearing circuitry is configured to determine whether to allow the assigned substream to proceed past the neighbor barrier command based on communication of a most-recently-completed command from of a substream processor to which the other substream is assigned.

2

2. The apparatus of claim 1 , wherein the control stream indicates command identifiers assigned to ones of the commands; wherein the substream processor is configured to, in response to the neighbor barrier command, communicate a first command identifier of the neighbor barrier command; and wherein the barrier clearing circuitry is configured to determine whether to allow the assigned substream to proceed past the neighbor barrier command based on the first command identifier and a command identifier of the most-recently-completed command from of a substream processor to which the other substream is assigned.

3

3. The apparatus of claim 2 , wherein the command identifiers are assigned to the commands according to a monotonic function.

4

4. The apparatus of claim 1 , wherein the control stream includes header information that indicates the substream of subsequent kernels and commands.

5

5. The apparatus of claim 1 , wherein the control stream includes a global substream that multiple substream processors are configured to fetch and parse.

6

6. The apparatus of claim 5 , wherein, in response to a global barrier command in the global substream, a substream processor is configured to indicate the global barrier command to the barrier clearing circuitry and wherein the barrier clearing circuitry is configured to prevent the substream processors in the set from proceeding past the global barrier command until all substream processors in the set have reached the global barrier command.

7

7. The apparatus of claim 1 , wherein ones of the substream processors in the set are configured to process a stream link command in its assigned substream to fetch and parse a secondary control stream.

8

8. The apparatus of claim 1 , wherein the control stream is a compute control stream that indicates compute kernels to be executed by a graphics processor.

9

9. A non-transitory computer-readable medium having instructions stored thereon that include: a control stream that includes kernels and commands, wherein instructions of the kernels are executable by one or more graphics processors to perform one or more operations, wherein the control stream includes multiple substreams; wherein multiple ones of the substreams are for fetching and parsing by respective assigned substream processors in a set of substream processors; and wherein a first substream includes a neighbor barrier command that identifies a second substream, wherein the neighbor barrier command indicates to prevent a first substream processor to which the first substream is assigned from proceeding past the neighbor barrier command until a most-recently-completed command, from a second substream processor to which the second substream has assigned, meets a threshold.

10

10. The non-transitory computer-readable medium of claim 9 , wherein the control stream indicates command identifiers assigned to ones of the commands; wherein the neighbor barrier command indicates a command identifier of the second substream; and wherein the neighbor barrier command indicates that the first substream is not to proceed past the neighbor barrier command until a command identifier of the most-recently-completed command from the second substream processor meets the command identifier indicated in the neighbor barrier command.

11

11. The non-transitory computer-readable medium of claim 10 , wherein the command identifiers are assigned to commands according to a monotonic function.

12

12. The non-transitory computer-readable medium of claim 9 , wherein the control stream includes header information that indicates the substream of subsequent kernels and commands.

13

13. The non-transitory computer-readable medium of claim 9 , wherein the control stream includes a global substream for parsing by substream processors in the set; wherein the global substream includes a global barrier command which all substream processors in the set are to reach before any of the substream processors in the set proceed past the global barrier command.

14

14. The non-transitory computer-readable medium of claim 9 , wherein multiple substreams in the control stream include respective stream link commands that indicate to fetch and parse a secondary control stream.

15

15. A non-transitory computer readable storage medium having stored thereon design information that specifies a design of at least a portion of a hardware integrated circuit in a format recognized by a semiconductor fabrication system that is configured to use the design information to produce the circuit according to the design, including: graphics processor circuitry configured to execute instructions specified by kernels; one or more storage elements configured to store a control stream that includes kernels and commands, wherein the control stream includes multiple substreams; barrier clearing circuitry; and a set of multiple substream processors, wherein ones of the substream processors are configured to: fetch and parse portions of the control stream corresponding to an assigned sub stream; in response to a neighbor barrier command in the assigned substream that identifies another substream, communicate the identified other substream to the barrier clearing circuitry; wherein the barrier clearing circuitry is configured to determine whether to allow the assigned substream to proceed past the neighbor barrier command based on communication of a most-recently-completed command from of a substream processor to which the other substream is assigned.

16

16. The non-transitory computer readable storage medium of claim 15 , wherein the control stream indicates command identifiers assigned to ones of the commands; wherein the substream processor is configured to, in response to the neighbor barrier command, communicate a first command identifier of the neighbor barrier command; and wherein the barrier clearing circuitry is configured to determine whether to allow the assigned substream to proceed past the neighbor barrier command based on the first command identifier and a command identifier of the most-recently-completed command from of a substream processor to which the other substream is assigned.

17

17. The non-transitory computer readable storage medium of claim 15 , wherein the control stream includes header information that indicates the substream of subsequent kernels and commands.

18

18. The non-transitory computer readable storage medium of claim 15 , wherein the control stream includes a global substream that multiple substream processors are configured to fetch and parse.

19

19. The non-transitory computer readable storage medium of claim 18 , wherein, in response to a global barrier command in the global substream, a substream processor is configured to indicate the global barrier to the barrier clearing circuitry and wherein the barrier clearing circuitry is configured to prevent the substream processors in the set from proceeding past the global barrier until all substream processors in the set have reached the global barrier command.

20

20. The non-transitory computer readable storage medium of claim 15 , wherein ones of the substream processors in the set are configured to process a stream link command in its assigned substream to fetch and parse a secondary control stream.

Patent Metadata

Filing Date

Unknown

Publication Date

August 3, 2021

Inventors

Andrew M. Havlir
Jason D. Carroll
Karl D. Mann

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Dependency Scheduling for Control Stream in Parallel Processor” (11080101). https://patentable.app/patents/11080101

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.