Patentable/Patents/US-11301254
US-11301254

Instruction streaming using state migration

PublishedApril 12, 2022
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

A method, system, and/or processor for processing data is disclosed that includes processing a parent stream, detecting a branch instruction in the parent stream, activating an additional child stream, copying the content of a parent mapper copy of the parent stream to an additional child mapper copy, dispatching instructions for the parent stream and the additional child stream, and executing the parent stream and the additional child stream on different execution slices. In an aspect, a first parent mapper copy is associated and used in connection with executing the parent stream and a second different child mapper copy is associated and used in connection with executing the additional child stream. The method in an aspect includes processing one or more streams and/or one or more threads of execution on one or more execution slices.

Patent Claims
20 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A method of processing data in a processor, the method comprising: processing a parent stream; detecting a branch instruction in the parent stream; activating an additional child stream; copying the content of a parent mapper copy of the parent stream to an additional child mapper copy; dispatching instructions for the parent stream and the additional child stream, and executing the parent stream and the additional child stream on different execution slices.

Plain English Translation

This invention relates to parallel processing in computer systems, specifically addressing the challenge of efficiently handling branch instructions in a processor pipeline. The method improves performance by dynamically creating and managing multiple execution streams to mitigate pipeline stalls caused by branch prediction inaccuracies or speculative execution delays. The method processes a parent stream of instructions in a processor. When a branch instruction is detected in the parent stream, an additional child stream is activated to handle the potential branch outcome. The content of a parent mapper copy, which tracks the state of the parent stream, is copied to an additional child mapper copy to initialize the child stream. Instructions from both the parent and child streams are then dispatched and executed on separate execution slices, allowing parallel processing of different branch paths. This approach reduces idle cycles by speculatively executing both possible branch outcomes, improving throughput and efficiency in pipelined processors. The method ensures that the parent and child streams remain synchronized until the branch resolution, at which point one stream is discarded, and the other continues execution. This technique is particularly useful in out-of-order execution architectures where branch mispredictions can significantly impact performance.

Claim 2

Original Legal Text

2. The method of claim 1 , wherein a first parent mapper copy is associated and used in connection with executing the parent stream and a second different child mapper copy is associated and used in connection with executing the additional child stream.

Plain English Translation

This invention relates to data processing systems that handle multiple data streams, particularly in scenarios where different data transformations are required for parent and child data streams. The problem addressed is the inefficiency and complexity of managing separate data processing pipelines for related but distinct data streams, which can lead to redundant processing, increased resource usage, and difficulties in maintaining consistency between the streams. The invention provides a method for processing data streams where a first mapper copy is specifically associated with and used for executing a parent data stream, while a second, distinct mapper copy is associated with and used for executing an additional child data stream. The parent and child streams are related, but the separate mapper copies allow for independent processing tailored to each stream's requirements. This approach ensures that transformations or operations applied to the parent stream do not interfere with those applied to the child stream, while still maintaining the logical relationship between them. The use of separate mapper copies improves efficiency by avoiding redundant processing and simplifies system design by clearly delineating responsibilities between the parent and child stream processors. This method is particularly useful in distributed data processing environments where different streams may require different processing logic or performance optimizations.

Claim 3

Original Legal Text

3. The method of claim 1 , further comprising processing one or more threads of execution on one or more execution slices.

Plain English Translation

A method for managing thread execution in a computing system addresses the challenge of efficiently scheduling and processing multiple threads to optimize resource utilization and performance. The method involves dynamically allocating and managing execution slices, which are discrete time intervals or computational units assigned to threads for execution. Each execution slice represents a portion of processing capacity that can be allocated to one or more threads, allowing for flexible and adaptive scheduling. The method further includes processing one or more threads of execution on these allocated slices, ensuring that threads are executed in a manner that balances workload distribution and minimizes idle time. This approach enhances system efficiency by dynamically adjusting the allocation of execution slices based on thread priorities, resource availability, and system workload, thereby improving overall throughput and responsiveness. The method may also involve monitoring thread execution to dynamically reallocate slices as needed, ensuring optimal performance under varying conditions. This technique is particularly useful in multi-core or multi-threaded environments where efficient thread management is critical for maintaining system performance and responsiveness.

Claim 4

Original Legal Text

4. The method according to claim 1 , further comprising determining the number of threads of execution that the processor is executing.

Plain English Translation

A method for optimizing processor performance involves monitoring the execution of threads by a processor to improve efficiency. The method includes determining the number of threads currently being executed by the processor. This information is used to adjust processor operations, such as scheduling, resource allocation, or power management, to enhance performance or reduce energy consumption. The method may also involve analyzing thread execution patterns, identifying bottlenecks, or dynamically reconfiguring processor resources based on the detected thread count. By dynamically adapting to the workload, the processor can achieve better utilization of its computational resources, leading to improved efficiency and responsiveness. The method is particularly useful in multi-core or multi-threaded processing environments where managing thread execution is critical for performance optimization.

Claim 5

Original Legal Text

5. The method according to claim 1 , further deactivating one of the parent or child streams.

Plain English Translation

A system and method for managing data streams in a distributed computing environment addresses the challenge of efficiently processing and routing multiple data streams while maintaining system stability and performance. The invention involves a hierarchical stream management framework where data streams are organized into parent and child relationships, allowing for dynamic control and prioritization of data flow. The method includes detecting stream conditions, such as congestion or latency, and dynamically adjusting stream parameters to optimize performance. Additionally, the method can selectively deactivate either a parent stream or a child stream to mitigate issues like resource contention or data corruption. By deactivating a stream, the system prevents further processing of that stream, thereby preserving system resources and ensuring uninterrupted operation of other critical streams. The invention also includes mechanisms for re-activating deactivated streams once conditions improve, ensuring continuous and adaptive stream management. This approach enhances system reliability, scalability, and efficiency in handling large-scale data processing tasks.

Claim 6

Original Legal Text

6. The method according to claim 5 , further comprising deallocating mapper copy entries for the deactivated stream.

Plain English Translation

A system and method for managing data streams in a computing environment, particularly for optimizing memory usage and performance in data processing applications. The invention addresses the challenge of efficiently handling multiple data streams while minimizing resource consumption, especially in scenarios where streams are dynamically activated or deactivated. The method involves tracking and managing mapper copy entries, which are data structures used to map or reference data segments associated with active streams. When a data stream is deactivated, the system automatically deallocates the corresponding mapper copy entries to free up memory and computational resources. This ensures that inactive streams do not unnecessarily occupy system resources, improving overall efficiency and performance. The method may also include mechanisms for detecting stream deactivation, such as monitoring stream activity or receiving explicit deactivation commands. By dynamically adjusting resource allocation based on stream status, the system optimizes memory usage and reduces overhead in data processing workflows. The invention is particularly useful in applications involving real-time data processing, where efficient resource management is critical for maintaining performance and scalability.

Claim 7

Original Legal Text

7. The method according to claim 5 , further comprising deactivating the mapper copy for the deactivated stream.

Plain English Translation

A system and method for managing data streams in a distributed computing environment addresses the challenge of efficiently handling stream processing tasks across multiple nodes. The invention involves a mapper component that processes data streams and generates output streams, with the ability to create and manage multiple copies of the mapper to handle different streams. When a stream is deactivated, the corresponding mapper copy assigned to that stream is also deactivated to conserve computational resources. This ensures that inactive streams do not consume unnecessary processing power while maintaining the ability to reactivate them when needed. The system dynamically adjusts mapper resources based on stream activity, optimizing performance and resource utilization in large-scale data processing environments. The method includes monitoring stream status, assigning mapper copies to active streams, and deactivating mapper copies for deactivated streams to improve efficiency. This approach is particularly useful in distributed systems where resource allocation must be dynamically managed to handle varying workloads.

Claim 8

Original Legal Text

8. The method of claim 5 , further comprising copying the mapper state of the mapper copy handling the stream that was not deactivated to a different mapper copy.

Plain English Translation

A system and method for managing data streams in a distributed processing environment, particularly in scenarios where stream processing must be dynamically adjusted. The problem addressed involves efficiently handling stream deactivation and reactivation without data loss or processing interruptions. When a data stream is deactivated, the system preserves the processing state by copying the mapper state from an active mapper copy to a different mapper copy. This ensures continuity when the stream is reactivated, allowing the new mapper copy to resume processing from the exact point where the previous mapper left off. The mapper state includes all necessary context, such as intermediate processing results and configuration settings, to maintain consistency. This approach is particularly useful in distributed systems where streams may be temporarily paused or redirected, ensuring seamless recovery and minimizing downtime. The method supports dynamic reallocation of resources and adaptability to changing workloads while maintaining data integrity and processing accuracy.

Claim 9

Original Legal Text

9. The method of claim 1 , wherein the processor has four execution slices that can process four threads of instructions and the processor is further configured to process up to four streams of instructions.

Plain English Translation

This invention relates to a processor architecture designed to enhance parallel processing capabilities. The processor includes four execution slices, each capable of independently processing a separate thread of instructions. This multi-threaded execution allows the processor to handle up to four concurrent instruction streams simultaneously, improving throughput and efficiency in multi-threaded workloads. The architecture is particularly suited for applications requiring high levels of parallelism, such as data processing, scientific computing, or real-time systems. By distributing instruction streams across multiple execution slices, the processor avoids bottlenecks and maximizes resource utilization. The design ensures that each thread is processed independently, reducing contention for shared resources and improving overall system performance. This approach is beneficial in environments where multiple tasks must be executed concurrently without significant delays, such as in high-performance computing or embedded systems. The processor's ability to manage four instruction streams in parallel enhances its suitability for modern computing demands, where multi-threaded applications are increasingly common. The architecture also supports efficient context switching between threads, allowing for seamless transitions and maintaining high performance across diverse workloads.

Claim 10

Original Legal Text

10. The method of claim 9 , where the processor has two super slices where each super slice has two execution slices and two mapper copies and a register file free list.

Plain English Translation

A system and method for processing data in a computing environment involves a processor architecture designed to enhance parallelism and efficiency. The processor includes multiple super slices, each containing two execution slices, two mapper copies, and a register file free list. The execution slices handle instruction execution, while the mapper copies manage instruction mapping and scheduling. The register file free list tracks available register resources to optimize allocation and reuse. This architecture improves throughput by enabling concurrent execution of multiple instruction streams while maintaining efficient resource management. The system is particularly useful in high-performance computing applications where parallel processing and resource optimization are critical. The design ensures balanced workload distribution across execution slices, reducing bottlenecks and improving overall system performance. The register file free list dynamically manages register allocation, preventing resource contention and enhancing efficiency. This approach is applicable in processors for data centers, high-performance computing, and other environments requiring scalable and efficient parallel processing.

Claim 11

Original Legal Text

11. A system for processing data, the system comprising: at least one processor having at least one super slice; the at least one super slice having at least two execution slices for processing instructions, and a mapper having two mapper file copies, each mapper file copy having entries for storing data; each execution slice having at least one execution unit; one or more computer readable non-transitory storage media; and programming instructions stored on the one or more computer readable non-transitory storage media for execution by the at least one processor, wherein the programming instructions when executed by the processor cause the processor to: process a parent stream; detect a branch instruction in the parent stream; activate an additional child stream; copy the contents of the parent mapper file copy of the parent stream to an additional child mapper file copy; dispatch instructions for the parent stream and the additional child stream, and execute the parent stream and additional child stream on different execution slices using different execution units.

Plain English Translation

The system processes data using a processor with a super slice architecture designed to improve instruction execution efficiency. The super slice includes multiple execution slices, each with execution units for parallel instruction processing. A mapper with redundant file copies stores data entries to support concurrent execution paths. When a branch instruction is detected in a parent stream, the system activates a child stream, copies the parent mapper file to the child mapper, and dispatches instructions for both streams. The parent and child streams execute in parallel on separate execution slices, leveraging distinct execution units. This architecture enables efficient handling of branching operations by maintaining independent execution paths while preserving data consistency through mapper file replication. The system optimizes performance by reducing pipeline stalls and improving throughput in multi-threaded or speculative execution scenarios. The redundant mapper files ensure data integrity during concurrent operations, while the distributed execution slices enhance parallel processing capabilities. This approach is particularly useful in high-performance computing environments where branch prediction and speculative execution are critical for performance optimization.

Claim 12

Original Legal Text

12. The system according to claim 11 , wherein the processor is configured to operate in a number of modes of operation including single thread mode, double thread mode (SMT2) and four-threaded mode (SMT4) and the system further comprises programming instructions that when executed by the processor cause the processor to determine the mode in which the processor is operating.

Plain English Translation

A processor system is designed to dynamically adjust its operational modes to optimize performance and efficiency. The system includes a processor capable of operating in multiple modes, including single-thread mode, double-thread mode (SMT2), and four-threaded mode (SMT4). Each mode allows the processor to handle varying workloads by activating different numbers of hardware threads. The system also includes programming instructions that enable the processor to determine its current mode of operation. This determination allows the system to adapt to different computational demands, such as balancing throughput and power consumption. The processor may switch between modes based on factors like workload characteristics, thermal constraints, or energy efficiency requirements. By dynamically selecting the appropriate mode, the system enhances overall performance while maintaining efficiency. This approach is particularly useful in environments where workloads vary significantly, ensuring optimal resource utilization and responsiveness. The system may also include additional components, such as memory and input/output interfaces, to support its operation in different modes. The ability to detect and adjust the operational mode ensures that the processor operates at peak efficiency under varying conditions.

Claim 13

Original Legal Text

13. The system according to claim 11 , further comprising programming instructions that when executed by the processor cause the processor to deactivate one of the streams.

Plain English Translation

A system for managing data streams in a computing environment addresses the challenge of efficiently handling multiple data streams to optimize processing resources. The system includes a processor and programming instructions that, when executed, enable the processor to receive and process multiple data streams simultaneously. These streams may originate from various sources, such as sensors, network inputs, or internal system processes, and can be processed in parallel to enhance throughput and reduce latency. The system further includes programming instructions that allow the processor to deactivate one of the streams when necessary. This deactivation can be triggered by conditions such as resource constraints, priority changes, or error detection, ensuring that the system can dynamically adjust its operations to maintain stability and efficiency. By selectively deactivating streams, the system prevents resource exhaustion and ensures that critical data streams continue to be processed without interruption. This capability is particularly useful in environments where data sources may fluctuate in importance or reliability, such as real-time monitoring systems or distributed computing networks. The system's ability to dynamically manage stream activation and deactivation enhances its adaptability and reliability in handling diverse data processing tasks.

Claim 14

Original Legal Text

14. The system according to claim 13 , further comprising programming instructions that when executed by the processor cause the processor to deallocate mapper file copy entries for the deactivated stream.

Plain English Translation

The system relates to data processing in distributed computing environments, specifically managing data streams in a distributed file system. The problem addressed is efficient resource management when handling multiple data streams, particularly when streams are deactivated to free up system resources. The system includes a processor and a memory storing programming instructions. The instructions enable the processor to manage data streams by allocating and deallocating resources as needed. For deactivated streams, the system includes additional programming instructions that, when executed, cause the processor to deallocate mapper file copy entries associated with the deactivated stream. Mapper file copy entries are data structures that track the mapping of data blocks to their physical storage locations in the distributed file system. By deallocating these entries, the system ensures that resources are properly released when a stream is no longer active, preventing memory leaks and improving overall system efficiency. The system may also include other components, such as network interfaces and storage devices, to facilitate data processing and communication within the distributed environment. The deallocation process helps maintain system performance by reducing unnecessary resource consumption and ensuring that storage and memory are available for active streams.

Claim 15

Original Legal Text

15. The system according to claim 13 , further comprising programming instructions that when executed by the processor cause the processor to deactivate the mapper copy for the deactivated stream.

Plain English Translation

A system for managing data streams in a computing environment includes a processor and a memory storing programming instructions. The system processes multiple data streams, each associated with a mapper that transforms data from a source format to a target format. The system monitors the status of each stream and detects when a stream is deactivated. Upon detecting a deactivated stream, the system deactivates the corresponding mapper copy associated with that stream. This ensures that resources are efficiently allocated by preventing inactive mappers from consuming unnecessary processing power or memory. The system may also include a user interface for configuring stream parameters and a logging mechanism to track stream activity and mapper status. The deactivation process may involve releasing allocated resources or marking the mapper as inactive for future reference. This approach optimizes system performance by dynamically adjusting resource usage based on stream activity.

Claim 16

Original Legal Text

16. The system of claim 13 , further comprising programming instructions that when executed by the processor cause the processor to copy the mapper state of the mapper file copy handling the stream that was not deactivated to a different mapper file copy.

Plain English Translation

A system for managing data streams in a distributed storage environment addresses the challenge of efficiently handling stream deactivation and state management. The system includes a processor and a memory storing programming instructions that, when executed, enable the processor to manage multiple mapper file copies associated with data streams. Each mapper file copy maintains a mapper state that tracks the status and configuration of its assigned stream. When a stream is deactivated, the system ensures continuity by copying the mapper state from the active mapper file copy to a different mapper file copy. This allows the system to maintain stream state information even after deactivation, enabling seamless recovery or reassignment of the stream to another mapper file copy. The system also includes mechanisms for detecting stream deactivation events and triggering the state transfer process automatically. This approach improves reliability and reduces downtime in distributed storage systems by preserving critical stream state information during transitions. The system is particularly useful in environments where streams are frequently activated or deactivated, such as in cloud storage or real-time data processing applications.

Claim 17

Original Legal Text

17. The system of claim 11 , wherein the processor has four execution slices that can process four threads of instructions and the processor is further configured to process up to four streams of instructions.

Plain English Translation

A processor system is designed to enhance parallel processing capabilities by incorporating multiple execution slices and instruction streams. The system includes a processor with four execution slices, each capable of independently processing a separate thread of instructions. This allows the processor to concurrently execute up to four threads, improving throughput and efficiency in multithreaded applications. Additionally, the processor is configured to handle up to four streams of instructions, enabling it to manage multiple instruction flows simultaneously. The execution slices and instruction streams work together to optimize performance by distributing workloads across the available resources, reducing bottlenecks, and improving overall system responsiveness. This architecture is particularly useful in environments requiring high parallelism, such as data centers, real-time processing systems, and high-performance computing applications. The design ensures that each thread and instruction stream is processed efficiently, leveraging the full potential of the processor's parallel execution capabilities.

Claim 18

Original Legal Text

18. The system of claim 11 , wherein a first mapper file copy is associated and used in connection with executing the parent stream and a second different mapper file copy is associated and used in connection with executing the additional child stream.

Plain English Translation

This invention relates to data processing systems that handle multiple data streams, particularly in distributed computing environments. The problem addressed is the inefficiency and complexity of managing different data processing tasks when multiple streams of data are processed in parallel or sequentially, often requiring separate configurations or resources for each stream. The system includes a data processing framework that processes a parent data stream and at least one additional child data stream. The parent stream is executed using a first mapper file copy, which contains instructions or configurations specific to the parent stream's processing requirements. Similarly, the child stream is executed using a second, distinct mapper file copy tailored to the child stream's processing needs. This separation allows for independent optimization, resource allocation, and configuration of each stream, improving efficiency and flexibility in data processing workflows. The system ensures that the parent and child streams are processed independently, with their respective mapper files ensuring that each stream is handled according to its specific requirements. This approach reduces conflicts, improves scalability, and allows for dynamic adjustments in processing without disrupting other streams. The invention is particularly useful in big data environments, such as Hadoop or Spark, where multiple data streams must be processed concurrently with different configurations.

Claim 19

Original Legal Text

19. A system for processing data, the system comprising: at least one processor having at least one super slice; the at least one super slice having at least two execution slices for processing instructions, each execution slice having at least one execution unit; at least one physical register file per super slice; at least one mapper per super slice for tracking associations between the physical register file and logical register files, each mapper having at least two mapper file copies, each mapper file copy having a plurality of entries for storing data, at least one mapper file copy associated with each execution slice, wherein the system is configured to execute multiple threads of execution and multiple streams of one or more threads of execution, wherein a stream identification is used to determine which mapper copy to use while executing multiple streams of one or more threads of execution.

Plain English Translation

The system is designed for efficient data processing in computing architectures, particularly addressing challenges in managing multiple threads and execution streams. It features at least one processor with a super slice, which contains multiple execution slices for parallel instruction processing. Each execution slice includes at least one execution unit, enabling concurrent operations. The system also includes a physical register file per super slice, which stores data for execution. A mapper per super slice tracks associations between the physical register file and logical register files, ensuring proper data access. Each mapper has at least two copies, with each copy linked to a specific execution slice and containing multiple entries for storing data. This design allows the system to execute multiple threads and streams of threads simultaneously. A stream identification mechanism determines which mapper copy to use during execution, optimizing resource allocation and reducing conflicts. The architecture enhances parallelism and efficiency in multi-threaded environments by dynamically managing register access and execution resources.

Claim 20

Original Legal Text

20. The system of claim 19 , wherein the processor comprises two super slices, each super slice having two execution slices, the processor configured to process a single thread of execution, two threads of execution simultaneously, or four threads of execution simultaneously, and the processor is further configured to process up to four streams of execution, wherein the processor is configured to activate one of the mapper file copies to process an additional stream and to copy the contents of a parent mapper file copy to an additional child mapper file copy to process the additional stream using the additional child mapper file copy.

Plain English Translation

The system relates to a multi-threaded processor architecture designed to enhance parallel processing capabilities. The processor includes two super slices, each containing two execution slices, enabling flexible execution modes. It can process a single thread, two threads simultaneously, or four threads simultaneously, depending on the workload. Additionally, the processor supports up to four streams of execution, allowing for concurrent handling of multiple data streams. To manage these streams, the processor activates a mapper file copy to process an additional stream and copies the contents of a parent mapper file copy to a child mapper file copy. This child copy is then used to process the additional stream, ensuring efficient resource allocation and parallel execution. The architecture optimizes performance by dynamically adjusting the number of active threads and streams, improving throughput and efficiency in multi-threaded and multi-stream applications. The system is particularly useful in high-performance computing environments where parallel processing and streamlined data handling are critical.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

July 25, 2019

Publication Date

April 12, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Instruction streaming using state migration” (US-11301254). https://patentable.app/patents/US-11301254

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/US-11301254. See llms.txt for full attribution policy.