Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. An apparatus for managing divergent threads based on a Single Instruction Multiple Data (SIMD) architecture, the apparatus comprising: a plurality of Front End Units (FEUs) configured to fetch instructions of thread groups of a program flow; and a controller configured to schedule a thread group based on SIMD lane availability information, activate an FEU of the plurality of FEUs, and control the activated FEU to fetch an instruction for processing the scheduled thread group, wherein scheduling the thread group by the controller comprises, determining at least one thread group, among a plurality of thread groups, based on a number of idle SIMD lanes and an idle SIMD lane number included in the SIMD lane availability information, and scheduling an SIMD width to be processed for the at least one determined thread group, an SIMD depth which is greater than one to be processed for the at least one determined thread group, and an SIMD lane number to be processed for the at least one determined thread group, based on the number of idle SIMD lanes and the idle SIMD lane number included in the SIMD lane availability information, wherein the controller is configured to, in response to thread divergence occurring in the thread group that is scheduled due to a branch instruction, schedule, based on the SIMD lane availability information that is managed, another thread group to be processed through another FEU among the plurality of FEUs, and control the another FEU to fetch another instruction for execution by one or more SIMD lanes that are made available as a result of the thread divergence, and wherein the another thread group is independent and not divergent from the scheduled thread group.
This invention relates to managing divergent threads in a Single Instruction Multiple Data (SIMD) architecture to improve processing efficiency. The problem addressed is the inefficiency caused by thread divergence, where different threads in a SIMD group take different execution paths due to branch instructions, leading to underutilized processing lanes. The apparatus includes multiple Front End Units (FEUs) that fetch instructions for thread groups in a program flow. A controller schedules thread groups based on SIMD lane availability, activating an FEU to fetch instructions for the scheduled group. The controller determines which thread group to schedule by analyzing the number of idle SIMD lanes and their specific lane numbers. It then assigns an SIMD width, depth (greater than one), and lane number for processing the selected thread group. If thread divergence occurs due to a branch instruction, the controller reschedules another independent thread group through a different FEU. This new thread group utilizes the now-available SIMD lanes freed by the divergence, ensuring efficient lane utilization. The system dynamically adapts to thread divergence by reallocating resources to non-divergent, independent thread groups, optimizing overall processing performance.
2. The apparatus of claim 1 , wherein the controller comprises an active thread manager configured to, in response to the thread divergence occurring in the thread group that is scheduled due to the branch instruction, manage active thread information of the thread group with the thread divergence.
This invention relates to parallel processing systems, specifically managing thread execution in a multi-threaded environment where thread divergence occurs due to branch instructions. The problem addressed is the inefficiency in handling thread divergence, where threads in a group take different execution paths, leading to wasted computational resources and reduced performance. The apparatus includes a controller with an active thread manager that dynamically tracks and manages thread groups experiencing divergence. When a branch instruction causes threads in a group to diverge, the active thread manager updates and maintains accurate information about which threads remain active and which are temporarily inactive. This ensures that only the active threads consume processing resources, while inactive threads are efficiently suspended until their execution paths reconverge. The system optimizes resource allocation by dynamically adjusting thread execution based on real-time divergence status, improving overall processing efficiency and performance in parallel computing environments. The invention enhances thread management in multi-threaded architectures, particularly in graphics processing units (GPUs) or other parallel processing units where thread divergence is common.
3. The apparatus of claim 2 , wherein the controller further comprises an SIMD manager configured to manage the SIMD lane availability information by checking any available SIMD lanes based on the active thread information that is managed.
The invention relates to a computing apparatus with a controller that manages Single Instruction Multiple Data (SIMD) lane availability. SIMD is a parallel processing technique where a single instruction operates on multiple data points simultaneously, improving computational efficiency. The problem addressed is optimizing SIMD lane utilization in multi-threaded environments, where threads may compete for SIMD resources, leading to inefficiencies. The apparatus includes a controller with an SIMD manager that tracks SIMD lane availability. The SIMD manager checks for available SIMD lanes by analyzing active thread information, which includes details about threads currently using or requesting SIMD lanes. This ensures that SIMD lanes are allocated efficiently, reducing idle time and improving performance. The controller also manages thread scheduling, ensuring that threads are assigned to available SIMD lanes based on their computational needs. The system dynamically adjusts lane assignments as threads complete or new threads are introduced, maintaining optimal resource utilization. This approach enhances parallel processing efficiency in multi-threaded applications, particularly in workloads requiring intensive SIMD operations.
4. The apparatus of claim 2 , further comprising a distribution network configured to transfer the instruction fetched by the FEU that is activated to a corresponding SIMD lane based on the active thread information that is managed.
This invention relates to a processing apparatus with a front-end unit (FEU) and a single-instruction multiple-data (SIMD) execution unit. The problem addressed is efficiently managing instruction distribution in a multi-threaded SIMD architecture to optimize performance and resource utilization. The apparatus includes a distribution network that transfers instructions fetched by an activated FEU to a corresponding SIMD lane based on active thread information. The FEU is responsible for fetching instructions from a memory system and determining which threads are active. The SIMD execution unit processes multiple data elements in parallel using the same instruction, improving computational efficiency. The distribution network ensures that instructions are routed to the correct SIMD lane based on thread activity, preventing unnecessary processing and reducing power consumption. The apparatus may also include a thread management unit that tracks active threads and provides this information to the distribution network. This allows the system to dynamically adjust instruction routing as thread priorities or states change. The invention improves performance by minimizing idle cycles in SIMD lanes and ensuring that only active threads receive instructions.
5. The apparatus of claim 4 , wherein a plurality of SIMD lanes is provided, and each SIMD lane comprises an Execution Unit (EU) configured to execute a corresponding instruction transferred through the distribution network.
The invention relates to a parallel processing apparatus designed to enhance computational efficiency in systems requiring simultaneous execution of multiple data streams. The apparatus addresses the challenge of efficiently distributing and executing instructions across multiple processing units in a Single Instruction Multiple Data (SIMD) architecture, where a single instruction operates on multiple data points simultaneously. The apparatus includes a distribution network that transfers instructions to multiple SIMD lanes, each containing an Execution Unit (EU). Each EU is configured to execute the corresponding instruction received through the distribution network. The SIMD lanes operate in parallel, allowing the apparatus to process multiple data streams concurrently, improving throughput and reducing latency. The distribution network ensures that instructions are correctly routed to the appropriate EUs, enabling synchronized execution across the lanes. This design is particularly useful in applications requiring high-performance parallel processing, such as graphics rendering, scientific computing, and machine learning, where large datasets must be processed efficiently. By leveraging SIMD architecture, the apparatus maximizes computational resources while minimizing overhead, making it suitable for real-time and high-throughput environments. The invention enhances processing efficiency by ensuring that each EU receives and executes its assigned instruction without bottlenecks, thereby optimizing overall system performance.
6. The apparatus of claim 1 , wherein the controller is configured to schedule the thread group based on at least one of memory access characteristics, computation latency, and user input information with respect to the thread group.
This invention relates to a computing apparatus with an improved thread scheduling system. The apparatus includes a controller that manages the execution of thread groups within a processor. The primary problem addressed is inefficient thread scheduling, which can lead to suboptimal performance, increased latency, and poor responsiveness in computing systems. Traditional scheduling methods often fail to account for dynamic factors such as memory access patterns, computation delays, and user interactions, resulting in inefficient resource utilization. The controller is configured to optimize thread scheduling by considering at least one of three key factors: memory access characteristics, computation latency, and user input information. Memory access characteristics refer to how threads interact with memory, including access patterns, bandwidth usage, and potential bottlenecks. Computation latency involves the time taken for threads to complete their tasks, which can vary based on workload complexity and processor availability. User input information includes real-time interactions, such as keyboard or touch inputs, which may require prioritization to ensure system responsiveness. By analyzing these factors, the controller dynamically adjusts the scheduling of thread groups to improve overall system efficiency. For example, threads with high memory access demands may be scheduled to minimize conflicts, while latency-sensitive computations may be prioritized to reduce delays. User input-related threads can be given higher priority to ensure immediate responsiveness. This adaptive approach enhances performance, reduces idle time, and improves the user experience in computing environments.
7. The apparatus of claim 1 , wherein the controller is configured to, before threads of the thread group that is scheduled are diverged, or after divergent threads of the thread group that is scheduled are converged, activate the FEU to fetch another instruction that controls the FEU that is activated to process the thread group using all SIMD lanes.
This invention relates to parallel processing systems, specifically optimizing instruction fetching and execution in Single Instruction Multiple Data (SIMD) architectures. The problem addressed is inefficient instruction fetching and execution when handling divergent threads in SIMD processing, where threads within a group may follow different execution paths, leading to underutilized processing lanes. The apparatus includes a controller and a Fetch Execution Unit (FEU) designed to manage thread groups in SIMD architectures. The controller schedules thread groups for execution, where each thread group consists of multiple threads processed in parallel using SIMD lanes. When threads within a group diverge (i.e., follow different execution paths), the FEU may fetch and execute instructions for only the active threads, leaving some lanes idle. To mitigate this inefficiency, the controller activates the FEU to fetch and process instructions for the entire thread group using all SIMD lanes either before the threads diverge or after they reconverge. This ensures that the SIMD lanes are fully utilized, improving processing efficiency. The invention also includes mechanisms to handle thread divergence and convergence, ensuring that the FEU dynamically adjusts instruction fetching and execution to maximize lane utilization. By preemptively fetching instructions for the entire thread group or resuming full-lane processing after convergence, the system avoids wasted computational resources, enhancing overall performance in parallel processing tasks.
8. A method of managing divergent threads based on Simple Instruction Multiple Data (SIMD) architecture, the method comprising: fetching, at a first Front End Unit (FEU) among a plurality of FEUs, an instruction of a first thread group; determining, at the first FEU, whether threads of the first thread group are diverged due to the instruction that is fetched; in response to determining that the threads of the first thread group are diverged, activating a second FEU among the plurality of FEUs; scheduling, based on a number of idle SIMD lanes and an idle SIMD lane number included in SIMD lane availability information, a second thread group to be processed through the second FEU, the second thread group being independent and not divergent from the first thread group; scheduling, based on the number of idle SIMD lanes and the idle SIMD lane number included in the SIMD lane availability information, an SIMD width to be processed for the second thread group, an SIMD depth which is greater than one to be processed for the second thread group, and an SIMD lane number to be processed for the second thread group; and fetching, at both the first FEU and the second FEU, instructions.
The invention relates to managing divergent threads in a computing system utilizing Simple Instruction Multiple Data (SIMD) architecture. In SIMD systems, threads within a group may diverge due to conditional branches or other instructions, leading to inefficiencies as some processing lanes remain idle while others execute divergent paths. The invention addresses this by dynamically activating additional Front End Units (FEUs) to handle independent, non-divergent thread groups, improving resource utilization. The method involves fetching an instruction for a first thread group at a primary FEU and determining if the threads within the group have diverged. If divergence is detected, a secondary FEU is activated. The system then schedules a second, independent thread group to the secondary FEU, ensuring it does not conflict with the divergent first group. Scheduling is based on available SIMD lanes, including the number of idle lanes and their specific identifiers. The method also determines the SIMD width, depth (greater than one), and lane number for processing the second thread group. Both FEUs then fetch instructions concurrently, allowing parallel execution of divergent and non-divergent threads, optimizing performance in SIMD architectures.
9. The method of claim 8 , further comprising managing active thread information of each of the first thread group and the second thread group.
This invention relates to thread management in computing systems, specifically addressing the challenge of efficiently handling multiple threads to optimize performance and resource utilization. The method involves dividing threads into at least two distinct groups, where each group operates independently but can interact with the other. The first thread group executes tasks that require high-priority processing, while the second thread group handles lower-priority or background tasks. This separation allows the system to prioritize critical operations without being hindered by less urgent processes. The method further includes managing active thread information for each group, tracking details such as thread status, execution time, and resource allocation. This tracking enables dynamic adjustments to thread scheduling, ensuring that system resources are allocated efficiently based on real-time demands. By monitoring active threads, the system can detect bottlenecks, reallocate resources, or adjust priorities to maintain optimal performance. The invention improves computational efficiency by reducing thread contention and ensuring that high-priority tasks are processed without unnecessary delays. It is particularly useful in multi-core or multi-threaded environments where managing thread interactions is critical for performance. The method ensures that thread groups operate independently while still allowing controlled interaction, preventing resource conflicts and improving overall system responsiveness.
10. The method of claim 9 , wherein the managing the active thread information comprises managing SIMD lane availability information by checking an available SIMD lane based on the active thread information of each of the first thread group and the second thread group.
This invention relates to managing thread execution in a processor, particularly for systems using Single Instruction Multiple Data (SIMD) parallelism. The problem addressed is efficiently utilizing SIMD lanes by dynamically tracking and allocating available lanes across multiple thread groups to maximize computational throughput. The method involves managing active thread information for at least two thread groups, where each thread group contains multiple threads. The active thread information includes details about which threads are currently executing and their associated SIMD lane requirements. When a thread requires a SIMD lane, the system checks the availability of lanes by analyzing the active thread information of both thread groups. This ensures that lanes are allocated only when they are free, preventing conflicts and improving resource utilization. The system dynamically updates the active thread information as threads start or complete execution, maintaining an accurate record of lane availability. By cross-referencing the active thread information of both thread groups, the method ensures that SIMD lanes are efficiently shared, reducing idle time and enhancing overall processing efficiency. This approach is particularly useful in multi-threaded environments where SIMD parallelism is critical for performance.
11. The method of claim 8 , further comprising, in response to the determining that the threads of the first thread group are diverged, managing SIMD lane usage status information that indicates information about any SIMD lane that was being used at a time shortly before the threads of the first thread group are diverged.
This invention relates to optimizing Single Instruction Multiple Data (SIMD) processing in parallel computing environments, particularly when handling thread divergence. In SIMD architectures, threads within a group execute the same instruction on different data, but performance degrades when threads diverge (take different execution paths). The invention addresses this by managing SIMD lane usage status information to improve efficiency during divergence. When threads in a first thread group diverge, the system captures the state of SIMD lanes shortly before divergence occurs. This status information includes details about which lanes were active, their data, and their execution context. By tracking this information, the system can later restore or repurpose the lanes more efficiently, reducing idle cycles and improving throughput. The method may involve pausing certain lanes, reallocating resources, or resuming execution in a way that minimizes performance loss. The invention builds on a prior method that detects thread divergence and adjusts execution accordingly. The additional step of managing SIMD lane status ensures that the system retains critical context, allowing for smoother recovery or alternative processing strategies. This approach is particularly useful in graphics processing, scientific computing, and other domains where SIMD parallelism is heavily utilized. The solution enhances performance by minimizing the overhead associated with thread divergence in SIMD architectures.
12. The method of claim 11 , further comprising: in response to the determining that the threads of the first thread group are diverged due to a conditional branch, jumping into a Taken Program Counter (PC).
The invention relates to optimizing thread execution in parallel processing systems, particularly addressing inefficiencies caused by thread divergence during conditional branching. In parallel processing, threads often execute the same instructions in lockstep, but conditional branches can cause divergence, where some threads take one path while others take another. This divergence leads to wasted computational resources as threads must synchronize before proceeding, reducing overall efficiency. The method improves performance by dynamically handling thread divergence. When a conditional branch is encountered, the system determines if threads in a first thread group have diverged. If divergence is detected, the system jumps to a Taken Program Counter (PC), which is a predefined execution point for threads that take the conditional branch. This allows the diverged threads to continue execution without waiting for synchronization, while non-diverged threads proceed along their original path. The Taken PC may include optimized instructions or alternative execution paths tailored for the diverged threads, further enhancing efficiency. The method may also involve tracking thread states to identify divergence and dynamically adjusting execution paths based on real-time conditions. By minimizing synchronization delays and optimizing execution paths, the invention improves throughput and resource utilization in parallel processing environments.
13. The method of claim 11 , further comprising: determining whether the instruction fetched by the first FEU is a branch-join instruction; in response to determining that the instruction fetched by the first FEU is the branch-join instruction, determining whether there is any Not-Taken Program Counter (PC) not processed due to thread divergence of the first thread group; and in response to a determination that there is no Not Taken PC not processed due to the thread divergence of the first thread group, fetching another instruction based on the SIMD lane usage status information to process the first thread group.
This invention relates to a method for optimizing instruction processing in a processor, particularly in systems handling thread divergence where multiple threads within a thread group follow different execution paths. The problem addressed is inefficient instruction fetching and processing when threads in a group diverge, leading to underutilized processing resources and performance bottlenecks. The method involves a first Front-End Unit (FEU) fetching instructions for a first thread group. If the fetched instruction is a branch-join instruction, the system checks whether any Not-Taken Program Counter (PC) values remain unprocessed due to thread divergence. If no Not-Taken PCs are pending, the system uses SIMD (Single Instruction, Multiple Data) lane usage status information to fetch and process another instruction for the thread group. This ensures that processing resources are fully utilized by dynamically adjusting instruction fetching based on thread convergence status, avoiding idle cycles and improving efficiency. The method leverages SIMD lane usage data to prioritize instructions that maximize parallel execution, particularly after thread divergence resolves. This approach enhances performance in multi-threaded environments by dynamically adapting instruction flow to the current execution state of the thread group.
14. The method of claim 8 , wherein the scheduling the second thread group comprises scheduling the second thread group based on at least one of memory access characteristics, computation latency, and user input information with respect to the second thread group.
This invention relates to optimizing thread scheduling in computing systems to improve performance and efficiency. The problem addressed is inefficient thread execution due to suboptimal scheduling, leading to wasted computational resources, increased latency, and poor responsiveness to user inputs. The solution involves dynamically scheduling thread groups based on key factors such as memory access patterns, computation latency, and user interaction data to enhance system performance. The method includes analyzing memory access characteristics of a second thread group to determine how data is retrieved and stored, which impacts cache efficiency and bandwidth usage. Computation latency is assessed to prioritize threads with longer processing times or dependencies, ensuring critical tasks complete faster. User input information, such as real-time interactions, is used to prioritize threads that directly affect user experience, reducing perceived lag. By considering these factors, the scheduling system dynamically adjusts thread execution order and resource allocation, improving overall system responsiveness and throughput. This approach is particularly useful in multi-core processors, real-time applications, and user-facing systems where performance and interactivity are critical. The invention ensures threads are executed in a manner that minimizes idle time, optimizes resource utilization, and adapts to varying workload demands.
15. A method of managing divergent threads based on Simple Instruction Multiple Data (SIMD) architecture, the method comprising: scheduling, based on a number of idle SIMD lanes and an idle SIMD lane number included in SIMD lane availability information, a first thread group of a program flow; scheduling, based on the number of idle SIMD lanes and the idle SIMD lane number included in the SIMD lane availability information, an SIMD width to be processed for the first thread group of the program flow, an SIMD depth which is greater than one to be processed for the first thread group of the program flow, and an SIMD lane number to be processed for the first thread group of the program flow; activating a first FEU among a plurality of FEUs configured to fetch instructions for execution by SIMD lanes; fetching, at the first FEU that is activated, a first instruction for processing the first thread group that is scheduled; managing the SIMD lane availability information by checking any available SIMD lanes resulting from a thread divergence occurring in the first thread group that is scheduled due to a branch instruction; scheduling, based on the SIMD lane availability information that is managed, a second thread group to be processed through a second FEU among the plurality of FEUs, the second thread group being independent and not divergent from the first thread group; and fetching, at the second FEU, a second instruction for execution by one or more first SIMD lanes that are made available as a result of the thread divergence.
This invention relates to managing divergent threads in a Simple Instruction Multiple Data (SIMD) architecture to improve processing efficiency. The problem addressed is the inefficiency caused by thread divergence, where different threads in a SIMD group take different execution paths due to branch instructions, leading to underutilized processing lanes. The method involves dynamically scheduling thread groups and adjusting SIMD processing parameters based on available SIMD lanes. A first thread group is scheduled based on the number of idle SIMD lanes and their specific lane numbers, as indicated by SIMD lane availability information. The system then determines the SIMD width, depth (greater than one), and lane number for processing this group. A front-end unit (FEU) is activated to fetch instructions for the scheduled thread group. If thread divergence occurs due to a branch instruction, the system updates the SIMD lane availability information to reflect the newly available lanes. The method then schedules a second, independent thread group for processing through a different FEU, utilizing the lanes freed by the divergence. This second FEU fetches instructions for execution by the available SIMD lanes. The approach ensures continuous utilization of SIMD resources by dynamically reallocating lanes to independent thread groups when divergence occurs, thereby improving overall processing efficiency.
16. The method of claim 15 , further comprising, in response to the thread divergence occurring in the first thread group that is scheduled due to the branch instruction, managing active thread information of the first thread group with the thread divergence.
This invention relates to parallel processing systems, specifically managing thread execution in multi-threaded environments where branch instructions cause thread divergence. The problem addressed is the inefficiency in handling thread divergence, where threads in a group take different execution paths due to conditional branches, leading to wasted computational resources and reduced performance. The method involves monitoring thread execution in a parallel processing system where multiple threads are grouped and scheduled for execution. When a branch instruction causes divergence in a first thread group, the system dynamically manages the active thread information of that group. This includes tracking which threads are active or inactive based on the branch outcomes, ensuring only the relevant threads continue execution while others are temporarily suspended. The system may also adjust scheduling priorities or allocate resources more efficiently to minimize idle cycles and improve throughput. The method may further involve predicting branch outcomes to preemptively manage thread groups, reducing the overhead of handling divergence reactively. By dynamically adjusting thread states and resource allocation, the system optimizes performance in multi-threaded applications where branch instructions frequently cause thread divergence. This approach is particularly useful in graphics processing units (GPUs) or other parallel architectures where thread divergence can significantly impact efficiency.
17. The method of claim 16 , further comprising managing the SIMD lane availability information by checking any available SIMD lanes based on the active thread information that is managed.
This invention relates to optimizing the use of Single Instruction Multiple Data (SIMD) lanes in a processor to improve computational efficiency. The problem addressed is the inefficient utilization of SIMD lanes, where some lanes may remain idle while others are overloaded, leading to suboptimal performance. The solution involves dynamically managing SIMD lane availability by tracking active threads and their resource usage. The method includes monitoring active thread information to determine which SIMD lanes are currently in use and which are available. This information is used to allocate tasks to available lanes, ensuring that computational workloads are distributed evenly across all lanes. By continuously updating the lane availability status based on thread activity, the system avoids bottlenecks and maximizes parallel processing capabilities. The approach improves throughput and reduces idle time, particularly in applications that rely heavily on SIMD parallelism, such as multimedia processing, scientific computing, and machine learning. The invention also includes mechanisms to handle thread preemption and context switching, ensuring that lane availability is accurately reflected even when threads are temporarily paused or interrupted. This dynamic management system adapts to varying workloads, maintaining optimal lane utilization under different operating conditions. The result is a more efficient use of hardware resources, leading to faster execution times and better energy efficiency.
18. The method of claim 15 , further comprising: scheduling, based on the SIMD lane availability information that is managed, a third thread group to be processed through a third FEU among the plurality of FEUs; and fetching, at the third FEU, instructions for execution by SIMD lanes that are made available as a result of the thread divergence.
This invention relates to optimizing the execution of thread groups in a processor architecture that supports Single Instruction Multiple Data (SIMD) operations, particularly in scenarios where thread divergence occurs. The problem addressed is inefficient utilization of SIMD lanes when threads within a group diverge, leading to underutilized processing resources. The method involves managing SIMD lane availability information to track which lanes are free or occupied due to thread divergence. This information is used to schedule a third thread group for execution on a third Front-End Unit (FEU) among multiple FEUs. The third FEU then fetches instructions for execution by the SIMD lanes that have become available as a result of the divergence in previously executed thread groups. This dynamic scheduling ensures that idle SIMD lanes are repurposed efficiently, improving overall processor throughput. The method builds on a system where thread groups are initially scheduled based on SIMD lane availability, and instructions are fetched for execution by available lanes. The additional step of scheduling a third thread group and fetching instructions for available lanes further optimizes resource utilization by leveraging the freed-up lanes caused by thread divergence. This approach minimizes idle cycles and maximizes the efficiency of SIMD execution in parallel processing environments.
19. The method of claim 16 , further comprising transferring the instruction fetched by the first FEU that is activated to a corresponding SIMD land based on the active thread information that is managed.
This invention relates to a method for managing instruction execution in a processor system, particularly in a system with multiple functional execution units (FEUs) and single instruction multiple data (SIMD) units. The problem addressed is efficient instruction handling in parallel processing environments where multiple threads compete for execution resources. The method involves activating a first FEU in response to a fetch request for an instruction. The activated FEU then fetches the instruction from an instruction cache. The system manages active thread information, which tracks the state of multiple threads being processed. Based on this thread information, the fetched instruction is transferred to a corresponding SIMD unit for execution. The SIMD unit processes the instruction in parallel across multiple data elements, improving computational efficiency. The method ensures that instructions are routed to the appropriate SIMD unit based on the active thread state, preventing conflicts and optimizing resource utilization. The system may include multiple FEUs and SIMD units, each handling different threads or portions of threads. The thread information is dynamically updated to reflect changes in thread status, allowing the system to adapt to varying workloads. This approach enhances parallel processing performance by efficiently distributing instructions across available execution units while maintaining thread context integrity.
20. A non-transitory computer readable recording medium having recorded thereon a program executable by a computer for performing the method of claim 9 .
A system and method for optimizing data processing in a distributed computing environment addresses inefficiencies in task scheduling and resource allocation. The system identifies tasks within a distributed computing framework, such as a MapReduce or Spark environment, and dynamically assigns them to processing nodes based on real-time performance metrics. This includes analyzing task execution times, node availability, and network latency to minimize processing delays. The method further adjusts task priorities and resource allocations to balance workloads across nodes, preventing bottlenecks and improving overall system throughput. Additionally, the system monitors task dependencies and schedules tasks in an order that maximizes parallel execution while respecting data flow constraints. The solution also includes mechanisms for handling task failures, such as automatic retries or reassignments to alternative nodes, ensuring fault tolerance. By continuously adapting to changing system conditions, the system enhances efficiency in large-scale data processing environments. The invention is implemented as a software program stored on a non-transitory computer-readable medium, executable by a computer to perform the described optimization processes.
21. A non-transitory computer readable recording medium having recorded thereon a program executable by a computer for performing the method of claim 15 .
A system and method for optimizing data processing in a distributed computing environment addresses inefficiencies in task allocation and resource utilization. The technology focuses on dynamically assigning computational tasks to nodes within a network to minimize processing time and maximize resource efficiency. The method involves analyzing task dependencies, node capabilities, and network conditions to determine optimal task distribution. It includes generating a task execution plan that accounts for data locality, node load balancing, and communication overhead. The system monitors task progress in real-time and adjusts the execution plan as needed to adapt to changing conditions. This approach improves overall system performance by reducing idle time, avoiding bottlenecks, and ensuring balanced workload distribution. The solution is particularly useful in large-scale distributed systems where tasks vary in complexity and resource requirements. The program for implementing this method is stored on a non-transitory computer-readable medium, enabling deployment across different computing environments. The system also includes mechanisms for fault tolerance and recovery, ensuring reliable execution even in the presence of node failures or network disruptions. By dynamically optimizing task allocation, the technology enhances efficiency and scalability in distributed computing environments.
Unknown
November 10, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.