Provided are a computer program product, system, and method for using a machine learning module to determine an allocation of stage and destage tasks. Storage performance information related to processing of Input/Output (I/O) requests with respect to the storage unit is provided to a machine learning module. The machine learning module receives a computed number of stage tasks and a computed number of destage tasks. A current number of stage tasks allocated to stage tracks from the storage unit to the cache is adjusted based on the computed number of stage tasks. A current number of destage tasks allocated to destage tracks from the cache to the storage unit is adjusted based on the computed number of destage tasks.
Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A computer program product for allocating tasks to stage tracks from a storage unit to a cache and destage tracks from the cache to the storage unit, comprising a computer readable storage medium having computer readable program code embodied therein that when executed performs operations, the operations comprising: providing a machine learning module that receives as input storage performance information related to processing of Input/Output (I/O) requests with respect to the storage unit; determining an adjusted number of stage tasks; determining an adjusted number of destage tasks; retraining the machine learning module with the storage performance information to produce the adjusted number of stage tasks and the adjusted number of destage tasks; and using the retrained machine learning module to produce a computed number of stage tasks to allocate to staging operations and a computed number of destage tasks to allocate to destaging operations.
This invention relates to optimizing task allocation in data storage systems, specifically balancing the staging (moving data from storage to cache) and destaging (moving data from cache to storage) operations to improve performance. The problem addressed is inefficient task allocation, which can lead to bottlenecks, increased latency, or suboptimal resource utilization in storage systems handling Input/Output (I/O) requests. The solution involves a machine learning module that dynamically adjusts the number of staging and destaging tasks based on storage performance metrics. The system collects storage performance information related to I/O request processing, such as latency, throughput, or cache hit rates. Using this data, the machine learning module determines an adjusted number of stage and destage tasks. The module is periodically retrained with updated performance information to refine these adjustments. The retrained model then outputs a computed number of stage and destage tasks, which are allocated to staging and destaging operations, respectively. This adaptive approach ensures that task allocation aligns with real-time system demands, improving overall efficiency and performance. The invention may be implemented as a computer program product with executable code for performing these operations.
2. The computer program product of claim 1 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks and retraining the machine learning module are performed in response to completing one of a staging operation and a destaging operation.
This invention relates to a computer program product for optimizing data movement operations in a storage system using machine learning. The system addresses the problem of inefficient data staging and destaging operations, which can lead to performance bottlenecks and resource underutilization. The invention involves a machine learning module that predicts optimal numbers of stage tasks and destage tasks based on historical performance data. The module is periodically retrained to adapt to changing workload patterns. The retraining and adjustment of task numbers occur dynamically in response to the completion of either a staging operation or a destaging operation, ensuring continuous optimization. The system monitors performance metrics such as latency, throughput, and resource utilization to refine its predictions. By dynamically adjusting task allocation, the invention improves storage system efficiency, reduces latency, and balances workload distribution. The machine learning module may use regression models, neural networks, or other predictive algorithms to analyze historical data and forecast optimal task distributions. The retraining process ensures the model remains accurate over time, adapting to shifts in data access patterns. This approach minimizes manual intervention and enhances overall system responsiveness.
3. The computer program product of claim 1 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks comprises: determining a margin of error of a threshold storage parameter value of a storage parameter and a current value of the storage parameter; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and the margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and the margin of error.
This invention relates to data storage systems, specifically optimizing task scheduling for staging and destaging data to improve performance and resource utilization. The problem addressed is efficiently managing the balance between staging tasks (moving data from storage to cache) and destaging tasks (moving data from cache to storage) to prevent system bottlenecks while maintaining storage parameter thresholds. The system computes an initial number of stage and destage tasks based on system workload and storage conditions. To refine these values, it calculates the margin of error between a threshold storage parameter (e.g., cache hit ratio, latency) and its current value. The adjusted number of stage tasks is then determined by modifying the computed stage tasks using this margin of error, ensuring the system remains within acceptable performance limits. Similarly, the adjusted number of destage tasks is recalculated by incorporating the margin of error to prevent overloading storage resources. This approach dynamically adjusts task scheduling to maintain optimal storage performance, avoiding both underutilization and overutilization of system resources. The method ensures that storage parameters stay within predefined thresholds while adapting to real-time workload changes.
4. The computer program product of claim 1 , wherein the operations further comprise: determining a first margin of error of a first threshold storage parameter value of a first storage parameter and a first current value of the first storage parameter; determining a second margin of error of a second threshold storage parameter value of a second storage parameter and a second current value of the second storage parameter, wherein the first and the second storage parameters comprise different performance metrics; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and at least one of the first margin of error and the second margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and at least one of the first margin of error and the second margin of error.
This invention relates to data storage systems, specifically optimizing task distribution in storage controllers to balance performance and resource utilization. The problem addressed is ensuring efficient data handling by dynamically adjusting the number of stage and destage tasks based on storage parameter deviations. The system monitors two distinct storage performance metrics, such as latency or throughput, comparing their current values against predefined threshold values. For each metric, a margin of error is calculated, representing the difference between the current value and the threshold. These margins of error are then used to adjust the number of stage tasks (tasks that move data from storage to cache) and destage tasks (tasks that move data from cache to storage). The adjustments are made as a function of the initially computed task numbers and the calculated margins of error, allowing the system to dynamically respond to performance fluctuations. This ensures that storage operations remain within acceptable performance bounds while optimizing resource allocation. The approach helps maintain system stability and efficiency by preventing overloading or underutilization of storage resources.
5. The computer program product of claim 4 , wherein the function performs one of: alternating using the first margin of error and the second margin of error to determine the adjusted number of stage tasks and the adjusted number of destage tasks during different iterations of performing the retraining the machine learning module; and applying both the first margin of error and the second margin of error to the computed number of stage tasks and the computed number of destage tasks to determine the adjusted number of stage tasks and the adjusted number of destage tasks, respectively.
The invention relates to optimizing task allocation in a data processing system using machine learning. The problem addressed is efficiently balancing the number of staging and destaging tasks to improve system performance while accounting for uncertainties in task execution times. Staging tasks involve loading data into a processing stage, while destaging tasks involve moving processed data out of the stage. The system uses a machine learning module to compute an initial number of stage and destage tasks based on historical data. However, execution times can vary due to factors like system load or data characteristics, leading to inefficiencies if fixed task counts are used. To address this, the invention introduces a method that adjusts the computed task numbers using two margins of error. The first margin of error accounts for variability in staging task execution times, while the second margin of error accounts for variability in destaging task execution times. The adjustment can be performed in two ways. First, the system may alternate between using the first and second margins of error during different iterations of retraining the machine learning module. This allows the system to dynamically adapt to changing conditions by periodically switching the focus between staging and destaging tasks. Second, the system may apply both margins of error simultaneously to the computed task numbers, adjusting the staging and destaging tasks independently based on their respective error margins. This approach provides a more granular adjustment tailored to the specific variability of each task type. The method ensures that the system remains efficient even when task execution times fluctuate, improving overall data processing performance.
6. The computer program product of claim 4 , wherein a device adaptor transfers data between the storage unit and the cache, wherein the first storage parameter comprises device adaptor bandwidth, wherein the first threshold storage parameter value comprises an optimum adaptor bandwidth for the device adaptor and wherein the first current value of the first storage parameter comprises a current adaptor bandwidth of the device adaptor, wherein the second storage parameter comprises a response time of I/O requests to tracks in the storage unit, wherein the second threshold storage parameter value comprises a maximum acceptable response time for I/O requests and wherein the second current value of the second storage parameter comprises a current response time.
This invention relates to optimizing data transfer between a storage unit and a cache in a computing system. The problem addressed is inefficient data handling due to suboptimal device adaptor performance and excessive I/O request response times, which can degrade system performance. The invention involves a device adaptor that facilitates data transfer between a storage unit and a cache. The adaptor monitors two key storage parameters: bandwidth and I/O request response time. The first parameter is the device adaptor bandwidth, which is compared against an optimum bandwidth threshold to ensure efficient data transfer. The second parameter is the response time for I/O requests to tracks in the storage unit, which is compared against a maximum acceptable response time threshold to maintain system responsiveness. The system dynamically adjusts operations based on these comparisons. If the current adaptor bandwidth deviates from the optimum threshold, adjustments are made to improve transfer efficiency. Similarly, if the current I/O response time exceeds the acceptable threshold, measures are taken to reduce latency. This ensures balanced performance between data transfer speed and system responsiveness. The invention enhances overall system efficiency by maintaining optimal adaptor bandwidth and minimizing I/O delays.
7. The computer program product of claim 6 , wherein the function uses the first margin of error and the second margin of error to increase the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are greater than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being positive, wherein the function uses the first margin of error and the second margin of error to decrease the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are less than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being negative.
The invention relates to data storage systems, specifically optimizing task scheduling for staging and destaging data between storage tiers. The problem addressed is ensuring efficient data movement while accounting for performance variations due to system conditions. The system computes initial numbers of stage tasks (moving data from slower to faster storage) and destage tasks (moving data from faster to slower storage) based on workload demands. However, these computations may not fully account for real-time performance fluctuations. To address this, the system adjusts these task counts using margins of error. If the first margin of error (for staging) and the second margin of error (for destaging) are positive, the system increases the task counts to compensate for potential underestimation. Conversely, if the margins are negative, the system decreases the task counts to avoid overloading the system. This dynamic adjustment ensures that data movement operations remain balanced and efficient under varying conditions. The margins of error are derived from historical performance data or real-time monitoring, allowing the system to adapt proactively. The invention improves storage system performance by dynamically fine-tuning task scheduling based on error margins, reducing bottlenecks and optimizing resource utilization.
8. The computer program product of claim 1 , wherein the storage unit is configured as Redundant Array of Independent Disk (RAID) ranks, wherein each of the RAID ranks is comprised of storage devices, wherein there is storage performance information for each of the RAID ranks, wherein the adjusted number of stage tasks and the adjusted number of destage tasks are determined separately for each of the RAID ranks, and wherein the machine learning module comprises one of at least one machine learning module that is retrained, for each RAID rank of the RAID ranks, with the storage performance information for the RAID rank to produce the adjusted number of stage tasks and the adjusted number of destage tasks for the RAID rank.
This invention relates to optimizing storage performance in a Redundant Array of Independent Disk (RAID) system. The problem addressed is efficiently managing data staging and destaging operations to improve overall storage performance, particularly in environments where multiple RAID ranks are used. Each RAID rank consists of storage devices, and performance metrics are tracked for each rank. The system uses a machine learning module to dynamically adjust the number of staging and destaging tasks based on the specific performance characteristics of each RAID rank. The machine learning module is retrained individually for each RAID rank using its performance data, allowing for tailored optimization. This approach ensures that staging and destaging operations are balanced according to the unique performance needs of each RAID rank, enhancing overall system efficiency and responsiveness. The solution leverages machine learning to adapt to changing workloads and storage conditions, providing a more intelligent and responsive storage management system.
9. A system for allocating tasks to destaging and staging operations with respect to a storage unit, comprising: a processor; a cache implemented in at least one memory device; a machine learning module; and a computer readable storage medium having computer readable program code embodied therein that when executed by the processor performs operations, the operations comprising: providing the machine learning module that receives as input storage performance information related to processing of Input/Output (I/O) requests with respect to the storage unit; determining an adjusted number of stage tasks; determining an adjusted number of destage tasks; retraining the machine learning module with the storage performance information to produce the adjusted number of stage tasks and the adjusted number of destage tasks; and using the retrained machine learning module to produce a computed number of stage tasks to allocate to staging operations and a computed number of destage tasks to allocate to destaging operations.
A system optimizes task allocation for staging and destaging operations in a storage system to improve I/O performance. The system includes a processor, a cache, a machine learning module, and a storage medium with executable code. The machine learning module analyzes storage performance data, such as I/O request processing metrics, to dynamically adjust the number of staging and destaging tasks. The system periodically retrains the machine learning model using updated performance information to refine task allocation decisions. The retrained model then generates optimized task counts for staging (retrieving data from storage to cache) and destaging (writing data from cache to storage). This adaptive approach ensures efficient resource utilization and minimizes latency by balancing the workload between staging and destaging operations based on real-time performance conditions. The system automates task allocation decisions, reducing manual tuning and improving overall storage system responsiveness.
10. The system of claim 9 , wherein the operations further comprise: determining a first margin of error of a first threshold storage parameter value of a first storage parameter and a first current value of the first storage parameter; determining a second margin of error of a second threshold storage parameter value of a second storage parameter and a second current value of the second storage parameter, wherein the first and the second storage parameters comprise different performance metrics; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and at least one of the first margin of error and the second margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and at least one of the first margin of error and the second margin of error.
The invention relates to a data storage system that dynamically adjusts task allocation based on storage performance metrics. The system monitors multiple storage parameters, such as latency, throughput, or error rates, to optimize task distribution between stage tasks (data retrieval operations) and destage tasks (data write operations). The system calculates a margin of error for each storage parameter by comparing its current value to a predefined threshold. These margins of error are used to adjust the number of stage and destage tasks dynamically. For example, if a storage parameter indicates degraded performance, the system may increase the number of tasks allocated to mitigate the issue. The system ensures balanced resource utilization by considering different performance metrics, allowing for real-time adjustments to maintain optimal storage efficiency and reliability. This approach prevents bottlenecks and improves overall system responsiveness by dynamically adapting task allocation based on real-time performance conditions.
11. The system of claim 10 , further comprising: a device adaptor to transfer data between the storage unit and the cache, wherein the first storage parameter comprises device adaptor bandwidth, wherein the first threshold storage parameter value comprises an optimum adaptor bandwidth for the device adaptor and wherein the first current value of the first storage parameter comprises a current adaptor bandwidth of the device adaptor, wherein the second storage parameter comprises a response time of I/O requests to tracks in the storage unit, wherein the second threshold storage parameter value comprises a maximum acceptable response time for I/O requests and wherein the second current value of the second storage parameter comprises a current response time.
A data storage system monitors and optimizes performance by comparing storage parameters against predefined thresholds. The system includes a storage unit, a cache, and a device adaptor that transfers data between them. The device adaptor bandwidth is one monitored parameter, with the system comparing the current bandwidth against an optimum bandwidth threshold. Another monitored parameter is the response time for input/output (I/O) requests to tracks in the storage unit, where the system checks if the current response time exceeds a maximum acceptable threshold. If either parameter exceeds its threshold, the system adjusts storage operations to improve performance. The storage unit may include multiple storage devices, such as disks, and the system can dynamically allocate data to optimize access patterns. The cache stores frequently accessed data to reduce I/O latency. The device adaptor ensures efficient data transfer between the storage unit and cache, with bandwidth adjustments made to maintain optimal performance. The system dynamically tracks and adjusts these parameters to ensure storage operations remain within acceptable performance limits.
12. A method for allocating tasks to stage tracks from a storage unit to a cache and destage tracks from the cache to the storage unit, comprising: providing a machine learning module that receives as input storage performance information related to processing of Input/Output (I/O) requests with respect to the storage unit; determining an adjusted number of stage tasks; determining an adjusted number of destage tasks; retraining the machine learning module with the storage performance information to produce the adjusted number of stage tasks and the adjusted number of destage tasks; and using the retrained machine learning module to produce a computed number of stage tasks to allocate to staging operations and a computed number of destage tasks to allocate to destaging operations.
This invention relates to optimizing task allocation in data storage systems, specifically balancing the staging (moving data from storage to cache) and destaging (moving data from cache to storage) operations to improve performance. The problem addressed is inefficient task allocation, which can lead to bottlenecks, increased latency, or resource underutilization in storage systems handling Input/Output (I/O) requests. The method uses a machine learning module trained on storage performance data, such as I/O request patterns, latency metrics, and cache hit rates. The module dynamically adjusts the number of staging and destaging tasks based on real-time performance information. The system periodically retrains the machine learning model with updated performance data to refine task allocation decisions. The retrained model then outputs optimized task counts for staging and destaging operations, ensuring efficient resource utilization and reduced latency. By continuously adapting to changing workload conditions, the system improves overall storage performance, minimizes delays, and balances the load between cache and storage units. The approach leverages predictive analytics to automate task allocation, reducing manual intervention and enhancing system responsiveness.
13. The method of claim 12 , further comprising: determining a first margin of error of a first threshold storage parameter value of a first storage parameter and a first current value of the first storage parameter; determining a second margin of error of a second threshold storage parameter value of a second storage parameter and a second current value of the second storage parameter, wherein the first and the second storage parameters comprise different performance metrics; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and at least one of the first margin of error and the second margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and at least one of the first margin of error and the second margin of error.
This invention relates to data storage systems, specifically optimizing task allocation for data staging and destaging operations. The problem addressed is efficiently managing storage performance by dynamically adjusting task distribution based on real-time storage parameter deviations. The method involves monitoring multiple storage parameters, each representing different performance metrics such as latency, throughput, or error rates. For each parameter, the system calculates a margin of error between its current value and a predefined threshold value. These margins of error are used to adjust the allocation of stage tasks (data retrieval operations) and destage tasks (data write operations). The system first computes an initial number of stage and destage tasks based on system workload. Then, it refines these numbers by incorporating the margins of error from at least two distinct storage parameters. This adjustment ensures that task distribution aligns with current storage performance conditions, preventing bottlenecks or inefficiencies. The approach allows the system to prioritize tasks based on which performance metrics are closest to their thresholds, dynamically balancing workload to maintain optimal storage performance.
14. The method of claim 13 , wherein a device adaptor transfers data between the storage unit and the cache, wherein the first storage parameter comprises device adaptor bandwidth, wherein the first threshold storage parameter value comprises an optimum adaptor bandwidth for the device adaptor and wherein the first current value of the first storage parameter comprises a current adaptor bandwidth of the device adaptor, wherein the second storage parameter comprises a response time of I/O requests to tracks in the storage unit, wherein the second threshold storage parameter value comprises a maximum acceptable response time for I/O requests and wherein the second current value of the second storage parameter comprises a current response time.
This invention relates to data storage systems, specifically optimizing performance by monitoring and adjusting storage parameters. The problem addressed is inefficient data transfer between storage units and caches, leading to degraded system performance. A device adaptor facilitates data transfer between a storage unit and a cache, with performance evaluated based on two key parameters: adaptor bandwidth and I/O request response time. The adaptor bandwidth is compared to an optimum bandwidth threshold to ensure efficient data transfer rates. Simultaneously, the response time of I/O requests to tracks in the storage unit is monitored against a maximum acceptable response time threshold to maintain timely data access. If either parameter deviates from its threshold, adjustments are made to improve performance. The system dynamically tracks current adaptor bandwidth and response times, ensuring real-time optimization of storage operations. This approach enhances data transfer efficiency and reduces latency, particularly in high-demand storage environments. The invention focuses on maintaining optimal bandwidth utilization while ensuring I/O operations meet performance benchmarks, balancing speed and reliability in storage systems.
15. The computer program product of claim 1 , wherein the storage unit comprises a Redundant Array of Independent Disk (RAID) rank of a plurality of RAID ranks, wherein each of the RAID ranks is comprised of storage devices, and wherein there is storage performance information for each of the RAID ranks, wherein a non-volatile storage (NVS) unit stores modified data in the cache, wherein the storage performance information for each RAID rank of the RAID ranks comprises at least one of: number of tasks queued for staging operations; speed of at least one storage device in which the RAID rank is stored; overall NVS usage of tracks from all the RAID ranks; NVS usage by the RAID rank; maximum NVS usage allowed for the RAID rank; RAID rank response time for processing I/O requests with respect to the storage unit; device adaptor bandwidth utilized in a device adaptor that transfers data between the cache and the storage unit; maximum device adaptor bandwidth that is available for transfer of data; and optimum device adaptor bandwidth.
This invention relates to a computer program product for managing data storage in a system with a Redundant Array of Independent Disks (RAID) configuration. The system includes multiple RAID ranks, each composed of storage devices, and a non-volatile storage (NVS) unit that stores modified data in a cache. The invention addresses the challenge of optimizing storage performance by tracking and utilizing storage performance information for each RAID rank. This information includes metrics such as the number of tasks queued for staging operations, the speed of storage devices within a RAID rank, overall NVS usage across all RAID ranks, NVS usage specific to a RAID rank, the maximum allowed NVS usage for a RAID rank, RAID rank response time for processing I/O requests, device adaptor bandwidth utilized for data transfer between the cache and storage, the maximum available device adaptor bandwidth, and the optimum device adaptor bandwidth. By monitoring these metrics, the system can dynamically adjust storage operations to improve efficiency and performance. The invention ensures that storage resources are allocated optimally, reducing bottlenecks and enhancing overall system responsiveness.
16. The system of claim 9 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks and retraining the machine learning module are performed in response to completing one of a staging operation and a destaging operation.
A system for managing data storage operations in a distributed computing environment addresses inefficiencies in task allocation and resource utilization during data staging and destaging processes. The system includes a machine learning module that dynamically adjusts the number of staging and destaging tasks based on real-time performance metrics, such as latency, throughput, and resource availability. The machine learning module is trained to optimize task distribution across multiple storage nodes to minimize bottlenecks and improve overall system efficiency. The system also monitors task completion and, upon finishing a staging or destaging operation, recalculates the optimal number of tasks for subsequent operations. This recalibration ensures adaptive performance adjustments in response to changing workload conditions. Additionally, the machine learning module is periodically retrained using historical and current performance data to refine its predictive accuracy and adapt to evolving system dynamics. The system integrates with existing storage infrastructure, enabling seamless integration without requiring significant architectural changes. By dynamically balancing task allocation and retraining the machine learning model, the system enhances data transfer efficiency and reduces operational overhead in large-scale storage environments.
17. The system of claim 9 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks comprises: determining a margin of error of a threshold storage parameter value of a storage parameter and a current value of the storage parameter; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and the margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and the margin of error.
The invention relates to a storage system that dynamically adjusts the number of staging and destaging tasks to optimize performance and resource utilization. In storage systems, data is often staged from a slower storage tier to a faster tier for quick access and destaged back when needed. However, imbalances in these operations can lead to inefficiencies, such as overloading the system or underutilizing resources. The system monitors a storage parameter, such as cache hit rate or latency, and compares its current value to a predefined threshold. It calculates the margin of error between the current value and the threshold to assess how close the system is to an optimal or critical state. Based on this margin, the system adjusts the number of staging tasks (tasks that move data from slower to faster storage) and destaging tasks (tasks that move data back to slower storage). The adjustments are made by modifying the initially computed number of tasks for each operation, ensuring the system remains balanced and responsive. This dynamic adjustment helps maintain performance while adapting to changing workload conditions.
18. The system of claim 10 , wherein the function performs one of: alternating using the first margin of error and the second margin of error to determine the adjusted number of stage tasks and the adjusted number of destage tasks during different iterations of performing the retraining the machine learning module; and applying both the first margin of error and the second margin of error to the computed number of stage tasks and the computed number of destage tasks to determine the adjusted number of stage tasks and the adjusted number of destage tasks, respectively.
The system relates to optimizing task allocation in data storage systems using machine learning. The problem addressed is efficiently balancing the number of staging and destaging tasks to improve system performance while minimizing errors. Staging tasks involve moving data from storage to cache, while destaging tasks move data from cache back to storage. The system uses a machine learning module to compute an initial number of stage and destage tasks based on system metrics. To refine these computations, the system applies margins of error to adjust the task numbers. The system can either alternate between using a first and second margin of error during different retraining iterations of the machine learning module or apply both margins simultaneously to the computed task numbers. This dual-margin approach ensures the system dynamically adapts to varying workload conditions, improving accuracy in task allocation. The machine learning module is periodically retrained to refine its predictions, and the margins of error help account for uncertainties in the computed task numbers. This method enhances system efficiency by dynamically adjusting task allocation based on real-time performance data and learned patterns.
19. The system of claim 11 , wherein the function uses the first margin of error and the second margin of error to increase the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are greater than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being positive, wherein the function uses the first margin of error and the second margin of error to decrease the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are less than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being negative.
The invention relates to a data storage system that dynamically adjusts the number of staging and destaging tasks based on error margins to optimize performance. The system computes an initial number of stage tasks for moving data from a storage device to a cache and an initial number of destage tasks for moving data from the cache back to the storage device. A function then evaluates first and second margins of error, which represent deviations from expected performance metrics. If the margins are positive, indicating underperformance, the function increases the number of stage and destage tasks to improve throughput. If the margins are negative, indicating overperformance, the function decreases the number of tasks to reduce unnecessary resource usage. The adjustments ensure the system adapts to real-time conditions, balancing efficiency and resource utilization. The system may also prioritize tasks based on factors like data access frequency or criticality, further enhancing performance. This dynamic adjustment mechanism helps maintain optimal data movement between storage and cache, addressing inefficiencies in static task allocation methods.
20. The system of claim 9 , wherein the storage unit is configured as Redundant Array of Independent Disk (RAID) ranks, wherein each of the RAID ranks is comprised of storage devices, wherein there is storage performance information for each of the RAID ranks, wherein the adjusted number of stage tasks and the adjusted number of destage tasks are determined separately for each of the RAID ranks, and wherein the machine learning module comprises one of at least one machine learning module that is retrained, for each RAID rank of the RAID ranks, with the storage performance information for the RAID rank to produce the adjusted number of stage tasks and the adjusted number of destage tasks for the RAID rank.
A storage system manages data using a Redundant Array of Independent Disks (RAID) configuration, where multiple RAID ranks are formed from storage devices. Each RAID rank operates independently, and performance metrics such as input/output operations per second (IOPS), latency, and throughput are tracked for each rank. The system dynamically adjusts the number of stage tasks (data movement from slower to faster storage) and destage tasks (data movement from faster to slower storage) based on these performance metrics. A machine learning module is employed to analyze the performance data for each RAID rank separately. The module is retrained for each rank using its specific performance information to optimize the number of stage and destage tasks, ensuring efficient data management tailored to the performance characteristics of each RAID rank. This approach improves overall system performance by dynamically adapting to varying workloads and storage conditions across different RAID ranks.
21. The method of claim 12 , wherein the storage unit comprises a Redundant Array of Independent Disk (RAID) rank of a plurality of RAID ranks, wherein each of the RAID ranks is comprised of storage devices, and wherein there is storage performance information for each of the RAID ranks, wherein a non-volatile storage (NVS) unit stores modified data in the cache, wherein the storage performance information for each RAID rank of the RAID ranks comprises at least one of: number of tasks queued for staging operations; speed of at least one storage device in which the RAID rank is stored; overall NVS usage of tracks from all the RAID ranks; NVS usage by the RAID rank; maximum NVS usage allowed for the RAID rank; RAID rank response time for processing I/O requests with respect to the storage unit; device adaptor bandwidth utilized in a device adaptor that transfers data between the cache and the storage unit; maximum device adaptor bandwidth that is available for transfer of data; and optimum device adaptor bandwidth.
This invention relates to data storage systems, specifically optimizing storage performance in a Redundant Array of Independent Disks (RAID) configuration. The problem addressed is efficiently managing data staging operations and I/O request processing across multiple RAID ranks to improve overall system performance. The system includes a storage unit with multiple RAID ranks, each composed of storage devices. Performance metrics are tracked for each RAID rank, including queued staging tasks, storage device speeds, Non-Volatile Storage (NVS) usage (both overall and per RAID rank), allowed NVS limits, RAID rank response times, device adaptor bandwidth utilization, available bandwidth, and optimal bandwidth. The NVS unit temporarily stores modified data from the cache before it is staged to the RAID ranks. The performance information is used to dynamically adjust data distribution and staging operations across the RAID ranks. By monitoring these metrics, the system can balance workloads, prevent bottlenecks, and ensure efficient use of storage resources. This approach helps maintain high performance and reliability in large-scale storage environments.
22. The method of claim 12 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks and retraining the machine learning module are performed in response to completing one of a staging operation and a destaging operation.
A method for optimizing data movement in a storage system involves dynamically adjusting the number of staging and destaging tasks based on system performance. The storage system uses a machine learning module to predict optimal task allocation, but this prediction may need refinement over time. The method includes determining an adjusted number of stage tasks and an adjusted number of destage tasks after completing a staging or destaging operation. The machine learning module is then retrained using data from these completed operations to improve future predictions. This ensures that the system adapts to changing workload patterns, reducing latency and improving efficiency. The method may also involve monitoring system metrics such as queue lengths, response times, or resource utilization to guide adjustments. By dynamically retraining the machine learning model, the system maintains high performance under varying conditions. This approach is particularly useful in large-scale storage environments where workloads are unpredictable, and static task allocation may lead to inefficiencies. The method ensures that the storage system remains responsive and efficient by continuously refining its task scheduling strategy.
23. The method of claim 12 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks comprises: determining a margin of error of a threshold storage parameter value of a storage parameter and a current value of the storage parameter; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and the margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and the margin of error.
This invention relates to data storage systems, specifically optimizing the number of stage and destage tasks in a storage system to improve performance and reliability. The problem addressed is ensuring efficient data movement between storage tiers while maintaining system stability, particularly when storage parameters fluctuate. The method involves dynamically adjusting the number of stage (data movement from slower to faster storage) and destage (data movement from faster to slower storage) tasks based on a margin of error between a threshold storage parameter value and its current value. A storage parameter, such as latency or throughput, is monitored, and its current value is compared to a predefined threshold. The margin of error is calculated as the difference between the threshold and the current value. The adjusted number of stage tasks is determined by modifying the computed number of stage tasks based on this margin of error. Similarly, the adjusted number of destage tasks is determined by modifying the computed number of destage tasks using the same margin of error. This adjustment ensures that the system responds to fluctuations in storage performance, preventing overloading or underutilization of resources. The method helps maintain optimal data movement efficiency while adapting to real-time storage conditions.
24. The method of claim 13 , wherein the function performs one of: alternating using the first margin of error and the second margin of error to determine the adjusted number of stage tasks and the adjusted number of destage tasks during different iterations of performing the retraining the machine learning module; and applying both the first margin of error and the second margin of error to the computed number of stage tasks and the computed number of destage tasks to determine the adjusted number of stage tasks and the adjusted number of destage tasks, respectively.
This invention relates to optimizing task allocation in machine learning systems, particularly for balancing stage and destage tasks during retraining of a machine learning module. The problem addressed is efficiently adjusting the number of stage and destage tasks to improve performance and resource utilization during iterative retraining processes. The method involves using margins of error to dynamically adjust the number of stage and destage tasks. A function is employed to determine these adjustments, with two possible approaches. The first approach alternates between using a first margin of error and a second margin of error in different iterations of the retraining process. This alternation helps balance the workload by varying the adjustments over time. The second approach applies both margins of error simultaneously to the computed number of stage and destage tasks, adjusting each task type independently based on its respective margin. This allows for more granular control over task allocation. The machine learning module is retrained iteratively, and the function dynamically adjusts the task counts to optimize performance. The margins of error provide flexibility in determining how aggressively or conservatively the task numbers are adjusted, ensuring efficient resource usage and improved training outcomes. This method is particularly useful in systems where task allocation directly impacts the speed and accuracy of machine learning model retraining.
25. The method of claim 24 , wherein the function uses the first margin of error and the second margin of error to increase the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are greater than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being positive, wherein the function uses the first margin of error and the second margin of error to decrease the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are less than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being negative.
This invention relates to data storage systems, specifically methods for dynamically adjusting the number of stage and destage tasks in a storage system to optimize performance and resource utilization. The problem addressed is the inefficiency in traditional storage systems where fixed task allocation can lead to bottlenecks or underutilization of resources, particularly when handling variable workloads. The method involves computing an initial number of stage tasks and destage tasks based on system parameters such as workload characteristics and storage device capabilities. A first margin of error is calculated for stage tasks, representing the difference between the actual and expected performance of staging data from storage to cache. Similarly, a second margin of error is calculated for destage tasks, representing the difference between the actual and expected performance of destaging data from cache to storage. These margins of error are used to dynamically adjust the computed task numbers. If the margins are positive, indicating underperformance, the number of stage and destage tasks is increased to compensate. Conversely, if the margins are negative, indicating overperformance, the number of tasks is decreased to avoid unnecessary resource consumption. This adaptive adjustment ensures efficient resource allocation and maintains system performance under varying workload conditions.
26. The method of claim 12 , wherein the storage unit is configured as Redundant Array of Independent Disk (RAID) ranks, wherein each of the RAID ranks is comprised of storage devices, wherein there is storage performance information for each of the RAID ranks, wherein the adjusted number of stage tasks and the adjusted number of destage tasks are determined separately for each of the RAID ranks, and wherein the machine learning module comprises one of at least one machine learning module that is retrained, for each RAID rank of the RAID ranks, with the storage performance information for the RAID rank to produce the adjusted number of stage tasks and the adjusted number of destage tasks for the RAID rank.
This invention relates to optimizing storage performance in a Redundant Array of Independent Disk (RAID) system. The problem addressed is efficiently managing data movement between storage tiers to balance performance and resource utilization. The system includes multiple RAID ranks, each composed of storage devices, with performance metrics tracked for each rank. A machine learning module dynamically adjusts the number of stage tasks (moving data from slower to faster storage) and destage tasks (moving data from faster to slower storage) for each RAID rank individually. The machine learning module is retrained separately for each RAID rank using its specific performance data to generate optimized task adjustments. This approach ensures tailored performance improvements for each RAID rank based on its unique operational characteristics, enhancing overall system efficiency. The solution leverages machine learning to adapt to varying workloads and storage conditions, reducing bottlenecks and improving data access times.
27. The method of claim 12 , wherein the storage unit comprises a Redundant Array of Independent Disk (RAID) rank of a plurality of RAID ranks, wherein each of the RAID ranks is comprised of storage devices, and wherein there is storage performance information for each of the RAID ranks, wherein a non-volatile storage (NVS) unit stores modified data in the cache, wherein the storage performance information for each RAID rank of the RAID ranks comprises at least one of: number of tasks queued for staging operations; speed of at least one storage device in which the RAID rank is stored; overall NVS usage of tracks from all the RAID ranks; NVS usage by the RAID rank; maximum NVS usage allowed for the RAID rank; RAID rank response time for processing I/O requests with respect to the storage unit; device adaptor bandwidth utilized in a device adaptor that transfers data between the cache and the storage unit; maximum device adaptor bandwidth that is available for transfer of data; and optimum device adaptor bandwidth.
This invention relates to data storage systems, specifically optimizing storage performance in a Redundant Array of Independent Disks (RAID) configuration. The problem addressed is efficiently managing data staging operations and I/O request processing across multiple RAID ranks to improve overall system performance. The system includes a storage unit with multiple RAID ranks, each composed of storage devices. Each RAID rank has associated storage performance information, which may include metrics such as the number of tasks queued for staging operations, the speed of storage devices within the rank, overall and rank-specific NVS (Non-Volatile Storage) usage, maximum allowed NVS usage per rank, RAID rank response time for I/O requests, device adaptor bandwidth utilization, maximum available bandwidth, and optimum bandwidth for data transfer between the cache and storage unit. An NVS unit stores modified data in the cache, and the system monitors these performance metrics to optimize data handling. By tracking these parameters, the system can dynamically adjust operations to balance workloads, prevent bottlenecks, and ensure efficient data staging and retrieval across the RAID ranks. This approach enhances storage system responsiveness and throughput by leveraging real-time performance data to guide decision-making.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
September 20, 2018
March 22, 2022
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.