US-11281497

Determining an allocation of stage and destage tasks by training a machine learning module

PublishedMarch 22, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Provided are a computer program product, system, and method for using a machine learning module to determine an allocation of stage and destage tasks. Storage performance information related to processing of Input/Output (I/O) requests with respect to the storage unit is provided to a machine learning module. The machine learning module receives a computed number of stage tasks and a computed number of destage tasks. A current number of stage tasks allocated to stage tracks from the storage unit to the cache is adjusted based on the computed number of stage tasks. A current number of destage tasks allocated to destage tracks from the cache to the storage unit is adjusted based on the computed number of destage tasks.

Patent Claims

27 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A computer program product for allocating tasks to stage tracks from a storage unit to a cache and destage tracks from the cache to the storage unit, comprising a computer readable storage medium having computer readable program code embodied therein that when executed performs operations, the operations comprising: providing a machine learning module that receives as input storage performance information related to processing of Input/Output (I/O) requests with respect to the storage unit; determining an adjusted number of stage tasks; determining an adjusted number of destage tasks; retraining the machine learning module with the storage performance information to produce the adjusted number of stage tasks and the adjusted number of destage tasks; and using the retrained machine learning module to produce a computed number of stage tasks to allocate to staging operations and a computed number of destage tasks to allocate to destaging operations.

2. The computer program product of claim 1 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks and retraining the machine learning module are performed in response to completing one of a staging operation and a destaging operation.

3. The computer program product of claim 1 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks comprises: determining a margin of error of a threshold storage parameter value of a storage parameter and a current value of the storage parameter; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and the margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and the margin of error.

4. The computer program product of claim 1 , wherein the operations further comprise: determining a first margin of error of a first threshold storage parameter value of a first storage parameter and a first current value of the first storage parameter; determining a second margin of error of a second threshold storage parameter value of a second storage parameter and a second current value of the second storage parameter, wherein the first and the second storage parameters comprise different performance metrics; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and at least one of the first margin of error and the second margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and at least one of the first margin of error and the second margin of error.

5. The computer program product of claim 4 , wherein the function performs one of: alternating using the first margin of error and the second margin of error to determine the adjusted number of stage tasks and the adjusted number of destage tasks during different iterations of performing the retraining the machine learning module; and applying both the first margin of error and the second margin of error to the computed number of stage tasks and the computed number of destage tasks to determine the adjusted number of stage tasks and the adjusted number of destage tasks, respectively.

6. The computer program product of claim 4 , wherein a device adaptor transfers data between the storage unit and the cache, wherein the first storage parameter comprises device adaptor bandwidth, wherein the first threshold storage parameter value comprises an optimum adaptor bandwidth for the device adaptor and wherein the first current value of the first storage parameter comprises a current adaptor bandwidth of the device adaptor, wherein the second storage parameter comprises a response time of I/O requests to tracks in the storage unit, wherein the second threshold storage parameter value comprises a maximum acceptable response time for I/O requests and wherein the second current value of the second storage parameter comprises a current response time.

7. The computer program product of claim 6 , wherein the function uses the first margin of error and the second margin of error to increase the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are greater than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being positive, wherein the function uses the first margin of error and the second margin of error to decrease the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are less than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being negative.

8. The computer program product of claim 1 , wherein the storage unit is configured as Redundant Array of Independent Disk (RAID) ranks, wherein each of the RAID ranks is comprised of storage devices, wherein there is storage performance information for each of the RAID ranks, wherein the adjusted number of stage tasks and the adjusted number of destage tasks are determined separately for each of the RAID ranks, and wherein the machine learning module comprises one of at least one machine learning module that is retrained, for each RAID rank of the RAID ranks, with the storage performance information for the RAID rank to produce the adjusted number of stage tasks and the adjusted number of destage tasks for the RAID rank.

9. A system for allocating tasks to destaging and staging operations with respect to a storage unit, comprising: a processor; a cache implemented in at least one memory device; a machine learning module; and a computer readable storage medium having computer readable program code embodied therein that when executed by the processor performs operations, the operations comprising: providing the machine learning module that receives as input storage performance information related to processing of Input/Output (I/O) requests with respect to the storage unit; determining an adjusted number of stage tasks; determining an adjusted number of destage tasks; retraining the machine learning module with the storage performance information to produce the adjusted number of stage tasks and the adjusted number of destage tasks; and using the retrained machine learning module to produce a computed number of stage tasks to allocate to staging operations and a computed number of destage tasks to allocate to destaging operations.

10. The system of claim 9 , wherein the operations further comprise: determining a first margin of error of a first threshold storage parameter value of a first storage parameter and a first current value of the first storage parameter; determining a second margin of error of a second threshold storage parameter value of a second storage parameter and a second current value of the second storage parameter, wherein the first and the second storage parameters comprise different performance metrics; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and at least one of the first margin of error and the second margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and at least one of the first margin of error and the second margin of error.

11. The system of claim 10 , further comprising: a device adaptor to transfer data between the storage unit and the cache, wherein the first storage parameter comprises device adaptor bandwidth, wherein the first threshold storage parameter value comprises an optimum adaptor bandwidth for the device adaptor and wherein the first current value of the first storage parameter comprises a current adaptor bandwidth of the device adaptor, wherein the second storage parameter comprises a response time of I/O requests to tracks in the storage unit, wherein the second threshold storage parameter value comprises a maximum acceptable response time for I/O requests and wherein the second current value of the second storage parameter comprises a current response time.

12. A method for allocating tasks to stage tracks from a storage unit to a cache and destage tracks from the cache to the storage unit, comprising: providing a machine learning module that receives as input storage performance information related to processing of Input/Output (I/O) requests with respect to the storage unit; determining an adjusted number of stage tasks; determining an adjusted number of destage tasks; retraining the machine learning module with the storage performance information to produce the adjusted number of stage tasks and the adjusted number of destage tasks; and using the retrained machine learning module to produce a computed number of stage tasks to allocate to staging operations and a computed number of destage tasks to allocate to destaging operations.

13. The method of claim 12 , further comprising: determining a first margin of error of a first threshold storage parameter value of a first storage parameter and a first current value of the first storage parameter; determining a second margin of error of a second threshold storage parameter value of a second storage parameter and a second current value of the second storage parameter, wherein the first and the second storage parameters comprise different performance metrics; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and at least one of the first margin of error and the second margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and at least one of the first margin of error and the second margin of error.

14. The method of claim 13 , wherein a device adaptor transfers data between the storage unit and the cache, wherein the first storage parameter comprises device adaptor bandwidth, wherein the first threshold storage parameter value comprises an optimum adaptor bandwidth for the device adaptor and wherein the first current value of the first storage parameter comprises a current adaptor bandwidth of the device adaptor, wherein the second storage parameter comprises a response time of I/O requests to tracks in the storage unit, wherein the second threshold storage parameter value comprises a maximum acceptable response time for I/O requests and wherein the second current value of the second storage parameter comprises a current response time.

15. The computer program product of claim 1 , wherein the storage unit comprises a Redundant Array of Independent Disk (RAID) rank of a plurality of RAID ranks, wherein each of the RAID ranks is comprised of storage devices, and wherein there is storage performance information for each of the RAID ranks, wherein a non-volatile storage (NVS) unit stores modified data in the cache, wherein the storage performance information for each RAID rank of the RAID ranks comprises at least one of: number of tasks queued for staging operations; speed of at least one storage device in which the RAID rank is stored; overall NVS usage of tracks from all the RAID ranks; NVS usage by the RAID rank; maximum NVS usage allowed for the RAID rank; RAID rank response time for processing I/O requests with respect to the storage unit; device adaptor bandwidth utilized in a device adaptor that transfers data between the cache and the storage unit; maximum device adaptor bandwidth that is available for transfer of data; and optimum device adaptor bandwidth.

16. The system of claim 9 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks and retraining the machine learning module are performed in response to completing one of a staging operation and a destaging operation.

17. The system of claim 9 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks comprises: determining a margin of error of a threshold storage parameter value of a storage parameter and a current value of the storage parameter; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and the margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and the margin of error.

18. The system of claim 10 , wherein the function performs one of: alternating using the first margin of error and the second margin of error to determine the adjusted number of stage tasks and the adjusted number of destage tasks during different iterations of performing the retraining the machine learning module; and applying both the first margin of error and the second margin of error to the computed number of stage tasks and the computed number of destage tasks to determine the adjusted number of stage tasks and the adjusted number of destage tasks, respectively.

19. The system of claim 11 , wherein the function uses the first margin of error and the second margin of error to increase the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are greater than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being positive, wherein the function uses the first margin of error and the second margin of error to decrease the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are less than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being negative.

20. The system of claim 9 , wherein the storage unit is configured as Redundant Array of Independent Disk (RAID) ranks, wherein each of the RAID ranks is comprised of storage devices, wherein there is storage performance information for each of the RAID ranks, wherein the adjusted number of stage tasks and the adjusted number of destage tasks are determined separately for each of the RAID ranks, and wherein the machine learning module comprises one of at least one machine learning module that is retrained, for each RAID rank of the RAID ranks, with the storage performance information for the RAID rank to produce the adjusted number of stage tasks and the adjusted number of destage tasks for the RAID rank.

21. The method of claim 12 , wherein the storage unit comprises a Redundant Array of Independent Disk (RAID) rank of a plurality of RAID ranks, wherein each of the RAID ranks is comprised of storage devices, and wherein there is storage performance information for each of the RAID ranks, wherein a non-volatile storage (NVS) unit stores modified data in the cache, wherein the storage performance information for each RAID rank of the RAID ranks comprises at least one of: number of tasks queued for staging operations; speed of at least one storage device in which the RAID rank is stored; overall NVS usage of tracks from all the RAID ranks; NVS usage by the RAID rank; maximum NVS usage allowed for the RAID rank; RAID rank response time for processing I/O requests with respect to the storage unit; device adaptor bandwidth utilized in a device adaptor that transfers data between the cache and the storage unit; maximum device adaptor bandwidth that is available for transfer of data; and optimum device adaptor bandwidth.

22. The method of claim 12 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks and retraining the machine learning module are performed in response to completing one of a staging operation and a destaging operation.

23. The method of claim 12 , wherein the determining the adjusted number of stage tasks and the adjusted number of destage tasks comprises: determining a margin of error of a threshold storage parameter value of a storage parameter and a current value of the storage parameter; determining the adjusted number of stage tasks as a function of the computed number of stage tasks and the margin of error; and determining the adjusted number of destage tasks as a function of the computed number of destage tasks and the margin of error.

24. The method of claim 13 , wherein the function performs one of: alternating using the first margin of error and the second margin of error to determine the adjusted number of stage tasks and the adjusted number of destage tasks during different iterations of performing the retraining the machine learning module; and applying both the first margin of error and the second margin of error to the computed number of stage tasks and the computed number of destage tasks to determine the adjusted number of stage tasks and the adjusted number of destage tasks, respectively.

25. The method of claim 24 , wherein the function uses the first margin of error and the second margin of error to increase the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are greater than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being positive, wherein the function uses the first margin of error and the second margin of error to decrease the computed number of stage tasks and the computed number of destage tasks such that the adjusted number of stage tasks and the adjusted number of destage tasks are less than the computed number of stage tasks and the computed number of destage tasks, respectively, in response to the first margin of error and the second margin of error being negative.

26. The method of claim 12 , wherein the storage unit is configured as Redundant Array of Independent Disk (RAID) ranks, wherein each of the RAID ranks is comprised of storage devices, wherein there is storage performance information for each of the RAID ranks, wherein the adjusted number of stage tasks and the adjusted number of destage tasks are determined separately for each of the RAID ranks, and wherein the machine learning module comprises one of at least one machine learning module that is retrained, for each RAID rank of the RAID ranks, with the storage performance information for the RAID rank to produce the adjusted number of stage tasks and the adjusted number of destage tasks for the RAID rank.

27. The method of claim 12 , wherein the storage unit comprises a Redundant Array of Independent Disk (RAID) rank of a plurality of RAID ranks, wherein each of the RAID ranks is comprised of storage devices, and wherein there is storage performance information for each of the RAID ranks, wherein a non-volatile storage (NVS) unit stores modified data in the cache, wherein the storage performance information for each RAID rank of the RAID ranks comprises at least one of: number of tasks queued for staging operations; speed of at least one storage device in which the RAID rank is stored; overall NVS usage of tracks from all the RAID ranks; NVS usage by the RAID rank; maximum NVS usage allowed for the RAID rank; RAID rank response time for processing I/O requests with respect to the storage unit; device adaptor bandwidth utilized in a device adaptor that transfers data between the cache and the storage unit; maximum device adaptor bandwidth that is available for transfer of data; and optimum device adaptor bandwidth.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06F G06N

Patent Metadata

Filing Date

September 20, 2018

Publication Date

March 22, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search