Legal claims defining the scope of protection, as filed with the USPTO.
1. A method of assigning data to storage device nodes in a data storage system, wherein the data storage system stores a number M of replicas of the data, wherein M is greater than or equal to 2, the method comprising: dividing the data into a plurality of groups of segments and for each group of segments, identifying storage device nodes that have sufficient resources available to accommodate a requirement of the data, the requirement including at least one of a reliability requirement, a capacity requirement and a performance requirement, and when a number of the storage device nodes identified by said identifying is greater than M, assigning the data to M randomly selected storage device nodes from among those identified, and when the number of the identified storage device nodes is equal to M, assigning the data to the M identified storage device nodes, and when the number of the identified storage device nodes is less than M, dividing the group of data segments thereby forming a group of data segments having a reduced requirement and identifying storage device nodes that have sufficient resources available to accommodate the reduced requirement, and the method further comprising adding a new storage device node to the data storage system including identifying an existing storage device node that is heavily loaded in comparison to other ones of existing storage device nodes; moving data stored at the identified existing storage device node to the new storage device node; and determining whether the new storage device node is sufficiently loaded in comparison to the existing storage device nodes and when the new storage device node is not sufficiently loaded, repeating said steps of identifying the existing storage device node and moving the data until the new storage device node is sufficiently loaded.
2. The method according to claim 1 , wherein said determining comprises determining an average loading of the existing storage device nodes and when the loading of the new storage device node is at least as great as the average loading, then the new storage device node is sufficiently loaded.
3. The method according to claim 1 , wherein said determining comprises determining an average loading of the existing storage device nodes and when the loading of the new storage device node is within a range bounded by the lowest and highest loading of the existing storage device nodes, then the new storage device node is sufficiently loaded.
4. The method according to claim 3 , wherein the data stored at the identified existing storage device node includes a particular group of data segments selected from among a plurality of groups of segments stored at the identified existing node.
5. The method according to claim 4 , wherein the particular group of data segments is selected according to its size.
6. The method of claim 1 , further comprising: removing data from a particular storage device node in the data storage system, wherein removing the data comprises: selecting data from the particular storage device node to be removed; identifying other storage device nodes of the data storage system having sufficient resources available to accommodate a requirement of the selected data; moving the selected data to a randomly selected storage device node from among the identified other storage device nodes; and repeating said steps of selecting data, identifying other storage device nodes, and moving the selected data until the particular storage device node to be removed is empty.
7. The method according to claim 6 , further comprising removing the particular storage device node from the data storage system.
8. The method according to claim 6 , wherein when another storage device node of the data storage system having sufficient resources available to accommodate a requirement of the selected data is not identified, dividing the selected data thereby forming a group of data segments having a reduced requirement.
9. The method according to claim 8 , wherein a storage device node is identified as having sufficient resources for the selected data only when available capacity of the storage device node is at least as great as a capacity requirement of the selected data.
10. The method according to claim 9 , wherein a storage device node is identified as having sufficient resources for the selected data only when an available performance parameter of the storage device node is at least as great as a corresponding performance requirement of the selected data.
11. The method according to claim 1 , wherein identifying the existing storage device node is performing by comparing utilization of the existing storage device nodes.
12. The method of claim 1 , further comprising: removing data from a particular storage device node in the data storage system, wherein removing the data comprises: selecting data from the particular storage device node is to be removed; randomly selecting at least one other storage device node in the data storage system; determining whether the at least one other randomly selected storage device node has sufficient resources available to accommodate a requirement of the selected data; moving the selected data to one of the at least one randomly selected storage device node having sufficient resources available to accommodate a requirement of the selected data; and repeating said steps of selecting data, randomly selecting, determining and moving the selected data until the particular storage device node to be removed is empty.
13. The method according to claim 12 , wherein when another storage device node of the data storage system having sufficient resources available to accommodate a requirement of the selected data is not identified, dividing the selected data thereby forming a group of data segments having a reduced requirement.
14. The method according to claim 13 , wherein a storage device node is identified as having sufficient resources for the data only when available capacity of the storage device node is at least as great as a capacity requirement of the data.
15. The method according to claim 14 , wherein a storage device node is identified as having sufficient resources for the selected data only when an available performance parameter of the storage device node is at least as great as a corresponding performance requirement of the selected data.
16. The method according to claim 14 , wherein a storage device node is identified as having sufficient resources for the selected data only when availability of the storage device node and other storage device nodes to which the data is assigned is at least as great as a corresponding availability requirement of the selected data.
17. The method according to claim 14 , wherein a storage device node is identified as having sufficient resources for the selected data only when reliability of the storage device node and other storage device nodes to which the data is assigned is at least as great as a corresponding reliability requirement of the selected data.
18. The method of claim 1 , wherein dividing the group of data segments to have the reduced requirement comprises dividing the group of data segments into two or more smaller groups.
19. The method of claim 1 , wherein dividing the group of data segments to have the reduced requirement comprises reassigning one or more of the segments in the group to a different group.
20. A data storage system comprising: storage device nodes to store M replicas of data, wherein M is greater than or equal to 2; at least one central processing unit (CPU) configured to: divide the data into a plurality of groups of segments and for each group of segments, identify storage device nodes that have sufficient resources available to accommodate a requirement of the data, the requirement including at least one of a reliability requirement, a capacity requirement and a performance requirement, and when a number of the storage device nodes identified by said identifying is greater than M, assign the data to M randomly selected storage device nodes from among those identified, and when the number of the identified storage device nodes is equal to M, assign the data to the M identified storage device nodes, and when the number of the identified storage device nodes is less than M, divide the group of data segments thereby forming a group of data segments having a reduced requirement and identifying storage device nodes that have sufficient resources available to accommodate the reduced requirement, and in response to addition of a new storage device node, the at least one CPU is configured to further: identify an existing storage device node that is heavily loaded in comparison to other ones of existing storage device nodes; move data stored at the identified existing storage device node to the new storage device node; and determine whether the new storage device node is sufficiently loaded in comparison to the existing storage device nodes and when the new storage device node is not sufficiently loaded, repeating identifying the existing storage device node and moving the data until the new storage device node is sufficiently loaded.
21. The data storage system of claim 20 , wherein dividing the group of data segments to have the reduced requirement comprises dividing the group of data segments into two or more smaller groups.
22. The data storage system of claim 20 , wherein dividing the group of data segments to have the reduced requirement comprises reassigning one or more of the segments in the group to a different group.
Unknown
July 8, 2014
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.