Legal claims defining the scope of protection, as filed with the USPTO.
1. A method comprising: identifying, by a storage controller of a first storage node in a plurality of storage nodes coupled via a network within a distributed storage system, a failing storage device of the first storage node; determining that an inaccessible data segment of the failing storage device cannot be recovered using a low-level data protection scheme; responsive to the determination, identifying a first chunk of data associated with the inaccessible data segment and a group of data chunks associated with the first chunk of data; selectively retrieving, via the network, a second chunk of data from a second storage node in the plurality of storage nodes, wherein the second chunk of data is also associated with the group of data chunks, and wherein the selectively retrieved data does not include data associated with an accessible data segment of the first storage node; recovering the first chunk of data using an upper-level data protection scheme and the second chunk of data; storing the recovered inaccessible data segment on a replacement storage device; and transferring the accessible data segment from the failing storage device to the replacement storage device.
2. The method of claim 1 , wherein storing the recovered inaccessible data segment comprises: storing the recovered inaccessible data segment on the replacement storage device at a location corresponding to a logical block address of the inaccessible data segment.
3. The method of claim 1 , wherein the first storage node further has an inaccessible protection segment stored thereupon, the method further comprising recovering the inaccessible protection segment using the low-level data protection scheme and the recovered inaccessible data segment.
4. The method of claim 1 , wherein the inaccessible data segment is stored on a RAID array of storage devices of the first storage node, and wherein the low-level data protection scheme is a RAID data protection scheme.
5. The method of claim 4 , wherein the RAID data protection scheme is selected from the group consisting of RAID 1, RAID 5, and RAID 6.
6. The method of claim 1 , wherein the upper-level data protection scheme is selected from the group consisting of: a Reed-Solomon erasure code protection scheme and a Tornado erasure code protection scheme.
7. The method of claim 1 , wherein selectively retrieving the second chunk of data includes retrieving a subset of chunks of the group, wherein the subset includes the second chunk, and wherein the subset includes less than all chunks of the group.
8. The method of claim 1 , wherein the upper-level data protection scheme defines a minimum number of chunks to reconstruct an inaccessible chunk.
9. The method of claim 1 , wherein the second chunk of data is part of a subset of data chunks in the group associated with the first chunk of data, and the subset is determined based on at least one of storage node load, storage node capacity, storage node health, or a network quality of service factor.
10. The method of claim 1 , wherein the plurality of storage nodes are physically arranged across different facilities at different sites within the distributed storage system, and node arrangement is determined based on at least one of a cost, a fault tolerance, a network infrastructure, or a geography of hosts.
11. A non-transitory machine-readable medium having stored thereon instructions for performing a method of data recovery, comprising machine executable code which when executed by at least one machine, causes the machine to: identify, by a storage controller of a first storage node in a plurality of storage nodes coupled via a network within a distributed storage system, an inaccessible data segment of the first storage node; identify an accessible data segment of the first storage node; determine that a number of failed storage devices exceeds a maximum supported by a low-level data protection scheme; identify a first chunk of data associated with the inaccessible data segment and a group of data chunks associated with the first chunk of data; selectively retrieve, via the network, a second chunk of data from at least one second storage node in the plurality of storage nodes, wherein the second chunk of data is also associated with the group of data chunks and the selectively retrieved data does not include data associated with the accessible data segment of the first storage node; recover the inaccessible data segment using an upper-level data protection scheme and the second chunk of data; store the inaccessible data segment on a replacement storage device; and transfer the accessible data segment from the first storage node to the replacement storage device.
12. The non-transitory machine-readable medium of claim 11 comprising further machine executable code which causes the machine to: recover a recovery segment of the first storage node using the recovered inaccessible data segment and the low-level data protection scheme.
13. The non-transitory machine-readable medium of claim 11 , wherein the low-level data protection scheme is a RAID data protection scheme.
14. The non-transitory machine-readable medium of claim 13 , wherein the upper-level data protection scheme is selected from the group consisting of: a Reed-Solomon erasure code protection scheme and a Tornado erasure code protection scheme.
15. The non-transitory machine-readable medium of claim 11 , wherein the upper-level data protection scheme defines a minimum number of data chunks to recover the inaccessible data segment, and wherein the first chunk of data includes more than the minimum number.
16. The non-transitory machine-readable medium of claim 15 , comprising further machine executable code which causes the machine to verify the inaccessible data segment using the first set of data structures.
17. A computing device comprising: a memory containing a machine-readable medium comprising machine executable code having stored thereon instructions for performing a method of data recovery; and a processor coupled to the memory, the processor configured to execute the machine executable code to: identify, by a storage controller of a first storage node in a plurality of storage nodes coupled via a network within a distributed storage system, a failing storage device of the first storage node; determine that an inaccessible data segment of the failing storage device cannot be recovered using a RAID data recovery technique; identify a first chunk of data associated with the inaccessible data segment and a group of data chunks associated with the first chunk of data within the distributed storage system; selectively retrieve a second chunk of data from a second storage node in the plurality of storage nodes via the network, wherein the second chunk of data is also associated with the group of data chunks and the selectively retrieved data does not include data associated with an accessible data segment of the first storage node; recreate the inaccessible data segment using an upper-level data protection scheme and the second chunk of data; write the recreated inaccessible data segment to a replacement storage device; and transfer the accessible data segment from the failing storage device to the replacement storage device.
18. The computing device of claim 17 , wherein the RAID data recovery technique includes one of a RAID 1, RAID 5, or RAID 6 data recovery technique.
19. The computing device of claim 17 , wherein the upper-level data protection scheme is at least one of a Reed-Solomon erasure code technique or a Tornado erasure code technique.
Unknown
July 23, 2019
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.