Legal claims defining the scope of protection, as filed with the USPTO.
1. A method of recovering information in a distributed storage system, comprising: maintaining a list of entries corresponding to values for storing on a first storage device in the distributed storage system; defining a range of wait time intervals between convergence rounds on the list of entries, wherein each convergence round attempts to converge values corresponding to the entries to an At Maximum Redundancy (AMR) state; performing a first convergence round by a first processing device on the list of entries; and scheduling a second convergence round for the first processing device to perform on the list of entries by selecting a wait time interval from the defined range of wait time intervals.
2. The method as recited in claim 1 , wherein the wait time interval is selected by the first processing device at the end of the first convergence round.
3. The method as recited in claim 1 , further comprising: executing, by the first processing device, a convergence step on a first value corresponding to a first entry on the list of entries; and if the convergence step does not converge the first value to an AMR state, then setting a step-wait time for starting a subsequent convergence step on the first value.
4. The method as recited in claim 3 , further comprising: performing the second convergence round; and executing the subsequent convergence step on the first value during the second convergence round only if the step-wait time has expired.
5. The method as recited in claim 1 , further comprising: executing by the first processing device, convergence steps on a first value corresponding to a first entry on the list of entries; and after each unsuccessful convergence step, increasing a maximum step-wait time for starting a subsequent convergence step on the first value, and selecting a step-wait time for the subsequent convergence step that is less than or equal to the increased maximum step-wait time.
6. The method as recited in claim 1 , further comprising: encoding a first value into a plurality of fragments for storage across a first storage domain and a second storage domain remote from the first storage domain; and executing, at the first storage domain, a convergence step on the first value, wherein the executing comprises: determining, at the first storage domain, whether sibling fragments are missing from the second storage domain; if the sibling fragments are missing from the second storage domain, notifying the second storage domain of an intent to recover the missing sibling fragments at the first storage domain; and in response to the notification, delaying, at the second storage domain, execution of a corresponding convergence step on the first value.
7. The method as recited in claim 6 , further comprising: recovering missing sibling fragments at the first storage domain; and transmitting only a sufficient number of missing sibling fragments to the second storage domain to allow the second storage domain to recover all sibling fragments missing from the second storage domain.
8. A distributed storage system, comprising: a local storage domain; and a remote storage domain in communication with the local storage domain via a network, each of the local and remote storage domains comprising: a fragment server having a plurality of storage devices to store encoded fragments corresponding to a plurality of values inserted into the distributed storage system, the fragment server configured to: perform a first round of convergence on values having fragments for storing on the fragment server's storage devices; and schedule a second round of convergence on values that did not achieve an At Maximum Redundancy (AMR) state in the first round of convergence, wherein the fragment server is configured to schedule the second round of convergence to start after a selected time interval from start of the first round of convergence.
9. The system as recited in claim 8 , wherein the fragment server is configured to select the selected time interval from a range of time intervals, and wherein the fragment server is configured to perform the selection after conclusion of the first round of convergence.
10. The system as recited in claim 8 , wherein the fragment server is further configured to: execute a convergence step on a first value corresponding to a first entry on a list of entries; and if the convergence step does not converge the first value to an AMR state, then set a step-wait time for starting a subsequent convergence step on the first value.
11. The system as recited in claim 10 , wherein the fragment server is further configured to: start the second round of convergence; and execute the subsequent convergence step on the first value during the second round of convergence only if the step-wait time has expired.
12. The system as recited in claim 8 , wherein the fragment server is further configured to: execute convergence steps on a first value; and after each unsuccessful convergence step, increase a maximum step-wait time for starting a subsequent convergence step on the first value, and select a step-wait time for the subsequent convergence step that is less than or equal to the increased maximum step-wait time.
13. The system as recited in claim 8 , wherein the fragment server at the local storage domain is further configured to: execute a convergence step on a first value, wherein the executing comprises: determining whether fragments corresponding to the first value are missing from the fragment server at the remote storage domain; and if the fragments corresponding to the first value are missing from the fragment server at the remote storage domain, notifying the fragment server at the remote storage domain of an intent to recover the missing fragments at the local storage domain, wherein, in response to the notification, the fragment server at the remote storage domain is configured to delay execution of a corresponding convergence step on the first value.
14. The system as recited in claim 13 , wherein, during the convergence step for the first value, the fragment server at the local storage domain is configured to: recover fragments corresponding to the first value; and transmit over the network to the remote storage domain only a sufficient number of fragments to allow the fragment server at the remote storage domain to recover missing fragments for the remote storage domain.
15. A distributed storage system, comprising: a first fragment server having a plurality of first storage devices to store encoded fragments corresponding to values inserted into the distributed storage system; and a second fragment server having a plurality of second storage devices to store encoded fragments corresponding to values inserted into the distributed storage system, the second fragment server in communication with the first fragment server via a network, wherein each of the first fragment server and the second fragment server is configured to execute respective convergence steps on a first value having encoded fragments assigned for storage on the first storage devices and the second storage devices, and wherein the second fragment server is configured to delay execution of its respective convergence step on the first value in response to notification that the first fragment server is executing its respective convergence step on the first value.
16. The system as recited in claim 15 , wherein, while executing its respective convergence step on the first value, the first fragment server is configured to determine whether encoded fragments for the first value are missing from the second storage devices and, if the encoded fragments for the first value are missing from the second storage devices, recover the missing encoded fragments for the second storage devices.
17. The system as recited in claim 16 , wherein, if the number of encoded fragments missing from the second storage devices exceeds the minimum number of fragments needed to recover the missing fragments, the first fragment server is configured to transmit to the second fragment server only the minimum number of recovered missing fragments.
18. The system as recited in claim 15 , wherein if execution of its respective convergence step on the first value is unsuccessful, the first fragment server is configured to increase a delay time for executing a subsequent convergence step on the first value.
19. The system as recited in claim 18 , wherein the first fragment server is configured to: maintain a list of values having fragments for storing on the first storage devices; execute a first convergence round on the list; and schedule a start of a second convergence round on the list by selecting a wait time between the first and second convergence rounds from a predefined range of wait times.
20. The system as recited in claim 19 , wherein the first fragment server is configured to execute the subsequent convergence step on the first value during the second convergence round only if the delay time has expired.
Unknown
June 4, 2013
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.