Legal claims defining the scope of protection, as filed with the USPTO.
1. A distributed computing system for storing a set of client data blocks, the system comprising: a capacity storage tier including a first plurality of storage disks storing a capacity data object structuring the set of client data blocks as a plurality data stripes that are erasure coded (EC) and distributed across the first plurality of disks, wherein each of the plurality of data stripes includes a subset of the set of client data blocks and corresponding parity data for the subset of client data blocks; a performance storage tier including a second plurality of storage disks storing a metadata object structuring log data as a B-tree that includes a plurality of leaf nodes and a plurality of index nodes that include pointers to each of the plurality of leaf nodes, wherein the plurality of leaf nodes encodes an address map indicating, for each client data block of the set of client data blocks, a correspondence between a logical address associated with a first layer of the system and a physical address associated with a second layer of the system; one or more processors; and a memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for: determining a storage volume associated with the plurality of leaf nodes that are stored on the performance tier; in response to the storage volume of the plurality of leaf nodes stored on the performance tier being greater than a predetermined volume threshold, migrating at least a portion of the plurality of leaf nodes to one or more of the plurality of data stripes of the capacity storage; updating a portion of the plurality of index nodes that include pointers to the migrated portion of the plurality of leaf nodes to include updated pointers to physical addresses of the one or more of the plurality of data stripes of the capacity storage that store the migrated portion of the plurality of leaf nodes; and re-allocating a portion of the performance tier that stored the migrated portion of the leaf nodes to store additional log data.
2. The system of claim 1 , wherein a first leaf node of the plurality of leaf nodes encodes a key-value pair indicating an entry in the address map for one or more client data blocks of the set of client data blocks, the key of the key-value pair indicating the logical address of a first client data block of the one or more client data blocks, and the value of the key-value pair indicating the corresponding physical address of the first client data block.
3. The system of claim 2 , wherein the value of the key-value pair further indicates a number of the one or more client data blocks that have contiguous logical addresses and contiguous physical addresses.
4. The system of claim 2 , wherein the value of the key-value pair further indicates checksum data for each of the one or more client data blocks.
5. The system of claim 1 , wherein the capacity tier is managed by a log-structured file system (LFS) and the performance tier is managed by another file system that enables overwriting the log data.
6. The system of claim 1 , wherein the B-tree is a B++tree.
7. The system of claim 1 , wherein a storage volume of each of the plurality of leaf pages is 512 bytes.
8. The system of claim 1 , wherein prior to migration to the to one or more of the plurality of data stripes, each leaf node in the migrated portion of leaf nodes was stored in a memory bank of the capacity tier.
9. The system of claim 1 , wherein the input/output (I/O) throughput of the performance tier is at least seven times greater than a I/O throughput of the capacity tier.
10. The system of claim 1 , wherein the first plurality of disks is arranged in a 4+2 RAID 6 configuration and the second plurality of disks is arranged in a 3-way mirroring configuration.
11. A method for employing a distributed-computing system to store a set of client data blocks, wherein the system includes: a capacity storage tier including a first plurality of storage disks storing a capacity data object structuring the set of client data blocks as a plurality data stripes that are erasure coded (EC) and distributed across the first plurality of disks, wherein each of the plurality of data stripes includes a subset of the set of client data blocks and corresponding parity data for the subset of client data blocks; and a performance storage tier including a second plurality of storage disks storing a metadata object structuring log data as a B-tree that includes a plurality of leaf nodes and a plurality of index nodes that include pointers to each of the plurality of leaf nodes, wherein the plurality of leaf nodes encodes an address map indicating, for each client data block of the set of client data blocks, a correspondence between a logical address associated with a first layer of the system and a physical address associated with a second layer of the system, wherein the method comprises: determining a storage volume associated with the plurality of leaf nodes that are stored on the performance tier; in response to the storage volume of the plurality of leaf nodes stored on the performance tier is greater than a predetermined volume threshold, migrating at least a portion of the plurality of leaf nodes to one or more of the plurality of data stripes of the capacity storage; updating a portion of the plurality of index nodes that include pointers to the migrated portion of the plurality of leaf nodes to include updated pointers to physical addresses of the one or more of the plurality of data stripes of the capacity storage that store the migrated portion of the plurality of leaf nodes; and re-allocating a portion of the performance tier that stored the migrated portion of the leaf nodes to store additional log data.
12. The method of claim 11 , wherein a first leaf node of the plurality of leaf nodes encodes a key-value pair indicating an entry in the address map for one or more client data blocks of the set of client data blocks, the key of the key-value pair indicating the logical address of a first client data block of the one or more client data blocks, and the value of the key-value pair indicating the corresponding physical address of the first client data block.
13. The method of claim 12 , wherein the value of the key-value pair further indicates a number of the one or more client data blocks that have contiguous logical addresses and contiguous physical addresses.
14. The method of claim 12 , wherein the value of the key-value pair further indicates checksum data for each of the one or more client data blocks.
15. The method of claim 11 , wherein the capacity tier is managed by a log-structured file system (LFS) and the performance tier is managed by another file system that enables overwriting the log data.
16. The method of claim 11 , wherein the B-tree is a B++tree.
17. The method of claim 11 , wherein a storage volume of each of the plurality of leaf pages is 512 bytes.
18. The method of claim 11 , wherein prior to migration to the one or more of the plurality of data stripes, each leaf node in the migrated portion of leaf nodes was stored in a memory bank of the capacity tier.
19. The method of claim 11 , wherein the input/output (I/O) throughput of the performance tier is at least seven times greater than a I/O throughput of the capacity tier.
20. The method of claim 11 , wherein the first plurality of disks is arranged in a 4+2 RAID 6 configuration and the second plurality of disks is arranged in a 3-way mirroring configuration.
21. A non-transitory computer-readable storage medium storing one or more programs configured to be executed by one or more components operating in a distributed-computing system, the one or more components having one or more processors and memory, the one or more programs including instructions and the system includes: a capacity storage tier including a first plurality of storage disks storing a capacity data object structuring the set of client data blocks as a plurality data stripes that are erasure coded (EC) and distributed across the first plurality of disks, wherein each of the plurality of data stripes includes a subset of the set of client data blocks and corresponding parity data for the subset of client data blocks; and a performance storage tier including a second plurality of storage disks storing a metadata object structuring log data as a B-tree that includes a plurality of leaf nodes and a plurality of index nodes that include pointers to each of the plurality of leaf nodes, wherein the plurality of leaf nodes encodes an address map indicating, for each client data block of the set of client data blocks, a correspondence between a logical address associated with a first layer of the system and a physical address associated with a second layer of the system, wherein the instructions are for: determining a storage volume associated with the plurality of leaf nodes that are stored on the performance tier; in response to the storage volume of the plurality of leaf nodes stored on the performance tier is greater than a predetermined volume threshold, migrating at least a portion of the plurality of leaf nodes to one or more of the plurality of data stripes of the capacity storage; updating a portion of the plurality of index nodes that include pointers to the migrated portion of the plurality of leaf nodes to include updated pointers to physical addresses of the one or more of the plurality of data stripes of the capacity storage that store the migrated portion of the plurality of leaf nodes; and re-allocating a portion of the performance tier that stored the migrated portion of the leaf nodes to store additional log data.
22. The computer-readable storage medium of claim 21 , wherein a first leaf node of the plurality of leaf nodes encodes a key-value pair indicating an entry in the address map for one or more client data blocks of the set of client data blocks, the key of the key-value pair indicating the logical address of a first client data block of the one or more client data blocks, and the value of the key-value pair indicating the corresponding physical address of the first client data block.
23. The computer-readable storage medium of claim 22 , wherein the value of the key-value pair further indicates a number of the one or more client data blocks that have contiguous logical addresses and contiguous physical addresses.
24. The computer-readable storage medium of claim 22 , wherein the value of the key-value pair further indicates checksum data for each of the one or more client data blocks.
25. The computer-readable storage medium of claim 21 , wherein the capacity tier is managed by a log-structured file system (LFS) and the performance tier is managed by another file system that enables overwriting the log data.
26. The computer-readable storage medium of claim 21 , wherein the binary B-tree is a B++tree.
27. The computer-readable storage medium of claim 21 , wherein a storage volume of each of the plurality of leaf pages is 512 bytes.
28. The computer-readable storage medium of claim 21 , wherein prior to migration to the to one or more of the plurality of data stripes, each leaf node in the migrated portion of leaf nodes was stored in a memory bank of the capacity tier.
29. The computer-readable storage medium of claim 21 , wherein the input/output (I/O) throughput of the performance tier is at least seven times greater than a I/O throughput of the capacity tier.
30. The computer-readable storage medium of claim 21 , wherein the first plurality of disks is arranged in a 4+2 RAID 6 configuration and the second plurality of disks is arranged in a 3-way mirroring configuration.
Unknown
October 12, 2021
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.