Legal claims defining the scope of protection, as filed with the USPTO.
1. A computer-implemented method for garbage collection of a storage system, the method comprising: scanning, by a garbage collector executed by a processor, a plurality of containers in a storage device of a storage system, each of the containers containing a plurality of segments that constitute a plurality of files, wherein each file is represented by a file tree having a plurality of segments in a plurality of levels in a hierarchical structure; creating a plurality of container live segment records (LSRs) corresponding to one of the containers, each of the container LSRs including a plurality of segment LSRs corresponding to a plurality of segments contained therein; after the segment LSRs of the container LSRs have been created for all segments of the containers, sequentially traversing the segment LSRs of the container LSRs based on levels of segments specified in the corresponding segment LSRs to determine and indicate in the corresponding segment LSRs whether the segments are live segments; and after all of the segment LSRs of the container LSRs have been traversed, performing a garbage collection operation to reclaim storage space of segments that are not live segments indicated in the segment LSRs of the container LSRs, without traversing the file system namespace again.
2. The method of claim 1 , wherein the segments of the file are deduplicated segments contained in one or more containers stored in the persistent storage, and wherein at least a portion of the segments are shared by a plurality of files in the file system namespace.
3. The method of claim 1 , further comprising: translating the segment LSRs of the container LSRs into a plurality of persistent LSRs, each persistent LSR corresponding to one of the containers of the storage system, wherein each persistent LSR includes information indicating whether each of the segments contained in a corresponding container is a dead segment; and storing the persistent LSRs in a persistent LSR file in a persistent storage device of the storage system, wherein the garbage collection operation is performed based on the persistent LSRs to reclaim the storage space of segments that are dead.
4. The method of claim 3 , wherein each persistent LSR includes a container identifier (ID) identifying a corresponding container and a dead bitmap having a plurality of bits, wherein each bit of the dead bitmap corresponds to one of a plurality segments contained in a corresponding container.
5. The method of claim 4 , wherein a bit having a predetermined logical value indicates that a corresponding segment is a dead segment.
6. The method of claim 1 , wherein traversing the segment LSRs of the container LSRs based on levels of segments comprises, for each of the levels, iteratively performing: identifying a first segment of a current level from a wanted vector associated with the current level; retrieving the first segment from a first container containing the first segment to determine whether the first segment exists in the first container; and marking a first live flag of a first segment LSR corresponding to the first segment to indicate that the first segment is a live segment.
7. The method of claim 6 , further comprising: identifying one or more child segments from a data section of the first segment; and for each of the identified child segments, adding the child segment in a wanted vector of a child level of the current level.
8. The method of claim 6 , further comprising: adding the first segment into a found vector corresponding to the current level; and comparing the wanted vector and the found vector of the current level to identify any segment that is missing.
9. The method of claim 8 , further comprising recovering the missing segment from a redundant storage system.
10. A non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations for garbage collection of a storage system, the operations comprising: scanning, by a garbage collector executed by a processor, a plurality of containers in a storage device of a storage system, each of the containers containing a plurality of segments that constitute a plurality of files, wherein each file is represented by a file tree having a plurality of segments in a plurality of levels in a hierarchical structure; creating a plurality of container live segment records (LSRs) corresponding to one of the containers, each of the container LSRs including a plurality of segment LSRs corresponding to a plurality of segments contained therein; after the segment LSRs of the container LSRs have been created for all segments of the containers, sequentially traversing the segment LSRs of the container LSRs based on levels of segments specified in the corresponding segment LSRs to determine and indicate in the corresponding segment LSRs whether the segments are live segments; and after all of the segment LSRs of the container LSRs have been traversed, performing a garbage collection operation to reclaim storage space of segments that are not live segments indicated in the segment LSRs of the container LSRs, without traversing the file system namespace again.
11. The non-transitory machine-readable medium of claim 10 , wherein the segments of the file are deduplicated segments contained in one or more containers stored in the persistent storage, and wherein at least a portion of the segments are shared by a plurality of files in the file system namespace.
12. The non-transitory machine-readable medium of claim 10 , wherein the operations further comprise: translating the segment LSRs of the container LSRs into a plurality of persistent LSRs, each persistent LSR corresponding to one of the containers of the storage system, wherein each persistent LSR includes information indicating whether each of the segments contained in a corresponding container is a dead segment; and storing the persistent LSRs in a persistent LSR file in a persistent storage device of the storage system, wherein the garbage collection operation is performed based on the persistent LSRs to reclaim the storage space of segments that are dead.
13. The non-transitory machine-readable medium of claim 12 , wherein each persistent LSR includes a container identifier (ID) identifying a corresponding container and a dead bitmap having a plurality of bits, wherein each bit of the dead bitmap corresponds to one of a plurality segments contained in a corresponding container.
14. The non-transitory machine-readable medium of claim 13 , wherein a bit having a predetermined logical value indicates that a corresponding segment is a dead segment.
15. The non-transitory machine-readable medium of claim 10 , wherein traversing the segment LSRs of the container LSRs based on levels of segments comprises, for each of the levels, iteratively performing: identifying a first segment of a current level from a wanted vector associated with the current level; retrieving the first segment from a first container containing the first segment to determine whether the first segment exists in the first container; and marking a first live flag of a first segment LSR corresponding to the first segment to indicate that the first segment is a live segment.
16. The non-transitory machine-readable medium of claim 15 , wherein the operations further comprise: identifying one or more child segments from a data section of the first segment; and for each of the identified child segments, adding the child segment in a wanted vector of a child level of the current level.
17. The non-transitory machine-readable medium of claim 15 , wherein the operations further comprise: adding the first segment into a found vector corresponding to the current level; and comparing the wanted vector and the found vector of the current level to identify any segment that is missing.
18. The non-transitory machine-readable medium of claim 17 , wherein the operations further comprise recovering the missing segment from a redundant storage system.
19. A storage system, comprising: a processor; a memory coupled to the processor; a garbage collector executed in the memory by the processor to perform operations of garbage collection, the operations including scanning a plurality of containers in a storage device of a storage system, each of the containers containing a plurality of segments that constitute a plurality of files, wherein each file is represented by a file tree having a plurality of segments in a plurality of levels in a hierarchical structure, creating a plurality of container live segment records (LSRs) corresponding to one of the containers, each of the container LSRs including a plurality of segment LSRs corresponding to a plurality of segments contained therein, after the segment LSRs of the container LSRs have been created for all segments of the containers, sequentially traversing the segment LSRs of the container LSRs based on levels of segments specified in the corresponding segment LSRs to determine and indicate in the corresponding segment LSRs whether the segments are live segments, and after all of the segment LSRs of the container LSRs have been traversed, performing a garbage collection operation to reclaim storage space of segments that are not live segments indicated in the segment LSRs of the container LSRs, without traversing the file system namespace again.
20. The storage system of claim 19 , wherein the segments of the file are deduplicated segments contained in one or more containers stored in the persistent storage, and wherein at least a portion of the segments are shared by a plurality of files in the file system namespace.
21. The storage system of claim 19 , wherein the operations further comprise: translating the segment LSRs of the container LSRs into a plurality of persistent LSRs, each persistent LSR corresponding to one of the containers of the storage system, wherein each persistent LSR includes information indicating whether each of the segments contained in a corresponding container is a dead segment; and storing the persistent LSRs in a persistent LSR file in a persistent storage device of the storage system, wherein the garbage collection operation is performed based on the persistent LSRs to reclaim the storage space of segments that are dead.
22. The storage system of claim 21 , wherein each persistent LSR includes a container identifier (ID) identifying a corresponding container and a dead bitmap having a plurality of bits, wherein each bit of the dead bitmap corresponds to one of a plurality segments contained in a corresponding container.
23. The storage system of claim 22 , wherein a bit having a predetermined logical value indicates that a corresponding segment is a dead segment.
Unknown
March 14, 2017
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.