A method, article of manufacture, and apparatus for populating an index cache on a deduplicated storage system is discussed. A determination to flush an in-memory index to a hard drive (“HDD”) on the deduplicated storage system is made, wherein the in-memory index comprises in-memory hash buckets containing fingerprint identifiers and container identifiers. A first HDD index is loaded from the HDD into a memory, wherein the first index includes a plurality of HDD buckets. The fingerprint identifiers and the container identifiers are merged from the in-memory hash buckets into the HDD buckets. The HDD buckets are mapped to a plurality of solid state drive (“SSD”) buckets, the SSD buckets together comprising a SSD index. The fingerprint identifiers and container identifiers are inserted into the plurality of SSD buckets.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for populating an index cache on a deduplicated storage system, the method comprising: determining to flush an in-memory index to a hard drive (“HDD”) on the deduplicated storage system, wherein the in-memory index comprises a plurality of in-memory hash buckets, each in-memory hash bucket containing a plurality of fingerprints of a plurality of deduplicated segments and container identifiers identifying containers storing the deduplicated segments, wherein each of the in-memory hash buckets is identified by a hash bucket identifier that is generated by hashing each fingerprint stored in the corresponding in-memory hash bucket; loading a first HDD index from the HDD into a memory, wherein the first HDD index includes a plurality of HDD buckets; merging the fingerprints and the container identifiers from the in-memory hash buckets into the HDD buckets; mapping the HDD buckets to a plurality of solid state drive (“SSD”) buckets including determining a scaling factor between the HDD buckets and the SSD buckets, the SSD buckets together comprising a SSD index, wherein the SSD index is utilized to access deduplicated segments stored in an SSD device operating as a data cache device; and inserting the fingerprints and container identifiers into the plurality of SSD buckets.
2. The method of claim 1 , further comprising writing the HDD index back to the HDD.
3. The method of claim 1 , further comprising writing the SSD index to the SSD device.
4. The method of claim 1 , further comprising storing a lowest container identifier in the SSD buckets.
5. The method of claim 4 , wherein container identifiers stored in the SSD buckets comprise an offset from the lowest container identifier stored in the SSD buckets.
6. A system for populating an index cache on a deduplicated storage system, the system comprising a non-transitory computer readable medium and processor enabled to execute instructions for: determining to flush an in-memory index to a hard drive (“HDD”) on the deduplicated storage system, wherein the in-memory index comprises in-memory hash buckets, each in-memory hash bucket containing a plurality of fingerprints of a plurality of deduplicated segments and container identifiers identifying containers storing the deduplicated segments, wherein each of the in-memory hash buckets is identified by a hash bucket identifier that is generated by hashing each fingerprint stored in the corresponding in-memory hash bucket; loading a first HDD index from the HDD into a memory, wherein the first HDD index includes a plurality of HDD buckets; merging the fingerprints and the container identifiers from the in-memory hash buckets into the HDD buckets; mapping the HDD buckets to a plurality of solid state drive (“SSD”) buckets including determining a scaling factor between the HDD buckets and the SSD buckets, the SSD buckets together comprising a SSD index, wherein the SSD index is utilized to access deduplicated segments stored in an SSD device operating as a data cache device; and inserting the fingerprints and container identifiers into the plurality of SSD buckets.
7. The system of claim 6 , further comprising writing the HDD index back to the HDD.
8. The system of claim 6 , further comprising writing the SSD index to the SSD device.
9. The system of claim 6 , further comprising storing a lowest container identifier in the SSD buckets.
10. The system of claim 9 , wherein container identifiers stored in the SSD buckets comprise an offset from the lowest container identifier stored in the SSD buckets.
11. A non-transitory computer readable storage medium comprising processor instructions for populating an index cache on a deduplicated storage system, the instructions comprising: determining to flush an in-memory index to a hard drive (“HDD”) on the deduplicated storage system, wherein the in-memory index comprises in-memory hash buckets, each in-memory hash bucket containing a plurality of fingerprints of a plurality of deduplicated segments and container identifiers identifying containers storing the deduplicated segments, wherein each of the in-memory hash buckets is identified by a hash bucket identifier that is generated by hashing each fingerprint stored in the corresponding in-memory hash bucket; loading a first HDD index from the HDD into a memory, wherein the first HDD index includes a plurality of HDD buckets; merging the fingerprints and the container identifiers from the in-memory hash buckets into the HDD buckets; mapping the HDD buckets to a plurality of solid state drive (“SSD”) buckets including determining a scaling factor between the HDD buckets and the SSD buckets, the SSD buckets together comprising a SSD index, wherein the SSD index is utilized to access deduplicated segments stored in an SSD device operating as a data cache device; and inserting the fingerprints and container identifiers into the plurality of SSD buckets.
12. The computer readable storage medium of claim 11 , further comprising writing the HDD index back to the HDD.
13. The computer readable storage medium of claim 11 , further comprising writing the SSD index to the SSD device.
14. The computer readable storage medium of claim 11 , further comprising storing a lowest container identifier in the SSD buckets.
15. The computer readable storage medium of claim 14 , wherein container identifiers stored in the SSD buckets comprise an offset from the lowest container identifier stored in the SSD buckets.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
March 24, 2016
January 8, 2019
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.