Patentable/Patents/US-9229657
US-9229657

Redistributing data in a distributed storage system based on attributes of the data

PublishedJanuary 5, 2016
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

Accesses to a number of data blocks stored in a distributed storage are observed. Following observation of the accesses, the stored data blocks are redistributed. In one aspect, redistribution of the data blocks includes determining the access patterns for one or more of the data blocks based on the observed accesses, and determining the storage sizes for the one or more data blocks. Thereafter, based on the determined access patterns and determined storage sizes, the one or more data blocks are sorted. Subsequently, the one or more data blocks are redistributed or rebalanced across a number of storage devices of the distributed storage based on the sorting. In one aspect, the one or more data blocks are redistributed according to either a uniform distribution scheme or a proportional distribution scheme.

Patent Claims
9 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A computer-implemented method for redistributing a plurality of data blocks stored in a distributed storage, the method comprising: observing accesses to the plurality of data blocks stored in the distributed storage; and redistributing the plurality of data blocks in the distributed storage, wherein redistribution of the plurality of data blocks in the distributed storage comprises: determining access patterns for one or more data blocks from the plurality of data blocks based on the observed accesses; determining storage sizes for the one or more data blocks from the plurality of data blocks; sorting the one or more data blocks based at least in part on the determined access patterns and on the determined storage sizes, wherein the sorting comprising: assigning the one or more data blocks to a plurality of buckets each bucket associated with a particular access pattern level and data block storage size requirement, wherein assigning the one or more data blocks to the plurality of buckets comprises: matching a determined access pattern and a determined storage size of a particular data block to an access pattern level and a data block storage size requirement of a particular bucket from the plurality of buckets; and assigning the particular data block to the particular bucket based on the matching; and redistributing the one or more data blocks across a plurality of storage devices of the distributed storage based on the sorting of the one or more data blocks, wherein the redistributing comprises: determining a total number of data blocks assigned to a particular bucket from the plurality of buckets; calculating a target number of data blocks for each of the plurality of storage devices for the particular bucket by dividing the determined total number of data blocks by a number of the plurality of storage devices; and redistributing the data blocks assigned to the particular bucket across the plurality of storage devices based on the calculated target number of data blocks for each of the plurality of storage devices for the particular bucket.

2

2. The computer-implemented method of claim 1 , wherein redistributing the one or more data blocks across the plurality of storage devices based on the sorting comprises uniformly redistributing data blocks having similar access patterns and storage sizes across the plurality of storage devices.

3

3. The computer-implemented method of claim 1 , wherein redistributing the data blocks across the plurality of storage devices comprises: selecting a bucket from the plurality of buckets, the bucket having an access pattern level specifying an access time that is more recent than an access time specified by an access pattern level for another bucket from the plurality of data buckets; and redistributing data blocks assigned to the selected bucket prior to redistributing data blocks assigned to the another bucket.

4

4. A non-transitory computer readable storage medium executing computer program instructions for storing data based on access patterns, the computer program instructions comprising instructions for: observing accesses to the plurality of data blocks stored in the distributed storage; and redistributing the plurality of data blocks in the distributed storage, wherein redistribution of the plurality of data blocks in the distributed storage comprises: determining access patterns for one or more data blocks from the plurality of data blocks based on the observed accesses; determining storage sizes for the one or more data blocks from the plurality of data blocks; sorting the one or more data blocks based at least in part on the determined access patterns and on the determined storage sizes, wherein the sorting comprising: assigning the one or more data blocks to a plurality of buckets each bucket associated with a particular access pattern level and data block storage size requirement, wherein assigning the one or more data blocks to the plurality of buckets comprises: matching a determined access pattern and a determined storage size of a particular data block to an access pattern level and a data block storage size requirement of a particular bucket from the plurality of buckets; and assigning the particular data block to the particular bucket based on the matching; and redistributing the one or more data blocks across a plurality of storage devices of the distributed storage based on the sorting of the one or more data blocks, wherein the redistributing comprises: determining a total number of data blocks assigned to a particular bucket from the plurality of buckets; calculating a target number of data blocks for each of the plurality of storage devices for the particular bucket by dividing the determined total number of data blocks by a number of the plurality of storage devices; and redistributing the data blocks assigned to the particular bucket across the plurality of storage devices based on the calculated target number of data blocks for each of the plurality of storage devices for the particular bucket.

5

5. The medium of claim 4 , wherein redistributing the one or more data blocks across the plurality of storage devices based on the sorting comprises uniformly redistributing data blocks having similar access patterns and storage sizes across the plurality of storage devices.

6

6. The medium of claim 4 , wherein redistributing the data blocks across the plurality of storage devices comprises: selecting a bucket from the plurality of buckets, the bucket having an access pattern level specifying an access time that is more recent than an access time specified by an access pattern level for another bucket from the plurality of data buckets; and redistributing data blocks assigned to the selected bucket prior to redistributing data blocks assigned to the another bucket.

7

7. A system comprising: a non-transitory computer readable storage medium storing processor-executable computer program instructions for redistributing data, the instructions comprising instructions for: observing accesses to the plurality of data blocks stored in the distributed storage; and redistributing the plurality of data blocks in the distributed storage, wherein redistribution of the plurality of data blocks in the distributed storage comprises: determining access patterns for one or more data blocks from the plurality of data blocks based on the observed accesses; determining storage sizes for the one or more data blocks from the plurality of data blocks; sorting the one or more data blocks based at least in part on the determined access patterns and on the determined storage sizes, wherein the sorting comprising: assigning the one or more data blocks to a plurality of buckets each bucket associated with a particular access pattern level and data block storage size requirement, wherein assigning the one or more data blocks to the plurality of buckets comprises: matching a determined access pattern and a determined storage size of a particular data block to an access pattern level and a data block storage size requirement of a particular bucket from the plurality of buckets; and assigning the particular data block to the particular bucket based on the matching; and redistributing the one or more data blocks across a plurality of storage devices of the distributed storage based on the sorting of the one or more data blocks, wherein the redistributing comprises: determining a total number of data blocks assigned to a particular bucket from the plurality of buckets; calculating a target number of data blocks for each of the plurality of storage devices for the particular bucket by dividing the determined total number of data blocks by a number of the plurality of storage devices; and redistributing the data blocks assigned to the particular bucket across the plurality of storage devices based on the calculated target number of data blocks for each of the plurality of storage devices for the particular bucket; and a processor for executing the computer program instructions.

8

8. The system of claim 7 , wherein redistributing the one or more data blocks across the plurality of storage devices based on the sorting comprises uniformly redistributing data blocks having similar access patterns and storage sizes across the plurality of storage devices.

9

9. The system of claim 7 , wherein redistributing the data blocks across the plurality of storage devices comprises: selecting a bucket from the plurality of buckets, the bucket having an access pattern level specifying an access time that is more recent than an access time specified by an access pattern level for another bucket from the plurality of data buckets; and redistributing data blocks assigned to the selected bucket prior to redistributing data blocks assigned to the another bucket.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

November 1, 2012

Publication Date

January 5, 2016

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Redistributing data in a distributed storage system based on attributes of the data” (US-9229657). https://patentable.app/patents/US-9229657

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.