Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method of managing data received for writing, the method comprising: during a first cache processing cycle, (i) placing M compressed blocks of the data in a segment of storage space, (ii) populating M entries of a mapping structure in cache for mapping the M compressed blocks in the segment, the mapping structure having space for N entries, N>M, with N-M locations of the mapping structure remaining unpopulated, and (iii), applying a hold to the mapping structure in the cache to ensure that the mapping structure is retained in the cache; during a second cache processing cycle, (i) placing between 1 and N-M additional compressed blocks in the segment and (ii) populating an additional location in the mapping structure for each additional compressed block placed or to be placed in the segment; and during or after the second cache processing cycle, releasing the hold on the mapping structure.
This invention relates to data management systems, specifically optimizing the handling of compressed data blocks in storage systems. The problem addressed is inefficient use of cache resources during data write operations, particularly when managing compressed data blocks and their associated metadata mappings. The method involves a two-phase process for managing data writes. In the first phase, a set of M compressed data blocks is written to a storage segment. Simultaneously, M entries in a mapping structure are populated in cache to track the locations of these compressed blocks. The mapping structure is designed to hold N entries, where N is greater than M, leaving N-M locations initially unpopulated. A hold is applied to the mapping structure to prevent it from being evicted from cache during this phase. In the second phase, additional compressed blocks (between 1 and N-M) are written to the same storage segment. Corresponding entries in the mapping structure are populated for these new blocks. Once this phase is complete, or during its execution, the hold on the mapping structure is released, allowing it to be managed normally by the cache system. This approach ensures efficient use of cache resources by minimizing the number of mapping structures required and reducing cache thrashing during write operations. The method is particularly useful in systems where compressed data blocks are frequently written and mapped, such as in database or file storage systems.
2. The method of claim 1 , wherein the M compressed blocks have a collective size in the segment, and wherein the method further comprises, during the first cache processing cycle and prior to placing the M compressed blocks in the segment, allocating the segment with a size that exceeds the collective size of the M compressed blocks by a margin, the margin based upon a predicted size of N-M additional compressed blocks.
3. The method of claim 2 , further comprising predicting the size of the N-M additional compressed blocks based on (i) a block size of uncompressed blocks in the cache and (ii) a compression ratio achieved when compressing previously processed blocks.
This invention relates to data compression in computing systems, specifically optimizing the allocation of compressed data blocks in a cache. The problem addressed is inefficient cache utilization due to unpredictable compression outcomes, leading to wasted space or insufficient storage for compressed data. The method involves compressing data into blocks and storing them in a cache. To improve efficiency, the system predicts the size of additional compressed blocks before allocation. This prediction is based on two factors: the size of uncompressed blocks already in the cache and the compression ratio observed from previously compressed blocks. By estimating the size of future compressed blocks, the system can allocate cache space more accurately, reducing fragmentation and improving storage utilization. The method also includes dynamically adjusting the number of additional blocks (N-M) to be compressed based on available cache space and compression performance. This ensures that the system adapts to varying data patterns and compression efficiency over time. The approach helps maintain optimal cache performance by balancing storage allocation with compression effectiveness.
4. The method of claim 2 , wherein one or more of the M compressed blocks represent overwrites of data elements already stored, and wherein the method further comprises, for at least one of the M compressed blocks: identifying a current storage location of the respective data element; and confirming that the compressed block does not fit in a space provided for the respective data element at the current storage location.
5. The method of claim 4 , wherein, during the first cache processing cycle, the M compressed blocks belong to a batch of N compressed blocks, and wherein the method further comprises, for each of the N compressed blocks in the batch that is not one of the M compressed blocks, placing the compressed block in another segment that has available space for accommodating the compressed block.
6. The method of claim 4 , wherein the cache operates repeatedly on successive cache processing cycles, and wherein during each cache processing cycle, the method includes (i) allocating a new segment for placing compressed blocks during that cache processing cycle and (ii) allocating a new mapping structure for mapping the compressed blocks in the respective new segment.
This invention relates to a method for managing a cache system, specifically addressing the challenge of efficiently handling compressed data blocks in a cache memory. The method involves a cyclic process where, during each cache processing cycle, the system allocates a new segment for storing compressed data blocks and a corresponding new mapping structure to track the locations of these blocks within the segment. This approach ensures that compressed data is organized in a structured manner, improving access efficiency and reducing fragmentation over time. The method is designed to operate repeatedly across successive cycles, dynamically adapting to changes in data storage requirements. By segregating compressed blocks into distinct segments and maintaining separate mapping structures for each cycle, the system enhances data retrieval performance and simplifies cache management. This technique is particularly useful in environments where compressed data is frequently accessed or updated, such as in database systems or file storage solutions. The invention aims to optimize cache utilization by minimizing overhead associated with block placement and mapping, thereby improving overall system efficiency.
7. The method of claim 6 , wherein allocating each new segment includes over-allocating space within the new segment when fewer than N compressed blocks are being placed in the new segment during the cache processing cycle.
8. The method of claim 6 , further comprising running multiple computing threads that manage data in the cache, each of the computing threads configured to place holds on certain mapping structures, and wherein the method further comprises maintaining a hold list of mapping structures on which holds have been placed by the computing threads.
9. The method of claim 8 , further comprising: obtaining, by one of the computing threads, a new batch of N compressed blocks from the cache; accessing the hold list; identifying a mapping structure on the hold list that maps to a segment that has available space for storing one or more of the N compressed blocks, and placing said one or more of the N compressed blocks in the segment mapped by the identified mapping structure.
10. A computerized apparatus, comprising control circuitry that includes a set of processing units coupled to memory, the control circuitry constructed and arranged to: during a first cache processing cycle, (i) place M compressed blocks of the data in a segment of storage space, (ii) populate M entries of a mapping structure in cache for mapping the M compressed blocks in the segment, the mapping structure having space for N entries, N>M, with N-M locations of the mapping structure remaining unpopulated, and (iii), apply a hold to the mapping structure in the cache to ensure that the mapping structure is retained in the cache; during a second cache processing cycle, (i) place between 1 and N-M additional compressed blocks in the segment and (ii) populate an additional location in the mapping structure for each additional compressed block placed or to be placed in the segment; and during or after the second cache processing cycle, release the hold on the mapping structure.
11. The computerized apparatus of claim 10 , wherein the M compressed blocks have a collective size in the segment, and wherein, during the first cache processing cycle and prior to placing the M compressed blocks in the segment, the control circuitry is further constructed and arranged to allocate the segment with a size that exceeds the collective size of the M compressed blocks by a margin, the margin based upon a predicted size of N-M additional compressed blocks.
12. A computer program product including a set of non-transitory, computer-readable media having instructions which, when executed by control circuitry of a computerized apparatus, cause the control circuitry to perform a method of managing data received for writing, the method comprising: during a first cache processing cycle, (i) placing M compressed blocks of the data in a segment of storage space, (ii) populating M entries of a mapping structure in cache for mapping the M compressed blocks in the segment, the mapping structure having space for N entries, N>M, with N-M locations of the mapping structure remaining unpopulated, and (iii), applying a hold to the mapping structure in the cache to ensure that the mapping structure is retained in the cache; during a second cache processing cycle, (i) placing between 1 and N-M additional compressed blocks in the segment and (ii) populating an additional location in the mapping structure for each additional compressed block placed or to be placed in the segment; and during or after the second cache processing cycle, releasing the hold on the mapping structure.
13. The computer program product of claim 12 , wherein the M compressed blocks have a collective size in the segment, and wherein the method further comprises, during the first cache processing cycle and prior to placing the M compressed blocks in the segment, allocating the segment with a size that exceeds the collective size of the M compressed blocks by a margin, the margin based upon a predicted size of N-M additional compressed blocks.
14. The computer program product of claim 13 , wherein the method further comprises predicting the size of the N-M additional compressed blocks based on (i) a block size of uncompressed blocks in the cache and (ii) a compression ratio achieved when compressing previously processed blocks.
15. The computer program product of claim 13 , wherein one or more of the M compressed blocks represent overwrites of data elements already stored, and wherein the method further comprises, for at least one of the M compressed blocks: identifying a current storage location of the respective data element; and confirming that the compressed block does not fit in a space provided for the respective data element at the current storage location.
This invention relates to data storage systems, specifically handling compressed data blocks that overwrite existing data elements. The problem addressed is efficiently managing storage when compressed data blocks do not fit in the space originally allocated for the overwritten data elements. The system processes M compressed blocks of data, where some blocks represent overwrites of previously stored data. For each overwritten data element, the system identifies its current storage location and checks whether the compressed block can fit in the space originally allocated for that data element. If the compressed block does not fit, the system determines an alternative storage strategy, such as relocating the data or adjusting storage allocation. This ensures efficient use of storage space while maintaining data integrity. The method involves compressing data into blocks, identifying overwrites, and verifying fit within existing storage allocations. If a compressed block exceeds the allocated space, the system may reallocate storage or apply further compression techniques. This approach optimizes storage utilization by dynamically adapting to changes in data size during overwrites. The invention is particularly useful in systems where data compression and overwrites are frequent, such as databases or file systems.
16. The computer program product of claim 15 , wherein, during the first cache processing cycle, the M compressed blocks belong to a batch of N compressed blocks, and wherein the method further comprises, for each of the N compressed blocks in the batch that is not one of the M compressed blocks, placing the compressed block in another segment that has available space for accommodating the compressed block.
This invention relates to data storage and retrieval systems, specifically optimizing cache memory management for compressed data blocks. The problem addressed is inefficient use of cache memory when handling batches of compressed data blocks, leading to wasted space and reduced performance. The invention describes a method for processing compressed data blocks in a cache memory system. During a first processing cycle, a subset of M compressed blocks from a batch of N compressed blocks is selected for placement in a primary cache segment. The remaining N-M compressed blocks in the batch are distributed to other cache segments that have available space to accommodate them. This approach ensures that the primary cache segment is fully utilized while preventing overflow by redirecting excess blocks to alternative segments with capacity. The system dynamically evaluates cache segment availability to optimize storage allocation for compressed data, improving memory efficiency and access performance. The method may involve tracking segment utilization metrics and adjusting block placement decisions based on real-time cache conditions. This solution is particularly useful in systems where compressed data blocks vary in size and require flexible storage management to maintain optimal cache performance.
17. The computer program product of claim 15 , wherein the cache operates repeatedly on successive cache processing cycles, and wherein during each cache processing cycle, the method includes (i) allocating a new segment for placing compressed blocks during that cache processing cycle and (ii) allocating a new mapping structure for mapping the compressed blocks in the respective new segment.
18. The computer program product of claim 17 , wherein allocating each new segment includes over-allocating space within the new segment when fewer than N compressed blocks are being placed in the new segment during the cache processing cycle.
19. The computer program product of claim 17 , further comprising running multiple computing threads that manage data in the cache, each of the computing threads configured to place holds on certain mapping structures, and wherein the method further comprises maintaining a hold list of mapping structures on which holds have been placed by the computing threads.
20. The computer program product of claim 19 , wherein the method further comprises: obtaining, by one of the computing threads, a new batch of N compressed blocks from the cache; accessing the hold list; identifying a mapping structure on the hold list that maps to a segment that has available space for storing one or more of the N compressed blocks, and placing said one or more of the N compressed blocks in the segment mapped by the identified mapping structure.
Unknown
April 6, 2021
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.