Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A computer program product for managing data in a computer readable cache system comprising a first cache, a second cache, and a storage system, the computer program product comprising a computer readable storage medium having computer readable program code embodied therein that executes to perform operations, the operations comprising: maintaining information on strides configured in the second cache and occupancy counts for the strides indicating an extent to which the strides are populated with valid tracks and invalid tracks, wherein a stride having no valid tracks is empty, wherein the strides comprise data strides populated with tracks of data, and wherein the tracks are maintained in the storage system; determining tracks to demote from the first cache, wherein the first cache comprises a dynamic random access memory (DRAM) and wherein the second cache comprises n solid state storage devices; forming a first stride including the determined tracks to demote; stripping the first stride of tracks from the first cache across the n solid state storage devices to form a second stride in the second cache having an occupancy count indicating the stride is empty, wherein the first cache stores tracks comprising modified or unmodified data and sequential or non-sequential data, and wherein tracks formed in strides in the first cache to promote to the second cache comprise unmodified non-sequential data; determining a target stride in the second cache based on the occupancy counts of the strides in the second cache; determining at least two source strides in the second cache having valid tracks based on the occupancy counts of the strides in the second cache; and populating the target stride with the valid tracks from the source strides.
A system manages data across a DRAM cache (first cache), an SSD cache (second cache with 'n' devices), and a storage system. It maintains an inventory of "strides" (data blocks) in the SSD cache, tracking how full each stride is with valid data ("occupancy count"). When data needs to be moved from the DRAM to the SSD cache, the system groups the data into a "first stride". It then writes this data to the SSD cache, potentially spread across the 'n' SSDs, creating a "second stride". To optimize space in the SSD cache, the system identifies a "target stride" (a stride with low occupancy) and multiple "source strides" (strides with some valid data). It then consolidates the valid data from the source strides into the target stride, effectively defragmenting the SSD cache. The first cache stores both modified/unmodified and sequential/non-sequential data, while data promoted from first cache to second cache comprises unmodified non-sequential data.
2. The computer program product of claim 1 , wherein the operations further comprise; invalidating the tracks in the source strides added to the target stride; and reducing the occupancy count of each of the source strides by a number of the valid tracks added to the target stride from the source stride.
Building upon the data management system described previously, after moving valid data from source strides to the target stride, the original data in the source strides are marked as invalid. The occupancy count of each source stride is reduced by the number of tracks that were moved to the target stride. This ensures that the system accurately tracks available space and data locations in the SSD cache after the consolidation process.
3. The computer program product of claim 1 , wherein determining the target stride comprises: selecting one of the strides in the second cache having a lowest occupancy count of the strides in the second cache to be the target stride.
In the data management system, the target stride for data consolidation is selected by identifying the stride within the SSD cache that has the lowest occupancy count. This strategy aims to consolidate data into the least-utilized strides, maximizing the number of completely empty strides available for new data being moved from the DRAM cache to the SSD cache.
4. The computer program product of claim 3 , wherein the operations further comprise: selecting the at least two strides in the second cache having lowest occupancy counts of the strides in the second cache other than the target stride to be the source strides.
Expanding on how the data management system finds the target stride (the stride with the lowest occupancy), the source strides (the strides that contribute valid data to the target stride) are selected by identifying at least two other strides with the next-lowest occupancy counts. The system excludes the target stride itself from consideration when selecting source strides.
5. The computer program product of claim 4 , wherein selecting the at least two strides in the second cache to be the source strides further comprises: selecting a number of source strides having the lowest occupancy counts of the strides in the second cache other than the target stride until a number of the valid tracks in the selected source strides is at least equal to a number of tracks in the target stride.
Continuing the data management system, when choosing the source strides (the strides contributing data), the system selects multiple strides with the lowest occupancy counts (excluding the target stride), continuing until the total number of valid tracks within those selected source strides is at least equal to the number of available tracks in the target stride. This ensures enough data is available to fill the target stride fully.
6. The computer program product of claim 4 , wherein the lowest occupancy count is determined as one of a lowest number of valid tracks and a lowest percentage of valid tracks.
In the data management system, the determination of the "lowest occupancy count" for both selecting the target stride and source strides can be based on either the absolute number of valid tracks present in the stride or the percentage of the stride that is filled with valid tracks. The selection method depends on whichever option works best.
7. The computer program product of claim 1 , wherein the operations further comprise: determining one of the tracks in one of the strides in the second cache to demote from the second cache; demoting the determined track to demote from the second cache; invalidating the determined track to demote in the second cache; and decrementing the occupancy count of the stride including the determined track to demote.
In the described data management system, to further optimize space, the system can also identify individual tracks within the SSD cache to remove (demote), invalidate them, and decrement the occupancy count of the containing stride. This allows granular control and removal of unused or less frequently accessed data.
8. The computer program product of claim 1 , wherein the first cache is a faster access device than the second cache and wherein the second cache is a faster access device than the storage devices.
In this hierarchical cache system, the DRAM cache is faster than the SSD cache, and the SSD cache is faster than the underlying persistent storage system. This tiered approach leverages the speed of DRAM for frequently accessed data, the capacity of SSDs for less-frequent data, and the large capacity of persistent storage for archival.
9. The computer program product of claim 1 , wherein the storage system is comprised of a plurality of slower access devices than the solid state storage devices.
In the data management system, the persistent storage system consists of multiple storage devices that are slower than the solid-state drives in the second cache (SSD cache). These slower access devices provide bulk storage for data not actively being used by the system.
10. The computer program product of claim 1 , wherein the operations further comprise: receiving a write to a track in the first cache; determining whether the track receiving the write is included in the second cache; invalidating the track in the second cache updated in the first cache in response to determining that the track written to in the first cache is included in the second cache; and decrementing the occupancy count for the stride in the second cache including the invalidated track in the second cache.
When a write operation occurs in the DRAM cache, the data management system checks if that data also exists in the SSD cache. If the data is present in both, the copy in the SSD cache is invalidated, and the occupancy count of the corresponding stride is decremented. This maintains cache coherence by ensuring the DRAM cache always contains the most up-to-date version of the data.
11. The computer program product of claim 1 , wherein the operations further comprise: determining whether a number of strides having an occupancy count indicating the stride is empty is below a predetermined free stride threshold; and wherein the determining of the target stride, the determining of the at least two source strides, and the populating are performed in response to determining that the number of strides having an occupancy count indicating the stride is empty is below the predetermined free threshold.
The data management system monitors the number of empty strides (strides with no valid data) in the SSD cache. If this number falls below a predefined threshold, the system initiates the data consolidation process: determining the target stride, selecting the source strides, and populating the target stride. This prevents excessive fragmentation and ensures enough free space for incoming data.
12. The computer program product of claim 1 , wherein forming the first stride of tracks comprising forming a stride for a Redundant Array of Independent Disk (RAID) configuration based on a RAID configuration defined for the second cache as having n devices including m devices for storing tracks of data and at least one parity device to store parity data calculated from the tracks of data for the m devices.
When creating the initial stride of data being demoted from DRAM to SSD cache, the system can use a RAID (Redundant Array of Independent Disks) configuration. The stride is formed according to the defined RAID level for the SSD cache, using 'm' devices for data and at least one device for parity data. This provides data protection and redundancy in the SSD cache.
13. A system in communication with a storage system, comprising: a processor; a first cache comprising a dynamic random access memory (DRAM) accessible to the processor; a second cache comprising n solid state storage devices accessible to the processor; a computer readable storage medium having computer readable program code embodied therein executed by the processor to perform operations, the operations comprising: maintaining information on strides configured in the second cache and occupancy counts for the strides indicating an extent to which the strides are populated with valid tracks and invalid tracks, wherein a stride having no valid tracks is empty, wherein the strides comprise data strides populated with tracks of data, and wherein the tracks are maintained in the storage system; determining tracks to demote from the first cache; forming a first stride including the determined tracks to demote; stripping the first stride of tracks from the first cache across the n solid state storage devices to form a second stride in the second cache having an occupancy count indicating the stride is empty, wherein the first cache stores tracks comprising modified or unmodified data and sequential or non-sequential data, and wherein tracks formed in strides in the first cache to promote to the second cache comprise unmodified non-sequential data; determining a target stride in the second cache based on the occupancy counts of the strides in the second cache; determining at least two source strides in the second cache having valid tracks based on the occupancy counts of the strides in the second cache; and populating the target stride with the valid tracks from the source strides.
A data management system communicates with storage and includes a processor, a fast DRAM cache (first cache), and a slower SSD cache (second cache with 'n' devices). Software running on the processor maintains info about "strides" in the SSD cache, tracking valid data. When data is demoted from DRAM, it's grouped into a "first stride". Then, the data is written to the SSD cache, potentially spread across the 'n' SSDs, creating a "second stride." The system consolidates data by finding a "target stride" with low occupancy and moving valid data from multiple "source strides" into it. The first cache stores both modified/unmodified and sequential/non-sequential data, while data promoted from first cache to second cache comprises unmodified non-sequential data.
14. The system of claim 13 , wherein the operations further comprise; invalidating the tracks in the source strides added to the target stride; and reducing the occupancy count of each of the source strides by a number of the valid tracks added to the target stride from the source stride.
Similar to the previous data management system claim, after moving valid data from source strides to the target stride, the original data in the source strides are marked as invalid. The occupancy count of each source stride is reduced by the number of tracks that were moved to the target stride. This ensures that the system accurately tracks available space and data locations in the SSD cache after the consolidation process.
15. The system of claim 13 , wherein determining the target stride comprises: selecting one of the strides in the second cache having a lowest occupancy count of the strides in the second cache to be the target stride.
In the data management system, the target stride for data consolidation is selected by identifying the stride within the SSD cache that has the lowest occupancy count. This strategy aims to consolidate data into the least-utilized strides, maximizing the number of completely empty strides available for new data being moved from the DRAM cache to the SSD cache.
16. The system of claim 15 , wherein the operations further comprise: selecting the at least two strides in the second cache having lowest occupancy counts of the strides in the second cache other than the target stride to be the source strides.
Expanding on how the data management system finds the target stride (the stride with the lowest occupancy), the source strides (the strides that contribute valid data to the target stride) are selected by identifying at least two other strides with the next-lowest occupancy counts. The system excludes the target stride itself from consideration when selecting source strides.
17. The system of claim 13 , wherein the operations further comprise: determining one of the tracks in one of the strides in the second cache to demote from the second cache; demoting the determined track to demote from the second cache; invalidating the determined track to demote in the second cache; and decrementing the occupancy count of the stride including the determined track to demote.
In the described data management system, to further optimize space, the system can also identify individual tracks within the SSD cache to remove (demote), invalidate them, and decrement the occupancy count of the containing stride. This allows granular control and removal of unused or less frequently accessed data.
18. The system of claim 13 , wherein the operations further comprise: determining whether a number of strides having an occupancy count indicating the stride is empty is below a predetermined free stride threshold; and wherein the determining of the target stride, the determining of the at least two source strides, and the populating are performed in response to determining that the number of strides having an occupancy count indicating the stride is empty is below the predetermined free threshold.
The data management system monitors the number of empty strides (strides with no valid data) in the SSD cache. If this number falls below a predefined threshold, the system initiates the data consolidation process: determining the target stride, selecting the source strides, and populating the target stride. This prevents excessive fragmentation and ensures enough free space for incoming data.
19. The system of claim 15 , wherein selecting the at least two strides in the second cache to be the source strides further comprises: selecting a number of source strides having the lowest occupancy counts of the strides in the second cache other than the target stride until a number of the valid tracks in the selected source strides is at least equal to a number of tracks in the target stride.
Continuing the data management system, when choosing the source strides (the strides contributing data), the system selects multiple strides with the lowest occupancy counts (excluding the target stride), continuing until the total number of valid tracks within those selected source strides is at least equal to the number of available tracks in the target stride. This ensures enough data is available to fill the target stride fully.
20. The system of claim 15 , wherein the lowest occupancy count is determined as one of a lowest number of valid tracks and a lowest percentage of valid tracks.
In the data management system, the determination of the "lowest occupancy count" for both selecting the target stride and source strides can be based on either the absolute number of valid tracks present in the stride or the percentage of the stride that is filled with valid tracks. The selection method depends on whichever option works best.
21. The system of claim 13 , wherein the first cache is a faster access device than the second cache and wherein the second cache is a faster access device than the storage devices.
In this hierarchical cache system, the DRAM cache is faster than the SSD cache, and the SSD cache is faster than the underlying persistent storage system. This tiered approach leverages the speed of DRAM for frequently accessed data, the capacity of SSDs for less-frequent data, and the large capacity of persistent storage for archival.
22. The system of claim 13 , wherein the storage system is comprised of a plurality of slower access devices than the solid state storage devices.
In the data management system, the persistent storage system consists of multiple storage devices that are slower than the solid-state drives in the second cache (SSD cache). These slower access devices provide bulk storage for data not actively being used by the system.
23. The system of claim 13 , wherein the operations further comprise: receiving a write to a track in the first cache; determining whether the track receiving the write is included in the second cache; invalidating the track in the second cache updated in the first cache in response to determining that the track written to in the first cache is included in the second cache; and decrementing the occupancy count for the stride in the second cache including the invalidated track in the second cache.
When a write operation occurs in the DRAM cache, the data management system checks if that data also exists in the SSD cache. If the data is present in both, the copy in the SSD cache is invalidated, and the occupancy count of the corresponding stride is decremented. This maintains cache coherence by ensuring the DRAM cache always contains the most up-to-date version of the data.
24. The system of claim 13 , wherein forming the first stride of tracks comprising forming a stride for a Redundant Array of Independent Disk (RAID) configuration based on a RAID configuration defined for the second cache as having n devices including m devices for storing tracks of data and at least one parity device to store parity data calculated from the tracks of data for the m devices.
When creating the initial stride of data being demoted from DRAM to SSD cache, the system can use a RAID (Redundant Array of Independent Disks) configuration. The stride is formed according to the defined RAID level for the SSD cache, using 'm' devices for data and at least one device for parity data. This provides data protection and redundancy in the SSD cache.
Unknown
September 2, 2014
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.