US-11513912

Application discovery using access pattern history

PublishedNovember 29, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Application discovery from access patterns is disclosed. Access histories from multiple servers are collected and stored at a warehouse, which may be part of a data protection system. A time series analysis is performed on the access history to identify consistency groups and applications from the perspective of devices and storage arrays. Data protection operations such as backup operations can then be performed on the basis of devices or storage in storage arrays or other arrangements that pertain to specific consistency groups or to specific applications.

Patent Claims

18 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 2

Original Legal Text

2. The method of claim 1, further comprising generating, by each server, the local access history, wherein at least one application is a distributed application that executes on more than one of the servers.

Plain English Translation

This invention relates to distributed computing systems where multiple servers interact with applications, including distributed applications that run across multiple servers. The problem addressed is efficiently managing and accessing application data in such environments, particularly when applications are distributed across multiple servers. The method involves generating a local access history for each server, which records how applications interact with the server. This history includes details such as which applications accessed the server, the frequency of access, and the type of operations performed. The key innovation is that at least one of the applications is a distributed application, meaning it executes across multiple servers rather than being confined to a single server. This distributed nature complicates access tracking, as the application's operations span multiple servers, requiring a coordinated approach to maintain accurate access histories. The local access history helps optimize resource allocation, improve security by monitoring access patterns, and enhance performance by identifying bottlenecks or inefficiencies in distributed application execution. By tracking access at each server, the system can better manage distributed workloads and ensure consistent behavior across the network. The method ensures that even in complex, multi-server environments, access patterns are accurately recorded and utilized for system improvements.

Claim 3

Original Legal Text

3. The method of claim 2, further comprising transmitting a local access history, by each of the servers, to a corresponding storage array, and aggregating the local access histories at the corresponding storage array, wherein each storage array is associated with an aggregated local history.

Plain English Translation

This invention relates to distributed storage systems where multiple servers access shared storage arrays. The problem addressed is efficiently tracking and aggregating access histories across distributed servers to optimize storage performance and management. Each server maintains a local access history of its interactions with a storage array, including read/write operations, timestamps, and data locations. These local histories are periodically transmitted to their corresponding storage arrays. The storage arrays then aggregate these local histories, creating a comprehensive access history for each array. This aggregated data enables better decision-making for caching, load balancing, and predictive maintenance. The system ensures that access patterns are centrally recorded without requiring continuous synchronization, reducing overhead while maintaining accurate historical records. The aggregated histories can be used to identify frequently accessed data, optimize storage tiering, and detect anomalies in access behavior. This approach improves storage efficiency and reliability in distributed environments by providing a unified view of access patterns across multiple servers.

Claim 4

Original Legal Text

4. The method of claim 3, further comprising sending the aggregated local history of each storage array to a data warehouse associated with a data protection system, wherein the aggregated access history at the data warehouse includes the aggregated local histories from each of the storage arrays.

Plain English Translation

This invention relates to data storage systems and methods for aggregating and analyzing access history data across multiple storage arrays. The problem addressed is the lack of centralized visibility into data access patterns across distributed storage systems, which complicates data protection, compliance, and performance optimization. The method involves collecting local access history data from each storage array in a distributed storage environment. This local history includes records of read and write operations, timestamps, and user or application identifiers. The collected data is then aggregated at a central location, such as a data warehouse associated with a data protection system. The aggregated data combines the local histories from all storage arrays, providing a comprehensive view of access patterns across the entire storage infrastructure. This centralized aggregation enables advanced analytics, such as identifying frequently accessed data, detecting anomalies, or enforcing access policies. The aggregated data can also support data protection functions like backup, replication, and retention policies by correlating access patterns with protection requirements. The solution improves operational efficiency by reducing the need for manual data collection and analysis across disparate systems.

Claim 5

Original Legal Text

5. The method of claim 1, further comprising performing a time series analysis on the access history.

Plain English Translation

A system and method for analyzing access patterns to digital content involves tracking user interactions with files, databases, or other digital resources to identify usage trends. The method records access events, including timestamps, user identifiers, and content identifiers, to build a comprehensive access history. This history is then processed to detect anomalies, such as unusual access frequencies or unauthorized access attempts, which may indicate security risks or operational inefficiencies. The system can generate alerts or reports based on these findings to inform administrators or automated response mechanisms. Additionally, the method performs time series analysis on the access history to identify temporal patterns, such as peak usage times or recurring access cycles. This analysis helps optimize resource allocation, improve security monitoring, and enhance user experience by predicting and preparing for high-demand periods. The system may also integrate with existing security frameworks to correlate access data with other threat indicators, providing a more holistic view of system security. By continuously monitoring and analyzing access patterns, the method enables proactive management of digital resources, reducing risks and improving operational efficiency.

Claim 6

Original Legal Text

6. The method of claim 5, further comprising identifying patterns across storage arrays to identify the consistency groups, wherein each of the consistency groups associates one or more of the storage devices with a server, and to identify the applications, wherein each of the applications is associated with one or more of the storage devices.

Plain English Translation

This invention relates to data storage management in distributed systems, specifically addressing the challenge of organizing and tracking relationships between storage devices, servers, and applications to ensure data consistency and efficient resource allocation. The method involves analyzing storage arrays to detect patterns that define consistency groups, where each group links one or more storage devices to a specific server. Additionally, the method identifies applications and their associations with storage devices, enabling better coordination between computational workloads and storage resources. By mapping these relationships, the system can optimize data consistency, improve performance, and simplify management in complex storage environments. The approach helps prevent data corruption and ensures that applications access the correct storage devices, particularly in multi-server or virtualized environments where dependencies between components are dynamic. The solution automates the discovery of these relationships, reducing manual configuration and enhancing scalability. This method is particularly useful in enterprise storage systems where maintaining consistency across distributed storage arrays is critical for reliability and performance.

Claim 7

Original Legal Text

7. The method of claim 1, further comprising tuning the analysis with a tuning factor.

Plain English Translation

This invention relates to a method for analyzing data, particularly in systems where data quality or accuracy is critical. The method addresses the problem of ensuring that data analysis remains reliable and adaptable to varying conditions or requirements. The core method involves processing input data through an analysis module to generate output data, where the analysis module is configured to perform specific operations such as filtering, normalization, or statistical analysis. The method further includes adjusting the analysis process using a tuning factor, which modifies how the analysis module processes the input data. The tuning factor can be a numerical value, a parameter, or a set of rules that influences the behavior of the analysis module, such as adjusting sensitivity thresholds, weighting factors, or decision criteria. This tuning allows the method to adapt to different data characteristics, environmental conditions, or user preferences, improving the accuracy and relevance of the output data. The method may also include validating the output data against predefined criteria to ensure it meets quality standards before further use. The tuning factor can be dynamically adjusted based on feedback or external inputs, enabling real-time optimization of the analysis process. This approach is particularly useful in applications like sensor data processing, financial modeling, or machine learning, where adaptability and precision are essential.

Claim 8

Original Legal Text

8. The method of claim 1, further comprising performing a backup operation based on the consistency groups or the applications.

Plain English Translation

A system and method for managing data consistency in storage environments involves organizing data into consistency groups, where each group ensures that data remains in a synchronized state across multiple storage devices. This addresses the problem of maintaining data integrity during operations such as backups, replication, or failover, where inconsistencies can arise due to asynchronous updates or failures. The method includes identifying applications or data sets that require consistent treatment, grouping them into consistency groups, and enforcing synchronization rules to ensure that all data within a group is updated or backed up as a single logical unit. This prevents partial or corrupted data states during critical operations. Additionally, the method performs backup operations based on these consistency groups or the associated applications, ensuring that backups are taken in a consistent state. The system may also monitor the status of consistency groups, detect failures, and trigger corrective actions to maintain data integrity. This approach is particularly useful in enterprise environments where multiple applications share storage resources, and consistency is critical for business continuity.

Claim 9

Original Legal Text

9. The method of claim 1, further comprising identifying fingerprints within the access history, wherein the fingerprints include a delta logical block address and a delta operation type.

Plain English Translation

A method for analyzing access patterns in a storage system involves identifying unique fingerprints within an access history to detect specific data access behaviors. The fingerprints are defined by a delta logical block address (LBA), representing the difference in storage addresses between consecutive access operations, and a delta operation type, indicating the change in operation type (e.g., read, write) between consecutive operations. This technique helps distinguish between different access patterns, such as sequential reads, random writes, or repeated access to specific blocks. By extracting these fingerprints, the system can classify access behaviors, optimize storage performance, or detect anomalies. The method may also involve preprocessing the access history to filter or normalize data before fingerprint extraction. The fingerprints can be used to improve caching strategies, predict future access patterns, or enhance security by identifying suspicious activity. This approach is particularly useful in storage systems where understanding access patterns is critical for performance tuning or threat detection.

Claim 10

Original Legal Text

10. The method of claim 9, further comprising creating a ranked list of a cumulative frequency distribution, which corresponds to a total percentage of IO per delta operation type.

Plain English Translation

A system and method for analyzing input/output (IO) operations in a data storage environment, particularly focusing on delta operations, which involve changes to stored data. The technology addresses the challenge of efficiently tracking and optimizing IO performance by categorizing and quantifying different types of delta operations. The method involves monitoring IO operations within a storage system, identifying delta operations among them, and classifying these operations into specific types based on their characteristics. A cumulative frequency distribution is then generated, representing the total percentage of IO operations attributed to each delta operation type. This distribution is used to create a ranked list, prioritizing delta operation types by their frequency or impact on system performance. The ranked list helps administrators or automated systems identify the most significant delta operations affecting storage efficiency, enabling targeted optimizations. The method may also involve visualizing the ranked list or integrating it with performance monitoring tools to guide decision-making. By quantifying and ranking delta operations, the system provides actionable insights for improving storage performance and resource allocation.

Claim 12

Original Legal Text

12. The non-transitory storage medium of claim 11, the operations further comprising generating, by each server, the local access history, wherein at least one application is a distributed application that executes on more than one of the servers.

Plain English Translation

This invention relates to distributed computing systems where multiple servers interact with shared resources, such as databases or files, and need to track access history for security, auditing, or performance optimization. The challenge is efficiently generating and maintaining accurate access histories across distributed applications that run on multiple servers, ensuring consistency and minimizing overhead. The invention involves a non-transitory storage medium storing executable instructions that, when executed by a server in a distributed system, perform operations to generate and manage local access histories. Each server independently records its own access history, which includes details of how applications interact with shared resources. The system supports distributed applications that execute across multiple servers, ensuring that access history is accurately captured even when an application spans multiple nodes. The operations may include tracking which applications accessed which resources, the timing of those accesses, and any modifications made. The local access histories can later be aggregated or analyzed to enforce security policies, detect anomalies, or optimize resource usage. The solution ensures that access history is maintained efficiently without requiring centralized coordination, reducing latency and improving scalability in distributed environments.

Claim 13

Original Legal Text

13. The non-transitory storage medium of claim 12, the operations further comprising transmitting a local access history, by each of the servers, to a corresponding storage array, and aggregating the local access histories at the corresponding storage array, wherein each storage array is associated with an aggregated local history.

Plain English Translation

This invention relates to distributed storage systems, specifically improving data access efficiency by tracking and aggregating access histories across multiple servers and storage arrays. The problem addressed is the lack of centralized access history data in distributed systems, which can lead to inefficient data placement, caching, and load balancing. The system includes multiple servers and storage arrays, where each server maintains a local access history of data requests it processes. These local histories are periodically transmitted to their corresponding storage arrays. Each storage array then aggregates the local access histories from its associated servers, creating a comprehensive aggregated local history for that array. This aggregated data helps optimize data distribution, caching strategies, and load balancing decisions across the system. The aggregated histories can be used to predict access patterns, pre-fetch frequently accessed data, or dynamically adjust storage allocation to reduce latency and improve performance. The invention enhances distributed storage systems by providing a mechanism to collect and centralize access history data, enabling more informed decision-making for data management and resource allocation. This approach improves overall system efficiency and responsiveness in handling data access requests.

Claim 14

Original Legal Text

14. The non-transitory storage medium of claim 13, the operations further comprising sending the aggregated local history of each storage array to a data warehouse associated with a data protection system, wherein the aggregated access history at the data warehouse includes the aggregated local histories from each of the storage arrays.

Plain English Translation

This invention relates to data storage systems and the management of access history data across multiple storage arrays. The problem addressed is the need to efficiently collect, aggregate, and analyze access history data from distributed storage arrays to support data protection and management functions. Traditional systems often struggle with centralized tracking of access patterns, leading to inefficiencies in data protection, performance optimization, and compliance monitoring. The invention involves a non-transitory storage medium containing instructions for a storage array to perform operations that include aggregating local access history data from the array. This local history includes records of data access operations, such as read and write activities, performed on the storage array. The aggregated local history is then sent to a centralized data warehouse associated with a data protection system. The data warehouse collects and consolidates the aggregated local histories from multiple storage arrays, creating a comprehensive access history dataset. This centralized dataset enables advanced analytics, such as identifying usage trends, detecting anomalies, and optimizing data protection strategies. The system

Claim 15

Original Legal Text

15. The non-transitory storage medium of claim 11, the operations further comprising performing a time series analysis on the access history.

Plain English Translation

A system and method for analyzing access history data involves storing access records in a non-transitory storage medium, where each record includes identifiers for a user, a resource, and a timestamp. The system processes these records to generate an access history, which is then used to perform a time series analysis. The time series analysis identifies patterns, trends, or anomalies in the access history, such as frequency of access, temporal distribution, or deviations from expected behavior. This analysis can be applied to various domains, including cybersecurity, user behavior monitoring, or resource utilization tracking. The system may also include additional operations such as filtering the access history based on specific criteria, such as time ranges or user roles, to refine the analysis. The time series analysis may employ statistical methods, machine learning models, or other analytical techniques to extract meaningful insights from the access data. The results of the analysis can be used to detect unauthorized access attempts, optimize resource allocation, or improve system performance. The system ensures that the access history is stored securely and can be retrieved efficiently for analysis.

Claim 16

Original Legal Text

16. The non-transitory storage medium of claim 15, the operations further comprising identifying patterns across storage arrays to identify the consistency groups, wherein each of the consistency groups associates one or more of the storage devices with a server, and to identify the applications, wherein each of the applications is associated with one or more of the storage devices.

Plain English Translation

The invention relates to data storage management in distributed computing environments, specifically addressing the challenge of tracking and managing relationships between storage devices, servers, and applications to ensure data consistency and efficient resource allocation. The system analyzes storage arrays to detect patterns that define consistency groups, where each group links one or more storage devices to a specific server. Additionally, the system identifies applications and their associations with storage devices, enabling the mapping of how applications utilize storage resources across servers. This approach helps maintain data integrity by ensuring that related storage operations are grouped and managed together, reducing the risk of inconsistencies during data access or replication. The solution automates the discovery of these relationships, improving operational efficiency and simplifying storage administration in complex environments. By correlating storage devices with servers and applications, the system provides a clearer understanding of data dependencies, supporting better resource allocation and troubleshooting. The invention is particularly useful in large-scale data centers or cloud environments where tracking storage relationships manually is impractical.

Claim 17

Original Legal Text

17. The non-transitory storage medium of claim 11, the operations further comprising tuning the analysis with a tuning factor.

Plain English Translation

A system and method for analyzing data involves processing input data to generate an output, where the analysis is adjusted using a tuning factor. The system includes a data processing module that receives input data and applies one or more analysis operations to produce an output. The analysis operations may include filtering, transformation, or other data manipulation techniques. The tuning factor is used to modify the behavior of the analysis operations, allowing for adjustments to the output based on specific requirements or conditions. For example, the tuning factor may adjust the sensitivity of a filter, the weight of a transformation, or the parameters of a machine learning model. The system may also include a feedback mechanism to dynamically update the tuning factor based on the output or external inputs. This approach enables flexible and adaptive data analysis, improving accuracy and relevance in applications such as signal processing, data mining, or predictive modeling. The tuning factor can be manually set or automatically optimized to enhance performance.

Claim 18

Original Legal Text

18. The non-transitory storage medium of claim 11, the operations further comprising performing a backup operation based on the consistency groups or the applications.

Plain English Translation

A system and method for managing data consistency in storage environments involves organizing data into consistency groups or application-based groups to ensure data integrity during backup operations. The technology addresses the challenge of maintaining consistent data states across multiple storage volumes or applications, which is critical for reliable backups and disaster recovery. By grouping related data sets or applications, the system ensures that all relevant data is captured in a synchronized manner, preventing inconsistencies that could arise from separate backup operations. The backup operation is performed based on these predefined consistency groups or application groupings, allowing for efficient and reliable data protection. This approach simplifies backup management, reduces the risk of data corruption, and ensures that backups accurately reflect the state of the system at a specific point in time. The solution is particularly useful in environments where multiple applications or storage volumes must be backed up together to maintain data integrity. The system may also include features for monitoring and verifying the consistency of the groups before initiating backups, further enhancing reliability.

Claim 19

Original Legal Text

19. The non-transitory storage medium of claim 11, the operations further comprising identifying fingerprints within the access history, wherein the fingerprints include a delta logical block address and a delta operation type.

Plain English Translation

A system for analyzing storage access patterns identifies and processes fingerprints within access history data to detect and classify specific storage operations. The system captures access history data from a storage device, which includes records of read and write operations along with their associated logical block addresses (LBAs). The system then processes this access history to identify fingerprints, which are defined by a delta LBA and a delta operation type. The delta LBA represents the difference in LBA between consecutive operations, while the delta operation type indicates the change in operation type (e.g., from read to write or vice versa). By analyzing these fingerprints, the system can detect patterns indicative of specific storage behaviors, such as sequential access, random access, or repeated operations. This analysis helps optimize storage performance, predict wear levels, or identify potential security threats by recognizing anomalous access patterns. The system may also correlate these fingerprints with other metadata, such as timestamps or user identifiers, to provide a comprehensive view of storage activity. The fingerprint-based approach allows for efficient pattern recognition without requiring extensive preprocessing or complex heuristics.

Claim 20

Original Legal Text

20. The non-transitory storage medium of claim 11, the operations further comprising creating a ranked list of a cumulative frequency distribution, which corresponds to a total percentage of IO per delta operation type.

Plain English Translation

A system and method for analyzing input/output (IO) operations in a storage environment. The technology addresses the challenge of efficiently monitoring and optimizing IO performance by categorizing and quantifying different types of IO operations. The invention involves tracking IO operations, classifying them into distinct operation types, and calculating a cumulative frequency distribution for each type. This distribution represents the total percentage of IO operations corresponding to each delta operation type, allowing for performance analysis and optimization. The ranked list of cumulative frequency distributions enables identification of the most frequent or impactful IO operation types, facilitating targeted improvements in storage system efficiency. The system may also include visualizing the ranked list to provide actionable insights for administrators. By quantifying IO operation types and their contributions to overall system performance, the invention helps optimize storage resource allocation and reduce bottlenecks. The method ensures accurate tracking and analysis of IO operations, supporting data-driven decision-making for storage system management.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06F

Patent Metadata

Filing Date

March 20, 2020

Publication Date

November 29, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search