An improved technique involves generating a predictive model for data storage system configuration management. A customer support center generates such a predictive model from detailed customer configuration and transaction history. For example, a population of customers submits transaction logs to the customer support center; such transaction logs provide details as to how the customers responded to various events. The population of customers may also submit data including various statistics such as load intensity, workload characteristics, data access patterns, data change patterns, and data fingerprints to the customer support center. The customer support center then performs an analysis on the data and, from the analysis, computes values of model parameters that define a predictive model. This predictive model is configured to take in a particular state of any data storage system and produce a configuration that optimizes performance of that data storage system.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method of generating a configuration advisory tool constructed and arranged to provide optimized configurations for data storage systems located at remote sites on a network in response to configuration queries from the data storage systems, the method comprising: receiving current storage system data from a particular data storage system located at a particular remote site on the network; storing the current storage system data in a database that stores previous storage system data that had been received from previous data storage systems located at the remote sites on the network prior to receiving the current storage system data; and generating, on a host computer, a predictive model configured to output particular values of configuration management parameters to the remote site on the network in response to the host computer receiving values of input parameters that are indicative of a configuration query, the predicative predictive model including model parameters based on the current storage system data and the previous storage system data, the particular values of the configuration parameters being indicative of an optimal configuration of the data storage system located at the remote site on the network.
2. A method as in claim 1 , wherein the database includes an unstructured database configured to store unstructured data; wherein the current storage system data includes current administrator log data indicative of actions performed by an administrator of the particular data storage system in reaction to changing conditions in the particular data storage system; wherein the previous customer storage data including previous administrator log data indicative of actions performed by administrators of the previous data storage systems in reaction to changing conditions in the previous data storage systems; and wherein storing the current storage system data includes: storing the current administrator log data in the unstructured database, the unstructured database also storing the previous administrator log data.
3. A method as in claim 2 , wherein the database further includes a structured database configured to store structured data; wherein generating the predictive model includes: extracting a sequence of entries from the unstructured database, each entry of the sequence of entries including an administrator action, identifying triads of entries of the sequence of entries indicative of an event, a reaction to the event, and an outcome of the reaction, generating, from a machine learning algorithm, a model from the triads, the model seeking to automatically duplicate the reaction to the event, forming the model parameters based on the triads, and storing the model parameters in the structured database.
4. A method as in claim 2 , wherein the current storage system data further includes a particular product number of a set of product numbers, each product number of the set of product numbers being indicative of a product type of the particular data storage system; wherein generating the predictive model includes: forming the model parameters based on the class identifier of the particular class of customers, the predictive model being enabled to output the particular values of the configuration management parameters for any product number of the set of product numbers.
5. A method as in claim 2 , wherein the particular data storage system belongs to a particular customer; wherein the particular customer is a member of a particular class of customers, the particular class of customers belonging to a set of classes of customers, each class of customers of the set of classes of customers having a class identifier; wherein the current storage system data further includes the class identifier of the particular class of customers; wherein generating the predictive model includes: forming the model parameters based on the class identifier of the particular class of customers.
6. A method as in claim 5 , wherein the current storage system data includes a set of particular identifiers indicating an identity of the particular customer; wherein receiving the current storage system data includes: prior to storing the current storage system data, encrypting each particular identifier of the set of particular identifiers.
7. A method as in claim 1 , wherein the current storage system data includes a first backup dataset and a second backup dataset, the first backup dataset including a first timestamp and the second backup dataset including a second timestamp; wherein receiving the current storage system data includes: performing a differencing operation on the first backup dataset and the second backup dataset to produce a differenced dataset; wherein storing the current storage system data in the database includes: writing the differenced dataset to the database; and wherein the model parameters are further based on the differenced dataset.
8. A system constructed and arranged to generate a configuration advisory tool constructed and arranged to provide optimized configurations for data storage systems located at remote sites on a network in response to configuration queries from the data storage systems, the system comprising: a network interface; memory; and a controller including controlling circuitry coupled to the memory, the controlling circuitry being constructed and arranged to: receive current storage system data from a particular data storage system located at a particular remote site on the network; store the current storage system data in a database that stores previous storage system data that had been received from previous data storage systems located at the remote sites on the network prior to receiving the current storage system data; and generate, on a host computer, a predictive model configured to output particular values of configuration management parameters to the remote site on the network in response to the host computer receiving values of input parameters that are indicative of a configuration query, the predictive model including model parameters based on the current storage system data and the previous storage system data, the particular values of the configuration parameters being indicative of an optimal configuration of the data storage system located at the remote site on the network.
9. A system as in claim 8 , wherein the database includes an unstructured database configured to store unstructured data; wherein the current storage system data includes current administrator log data indicative of actions performed by an administrator of the particular data storage system in reaction to changing conditions in the particular data storage system; wherein the previous customer storage data including previous administrator log data indicative of actions performed by administrators of the previous data storage systems in reaction to changing conditions in the previous data storage systems; wherein storing the current storage system data includes: storing the current administrator log data in the unstructured database, the unstructured database also storing the previous administrator log data.
10. A system as in claim 9 , wherein the database further includes a structured database configured to store structured data; wherein generating the predictive model includes: extracting a sequence of entries from the unstructured database, each entry of the sequence of entries including an administrator action, identifying triads of entries of the sequence of entries indicative of an event, a reaction to the event, and an outcome of the reaction, generating, from a machine learning algorithm, a model from the triads, the model seeking to automatically duplicate the reaction to the event, forming the model parameters based on the triads, and storing the model parameters in the structured database.
11. A system as in claim 9 , wherein the current storage system data further includes a particular product number of a set of product numbers, each product number of the set of product numbers being indicative of a product type of the particular data storage system; wherein generating the predictive model includes: forming the model parameters based on the class identifier of the particular class of customers, the predictive model being enabled to output the particular values of the configuration management parameters for any product number of the set of product numbers.
12. A system as in claim 9 , wherein the particular data storage system belongs to a particular customer; wherein the particular customer is a member of a particular class of customers, the particular class of customers belonging to a set of classes of customers, each class of customers of the set of classes of customers having a class identifier; wherein the current storage system data further includes the class identifier of the particular class of customers; wherein generating the predictive model includes: forming the model parameters based on the class identifier of the particular class of customers.
13. A system as in claim 12 , wherein the current storage system data includes a set of particular identifiers indicating an identity of the particular customer; wherein receiving the current storage system data includes: prior to storing the current storage system data, encrypting each particular identifier of the set of particular identifiers.
14. A system as in claim 8 , wherein the current storage system data includes a first backup dataset and a second backup dataset, the first backup dataset including a first timestamp and the second backup dataset including a second timestamp; wherein receiving the current storage system data includes: performing a differencing operation on the first backup dataset and the second backup dataset to produce a differenced dataset; wherein storing the current storage system data in the database includes: writing the differenced dataset to the database; and wherein the model parameters are further based on the differenced dataset.
15. A computer program product having a non-transitory, computer-readable storage medium which stores code to generate a configuration advisory tool constructed and arranged to provide optimized configurations for data storage systems located at remote sites on a network in response to configuration queries from the data storage systems, the code including instructions to: receive current storage system data from a particular data storage system located at a particular remote site on the network; store the current storage system data in a database that stores previous storage system data that had been received from previous data storage systems located at the remote sites on the network prior to receiving the current storage system data; and generate, on a host computer, a predictive model configured to output particular values of configuration management parameters to the remote site on the network in response to the host computer receiving values of input parameters that are indicative of a configuration query, the predictive model including model parameters based on the current storage system data and the previous storage system data, the particular values of the configuration parameters being indicative of an optimal configuration of the data storage system located at the remote site on the network.
16. A computer program product as in claim 15 , wherein the database includes an unstructured database configured to store unstructured data; wherein the current storage system data includes current administrator log data indicative of actions performed by an administrator of the particular data storage system in reaction to changing conditions in the particular data storage system; wherein the previous customer storage data including previous administrator log data indicative of actions performed by administrators of the previous data storage systems in reaction to changing conditions in the previous data storage systems; wherein storing the current storage system data includes: storing the current administrator log data in the unstructured database, the unstructured database also storing the previous administrator log data.
17. A computer program product as in claim 16 , wherein the database further includes a structured database configured to store structured data; wherein generating the predictive model includes: extracting a sequence of entries from the unstructured database, each entry of the sequence of entries including an administrator action, identifying triads of entries of the sequence of entries indicative of an event, a reaction to the event, and an outcome of the reaction, generating, from a machine learning algorithm, a model from the triads, the model seeking to automatically duplicate the reaction to the event, forming the model parameters based on the triads, and storing the model parameters in the structured database.
18. A computer program product as in claim 16 , wherein the current storage system data further includes a particular product number of a set of product numbers, each product number of the set of product numbers being indicative of a product type of the particular data storage system; wherein generating the predictive model includes: forming the model parameters based on the class identifier of the particular class of customers, the predictive model being enabled to output the particular values of the configuration management parameters for any product number of the set of product numbers.
19. A computer program product as in claim 16 , wherein the particular data storage system belongs to a particular customer; wherein the particular customer is a member of a particular class of customers, the particular class of customers belonging to a set of classes of customers, each class of customers of the set of classes of customers having a class identifier; wherein the current storage system data further includes the class identifier of the particular class of customers; wherein generating the predictive model includes: forming the model parameters based on the class identifier of the particular class of customers.
20. A computer program product as in claim 19 , wherein the current storage system data includes a set of particular identifiers indicating an identity of the particular customer; wherein receiving the current storage system data includes: prior to storing the current storage system data, encrypting each particular identifier of the set of particular identifiers.
21. A method as in claim 1 , wherein each of the previous data storage systems are located at a plurality of distinct remote sites; and wherein the method further comprises, prior to receiving the current storage system data from the particular data storage system, storing the previous storage system data from each of the previous data storage systems located at the plurality of distinct remote sites.
22. A method as in claim 1 , wherein storing the current storage system data in the database that stores the previous storage system data includes (i) storing previous administrator log data indicative of actions performed by administrators of the previous data storage systems in reaction to changing conditions in the previous data storage systems and (ii) current administrator log data indicative of actions performed by an administrator of the particular data storage system in reaction to changing conditions in the particular data storage system.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
June 29, 2012
December 30, 2014
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.