Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A computer-implemented method for backing-up an eventually-consistent database in a production cluster, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising: forming, on a production node, a stable copy of production data; provisioning storage on a backup node based on an amount of data in the stable copy and a replication factor; transferring information from the stable copy to a backup copy on the backup node; performing record synthesis on the backup copy to merge record updates into complete backup records; identifying and discarding any stale records and any redundant records in the complete backup records; and transferring the complete backup records from the backup node to a cloud storage device.
Computer systems and data management. The invention addresses the challenge of reliably backing up data from an eventually-consistent database operating within a production cluster. The method involves creating a stable snapshot of the production data on a production node. Storage is then allocated on a separate backup node, the amount of which is determined by the size of the stable data copy and a specified replication factor. Information from this stable copy is transmitted to form a backup copy on the backup node. A key step is record synthesis performed on the backup copy, which consolidates record updates to create complete backup records. The process further includes identifying and removing any outdated or duplicate records from these complete backup records. Finally, the complete, cleansed backup records are transferred from the backup node to a cloud storage device.
2. The computer-implemented method of claim 1 , further comprising: identifying a topology of a production cluster of which the production node is a constituent part to identify the production node as requiring backup.
This invention relates to data backup systems in distributed computing environments, specifically addressing the challenge of efficiently identifying nodes within a production cluster that require backup. The method involves analyzing the topology of a production cluster to determine which nodes are critical for backup operations. By examining the cluster's structure, the system identifies nodes that are essential for maintaining data integrity and availability. This identification process ensures that backup operations are targeted only at relevant nodes, optimizing resource usage and reducing unnecessary backup activities. The method integrates with existing cluster management systems to dynamically assess node roles and dependencies, enabling adaptive backup strategies. The solution improves backup efficiency by focusing on nodes that are most critical to the cluster's operation, minimizing downtime and ensuring reliable data recovery. The approach is particularly useful in large-scale distributed systems where manual identification of backup nodes is impractical. By automating the identification process, the system enhances scalability and reliability in backup operations.
3. The computer-implemented method of claim 1 , further comprising: determining the amount of data in the stable copy.
A system and method for managing data storage and retrieval in a distributed computing environment addresses the challenge of efficiently tracking and accessing data across multiple storage nodes. The invention involves creating a stable copy of data in a distributed storage system, where the stable copy is a verified and consistent version of the data that can be reliably accessed by multiple users or processes. The method includes determining the amount of data in the stable copy to optimize storage allocation, reduce redundancy, and ensure data integrity. This step helps in monitoring storage usage, identifying potential inefficiencies, and dynamically adjusting storage resources. The system may also involve distributing the stable copy across multiple storage nodes to enhance fault tolerance and improve access performance. By tracking the data amount in the stable copy, the system ensures that storage resources are used efficiently while maintaining data consistency and availability. The invention is particularly useful in large-scale distributed systems where data reliability and efficient storage management are critical.
4. The computer-implemented method of claim 1 , further comprising: provisioning the backup node in a backup cluster based on the amount of data in the stable copy.
This invention relates to data storage systems, specifically methods for managing backup nodes in a distributed storage environment. The problem addressed is efficiently provisioning backup nodes to handle data backups while optimizing resource allocation. The method involves creating a stable copy of data from a primary storage system, determining the amount of data in this stable copy, and then provisioning a backup node in a backup cluster based on the data size. This ensures that backup nodes are allocated resources proportionally to the data they need to store, preventing over-provisioning or under-provisioning. The method may also include monitoring the stable copy for changes, updating the backup node accordingly, and ensuring data consistency between the primary and backup systems. The backup node is dynamically adjusted to match the data volume, improving storage efficiency and reducing costs. This approach is particularly useful in large-scale distributed storage systems where backup resources must scale dynamically with data growth. The invention ensures reliable data backup while optimizing infrastructure utilization.
5. The computer-implemented method of claim 1 , further comprising: reverting the backup node to a pre-transfer state.
A computer-implemented method for managing data in a distributed system addresses the challenge of maintaining data consistency and availability during node failures or transfers. The method involves transferring data from a primary node to a backup node, ensuring the backup node can take over operations if the primary node fails. To prevent data corruption or inconsistencies, the method includes reverting the backup node to a pre-transfer state if an error occurs during the transfer process. This reversion ensures the backup node remains in a known, stable configuration, allowing for a clean restart of the transfer or failover procedures. The method may also involve validating the transferred data to confirm integrity before finalizing the backup node's readiness. This approach enhances system reliability by minimizing downtime and data loss risks during node transitions. The solution is particularly useful in high-availability systems where uninterrupted service is critical, such as cloud computing, database replication, or distributed storage environments. By incorporating the reversion step, the method provides a safeguard against partial or corrupted transfers, ensuring the backup node can reliably assume primary responsibilities when needed.
6. The computer-implemented method of claim 1 , wherein a number of production nodes in a production cluster of which the production node is a constituent part does not equal a number of backup nodes in a backup cluster of which the backup node is a constituent part.
This invention relates to distributed computing systems, specifically methods for managing production and backup clusters in a fault-tolerant environment. The problem addressed is ensuring data consistency and availability when the number of nodes in a production cluster differs from the number of nodes in a corresponding backup cluster. Traditional systems often assume equal node counts, leading to inefficiencies or failures when scaling or reconfiguring clusters. The method involves a production cluster and a backup cluster, where the production cluster contains a production node and the backup cluster contains a backup node. The key innovation is that the production cluster and backup cluster may have different numbers of nodes. This allows for flexible scaling, where the backup cluster can be sized independently of the production cluster to optimize cost, performance, or redundancy requirements. The method ensures that data operations performed on the production node are replicated to the backup node, even when the cluster sizes differ. This includes handling data writes, reads, and updates while maintaining consistency between the clusters. The system may also include mechanisms to detect and recover from failures, ensuring that the backup node can take over if the production node fails. The approach enables efficient resource utilization and improved fault tolerance in distributed systems.
7. The computer-implemented method of claim 1 , further comprising: restoring the backup copy from the cloud storage device to the production node.
This invention relates to data backup and recovery systems, specifically for restoring data from cloud storage to a production node. The method involves creating a backup copy of data from a production node and storing it in a cloud storage device. The backup copy is then restored from the cloud storage device back to the production node, ensuring data availability and continuity. The system may include a production node, a cloud storage device, and a backup module that manages the backup and restoration processes. The backup module may use encryption to secure the data during transfer and storage. The method ensures that data can be recovered efficiently in case of failures or data loss, minimizing downtime and maintaining data integrity. The restoration process may involve verifying the integrity of the backup copy before restoring it to the production node. The system may also include monitoring and alerting mechanisms to track the status of backup and restoration operations. This invention addresses the need for reliable and secure data backup and recovery solutions in cloud computing environments.
8. A system for backing-up an eventually-consistent database in a production cluster, the system comprising: a forming module, stored in a memory, that forms, on a production node, a stable copy of production data; a provisioning module, stored in the memory, that provisions storage on a backup node based on an amount of data in the stable copy and a replication factor; a first transferring module, stored in the memory, that transfers information from the stable copy to a backup copy on the backup node; a performing module, stored in the memory, that performs record synthesis on the backup copy to merge record updates into complete backup records; an identifying and discarding module, stored in the memory, that identifies and discards any stale records and any redundant records in the complete backup records; a second transferring module, stored in the memory, that transfers the complete backup records from the backup node to a cloud storage device; and at least one physical processor that executes the forming module, the provisioning module, the first transferring module, the performing module, the identifying and discarding module, and the second transferring module.
The system addresses the challenge of efficiently backing up an eventually-consistent database in a production cluster, where data consistency is not guaranteed at all times. In such environments, traditional backup methods may fail to capture accurate or complete data states. The system creates a stable copy of production data on a production node, ensuring a consistent snapshot before backup operations begin. Storage is provisioned on a backup node based on the data size and a specified replication factor, optimizing resource allocation. The system then transfers this stable copy to the backup node, where record synthesis is performed to merge incremental updates into complete, coherent backup records. Stale or redundant records are identified and discarded to maintain data integrity. Finally, the refined backup records are transferred to a cloud storage device for long-term retention. The system ensures reliable backups by handling eventual consistency, reducing storage overhead, and maintaining data accuracy throughout the process.
9. The system of claim 8 , further comprising: an identifying module, stored in the memory, that identifies a topology of a production cluster of which the production node is a constituent part to identify the production node as requiring backup.
A system for managing backup operations in a production cluster environment addresses the challenge of efficiently identifying nodes that require backup within a distributed computing infrastructure. The system includes a production node configured to execute workloads and a memory storing executable instructions. A monitoring module in the memory monitors the production node for backup-related conditions, such as performance degradation or storage capacity thresholds. A backup module initiates backup operations for the production node when triggered by the monitoring module. The system further includes an identifying module that determines the topology of the production cluster to pinpoint which nodes, including the production node, need backup. This involves analyzing cluster configuration data to assess node roles, dependencies, and criticality, ensuring backup operations are prioritized for nodes that are essential to cluster stability or at risk of failure. The identifying module may use cluster management APIs or topology mapping tools to gather this information. The system optimizes backup processes by dynamically adjusting backup schedules or resource allocation based on the identified topology and node status, reducing downtime and ensuring data integrity across the cluster.
10. The system of claim 8 , further comprising: a determining module, stored in the memory, that determines the amount of data in the stable copy.
A system for managing data storage and retrieval includes a memory storing a stable copy of data and a determining module that evaluates the amount of data in the stable copy. The system also includes a processor that executes instructions to perform operations such as receiving a request for data, identifying a location of the requested data, and retrieving the data from the identified location. The system further includes a data storage module that stores the data in a stable copy within the memory, ensuring data integrity and availability. The determining module assesses the size or quantity of data in the stable copy, which can be used for monitoring storage capacity, optimizing performance, or triggering data management actions. This system is designed to address challenges in data storage efficiency, reliability, and retrieval speed, particularly in environments where data integrity and quick access are critical. The determining module provides insights into the stored data volume, enabling better resource allocation and system maintenance. The overall system ensures that data is stored securely and can be retrieved efficiently, supporting applications requiring high availability and fault tolerance.
11. The system of claim 8 , further comprising: a provisioning module, stored in the memory, that provisions the backup node in a backup cluster based on the amount of data in the stable copy.
This invention relates to data backup systems, specifically improving the provisioning of backup nodes in a backup cluster. The problem addressed is inefficient resource allocation in backup systems, where backup nodes may be over- or under-provisioned relative to the actual data volume being protected. The system includes a backup cluster with multiple nodes and a stable copy of data that is periodically updated. A provisioning module dynamically adjusts the allocation of backup nodes based on the amount of data in the stable copy. This ensures that the backup infrastructure scales appropriately with data growth, optimizing storage and computational resources. The provisioning module may also consider other factors such as data access patterns or redundancy requirements to further refine node allocation. The system ensures that backup operations remain efficient and cost-effective by aligning node provisioning with actual data demands. This approach reduces unnecessary resource consumption while maintaining data availability and reliability.
12. The system of claim 8 , further comprising: a reverting module, stored in the memory, that reverts the backup node to a pre-transfer state.
A system for managing data backups includes a backup node configured to receive and store data from a primary node. The system also includes a monitoring module that tracks the operational status of the primary node and the backup node. When the primary node fails, the monitoring module detects the failure and triggers a transfer of data processing responsibilities from the primary node to the backup node. The system further includes a reverting module that restores the backup node to its state before the transfer occurred. This ensures that once the primary node is restored, the backup node can return to its original role without retaining any residual data or processing tasks from the temporary transfer. The reverting module may include logic to clear temporary data, reset configurations, or synchronize with the primary node to revert to the pre-transfer state. This system is particularly useful in high-availability environments where seamless failover and recovery are critical.
13. The system of claim 8 , wherein a number of production nodes in a production cluster of which the production node is a constituent part does not equal a number of backup nodes in a backup cluster of which the backup node is a constituent part.
This invention relates to distributed computing systems, specifically addressing the challenge of maintaining data consistency and availability in clustered environments where production and backup nodes may have unequal numbers. The system includes a production cluster and a backup cluster, each comprising multiple nodes. The production cluster processes and stores data, while the backup cluster provides redundancy to ensure data recovery in case of failures. A key feature is that the number of nodes in the production cluster does not necessarily match the number in the backup cluster, allowing for flexible scaling and resource allocation. The system dynamically manages data replication between the clusters, ensuring that data written to the production cluster is also stored in the backup cluster, even if the clusters have different node counts. This design accommodates varying workloads and hardware constraints while maintaining data integrity. The system may also include mechanisms for failover, where the backup cluster can take over operations if the production cluster fails, and for load balancing to distribute tasks efficiently across nodes. The invention improves fault tolerance and operational efficiency in distributed computing environments by decoupling the node counts of production and backup clusters.
14. The system of claim 8 , further comprising: a restoring module, stored in the memory, that restores the backup copy from the cloud storage device to the production node.
A system for data backup and restoration in a distributed computing environment addresses the challenge of ensuring data availability and integrity in the event of failures or disruptions. The system includes a backup module that creates a backup copy of data from a production node and stores it in a cloud storage device. The backup module may also encrypt the backup copy before transmission to the cloud storage device to enhance security. Additionally, the system includes a monitoring module that tracks the status of the production node and the cloud storage device, detecting any failures or disruptions that may require data restoration. The system further includes a restoring module that retrieves the backup copy from the cloud storage device and restores it to the production node, ensuring that the data is available and consistent. The restoring module may also decrypt the backup copy if it was encrypted during the backup process. This system ensures that data is protected and can be quickly recovered in the event of failures, minimizing downtime and data loss.
15. A non-transitory computer-readable medium comprising one or more computer-executable instructions that, when executed by at least one processor of a computing device, cause the computing device to: form, on a production node, a stable copy of production data; provision storage on a backup node based on an amount of data in the stable copy and a replication factor; transfer information from the stable copy to a backup copy on the backup node; perform record synthesis on the backup copy to merge record updates into complete backup records; identify and discard any stale records and any redundant records in the complete backup records; and transfer the complete backup records from the backup node to a cloud storage device.
This invention relates to data backup and replication systems, specifically addressing the challenge of efficiently storing and managing backup data in cloud storage environments. The system automates the process of creating and maintaining reliable backups by first forming a stable copy of production data on a production node. Storage is then provisioned on a backup node based on the data size and a specified replication factor, ensuring sufficient capacity for redundancy. The system transfers the stable copy to the backup node, where it undergoes record synthesis to merge incremental updates into complete records. Stale or redundant records are identified and removed to optimize storage usage. Finally, the refined backup records are transferred to a cloud storage device, ensuring data durability and accessibility. The invention improves backup efficiency by reducing storage overhead and ensuring data consistency through automated record management.
16. The non-transitory computer-readable medium of claim 15 , wherein the computer-executable instructions further cause the computing device to: identify a topology of a production cluster of which the production node is a constituent part to identify the production node as requiring backup.
A system and method for managing backup operations in a distributed computing environment, particularly in a production cluster, where nodes may require backup based on their role or status. The system monitors the production cluster to determine the topology and identify specific nodes that need backup. This involves analyzing the cluster's configuration, node roles, and operational status to assess which nodes are critical or at risk. The system then initiates backup procedures for the identified nodes, ensuring data integrity and availability. The backup process may include replicating data, creating snapshots, or transferring data to a secondary storage system. The system dynamically adjusts backup priorities based on real-time cluster conditions, such as node failures, high workloads, or changes in data criticality. This approach optimizes resource usage and minimizes downtime by focusing backup efforts on the most critical nodes. The solution is particularly useful in large-scale distributed systems where manual backup management is impractical. The system may also integrate with existing cluster management tools to streamline operations and improve efficiency.
17. The non-transitory computer-readable medium of claim 15 , wherein the computer-executable instructions further cause the computing device to: determine the amount of data in the stable copy.
A system and method for managing data storage and retrieval in a computing environment addresses the challenge of efficiently handling data consistency and integrity during storage operations. The invention involves creating a stable copy of data to ensure reliability, particularly in scenarios where data may be subject to corruption or loss during storage or retrieval processes. The system includes a computing device that executes instructions to generate a stable copy of data, where the stable copy is a verified and reliable version of the data that can be used for subsequent operations. The system further includes mechanisms to determine the amount of data present in the stable copy, allowing for efficient management of storage resources and ensuring that the data is complete and uncorrupted. This determination process may involve checking the size, integrity, or other metadata associated with the stable copy to confirm its validity. The invention is particularly useful in environments where data integrity is critical, such as in database systems, file storage solutions, or backup systems, where ensuring the reliability of stored data is essential for maintaining system functionality and user trust. By providing a stable copy and verifying its contents, the system enhances data reliability and reduces the risk of data loss or corruption during storage operations.
18. The non-transitory computer-readable medium of claim 15 , wherein the computer-executable instructions further cause the computing device to: provision the backup node in a backup cluster based on the amount of data in the stable copy.
This invention relates to data backup systems, specifically optimizing the provisioning of backup nodes in a backup cluster. The problem addressed is inefficient resource allocation in backup systems, where backup nodes may be over-provisioned or under-provisioned, leading to wasted resources or insufficient capacity. The solution involves dynamically provisioning backup nodes based on the amount of data in a stable copy, ensuring optimal resource utilization. The system includes a computing device that generates a stable copy of data from a primary storage system. The stable copy is a consistent snapshot of the data, ensuring data integrity during backup operations. The computing device then determines the amount of data in the stable copy, which is used to calculate the required storage capacity for the backup node. Based on this calculation, the computing device provisions a backup node in a backup cluster with the appropriate storage capacity. This dynamic provisioning ensures that the backup node has sufficient space to store the stable copy while avoiding over-provisioning, which can lead to unnecessary costs and resource waste. The backup cluster may include multiple backup nodes, each capable of storing different portions of the stable copy. The system may also monitor the backup nodes to ensure they are functioning correctly and adjust provisioning as needed. This approach improves the efficiency and reliability of data backup operations, reducing costs and ensuring data availability.
19. The non-transitory computer-readable medium of claim 15 , wherein the computer-executable instructions further cause the computing device to: revert the backup node to a pre-transfer state.
A system and method for managing backup nodes in a distributed computing environment addresses the challenge of maintaining data consistency and system stability during node transfers. The invention provides a mechanism to revert a backup node to a pre-transfer state if an error or failure occurs during the transfer process. This ensures that the backup node can be restored to a known, stable configuration, preventing data corruption or service disruptions. The system monitors the transfer process and detects any anomalies or failures. Upon detection, it automatically triggers a reversion procedure, rolling back the backup node to its state before the transfer began. This includes restoring data, configurations, and system settings to their original values. The reversion process may involve reverting changes made to the backup node, such as undoing data modifications, resetting network connections, or restoring system files. The invention ensures that the backup node remains reliable and ready for future operations, even if a transfer attempt fails. This approach enhances system resilience and reduces downtime in distributed computing environments.
20. The non-transitory computer-readable medium of claim 15 , wherein a number of production nodes in a production cluster of which the production node is a constituent part does not equal a number of backup nodes in a backup cluster of which the backup node is a constituent part.
This invention relates to distributed computing systems, specifically a method for managing data replication between production and backup clusters in a way that allows for unequal node counts between the clusters. The problem addressed is the inefficiency and inflexibility of traditional replication systems that require identical node counts in production and backup clusters, which can lead to resource waste or insufficient redundancy. The system includes a production cluster with multiple production nodes and a backup cluster with multiple backup nodes. The key innovation is that the number of production nodes does not have to match the number of backup nodes. This allows for dynamic scaling of either cluster independently, optimizing resource usage and cost. The system ensures data consistency by distributing data across the clusters in a way that accounts for the differing node counts, using techniques such as data partitioning or sharding to maintain reliability. The method involves monitoring the clusters, detecting discrepancies in node counts, and automatically adjusting data distribution or replication strategies to compensate. This ensures that data integrity is maintained even when the clusters are scaled differently. The approach improves flexibility in cluster management, reduces operational overhead, and allows for more efficient use of hardware resources. The system is particularly useful in cloud environments where resource allocation can vary dynamically.
Unknown
December 15, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.