A system and method can support federation replication in a distributed computing environment. The system can provide one or more federation replication channels between a plurality of members in a first cluster and a plurality of members in a second cluster. Furthermore, a replication request can be transmitted from a federation coordinator to the plurality of members in the first cluster, wherein each said member in the first cluster owns a set of partitions. Then, the aggregated data for each said partition in the first cluster can be sent to the plurality of members in the second cluster via said one or more federation replication channels. Additionally, using the second cluster, the system can take a persistent snapshot of information on the plurality of members in the first cluster while the first cluster is operational.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for supporting federation replication in a distributed computing environment, the method comprising: providing a first plurality of members in a first cluster of the distributed computing environment and a second plurality of members in a second cluster of the distributed computing environment, wherein each member of the first and second plurality of members in the first and second clusters comprises a distributed cache, and wherein each distributed cache of each member of the first and second plurality of members in the first and second clusters comprises a plurality of data partitions that store a datum and a key as a key-value pair; providing one or more federation replication channels between the first plurality of members in the first cluster and the second plurality of members in the second cluster; providing a federated cache service to each of the first plurality of members in the first cluster, the federated cache service comprising federation data features operable to maintain selected distributed caches of the first plurality of members as federated caches; initiating, by a federation coordinator, a replication request for replicating each of the data partitions of each distributed cache of each member in the first plurality of members; transmitting the replication request from the federation coordinator to each of the first plurality of members in the first cluster; intercepting the replication request by federation service event interceptors of the federated cache service; asynchronously aggregating, by the federated cache service, the key-value pair of each data partition of each of the federated caches; sending by the federated cache service the aggregated key-value pairs of each data partition of each of the federated caches in the first cluster to the second plurality of members in the second cluster via said one or more federation replication channels without: quiescing the federated cache service in the first cluster, or locking data of the federated caches in the first cluster; and seeding the data partitions of each distributed cache of each of the second plurality of members in the second cluster with the aggregated key-value pairs of each data partition of each of the federated caches in the first cluster.
2. The method of claim 1 , further comprising: taking a snapshot for the plurality of members in the first cluster while the first cluster is operational.
3. The method of claim 1 , further comprising: taking a snapshot for the plurality of members in the first cluster while the first cluster is operational by suspending the second cluster after it receives the aggregated data from the first cluster and then performing a snapshot operation on the second cluster.
4. The method of claim 1 , further comprising: taking a snapshot for the plurality of members in the first cluster while the first cluster is operational by suspending the second cluster after it receives the aggregated data from the first cluster and then performing a snapshot operation on the second cluster; and resuming the second cluster after completion of the snapshot operation and resynchronizing the second cluster with the first cluster.
5. The method of claim 1 , further comprising: initiating federation replication upon cold start up of said second cluster.
6. The method of claim 1 , further comprising: initiating federation replication upon detecting an initial connection event from said second cluster.
7. The method of claim 1 , further comprising: initiating federation replication in response to administrator input to a management console.
8. The method of claim 1 , wherein the distributed computing environment is a distributed data grid.
9. A system for supporting federation replication in a distributed data grid, the system comprising: a first cluster comprising a first plurality of server nodes operating on a first plurality of computer systems each comprising a microprocessor and a memory, wherein each of the first plurality of server nodes owns a set of data partitions, and wherein each data partition in the set stores a datum and a key as a key-value pair; a second cluster comprising a second plurality of server nodes operating on a second plurality of computer systems each comprising a microprocessor and a memory; one or more federation replication channels between the plurality of server nodes in the first cluster and the plurality of server in the second cluster; a federated cache service at each of the first plurality of server nodes of the first cluster, the federated cache service comprising federation data features operable to maintain selected caches of the first plurality of server nodes as federated caches; and a federation coordinator configured to send a replication request to the plurality of server nodes in the first cluster, wherein a federation service event interceptor of the federated cache service intercepts the replication request, wherein, the federated cache service asynchronously aggregates the key-value pair of each data partition each of the federated caches, wherein each of said first plurality of server nodes is configured such that, in response to receiving the replication request, said each of the federated cache service sends the aggregated key-value pair data for every partition owned by said each of said first plurality of server nodes to one of the second plurality of server nodes in said second cluster via said one or more federation replication channels without: quiescing the cache service in the first cluster, or locking data of the federated caches in the first cluster, wherein the data partitions of each distributed cache of each of the second plurality of members in the second cluster is seeded with the aggregated key-value pairs of each data partition of each of the federated caches in the first cluster.
10. The system of claim 9 , wherein: said second cluster is configured to support taking a snapshot for the plurality of server nodes in the first cluster while the first cluster is operational.
11. The system of claim 9 , wherein: said second cluster is configured to support taking a snapshot for the plurality of server nodes in the first cluster while the first cluster is operational by suspending the second cluster after it receives the aggregated data from the first cluster and then performing a snapshot operation on the second cluster.
12. The system of claim 11 , wherein: said second cluster is configured to support taking a snapshot for the plurality of server nodes in the first cluster while the first cluster is operational by suspending the second cluster after it receives the aggregated data from the first cluster and then performing a snapshot operation on the second cluster; said second cluster is configured to resume operation of the second cluster after completion of the snapshot operation; and said second cluster is configured to resynchronize with the first cluster after resuming operation.
13. The system of claim 9 , wherein the system is configured to initiate federation replication upon cold start-up of said second cluster.
14. The system of claim 9 , wherein the system is configured to initiate federation replication upon detecting an initial connection event from said second cluster.
15. The system of claim 9 , wherein the system is configured to initiate federation replication in response to administrator input to a management console.
16. A non-transitory computer readable medium including instruction stored thereon for supporting federation replication in a distributed computing environment, which instructions, when executed, configure nodes in the distributed computing environment to perform steps comprising: providing a first plurality of members in a first cluster of the distributed computing environment and a second plurality of members in a second cluster of the distributed computing environment, wherein each member of the first and second plurality of members in the first and second clusters comprises a distributed cache, and wherein each distributed cache of each member of the first and second plurality of members in the first and second clusters comprises a plurality of data partitions that store a datum and a key as a key-value pair; providing one or more federation replication channels between the first plurality of members in the first cluster and the second plurality of members in the second cluster; providing a federated cache service to each of the first plurality of members in the first cluster, the federated cache service comprising federation data features operable to maintain selected distributed caches of the first plurality of members as federated caches; initiating, by a federation coordinator, a replication request for replicating each of the data partitions of each distributed cache of each member in the first plurality of members; transmitting the replication request from the federation coordinator to each of the first plurality of members in the first cluster; intercepting the replication request by federation service event interceptors of the federated cache service; asynchronously aggregating, by the federated cache service, the key-value pair of each data partition of each of the federated caches; sending by the federated cache service the aggregated key-value pairs of each data partition of each of the federated caches in the first cluster to the second plurality of members in the second cluster via said one or more federation replication channels without: quiescing the cache service in the first cluster, or locking data of the federated caches in the first cluster; and seeding the data partitions of each distributed cache of each of the second plurality of members in the second cluster with the aggregated key-value pairs of each data partition of each of the federated caches in the first cluster.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
September 25, 2015
May 26, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.