Federation
Overview
AMKO uses federation to replicate AMKO configuration to a set of member clusters. This ensures a seamless recovery of AMKO configuration during disasters.
Federation Set
The set of clusters registered on AMKO is considered as a federation set. One of these clusters in the federation set must be designated as the leader. And, all the other clusters are designated as the followers.
Responsibilities
-
Leader Cluster: Is responsible for distributing the AMKO configuration to all the follower clusters in the federation set.
AMKO in the leader cluster is also responsible to create/update GSLB Services on the Avi Controller. -
Follower Cluster: AMKO in a follower cluster runs passively and does not carry out federation activities or GslbService sync operations unlike the leader.
Post a disaster, the user should manually pick any of the erstwhile follower clusters and designate that as the new leader. The user must ensure that only a single AMKO is designated as the leader at any given point in time.
Note: At any point in time, ensure that there is only one leader AMKO.
Federated Objects
The following objects are federated from the leader cluster to the follower:
- GSLBConfig
- GlobalDeploymentPolicy or GDP
Add/Update/Delete events of these objects are federated to the follower clusters.
Flow
Assume the following topology (a cluster is a Kubernetes-Kubernetes/OpenShift cluster):
- Cluster 1 in Site 1
- Cluster 2 in Site 2
- Cluster 3 in Site 3
Each site has an Avi Controller deployed. The site 1’s Avi Controller is chosen as the GSLB Leader and all the other sites are GSLB followers. AMKO is deployed on all three sites. For cluster 1 (in site 1), it is marked as the leader. Clusters 2 and 3, ared marked as followers.
On adding or updating the GSLBConfig and GDP objects in cluster 1, AMKO’s federator in cluster 1 federates the changes to these objects to all the follower clusters.
AMKOCluster CRD to control federation
A CRD called AMKOCluster
governs the federation. A typical AMKOCluster
object for a leader AMKO appears as shown below:
apiVersion: amko.vmware.com/v1alpha1
kind: AMKOCluster
metadata:
name: amkocluster-sample
namespace: avi-system
spec:
isLeader: true
clusterContext: cluster1
version: 1.6.1
clusters:
- cluster1
- cluster2
status:
conditions:
- status: valid AMKOCluster object
type: current AMKOCluster Validation
- status: all cluster clients fetched
type: member cluster initialisation
- status: validated all member clusters
type: member cluster validation
- status: federated to all valid clusters successfully
type: GSLBConfig Federation
- status: federated to all valid clusters successfully
type: GDP Federation
Here,
-
namespace
: The namespace of this object is avi-system. -
isLeader
: Specify whether the AMKO in the current cluster is leader. By default this is set toFalse
. If set to false, AMKO will not sync any objects to the Avi Controller, and the AMKO federator will not federate the objects to the member clusters. -
clusterContext
: Specify the current cluster’s context. Providing the wrong cluster context can cause undefined behavior. -
version
: This is the current cluster’s AMKO version. If installed via helm, this field gets automatically populated. -
clusters
: The Member cluster list on which federation will be performed. Current cluster (if present) in this list will be ignored. -
status
: Indicates the current state of federation.
The following types are reflected in the status:current AMKOCluster Validation
: Indicates the validity of the currentAMKOCluster
object.member cluster initialisation
: Indicates whether the cluster contexts given in spec.clusters were fetched and initialised from the gslb-config-secret secret. If a member cluster given in spec.clusters is not found in the gslb-config-secret, this step would fail.member cluster validation
: The federator validates all the member clusters in the spec.clusters list and indicates a success/error. Validation includes some sanity checks, version mismatch checks, leader checks etc.GSLBConfig federation
: The federator indicates whether it was able to federate the GSLBConfig object to all the clusters in spec.clusters successfully.GDP Federation
: The federator indicates whether it was able to federate the GDP/GlobalDeploymentPolicy object to all the clusters in spec.clusters successfully.
If Helm is used to deploy AMKO, this Custom Resource will be installed, and these values have to be provided via values.yaml.
Notes:
- The federation set can be a subset of the overall member cluster set used for GSLB.
- AMKO has to be deployed on all clusters in the federation set.
spec.clusterContext
must contain the current cluster’s context.spec.version
is compared against the versions of all member clusters. All AMKO clusters must have the same version as theleader
cluster. The federation logic will not work if there’s a version mismatch.spec.cluster
s contains the federation cluster set.- Only one AMKO can be a leader, all other AMKOs have to be followers. If there are two leaders at any point, federation will stop and the error will be written to the AMKOCluster’s status.
Disasters and Recovery
During a cluster down event on the leader
AMKO, the federation of config objects will stop. However, at this point, all other clusters participating in federation would be synced with up to date configuration of the erstwhile GSLB leader
. Hence, switching to a new AMKO leader
does not require any manual steps of recovering the AMKO config objects.
Assume that there are 3 sites with one cluster in each of them:
- Cluster 1 in Site 1
- Cluster 2 in Site 2
- Cluster 3 in Site 3
A disaster can occur either in the entire site or just for that cluster. Site failure would also mean that the Kubernetes/OpenShift cluster along with the Avi Controller in that site are down. Whereas, a cluster failure would mean that only the Kubernetes/OpenShift cluster is down.
If the site where the leader AMKO was deployed and which hosted the Avi GSLB leader, fails. At this point, you can:
-
Select a follower site to be the new leader on the cluster where this follower AMKO is deployed.
-
Choose a new follower AMKO to be the new leader, and follow the steps given below on the cluster where this follower AMKO is deployed:
-
Edit the
GSLBConfig
object and change the leader IP address:$ kubectl edit gslbconfig -n avi-system gc-1 // Set the field spec.gslbLeader.controllerIP to the new leader's IP address
-
Set the
isLeader
field in theAMKOCluster
object to true on this cluster:$ kubectl edit amkocluster amkocluster-federation -n avi-system
</pre>
-
This reboots the new leader AMKO. After reboot, the new leader
will take over the responsibilities of the previous leader.
Scenario: Cluster Failure
Consider the scenario where the cluster where the leader AMKO was deployed, fails.
Since the Avi GSLB leader is still active, the user only has to choose a new AMKO leader out of the followers. The steps to recover AMKO and designation of the new leader
remains the same as shown in the section above.
$ kubectl edit amkocluster amkocluster-federation -n avi-system
Old Leader AMKO Boots Up
At any point in time, the architecture only allows a single AMKO leader
. Conflicts leading from more than one leader must be resolved by the admin manually. This does not have any traffic impact on the existing GslbServices objects.
To resolve this situation, the admin must convert one of the leader AMKOs to follower by setting spec.isLeader
field to false
in the AMKOCluster
object:
$ kubectl edit amkocluster amkocluster-federation -n avi-system
This is especially important for situations, when a cluster, which hosted a leader instance of AMKO, previously failed.
The user has switched a follower AMKO to be the new leader. The failed cluster recovers and brings back the old leader AMKO. In this case, set the old leader to follower.
Caveats
Federation is currently a one way communication from the leader AMKO to the follower AMKOs. AMKO federator on the leader cluster reacts to the create/update/delete operations on the GSLBConfig and GDP objects on the leader cluster. Modification of these objects on the follower cluster will not prompt the federator on the leader to update these objects.
Date | Change Summary |
---|---|
July 29, 2021 | Created the article for Federation (version 1.4.2) |