High Availability Agones

Learn how to configure your Agones services for high availability and resiliancy to disruptions.

High Availability for Agones Controller

When SplitControllerAndExtensions is enabled, the agones-controller responsibility is split up into agones-controller, which enacts the Agones control loop, and agones-extensions, which acts as a service endpoint for webhooks and the allocation extension API. Splitting these responsibilities allows the agones-extensions pod to be horizontally scaled, making the Agones control plane highly available and more resiliant to disruption.

Extension Pod Configrations

The agones-extensions binary has a similar helm configuration to agones-controller, see here. If you previously overrode agones.controller.* settings, you may need to override the same agones.extensions.* setting.

To change controller.numWorkers to 200 from 100 values and through the use of helm --set, add the follow to the helm command:

 ...
 --set agones.controller.numWorkers=200
 ...

An important configuration to note is the PodDisruptionBudget fields, agones.extensions.pdb.minAvailable and agones.extensions.pdb.maxUnavailable. Currently, the agones.extensions.pdb.minAvailable field is set to 1.

Deployment Considerations

When SplitControllerAndExtensions is enabled, what was previously a single agones-controller pod is deployed as agones-controller and 2 agones-extensions pods. For example:

NAME                                 READY   STATUS    RESTARTS   AGE
agones-allocator-78c6b8c79-h9nqc     1/1     Running   0          23h
agones-allocator-78c6b8c79-l2bzp     1/1     Running   0          23h
agones-allocator-78c6b8c79-rw75j     1/1     Running   0          23h
agones-controller-fbf944f4-vs9xx     1/1     Running   0          23h
agones-extensions-5648fc7dcf-hm6lk   1/1     Running   0          23h
agones-extensions-5648fc7dcf-qbc6h   1/1     Running   0          23h
agones-ping-5b9647874-2rrl6          1/1     Running   0          27h
agones-ping-5b9647874-rksgg          1/1     Running   0          27h

The number of replicas for agones-extensions can be set using helm variable agones.extensions.replicas, but the default is 2.

We expect the aggregate memory consumption of the pods will be slightly higher than the previous singleton pod, but as the responsibilities are now split across the pods, the aggregate CPU consumption should also be similar.

Feature Design

SplitControllerAndExtensions represents phase 1 of HA Agones. The remaining phases are not yet implemented.


Last modified February 15, 2023: Add documentation for SplitControllerAndExtensions (#2961) (3f7dd4296)