Flux can detect and auto-remediate Kubernetes configuration drift by comparing the desired state defined in your Git repository against the actual state of your cluster.
Let’s see Flux in action. Imagine you have a deployment defined in your Git repository like this:
# git/apps/my-app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: main
image: nginx:1.21.0
And Flux is configured to watch this Git repository. Initially, your cluster will have 3 replicas of my-app.
Now, someone manually scales the deployment in the cluster:
kubectl scale deployment my-app --replicas=5
Flux, continuously reconciling, will detect this discrepancy. It sees that the desired state in Git specifies replicas: 3, but the cluster currently reports replicas: 5. Because auto-remediation is enabled, Flux will automatically revert the change in the cluster to match the Git state.
The core problem Flux solves here is configuration drift: the divergence between your intended system state (what’s in Git) and the actual state of your Kubernetes cluster. Manual changes, accidental or intentional, can break your deployments, introduce security vulnerabilities, or cause unexpected behavior. Flux acts as a continuous, automated auditor and enforcer of your Git-defined desired state.
Internally, Flux uses a reconciliation loop. It periodically fetches the latest state from your Git repository and compares it with the observed state in your Kubernetes cluster. For each Kubernetes resource managed by Flux, it performs a deep comparison. If a resource in the cluster doesn’t match its desired definition in Git, Flux flags it as drifted. If auto-remediation is configured, Flux will then apply the desired state from Git to the cluster, effectively overwriting the drifted configuration.
The key levers you control are:
- The Git Repository: This is your single source of truth. All desired configurations for your applications and cluster infrastructure should live here.
- Flux Controllers: Specifically,
kustomize-controller(for Kustomize) orhelm-controller(for Helm) are responsible for applying the desired state. - Reconciliation Interval: You configure how often Flux checks for changes in Git and reconciles the cluster. A shorter interval means faster drift detection and remediation but higher resource usage.
- Health Checks and Suspend: You can define health checks for your applications to ensure that remediation doesn’t happen if the application is unhealthy. You can also temporarily suspend reconciliation for specific resources or entire applications if you need to perform manual maintenance.
- Notifications: Flux can send notifications (e.g., to Slack or Teams) when drift is detected or when remediation occurs, keeping you informed.
To enable auto-remediation, you typically configure this within your Flux Kustomization or HelmRelease objects. For a Kustomization, it might look like this:
# git/flux-system/gotk-sync.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: my-app-kustomization
namespace: flux-system
spec:
interval: 10m
sourceRef:
kind: GitRepository
name: my-flux-repo
path: ./apps/my-app # Path within the Git repo
prune: true
validation: client
# The following lines enable auto-remediation for drift
syncInterval: 5m # How often to reconcile
remediation:
remediate: true
# Optional: specify how often to remediate if drift is detected
# remediationInterval: 15m
The remediate: true field is the crucial part. When set, Flux will actively counteract any drift it detects. The syncInterval defines how often Flux checks for any changes (new commits in Git or existing resources), while the remediation block specifically addresses drift from the Git state.
The most surprising aspect of Flux’s drift detection is its granularity. It doesn’t just check if a resource exists; it performs a deep, field-by-field comparison between the object in Git and the object in the cluster. This means that even a minor change, like a differing annotation or a slightly altered label, will be flagged as drift and, if remediation is enabled, corrected. This level of detail ensures that your cluster remains a precise reflection of your Git repository, down to the smallest configuration detail.
The next challenge you’ll likely face is managing multiple Git repositories and complex dependency chains between your applications and cluster infrastructure.