Flux’s reconciliation is failing after an upgrade because the new version expects a different format for its internal state, and the old state is incompatible.

Here’s how to fix it, covering the most common causes:

  1. Outdated CRDs: The Flux controllers (like source-controller, kustomize-controller, etc.) might be running a newer version than the Custom Resource Definitions (CRDs) they expect. This is the most frequent culprit.

    • Diagnosis: Check the CRD versions against the controller versions. For example, run kubectl get crd flux-custom-resources.source.toolkit.fluxcd.io -o yaml and look at the spec.versions[*].served and spec.versions[*].storage fields, then compare this to the version of your running Flux controllers (e.g., kubectl get pods -n flux-system -o yaml | grep fluxcd.io).
    • Fix: Apply the CRDs from the new version of Flux. For instance, if you’re upgrading to Flux v0.37.0, you’d run kubectl apply -f https://github.com/fluxcd/flux2/releases/download/v0.37.0/source-controller.crds.yaml, kubectl apply -f https://github.com/fluxcd/flux2/releases/download/v0.37.0/kustomize-controller.crds.yaml, and so on for all controllers.
    • Why it works: CRDs define the schema for custom resources. If the controller code expects a schema that doesn’t exist or is different from what the CRDs define, it can’t parse or store its state correctly, leading to reconciliation errors.
  2. Incompatible Kustomization or HelmRelease API Versions: You might have applied Kustomization or HelmRelease resources using an API version that’s no longer supported or has been changed in the new Flux version.

    • Diagnosis: Inspect your Kustomization and HelmRelease definitions. Look for apiVersion fields. For example, a common old version might be kustomize.toolkit.fluxcd.io/v1beta1 and the new one is kustomize.toolkit.fluxcd.io/v1.
    • Fix: Update the apiVersion in your Kustomization and HelmRelease manifests to match the new version. For example, change apiVersion: kustomize.toolkit.fluxcd.io/v1beta1 to apiVersion: kustomize.toolkit.fluxcd.io/v1. Then re-apply them: kubectl apply -f my-kustomization.yaml.
    • Why it works: The controllers specifically look for resources matching certain API versions. If you’re using an outdated API version, the controllers might not even see or process your resources correctly.
  3. Stale GitRepository or HelmRepository Definitions: Similar to CRDs, the controllers might be expecting a newer format for how they store information about your external sources.

    • Diagnosis: Examine your GitRepository and HelmRepository custom resources. Look for fields that might have been deprecated or changed in the latest release notes.
    • Fix: Consult the Flux v2 changelog for the specific version you’re upgrading to. It will detail any necessary changes to these resource definitions. Apply the updated manifests. For instance, if a field like spec.url was renamed to spec.interval, you’d update your GitRepository accordingly and re-apply.
    • Why it works: The source-controller uses these resources to fetch code and charts. If the structure it expects to read from these resources has changed, it won’t be able to authenticate, connect, or download the correct content.
  4. RBAC Permissions Issues: The upgrade process might have introduced new RBAC roles or role bindings that are missing, or old ones might be interfering.

    • Diagnosis: Check the logs of the Flux controllers in the flux-system namespace. Look for "permission denied" or "unauthorized" errors when they try to access Kubernetes API resources. kubectl logs -n flux-system <pod-name>
    • Fix: Re-apply the default RBAC manifests for the new Flux version. This ensures all necessary permissions are granted. For example: kubectl apply -f https://github.com/fluxcd/flux2/releases/download/v0.37.0/flux-system.yaml.
    • Why it works: Flux controllers need specific permissions to interact with the Kubernetes API (e.g., to create/update Deployments, Services, etc., or to read Secrets). If these permissions are missing, the controllers can’t perform their reconciliation tasks.
  5. Corrupted Controller State: In rare cases, the internal state of a Flux controller might become corrupted, especially if there were abrupt shutdowns or network issues during previous reconciliations.

    • Diagnosis: This is harder to diagnose directly. If all other steps fail and logs show persistent, unexplainable errors for a specific controller, this is a possibility.
    • Fix: You can try to reset the state of a specific controller by deleting and recreating its Deployment and associated resources. Be very careful with this, and ensure you have backups or understand the implications. A safer approach is often to uninstall and reinstall Flux, then re-apply your GitOps configurations.
    • Why it works: Controllers store their reconciliation state in Kubernetes resources (often in memory or in etcd for certain components). If this state becomes inconsistent, they can get stuck. Recreating the controller effectively gives it a clean slate.
  6. Incorrect flux bootstrap Command: If you used flux bootstrap to install or upgrade, an incorrect command or flag might have led to a misconfiguration that’s now breaking reconciliation.

    • Diagnosis: Review the exact flux bootstrap command you used. Compare it against the documentation for the version you upgraded to.
    • Fix: Re-run flux bootstrap with the correct parameters for your desired setup and the new Flux version. This will re-provision the Flux controllers with the correct configuration.
    • Why it works: The bootstrap command sets up all the initial Flux components, including their configurations, RBAC, and CRDs. An incorrect bootstrap can leave Flux in a state where it’s not properly configured to reconcile.

After fixing these, you’ll likely encounter a ReconcileFailed error related to a specific Kustomization or HelmRelease not being able to fetch its source, which is the next logical step in debugging.

Want structured learning?

Take the full Flux course →