Rotating the root CA for Linkerd’s mTLS is less about ceremony and more about maintaining the integrity of your service mesh’s identity system.
Let’s see what happens when Linkerd is configured to use a specific trust anchor, which is essentially a root certificate. We’ll simulate a basic Linkerd setup and then demonstrate how a certificate that’s about to expire forces a rotation.
Imagine a linkerd-config.yaml that defines a custom trust anchor:
apiVersion: linkerd.io/v1alpha1
kind: Config
identityContext:
trustAnchors:
- cert: |
-----BEGIN CERTIFICATE-----
MIIDPTCCAiYCCQCXv+Y88p/lPjANBgkqhkiG9w0BAQsFADBLMQswCQYDVQQGEwJVUzEL
MAkGA1UECBMCT0gxEDAOBgNVBAoTB0F1dGhvcnMxETAPBgNVBAMTCExpbmtlcmQxMB4G
A1UdDwQWE... (truncated for brevity)
-----END CERTIFICATE-----
This trustAnchors section tells Linkerd to use this specific certificate as the ultimate source of truth for verifying other certificates in the mesh. When a pod’s identity proxy (the linkerd-proxy container) starts, it receives a workload certificate signed by a CA. This workload certificate must be verifiable by one of the trustAnchors defined in the control plane’s configuration.
Now, what happens when that root CA certificate has an expiration date that’s fast approaching?
The Problem: Expiring Trust
Linkerd’s control plane components, particularly the identity service, are responsible for issuing and managing workload certificates. These certificates are short-lived, but the root CA they are signed by is typically long-lived. When the root CA approaches its expiration, Linkerd needs to transition to a new root CA without disrupting existing mTLS connections or preventing new ones. If the root CA expires, all existing workload certificates signed by it become untrustworthy, and new certificates cannot be issued.
Common Causes and Fixes
-
Root CA Certificate Expiration Imminent:
- Diagnosis: Check the expiration date of your current root CA. You can often find this in your
linkerd-config.yamlor by inspecting the certificate directly if you have access to the file. For example, if your CA is stored asca.crtin a Kubernetes secret, you can extract it:
Look forkubectl get secret linkerd-identity-tls -n linkerd -o jsonpath='{.data.ca\.crt}' | base64 --decode > ca.crt openssl x509 -in ca.crt -noout -datesnotAfter. - Fix: Generate a new root CA and intermediate CA pair. Then, create a new Kubernetes secret containing the new root CA certificate (
ca.crt) and its private key (ca.key). Update yourlinkerd-config.yamlto point to this new secret for theidentityContext.trustAnchors.
Apply this updated configuration.# linkerd-config.yaml snippet identityContext: trustAnchors: - cert: # Content of your NEW ca.crt -----BEGIN CERTIFICATE----- ... -----END CERTIFICATE----- - Why it works: Linkerd’s identity system will start using the new root CA for issuing new workload certificates. Existing workload certificates remain valid until their own expiration, as they were signed by the original root CA, which is still trusted until it’s replaced.
- Diagnosis: Check the expiration date of your current root CA. You can often find this in your
-
Intermediate CA Expiration:
- Diagnosis: If you use an intermediate CA to sign workload certificates, check its expiration date. The process is similar to checking the root CA, but you’d inspect the intermediate CA certificate.
- Fix: If the intermediate CA is expiring, you need to generate a new intermediate CA and have the current root CA sign it. Then, update the Kubernetes secret with the new intermediate CA certificate and its key, and ensure the
linkerd-config.yamlreferences this new intermediate CA. - Why it works: This maintains the chain of trust. The root CA remains the ultimate authority, and the new intermediate CA is now the direct signer of new workload certificates.
-
Improper Secret Update:
- Diagnosis: After updating the
linkerd-config.yamland applying it, check thelinkerd-identitypod logs for errors related to loading the trust anchor or issuing certificates.
Look for messages indicating it can’t find or parse the new certificate.kubectl logs -n linkerd deploy/linkerd-identity -c identity -f - Fix: Ensure the
linkerd-config.yamlcorrectly references the Kubernetes secret containing the new CA certificates. Verify the secret exists in thelinkerdnamespace and containsca.crtandca.key(ortls.crtandtls.keyif using a combined secret). - Why it works: Linkerd’s identity controller actively watches its configuration and the referenced secrets. A misconfigured reference means it never picks up the new trust anchor.
- Diagnosis: After updating the
-
Linkerd Control Plane Not Restarted/Reconfigured:
- Diagnosis: Even after updating the
linkerd-config.yaml, the control plane components might not have picked up the changes if they are not designed to dynamically reload this specific configuration. - Fix: Apply the updated
linkerd-config.yamlusinglinkerd upgrade --config linkerd-config.yaml. This command ensures the control plane components are reconfigured and potentially restarted to load the new trust anchor. - Why it works: The
linkerd upgradecommand is the canonical way to apply configuration changes to the control plane, triggering necessary reloads or restarts of the relevant controllers.
- Diagnosis: Even after updating the
-
Workload Certificates Not Refreshing:
- Diagnosis: After a successful root CA rotation, new workload certificates should be issued signed by the new CA. You can inspect a pod’s identity certificate:
The issuer should eventually reflect certificates signed by the new CA.# Get a pod's identity secret (e.g., for a pod named 'webapp-12345-abcde') kubectl get secret webapp-12345-abcde-identity-tls -n default -o jsonpath='{.data.crt\.pem}' | base64 --decode > pod.crt # Inspect the issuer of this certificate openssl x509 -in pod.crt -noout -issuer - Fix: Ensure the
linkerd-proxywithin your workloads is configured to periodically refresh its identity certificate. This is usually handled automatically by the control plane’s certificate rotation mechanism. If not, investigate theidentityContext.controller.workloadTls.rotationsettings in your Linkerd configuration. - Why it works: The proxy’s identity certificate has a limited lifespan (e.g., 24 hours) and must be renewed. The identity system ensures this renewal process uses the new trust anchor once it’s active.
- Diagnosis: After a successful root CA rotation, new workload certificates should be issued signed by the new CA. You can inspect a pod’s identity certificate:
-
Inconsistent Trust Anchors Across Control Plane Pods:
- Diagnosis: If your Linkerd control plane is deployed in a highly available configuration with multiple replicas of the identity service, ensure all replicas are reading the same updated trust anchor configuration.
- Fix: Verify that the
linkerd-config.yamlapplied is consistent across all control plane deployments. Thelinkerd upgradecommand should handle this, but manual interventions might cause drift. - Why it works: Each identity controller replica needs to agree on the current trust anchor to correctly validate incoming certificate requests and issue new ones.
After successfully rotating your root CA, the next immediate challenge you’ll face is ensuring all existing workload certificates are refreshed to be signed by the new CA, which will happen gradually as their current certificates expire.