Linkerd’s proxy injects itself into your application pods, and Traefik, as an ingress controller, needs to send traffic through those proxies to reach your services.

The Problem: Traefik Can’t Find Your Services

When Traefik tries to route traffic to a service that’s part of your Linkerd mesh, it often fails. You’ll see errors in Traefik’s logs like service not found or endpoint unavailable, even though kubectl get endpoints <your-service> shows healthy IPs. This happens because Traefik, by default, talks directly to Kubernetes API to resolve service endpoints and doesn’t "see" the Linkerd-injected proxies that are now handling the actual traffic.

Here’s how to fix it:

1. Traefik Not Discovering Kubernetes Services:

  • Diagnosis: Check Traefik’s logs for service not found or similar errors. Verify Traefik is configured to use the Kubernetes CRD or Ingress controller.
    kubectl logs <traefik-pod-name> -n <traefik-namespace>
    
  • Cause: Traefik’s Kubernetes provider isn’t enabled or configured correctly.
  • Fix: Ensure your Traefik deployment has the --providers.kubernetesCRD=true and --providers.kubernetesingress=true flags (or equivalent in static configuration).
    # In Traefik static configuration (e.g., traefik.yaml or Helm values)
    providers:
      kubernetesCRD:
        enabled: true
      kubernetesIngress:
        enabled: true
    
  • Why it works: This tells Traefik to actively watch for Kubernetes Service and Ingress (or IngressRoute if using CRDs) resources.

2. Linkerd Proxies Not Being Targeted:

  • Diagnosis: When Traefik does find the service, the traffic still fails. kubectl describe service <your-service> shows the correct selector and endpoints list IPs that look like your pods, but traffic doesn’t reach the application.
  • Cause: Traefik is trying to connect to the application’s original port on the pod IP, but Linkerd’s proxy has taken over that port. The application port is no longer directly accessible.
  • Fix: Configure Traefik to send traffic to the Linkerd proxy port (usually 4140 for TCP, or the application’s port itself if Linkerd is configured for transparent proxying and the app port is specified in the proxy config). The most robust way is to ensure your Kubernetes Service definition correctly points to the ports exposed by the Linkerd proxy.
    apiVersion: v1
    kind: Service
    metadata:
      name: my-app-service
      namespace: my-namespace
    spec:
      selector:
        app: my-app
      ports:
      - protocol: TCP
        port: 80  # The port Traefik will target
        targetPort: 80 # This should match the port the Linkerd proxy is listening on for your app traffic
    
  • Why it works: By ensuring the targetPort in the Kubernetes Service points to the port Linkerd’s proxy is listening on for application traffic, Traefik sends requests to the proxy. The proxy then forwards it to the actual application container.

3. Linkerd Service Profile Mismatch:

  • Diagnosis: Intermittent failures, timeouts, or incorrect routing for specific requests, especially if you’re using Linkerd’s advanced features like retries or traffic splitting.
  • Cause: Linkerd’s ServiceProfile might not be correctly configured for the service, or it might be expecting traffic on a different port than Traefik is sending.
  • Fix: Ensure your ServiceProfile for the service is correctly defined and that the port under spec.routes matches the port defined in your Kubernetes Service.
    apiVersion: linkerd.io/v1alpha2
    kind: ServiceProfile
    metadata:
      name: my-app-service.my-namespace.svc.cluster.local
      namespace: my-namespace
    spec:
      routes:
      - name: GET /api/users
        # Ensure this port matches your Kubernetes Service port
        port: 80
        responseClasses:
        - condition:
            status: httpRange(200, 299)
          isError: false
        # ... other route configurations
    
  • Why it works: The ServiceProfile tells Linkerd how to interpret traffic for that service. Aligning the port ensures Linkerd can correctly apply its policies and telemetry.

4. Traefik IP Filter / Rate Limiting Blocking Mesh Traffic:

  • Diagnosis: Legitimate traffic from Traefik to your services is being dropped or rejected. You might see 403 errors or connection resets originating from Traefik’s IP address.
  • Cause: Traefik’s security middleware (like IP filtering or rate limiting) is configured to only allow traffic from specific sources, and it’s not including the IPs of your Linkerd proxies.
  • Fix: Adjust Traefik’s middleware configurations to allow traffic originating from the Kubernetes pod CIDR range, or specifically from the Linkerd proxy’s default IP if applicable. This often means updating middlewares.yaml or your IngressRoute definitions.
    # Example Traefik IngressRoute with IP filter
    apiVersion: traefik.containo.us/v1alpha1
    kind: IngressRoute
    metadata:
      name: my-app-ingress
      namespace: my-namespace
    spec:
      entryPoints:
        - websecure
      routes:
        - match: Host(`my-app.example.com`)
          kind: Rule
          services:
            - name: my-app-service
              port: 80
          middlewares:
            - name: ip-whitelist # Assuming you have an IPAllowList middleware
              namespace: traefik
      # If you have a global IPAllowList middleware, ensure it's configured correctly
      # For example, to allow traffic from the cluster's pod network:
      # traefik.yaml (static config) or a dedicated middleware CRD
      # ipAllowList:
      #   sourceRange:
      #     - 10.244.0.0/16 # Replace with your cluster's pod CIDR
    
  • Why it works: This explicitly permits traffic originating from where your Linkerd proxies reside, allowing them to receive requests from Traefik.

5. DNS Resolution Issues:

  • Diagnosis: Traefik can’t resolve the Kubernetes service name, even though kubectl can. Logs might show no such host or DNS lookup failures.
  • Cause: Traefik’s DNS resolver isn’t configured to use the cluster’s DNS service (like CoreDNS) or is using an outdated configuration.
  • Fix: Ensure Traefik is configured to use the cluster’s DNS. In static configuration, this is often handled by default when running inside Kubernetes. If you’re overriding DNS settings, explicitly set it to your cluster’s DNS IP (e.g., 10.43.0.10 for CoreDNS in kube-system).
    # In Traefik static configuration
    ports:
      web:
        # ...
        dns:
          servers:
            - "10.43.0.10" # Replace with your cluster's CoreDNS IP
    
  • Why it works: This guarantees Traefik uses the same DNS resolution mechanism as the rest of your Kubernetes cluster, correctly finding internal service names.

6. Linkerd l5d-dst-canonical Header Issues:

  • Diagnosis: Traffic reaches the Linkerd proxy, but the proxy doesn’t know how to route it to the correct application container, leading to 503 errors from the proxy itself.
  • Cause: Traefik might be adding or modifying headers that interfere with Linkerd’s internal routing mechanism, specifically the l5d-dst-canonical header which Linkerd uses to identify the target service.
  • Fix: Configure Traefik to not add or modify the l5d-dst-canonical header. This is usually done by ensuring no middleware or configuration explicitly sets this header. If you’re using IngressRoute with Traefik, ensure that the service definition doesn’t involve custom headers that might conflict.
    # Example IngressRoute - ensure no custom headers are set that conflict
    apiVersion: traefik.containo.us/v1alpha1
    kind: IngressRoute
    metadata:
      name: my-app-ingress
      namespace: my-namespace
    spec:
      entryPoints:
        - web
      routes:
        - match: Host(`my-app.example.com`)
          kind: Rule
          services:
            - name: my-app-service
              port: 80
              # IMPORTANT: Do NOT add a 'headers:' section here that sets l5d-dst-canonical
    
  • Why it works: By leaving the l5d-dst-canonical header untouched, Linkerd’s proxy can correctly identify and route the incoming request to the appropriate application container without misinterpretation.

After applying these fixes, you should see traffic flowing correctly from Traefik through the Linkerd mesh to your applications. The next error you’ll likely encounter is a linkerd.io/v1alpha2.ServiceProfile not found for a specific route if you haven’t defined one for more granular control.

Want structured learning?

Take the full Linkerd course →