The Loki ingester component is rejecting write requests from promtail because the combination of labels on a log line exceeds the configured maximum number of unique label sets Loki can track.

Common Causes and Fixes:

  1. High Cardinality Labels Introduced by Application:

    • Diagnosis: Run promtail --config.file /etc/promtail/config.yaml | jq '.clients[0].labels' to see the static labels Promtail is adding. Then, inspect your application logs or use Loki’s query API to identify specific log lines with an unusually high number of distinct label values. Look for labels like user_id, request_id, session_id, or dynamically generated IDs.
    • Fix: In your promtail configuration (/etc/promtail/config.yaml), add a labeldrop or labelkeep stage to the relevant pipeline.
      scrape_configs:
      - job_name: myapp
        static_configs:
        - targets:
          - localhost
          labels:
            job: myapp
        pipeline_stages:
        - match:
            selector: '{__path__=~".*myapp.log"}'
            stages:
            - labeldrop:
              - user_id
              - request_id
      
    • Why it works: This tells Promtail to discard specific high-cardinality labels before sending them to Loki, reducing the number of unique label sets Loki needs to index.
  2. Incorrectly Configured Static Labels in Promtail:

    • Diagnosis: Examine your promtail configuration file (/etc/promtail/config.yaml). Look for static labels applied to static_configs that are inherently high cardinality (e.g., including environment-specific IPs, pod names that change frequently, or instance IDs).
    • Fix: Remove or modify these static labels to be less granular. If you need to differentiate instances, consider using a common label like instance_group instead of individual instance IDs.
      scrape_configs:
      - job_name: myapp_service
        static_configs:
        - targets:
          - localhost
          labels:
            job: myapp_service
            # Remove or change this to something less cardinal:
      
            # instance: {{ .Hostname }}
      
            environment: production
      
    • Why it works: Reducing the number of unique static labels applied across many Promtail agents directly decreases the total number of label sets Loki must manage.
  3. Timestamp Issues Leading to Reingestion:

    • Diagnosis: Loki’s ingester tracks unique label sets over time. If your application or Promtail is repeatedly sending the same log lines with slightly different timestamps (e.g., due to clock skew or retry mechanisms), Loki might see them as new, unique entries, inflating the cardinality count. Check Loki’s internal metrics for loki_ingester_client_request_duration_seconds and look for spikes corresponding to write operations.
    • Fix: Ensure your system clocks are synchronized using NTP. In Promtail, you can add a timestamp stage with source and format to explicitly parse and use a timestamp from the log line itself, overriding any default timestamp.
      scrape_configs:
      - job_name: myapp
        static_configs:
        - targets:
          - localhost
          labels:
            job: myapp
        pipeline_stages:
        - timestamp:
            source: time # Assuming 'time' is a field in your log line
            format: RFC3339Nano # Or the correct format of your log timestamp
      
    • Why it works: By consistently using the log line’s own timestamp, you prevent Loki from seeing the same log event as a new event just because its ingestion time differs slightly, thus reducing redundant label set tracking.
  4. Dynamic Label Extraction Errors:

    • Diagnosis: If you are using labelallow, labelformat, or labels stages in Promtail to extract labels from log content, an error in the regex or parsing logic could lead to unexpected, high-cardinality labels. For example, a regex that matches too broadly might capture timestamps or UUIDs as labels.
    • Fix: Carefully review and test your regex patterns in the pipeline_stages. Use tools like regex101.com to validate your expressions against sample log lines. Ensure the action for labels is set to replace or keep as intended, and that regex captures only the desired label value.
      pipeline_stages:
      - regex:
          expression: '^(?P<level>\w+)\s+(?P<message>.*)$'
      - labels:
          level:
      - timestamp:
          source: time # If your logs have a timestamp field
      
    • Why it works: Correctly defined label extraction ensures that only meaningful, low-cardinality data is turned into labels, preventing noise from inflating the index.
  5. Loki ingester Configuration Too Low:

    • Diagnosis: While less common for just hitting the "too many streams" error without other performance issues, the max_streams configuration in Loki’s ingester section might be set too low for your current workload. Check Loki’s logs for messages related to stream limits.
    • Fix: Increase the max_streams value in your loki-distributed.yaml or loki-single-binary.yaml configuration file. A common starting point is max_streams: 1000000 or higher, depending on your cluster size and expected cardinality.
      ingester:
        max_streams: 1000000 # Adjust as needed
      
    • Why it works: This directly raises the system’s capacity to track unique label sets, allowing more distinct streams before hitting the limit.
  6. Promtail Client Configuration:

    • Diagnosis: Examine the clients section in your promtail configuration. Ensure the tenant_id is correctly set if you are using multi-tenancy. Incorrect or missing tenant_id can sometimes lead to unexpected behavior and higher-than-intended stream counts per tenant.
    • Fix: Verify and correct the tenant_id in the clients section of /etc/promtail/config.yaml.
      clients:
      - url: http://loki:3100/loki/api/v1/push
        tenant_id: my-tenant-id # Ensure this is correct
      
    • Why it works: Correct tenant identification ensures that streams are attributed to the right logical tenant in Loki, preventing cross-tenant stream pollution and maintaining accurate cardinality counts per tenant.

The next error you’ll likely encounter is the Loki distributor rejecting requests due to exceeding its own stream limits or other resource constraints if the underlying cardinality problem isn’t fully resolved at the source.

Want structured learning?

Take the full Loki course →