Loki’s query engine splits long time range searches into smaller chunks to avoid overwhelming the system, but this can sometimes lead to slower-than-expected results.

Let’s watch this in action. Imagine you have logs spanning a week and you want to find all error messages. A naive approach would fetch all logs for that week and then filter. Loki, however, breaks this down. It might fetch logs from day 1, then day 2, and so on, processing each chunk independently. This prevents a single, massive read but introduces overhead from managing multiple requests and consolidations.

The core problem Loki is trying to solve is bounded scalability. If every query had to scan every log entry ever ingested, performance would degrade linearly with data volume. By splitting queries, Loki aims for a more predictable performance profile. The system works by identifying the total time range of a query and then dividing it into smaller, manageable intervals. For each interval, it queries the relevant index and object store. Finally, it aggregates the results from all intervals.

The primary lever you control here is the split interval. This is the duration of each smaller chunk Loki uses. You don’t directly set this in the query itself, but rather in the Loki configuration. The query-frontend component, specifically, is responsible for this splitting logic.

Here’s a snippet from a typical Loki configuration (loki-local.yaml or similar) showing where this is set:

query_frontend:
  # ... other settings ...
  log_queries_from_range: true # Ensures range queries are split
  split_queries_by_interval: 1h # This is the key setting
  # ... other settings ...

In this example, split_queries_by_interval: 1h tells the query frontend to break down any query spanning more than an hour into 1-hour chunks. If your query covers 24 hours, it will be split into 24 smaller queries.

The surprising thing about split_queries_by_interval is that setting it too low can actually hurt performance. If you have a very high volume of logs and set the interval to, say, 5 minutes, you might end up with thousands of tiny queries. The overhead of initiating and managing each of these small queries can outweigh the benefit of processing them in parallel. Finding the sweet spot is crucial and depends heavily on your ingestion rate and query patterns.

Consider a query like {job="my_app"} |= "exception". If your Loki is configured with split_queries_by_interval: 6h and you search across 24 hours, Loki will execute 4 sub-queries. Each sub-query will fetch index entries and then corresponding log chunks for that 6-hour window. The query frontend then merges the results. This parallelism is what makes long-range queries feasible.

The query-frontend also has a max_outstanding_requests setting. This limits how many sub-queries can be active concurrently. If you have many cores and fast storage, you might want to increase this to allow more parallelism.

When you’re tuning split_queries_by_interval, you’re essentially trading off the latency of individual chunk processing against the overhead of managing many small requests. A longer interval means fewer requests but potentially longer waits for each chunk to complete if that chunk is particularly large or slow to access. A shorter interval means more requests, potentially faster individual chunk completion if they are small, but higher overall management overhead.

If you’ve optimized split_queries_by_interval and are still seeing slow queries, investigate the query-scheduler component. It’s responsible for distributing these split queries to the query-store-gateway or ingesters and can become a bottleneck if not configured appropriately.

The next concept you’ll likely encounter is query aggregation and how Loki handles duplicate log lines across split intervals.

Want structured learning?

Take the full Loki course →