Loki’s query engine doesn’t actually scan your logs; it uses an index to jump directly to the chunks containing the log lines you’re interested in.

Let’s see this in action. Imagine you have logs from a Kubernetes cluster, and you want to find all logs from pods in the production namespace that have the app label set to webserver.

Here’s how you might query that in Loki:

{namespace="production", app="webserver"}

When Loki receives this query, it doesn’t go through every single log file. Instead, it looks at its index. The index is essentially a map of labels to the physical locations (chunks) of log data that have those labels. So, Loki finds all index entries for namespace="production" and app="webserver", and then it directly retrieves the log chunks associated with those entries. This is what makes Loki blazingly fast for label-based queries, unlike traditional log aggregators that might require full text scans.

The problem Loki solves is efficiently querying massive volumes of log data based on metadata (labels) rather than full-text search. Traditional systems often index every word in every log line, leading to enormous index sizes and slow query performance as the dataset grows. Loki’s approach, inspired by Prometheus’s label-based model, keeps the index relatively small and queries lightning-fast by focusing on the labels attached to log streams.

Internally, Loki stores logs in "chunks." A chunk is a small, immutable block of compressed log lines. Each chunk has associated metadata, which includes the labels that apply to all log lines within that chunk. When you query, Loki uses its index to find the relevant chunks. The index itself is typically built and maintained by a component called the "index" (or "ingester" if it’s part of the ingestion path). The index maps label sets to the locations of the chunks that contain logs with those labels.

The key levers you control are the labels you attach to your log streams. The more specific and consistent your labeling strategy, the more efficiently Loki can query your data. For example, using labels like namespace, pod, container, app, environment, and level allows for very granular filtering.

The actual mechanism that translates your query into retrieving data involves the "query frontend" and "query store" (or "querier"). The query frontend parses your query, breaks it down into sub-queries for different label sets, and sends them to the query store. The query store then uses the index to locate the relevant chunks and streams them back to the frontend, which assembles the final result.

When you’re querying, the | operator is often used for filtering within the log lines themselves, after Loki has already efficiently located the relevant streams using labels. For instance, {namespace="production"} |="error" first finds all log streams in the production namespace using the index, and then filters those streams for lines containing the string "error." This two-stage process is crucial: label-based selection happens first (fast, index-driven), followed by content filtering (slower, but applied only to relevant data).

Most people understand that {label="value"} filters by labels. What they often miss is how Loki applies multiple label matchers and how that translates to index lookups. When you use {ns="prod", app="web"}, Loki doesn’t just do two separate lookups. It looks for index entries that simultaneously satisfy both conditions. This can involve intersecting sets of chunk pointers from the index for each label matcher, or if the index is sophisticated enough, it might have composite entries that directly map to ns="prod", app="web". The efficiency here depends heavily on the index implementation.

The next concept you’ll likely explore is how to optimize query performance when dealing with high cardinality labels or complex regular expressions within your log content filtering.

Want structured learning?

Take the full Loki course →