Loki’s unpack and logfmt extractors can pull nested fields from JSON or logfmt-encoded log lines, but they’re often used incorrectly, leading to missed data.
Here’s a look at how to use them effectively, and a common pitfall:
{"level": "info", "message": "user logged in", "user": {"id": 123, "name": "Alice", "details": {"email": "alice@example.com", "subscription": "premium"}}}
{"level": "warn", "message": "invalid input", "data": {"query": "user=Bob", "error": "missing field"}}
Let’s say you want to extract user.id and user.details.email.
The logfmt extractor is designed for the key=value format. If your logs are JSON, you’ll typically use json first, then logfmt for nested access, or use unpack directly.
The Problem: Direct logfmt on JSON
A common mistake is to try and use logfmt directly on JSON, expecting it to understand the nested structure.
{job="my_app"} | logfmt "user.id"
This won’t work as expected because logfmt expects key=value pairs, not JSON objects. It will likely treat the entire JSON string as a single, unparseable value.
Solution 1: json + logfmt
The most robust way is to first parse the JSON, then use logfmt to navigate.
{job="my_app"} | json | logfmt "user.id", "user.details.email"
Let’s break this down:
{job="my_app"}: This is your basic Loki query to select logs from a specific job.| json: This is the crucial first step. Thejsonparser converts the raw JSON string into a set of key-value pairs that Loki can understand. Each top-level key in the JSON becomes a field.| logfmt "user.id", "user.details.email": Now that Loki sees the JSON parsed into fields,logfmtcan be used to access nested fields. The syntax"user.id"tellslogfmtto look for a field nameduser, and within that, a field namedid. Similarly,"user.details.email"traversesuser->details->email.
Why this works: The json parser effectively flattens the top level of your JSON into accessible fields. Then, logfmt’s dot notation allows you to "re-nest" and select specific values from those parsed fields.
Solution 2: unpack (for JSON)
The unpack transform is specifically designed for JSON and can directly extract nested fields without needing an intermediate logfmt step.
{job="my_app"} | unpack "user.id", "user.details.email"
Here’s the breakdown:
{job="my_app"}: Your log stream selector.| unpack "user.id", "user.details.email": Theunpacktransform, when given dot-notation paths, will attempt to parse the log line as JSON and extract the specified nested fields. If the log line is not valid JSON,unpackwill drop it.
Why this works: unpack is a more direct JSON handler. It intelligently traverses the JSON structure based on the dot notation provided. It’s often more performant than json | logfmt for pure JSON logs because it avoids the intermediate parsing step and the overhead of the logfmt parser.
Handling Mixed Log Formats
What if you have both JSON and logfmt in the same stream?
Example log lines:
{"level": "info", "message": "user logged in", "user": {"id": 123}}
level=info message="user logged in" user.id=123
If you use json | logfmt, the json parser will handle the JSON lines, and logfmt will then work on the parsed fields. The logfmt lines will likely be dropped by json.
If you use unpack, it will only work on the JSON lines.
To handle mixed formats, you often need to be more explicit or use multiple stages. A common pattern is to try parsing as JSON, and if that fails, try parsing as logfmt. However, for extraction of nested fields, you’re usually dealing with structured logs where one format dominates.
The Counterintuitive Detail: logfmt and JSON Keys
When using json | logfmt, Loki first parses the JSON into top-level fields. So, a JSON like {"user": {"id": 123}} becomes two separate fields in Loki’s view: user (whose value is the JSON string {"id": 123}) and id (which doesn’t exist at the top level).
When logfmt "user.id" is applied after json, it’s not operating on the original JSON structure in the same way unpack does. Instead, it’s looking at the fields that the json parser created. The logfmt parser, when given a path like user.id, will first look for a field named user. If it finds one, it will then attempt to parse that field’s value as if it were logfmt or JSON. If the value of the user field is the JSON string {"id": 123}, logfmt can then parse that string to find the id.
This is why json | logfmt works: json makes the top-level keys accessible, and then logfmt can drill into the values of those keys if those values are themselves structured (like JSON strings). unpack does this traversal in a single, more direct operation.
The Next Step: Filtering Extracted Fields
Once you’ve successfully extracted nested fields, you’ll likely want to filter based on them.
{job="my_app"} | json | logfmt "user.id", "user.details.email" | __extracted_user_id__ = "123"
Or using unpack:
{job="my_app"} | unpack "user.id", "user.details.email" | __extracted_user_id__ = "123"
The extracted fields are typically prefixed with __extracted_. You can then use these in subsequent | stages for filtering. This is where you’d filter for specific user IDs, email domains, or any other nested data you’ve pulled out.