Docker’s default JSON file logging driver is a black hole for logs; you can’t easily search or aggregate them across hosts.
Let’s see this in action. Imagine you have a simple web application running in a Docker container.
version: '3.8'
services:
webapp:
image: nginx:latest
ports:
- "8080:80"
logging:
driver: "fluentd"
options:
fluentd-address: "localhost:24224"
tag: "docker.webapp.{{.Name}}.{{.ID}}"
Here, we’ve told Docker to send logs from the webapp service to a Fluentd collector running on localhost:24224. The tag directive is powerful; it allows us to dynamically assign a label to each log message based on the container’s name and ID, making it easy to filter and route logs later.
The core problem Fluentd solves for Docker logging is centralization and structured access. Instead of logs being scattered across individual container files on potentially many different machines, Fluentd acts as a collection point. It receives logs, parses them into structured data (if configured to do so), and then forwards them to various destinations like Elasticsearch, S3, or even another logging system. This transforms raw, unstructured text into searchable, analyzable data.
Internally, Docker’s logging configuration in daemon.json or docker-compose.yml tells the Docker daemon how to send logs. When a container produces output to stdout/stderr, the Docker daemon intercepts it. If a custom logging driver like Fluentd is specified, the daemon packages this output and sends it over the network to the configured Fluentd endpoint. The fluentd-address option specifies the IP and port of the Fluentd collector. The tag option is crucial for routing and filtering; it’s a template that Fluentd uses to categorize incoming logs.
The primary lever you control is the logging section within your docker-compose.yml or daemon.json. You specify the driver (e.g., fluentd) and then provide driver-specific options. For Fluentd, the most common options are fluentd-address and tag. You can also configure fluentd-format (e.g., json, ltsv, msgpack) to control how the log message is encoded before sending, which is critical for Fluentd to parse it correctly.
Consider the tag directive: tag: "docker.webapp.{{.Name}}.{{.ID}}". This isn’t just a static string. {{.Name}} and {{.ID}} are Go template variables that Docker dynamically replaces with the actual container name and ID at runtime. This means every log message from that container will be tagged with its unique identifier, allowing you to trace specific container output within Fluentd, even if you have hundreds of containers running.
When Fluentd receives these logs, its own configuration takes over. A typical Fluentd configuration (fluentd.conf) would have a <source> section to receive logs from Docker (often via the in_forward plugin, which is what the Docker fluentd driver speaks), and then <match> sections to process and route these logs. For example, a match block might look like:
<source>
@type forward
port 24224
bind 0.0.0.0
</source>
<match docker.**>
@type stdout
</match>
This simple configuration tells Fluentd to listen on port 24224 for incoming logs and then print any log tagged with docker. to its own standard output. In a real-world scenario, you’d replace @type stdout with a plugin for your chosen destination, like elasticsearch or s3.
A common point of confusion is that the fluentd-address can be a TCP or UDP address. By default, Docker’s fluentd driver uses TCP. If your Fluentd collector is configured to listen on UDP, you’d need to specify fluentd-protocol: "udp" in your Docker logging options. However, TCP is generally preferred for reliability.
The next step after successfully routing logs to Fluentd is to configure Fluentd to parse and enrich those logs, often by adding metadata like the Kubernetes pod name if running in a cluster, or by parsing JSON payloads within the log message itself to create searchable fields.