Fluent Bit can collect Docker container logs, but it doesn’t actually collect them in the sense of pulling them from the Docker daemon’s logging driver. Instead, it reads them from the standard output and standard error streams of the containers.
Here’s how a typical Docker container log collection setup with Fluent Bit looks:
# docker-compose.yml
version: '3.8'
services:
app:
image: your-app-image
logging:
driver: "json-file" # Or "none" if Fluent Bit is sidecar
options:
max-size: "10m"
max-file: "3"
fluent-bit:
image: fluent/fluent-bit:latest
volumes:
- ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
- /var/run/docker.sock:/var/run/docker.sock # For container discovery
ports:
- "24224:24224" # For HTTP input or metrics
# fluent-bit.conf
[SERVICE]
Daemon Off
Log Level Info
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 24224
[INPUT]
Name tail
Tag docker.*
Path /var/log/containers/*.log
Parser docker
DB /var/log/flb_docker.db
Mem_Buf_Limit 5MB
[OUTPUT]
Name stdout
Match docker.*
This configuration tells Fluent Bit to:
- Run in the foreground (
Daemon Off). - Use a
parsers.conffile (which we’ll define next) for log parsing. - Enable an HTTP server on port 24224 for metrics and potentially other inputs.
- Read log files from
/var/log/containers/*.log. This is where Docker, by default, stores logs when using thejson-filelogging driver. - Use a parser named
dockerto interpret the log format. - Maintain a database (
flb_docker.db) to track processed files, preventing duplicate logs. - Buffer logs in memory up to 5MB before writing to disk.
- Send all logs tagged
docker.*tostdout.
The parsers.conf would look something like this:
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%LZ
This tells Fluent Bit that logs are in JSON format, and the timestamp is in the time field, formatted as YYYY-MM-DDTHH:MM:SS.LZ (e.g., 2023-10-27T10:30:00.123Z).
When you run this with docker-compose up, you’ll see your application’s logs (if it prints to stdout/stderr) being processed by Fluent Bit and then outputted by Fluent Bit itself.
The most surprising true thing about collecting Docker logs with Fluent Bit is that it doesn’t directly intercept container output; it relies on the Docker daemon’s configured logging driver to write logs to files that Fluent Bit can then read.
Let’s see it in action. Imagine a simple Python app that logs to stdout:
# app.py
import time
import datetime
import sys
for i in range(5):
timestamp = datetime.datetime.utcnow().isoformat() + "Z"
message = {"time": timestamp, "message": f"Log entry number {i}"}
print(message)
sys.stdout.flush() # Ensure immediate output
time.sleep(1)
And a Dockerfile:
FROM python:3.9-slim
WORKDIR /app
COPY app.py .
CMD ["python", "app.py"]
If you build and run this with Docker Compose, and your docker-compose.yml is configured as above, Fluent Bit will pick up these logs. Here’s what you might see in Fluent Bit’s output (since we configured OUTPUT to stdout):
[2023/10/27 10:30:05] [ info] [input:tail:tail.0] file=/var/log/containers/your-app-container-id_your-namespace_your-app-name-random-string.log, read line=1
[2023/10/27 10:30:05] [ info] [engine] valid record: {"time": "2023-10-27T10:30:00.123Z", "message": "Log entry number 0"}
[027-10-2023 10:30:05.123] [INFO] [app] Log entry number 0
[2023/10/27 10:30:06] [ info] [input:tail:tail.0] file=/var/log/containers/your-app-container-id_your-namespace_your-app-name-random-string.log, read line=2
[2023/10/27 10:30:06] [ info] [engine] valid record: {"time": "2023-10-27T10:30:01.456Z", "message": "Log entry number 1"}
[027-10-2023 10:30:06.456] [INFO] [app] Log entry number 1
Notice how Fluent Bit parses the JSON, extracts the time and message, and then re-formats it for its own stdout output. The docker.* tag in the INPUT section is crucial; it ensures that only logs originating from Docker containers (as identified by Fluent Bit’s discovery mechanism when docker.sock is mounted) are processed by this input plugin.
The core problem this solves is centralizing logs from ephemeral containers. Instead of SSHing into each node and tailing logs, or relying on Docker’s built-in drivers to send logs to a specific destination (which can be complex to configure for multiple destinations), Fluent Bit acts as a lightweight agent that can read these logs and forward them to various backends like Elasticsearch, Splunk, S3, or another Fluent Bit instance.
The docker.sock mount is key for Fluent Bit to discover running containers. It uses the Docker API to find containers and infer their log file paths, typically located under /var/lib/docker/containers/<container_id>/<container_id>-json.log or, more commonly when using json-file driver, symlinked into /var/log/containers/. Fluent Bit then tails these specific files.
The Parser docker in the INPUT section is not a built-in parser name; it’s a reference to the parser defined in parsers.conf. This allows Fluent Bit to correctly interpret the structured JSON output that Docker’s json-file driver produces, extracting fields like time, log (the actual message), and stream (stdout/stderr).
A common pitfall is forgetting to mount /var/run/docker.sock. Without it, Fluent Bit can’t discover containers and won’t know which log files to tail automatically based on container metadata. It would only process files explicitly listed in the Path directive, which isn’t dynamic.
If you want Fluent Bit to run as a sidecar container within your application’s pod (a common pattern in Kubernetes, but adaptable to Docker Compose), you would configure your application’s logging driver to none or syslog and then have Fluent Bit read from a shared volume or directly from the container’s stdout/stderr if the orchestrator provides a mechanism for that. However, the most common Docker Compose setup involves Fluent Bit as a separate service that tails the Docker daemon’s log files.
The next concept you’ll likely run into is configuring different OUTPUT plugins to send logs to a remote destination, rather than just stdout, and learning how to filter or modify logs before they are sent.