Fluent Bit can ship logs to CloudWatch, but it’s not just a simple pipe; it’s a sophisticated buffering and retry mechanism designed to handle network instability and backpressure.
Let’s see it in action. Imagine you have a Kubernetes cluster and you want to send your application logs to CloudWatch.
First, you need Fluent Bit running as a DaemonSet in your cluster. Each pod will tail logs from the nodes it’s running on.
Here’s a typical Fluent Bit configuration snippet for CloudWatch output:
[SERVICE]
Flush 1
Daemon On
Log_Level info
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
[INPUT]
Name tail
Path /var/log/containers/*.log
Parser docker,cri
Tag kube.*
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
[OUTPUT]
Name cloudwatch_logs
Match kube.*
region us-east-1
log_group_name /aws/containerlogs/${HOSTNAME}
log_stream_name ${CONTAINER_NAME}-${CONTAINER_ID:0:8}
auto_create_group On
retry_attempts 10
tls On
In this setup:
[SERVICE]defines general Fluent Bit settings.Flush 1means it tries to send logs every second.[INPUT]is configured to tail Docker/container logs from/var/log/containers/*.log. TheTag kube.*will be used for routing.Mem_Buf_Limit 5MBsets a memory buffer limit for incoming logs.[OUTPUT]is the star.Name cloudwatch_logstells Fluent Bit to use the AWS CloudWatch Logs output plugin.Match kube.*ensures only logs tagged withkube.*are sent here.region us-east-1specifies your AWS region.log_group_nameandlog_stream_namedefine how your logs will be organized in CloudWatch. The${HOSTNAME}and${CONTAINER_NAME}-${CONTAINER_ID:0:8}are dynamic variables that make each log stream unique per container and node.auto_create_group Onsimplifies setup by creating the log group if it doesn’t exist.retry_attempts 10is crucial for reliability.
The log_stream_name is particularly interesting. Fluent Bit uses a combination of container name and a truncated container ID to ensure each container gets its own stream. This means if a pod restarts and gets a new container ID, its logs will appear in a new stream, preserving historical data for the old one. This is a common pattern to avoid issues with log stream rotation in CloudWatch.
The core problem Fluent Bit solves here is transforming unstructured or semi-structured container logs into a format CloudWatch can ingest reliably, even under variable network conditions. It doesn’t just send data; it queues it, batches it, retries on failure, and applies backpressure if CloudWatch is slow to respond, preventing your application from being overwhelmed. The retry_attempts 10 means if a batch of logs fails to send to CloudWatch, Fluent Bit will try up to 10 more times, with increasing delays between attempts, before dropping the logs. This significantly increases the chances of your logs reaching their destination.
What most people don’t realize is how Fluent Bit handles backpressure. If the CloudWatch endpoint becomes unresponsive or starts returning errors, Fluent Bit won’t just keep hammering it. The Mem_Buf_Limit in the input plugin acts as a first line of defense. If that buffer fills up, Fluent Bit will signal the input source (like the tail plugin) to slow down its reading of log files. This prevents Fluent Bit itself from running out of memory and crashing, and it gives CloudWatch time to recover.
The next hurdle is understanding how to deal with log parsing and structuring before they even hit Fluent Bit, especially when dealing with complex application logs.