Fluent Bit’s chunk size isn’t just a buffer; it’s a critical performance knob that dictates how much data your output plugins have to process at once.

Let’s see it in action. Imagine you have a high-volume log stream going to Elasticsearch.

[SERVICE]
    Flush        5
    Daemon       off
    Log_Level    info
    Parsers_File parsers.conf
    HTTP_Server  on
    HTTP_Listen  127.0.0.1
    HTTP_Port    2020

[INPUT]
    Name             tail
    Path             /var/log/app/*.log
    Tag              app.*
    Refresh_Interval 1

[FILTER]
    Name            kubernetes
    Match           app.*
    Kube_URL        https://kubernetes.default.svc:443
    Kube_CA_File    /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
    Merge_Log       On
    Merge_Log_Key   log_processed
    K8S_Parser      On
    Labels          On
    Annotations     On

[OUTPUT]
    Name            es
    Match           app.*
    Host            elasticsearch.example.com
    Port            9200
    Logstash_Format On
    Replace_Dots    On
    Retry_Limit     False
    Buffer_Chunk_Size 1M
    Buffer_Max_Size   5M

In this setup, Buffer_Chunk_Size is 1M and Buffer_Max_Size is 5M. This means Fluent Bit will group up to 1MB of log data into a "chunk." Once a chunk reaches 1MB, or if 5 seconds pass (controlled by Flush in the [SERVICE] section), it’s sent to the es output plugin. The Buffer_Max_Size acts as a hard limit for a single output plugin’s buffer, preventing it from growing indefinitely if the output is slow.

The fundamental problem Fluent Bit solves is efficiently collecting, processing, and forwarding logs from numerous sources to various destinations. It achieves this through a plugin-based architecture and a robust buffering mechanism. The Buffer_Chunk_Size and Buffer_Max_Size are central to this buffering. A smaller Buffer_Chunk_Size means more frequent, smaller writes to the output. This can be good for real-time visibility but can overwhelm an output if it can’t keep up, leading to increased latency or even dropped data. Conversely, a larger Buffer_Chunk_Size means fewer, larger writes. This is more efficient for the output, reducing overhead, but increases latency between a log event occurring and it appearing at the destination.

The [SERVICE] section’s Flush setting dictates the maximum time a chunk will sit in memory before being flushed, regardless of its size. If your logs are sparse, Flush might be the primary driver for when data is sent. If logs are abundant, Buffer_Chunk_Size will likely dictate it.

The Buffer_Chunk_Size is the target size for a single buffer chunk. When a chunk reaches this size, it’s considered "full" and is queued for processing by the output plugin. The Buffer_Max_Size is the total maximum size for all chunks destined for a specific output plugin. If the total buffer for an output plugin exceeds Buffer_Max_Size, Fluent Bit will start dropping older chunks to make space for new ones, unless Retry_Limit is set to False (as shown in the example, meaning it will retry indefinitely, potentially leading to memory exhaustion).

Many users, especially those dealing with high-throughput scenarios, instinctively increase Buffer_Max_Size to prevent data loss. However, the real performance bottleneck is often the number of chunks being processed, not their total size. If your output can handle larger individual requests, increasing Buffer_Chunk_Size to something like 10M or 50M (depending on your output’s capacity and network throughput) can drastically reduce the overhead of sending data. This means fewer API calls to your Elasticsearch cluster, fewer network connections, and less CPU work for both Fluent Bit and the output. The trade-off is increased latency, as each chunk now represents a larger window of log events.

The key is to find the sweet spot where your output can ingest data without being overwhelmed, while minimizing the overhead Fluent Bit incurs by sending data in excessively small chunks.

The next logical step after tuning buffer sizes is to investigate how Fluent Bit’s internal threading and I/O management interact with your output’s concurrency settings.

Want structured learning?

Take the full Fluentbit course →