The most surprising thing about sending Fluentd logs to Datadog is that the plugin essentially acts as a high-volume, asynchronous HTTP client, not a direct API integration.

Let’s see it in action. Imagine you have a simple Fluentd configuration file (fluent.conf) like this:

<source>
  @type tail
  path /var/log/myapp.log
  tag myapp.logs
  <parse>
    @type json
  </parse>
</source>

<match myapp.logs>
  @type datadog
  api_key <YOUR_DATADOG_API_KEY>
  dd_source myapp
  dd_tags environment:production,service:webserver
</match>

When Fluentd processes a log line from /var/log/myapp.log that looks like {"message": "User logged in", "user_id": "123"}, the <match> block kicks in. The datadog output plugin takes this structured event, serializes it, and queues it up for sending. It doesn’t just dump it; it meticulously formats it into a Datadog-compatible JSON payload.

Internally, the datadog plugin uses a robust buffering mechanism. By default, it employs a file-based buffer. This means if Datadog is temporarily unavailable, Fluentd won’t drop your logs. Instead, it writes them to a local file on disk. Once Datadog is reachable again, Fluentd will resume sending, ensuring no data loss. This is crucial for maintaining log integrity.

The plugin’s configuration options give you fine-grained control. api_key is, of course, your Datadog API key, the gatekeeper to your account. dd_source is a primary identifier in Datadog, helping you categorize logs by their origin (e.g., webserver, database, worker). dd_tags allows you to attach arbitrary key-value pairs as tags to every log event sent, enabling powerful filtering and aggregation within Datadog. You can add service, env, region, or any custom tag relevant to your infrastructure.

Here’s a peek at the kind of JSON payload the plugin constructs for Datadog:

{
  "message": "User logged in",
  "user_id": "123",
  "ddsource": "myapp",
  "ddtags": "environment:production,service:webserver",
  "hostname": "your-fluentd-host",
  "timestamp": "2023-10-27T10:30:00Z"
  // ... other metadata
}

Notice how the original log fields (message, user_id) are preserved, and the ddsource and ddtags are injected. The hostname is automatically added by Fluentd.

Beyond basic log forwarding, the plugin supports advanced features like HTTP compression and TLS encryption, ensuring efficient and secure data transfer. You can also configure retry mechanisms and timeouts to fine-tune how the plugin handles network issues. For instance, flush_interval 5s would mean Fluentd attempts to send buffered logs every 5 seconds, balancing latency with network efficiency.

Many users overlook the dd_prefix option. When set, this prefix is added to the ddsource field in Datadog. So, if you have dd_source myapp and dd_prefix staging, your logs in Datadog will appear with ddsource: staging.myapp. This is incredibly useful for environments with multiple identical services deployed across different stages, allowing you to differentiate them cleanly within Datadog without altering the core dd_source value.

The next step in a robust logging pipeline often involves integrating metrics derived from these logs, using Datadog’s log-based metrics feature.

Want structured learning?

Take the full Fluentd course →