Fluentd’s HTTP Event Collector (HEC) is a surprisingly flexible way to get logs into Splunk, but it’s not a simple push-and-forget.

Let’s see it in action. Imagine you have a web server spitting out access logs. You want these to go straight to Splunk for analysis.

Here’s a minimal fluent.conf to get this done:

<source>
  @type tail
  path /var/log/nginx/access.log
  pos_file /var/log/td-agent/nginx-access.log.pos
  tag nginx.access
  <parse>
    @type nginx
  </parse>
</source>

<match nginx.access>
  @type http
  endpoint https://your-splunk-hec-host:8088/services/collector
  token YOUR_SPLUNK_HEC_TOKEN
  <buffer>
    flush_interval 10s
    chunk_limit_size 10m
  </buffer>
</match>

This setup does a few things:

  1. Source: The <source> block tells Fluentd to watch /var/log/nginx/access.log for new lines. The tail plugin keeps track of its position in nginx-access.log.pos so it doesn’t re-read old data. The tag nginx.access is a label for these logs. The <parse> block tells Fluentd to interpret these lines as Nginx logs.
  2. Match: The <match nginx.access> block says "if a log record has the tag nginx.access, do this."
  3. HTTP Output: The @type http specifies that we’re sending these logs over HTTP.
  4. Endpoint & Token: endpoint is the URL for your Splunk HEC, and token is your authentication key.
  5. Buffering: The <buffer> section is crucial. It tells Fluentd to collect logs for up to 10 seconds (flush_interval) or until it has 10MB (chunk_limit_size) before sending them to Splunk. This batching significantly improves efficiency.

The problem this solves is bridging the gap between applications generating logs in various formats and Splunk’s structured ingestion. Fluentd acts as a universal adapter, parsing, filtering, and transforming logs before sending them to Splunk via HEC.

Internally, the http output plugin buffers log events. When a buffer is full or the flush_interval is reached, it packages these events into an HTTP POST request to the Splunk HEC endpoint. Splunk then indexes these events.

The exact levers you control are:

  • path: Where Fluentd looks for logs.
  • tag: How you categorize and route logs within Fluentd.
  • endpoint: The address of your Splunk HEC.
  • token: Your HEC authentication key.
  • <parse>: The format of the incoming logs.
  • <buffer> settings: How often and how much data is sent in a batch.

A common gotcha is how the token is handled. Splunk HEC tokens are often configured with specific capabilities (e.g., input_http). If your token doesn’t have the input_http capability, Fluentd will successfully connect to the HEC endpoint but Splunk will reject the incoming events with an authorization error. This isn’t a network issue; it’s a permissions issue on the Splunk side.

Once logs are flowing, you’ll likely want to explore how to enrich them with metadata before they hit Splunk.

Want structured learning?

Take the full Fluentd course →