Fluentd’s HTTP Event Collector (HEC) is a surprisingly flexible way to get logs into Splunk, but it’s not a simple push-and-forget.
Let’s see it in action. Imagine you have a web server spitting out access logs. You want these to go straight to Splunk for analysis.
Here’s a minimal fluent.conf to get this done:
<source>
@type tail
path /var/log/nginx/access.log
pos_file /var/log/td-agent/nginx-access.log.pos
tag nginx.access
<parse>
@type nginx
</parse>
</source>
<match nginx.access>
@type http
endpoint https://your-splunk-hec-host:8088/services/collector
token YOUR_SPLUNK_HEC_TOKEN
<buffer>
flush_interval 10s
chunk_limit_size 10m
</buffer>
</match>
This setup does a few things:
- Source: The
<source>block tells Fluentd to watch/var/log/nginx/access.logfor new lines. Thetailplugin keeps track of its position innginx-access.log.posso it doesn’t re-read old data. Thetag nginx.accessis a label for these logs. The<parse>block tells Fluentd to interpret these lines as Nginx logs. - Match: The
<match nginx.access>block says "if a log record has the tagnginx.access, do this." - HTTP Output: The
@type httpspecifies that we’re sending these logs over HTTP. - Endpoint & Token:
endpointis the URL for your Splunk HEC, andtokenis your authentication key. - Buffering: The
<buffer>section is crucial. It tells Fluentd to collect logs for up to 10 seconds (flush_interval) or until it has 10MB (chunk_limit_size) before sending them to Splunk. This batching significantly improves efficiency.
The problem this solves is bridging the gap between applications generating logs in various formats and Splunk’s structured ingestion. Fluentd acts as a universal adapter, parsing, filtering, and transforming logs before sending them to Splunk via HEC.
Internally, the http output plugin buffers log events. When a buffer is full or the flush_interval is reached, it packages these events into an HTTP POST request to the Splunk HEC endpoint. Splunk then indexes these events.
The exact levers you control are:
path: Where Fluentd looks for logs.tag: How you categorize and route logs within Fluentd.endpoint: The address of your Splunk HEC.token: Your HEC authentication key.<parse>: The format of the incoming logs.<buffer>settings: How often and how much data is sent in a batch.
A common gotcha is how the token is handled. Splunk HEC tokens are often configured with specific capabilities (e.g., input_http). If your token doesn’t have the input_http capability, Fluentd will successfully connect to the HEC endpoint but Splunk will reject the incoming events with an authorization error. This isn’t a network issue; it’s a permissions issue on the Splunk side.
Once logs are flowing, you’ll likely want to explore how to enrich them with metadata before they hit Splunk.