Fluentd is deceptively simple, but the way it handles network protocols like TCP and UDP for syslog is fundamentally different from how you might expect, treating each connection as a distinct, stateful entity rather than just a stream of packets.
Let’s see it in action. Imagine you have a syslog server sending messages to a Fluentd collector.
Scenario: Sending logs from a Linux rsyslog client to Fluentd
First, on your Fluentd collector, you need the in_syslog plugin. If it’s not already there, install it:
/usr/local/bin/fluent-gem install fluent-plugin-syslog
Now, configure Fluentd to listen for syslog messages. Here’s a fluentd.conf snippet:
<source>
@type syslog
port 5140
bind 0.0.0.0
protocol_type tcp
tag syslog.tcp
</source>
<source>
@type syslog
port 5141
bind 0.0.0.0
protocol_type udp
tag syslog.udp
</source>
<match syslog.**>
@type stdout
</match>
Start Fluentd with this configuration:
/usr/local/bin/fluentd -c fluentd.conf
On your rsyslog client (e.g., a Linux server), configure /etc/rsyslog.conf to forward messages:
For TCP:
*.* @@your_fluentd_collector_ip:5140
For UDP:
*.* @your_fluentd_collector_ip:5141
Restart the rsyslog service:
systemctl restart rsyslog
Now, any log message generated on the rsyslog client will appear in Fluentd’s output, tagged appropriately (e.g., syslog.tcp.your_hostname or syslog.udp.your_hostname). You’ll see output like this in your Fluentd terminal:
2023-10-27 10:30:00.123456 +0000 syslog.tcp.myclient: {"host":"myclient","message":"Oct 27 10:30:00 myclient sshd[12345]: Accepted publickey for user from 192.168.1.100 port 54321 ssh2"}
2023-10-27 10:30:01.789012 +0000 syslog.udp.myclient: {"host":"myclient","message":"Oct 27 10:30:01 myclient kernel: [12345.67890] eth0: Link up"}
The core problem this solves is centralizing logs from diverse sources that might already be using the syslog protocol. Instead of reconfiguring every application to send logs in a different format, you can have Fluentd act as a universal syslog receiver.
Internally, the in_syslog plugin uses Ruby’s built-in SyslogProtocol library, which is designed to parse the standard syslog message format (RFC 3164 and RFC 5424). For TCP, it maintains a connection for each client, reading messages until the connection closes or errors. For UDP, it’s a fire-and-forget approach; it listens on the port and processes incoming datagrams as they arrive. The protocol_type parameter is critical here, dictating whether it establishes and maintains TCP sockets or just listens for UDP packets.
The tag parameter isn’t just for routing; it’s how Fluentd distinguishes between different sources or protocols. By default, in_syslog will append the hostname of the sender to the tag. So, syslog.tcp combined with the hostname myclient becomes syslog.tcp.myclient. This allows for granular routing and filtering in your <match> directives.
What most people miss is that the in_syslog plugin doesn’t only speak the standard syslog protocol. If you configure it with format none and message_length_limit 1048576 (a generous default), it will happily ingest any plain text data sent over TCP or UDP on the specified port, treating each line as a distinct event. This makes it incredibly flexible for non-standard log sources that might just be spitting out lines of text. You just need to ensure your sender is configured correctly to send line-delimited data.
The next concept you’ll likely encounter is how to handle message parsing and structuring once Fluentd has received these raw syslog strings.