Fluentd became the default logging agent for many Kubernetes clusters, but you’re still seeing td-agent in older setups or specific distributions, and you’re wondering what the deal is. The core difference isn’t a fundamental architectural shift, but rather td-agent is a bundled, opinionated distribution of Fluentd maintained by Treasure Data, whereas Fluentd is the open-source project itself.

Let’s see it in action. Imagine you have a simple web server spitting out logs.

Scenario 1: Using td-agent (The "Batteries Included" Approach)

If you installed td-agent on a server, your configuration typically lives in /etc/td-agent/td-agent.conf.

# /etc/td-agent/td-agent.conf

<source>
  @type tail
  path /var/log/nginx/access.log
  pos_file /var/log/td-agent/nginx.pos
  tag nginx.access
  <parse>
    @type nginx
  </parse>
</source>

<match nginx.**>
  @type stdout
</match>

When you start td-agent (e.g., sudo systemctl start td-agent), it reads this configuration. The tail input plugin watches /var/log/nginx/access.log. The <parse> block tells it to interpret the logs using the nginx parser. Finally, the <match> directive sends any log tagged with nginx. (which our source does) to stdout, meaning you’ll see it printed to your terminal or wherever td-agent’s logs are directed.

Scenario 2: Using Fluentd (The "Build Your Own" Approach)

If you’re using the core Fluentd binary, you might manage your configuration more granularly. You’d typically install Fluentd via a package manager (like apt or yum) or even from source. The configuration file might be /etc/fluentd/fluentd.conf or you might use multiple files in /etc/fluentd/conf.d/.

Let’s say you have a fluentd.conf and a sources.conf in conf.d:

/etc/fluentd/fluentd.conf

<source>
  @id input_nginx_access
  @type tail
  path /var/log/nginx/access.log
  pos_file /var/log/fluentd/nginx.pos
  tag nginx.access
  <parse>
    @type nginx
  </parse>
</source>

<match nginx.**>
  @id output_stdout
  @type stdout
</match>

You’d then start Fluentd like: fluentd -c /etc/fluentd/fluentd.conf. The behavior is identical to the td-agent example: tailing a file, parsing it, and sending it to stdout.

The Mental Model: Fluentd as the Engine, td-agent as the Car

Fluentd is the core engine. It’s a highly extensible, plugin-driven data collector. It defines the fundamental concepts:

  • Sources: Where logs come from (files, network sockets, other services).
  • Filters: How to modify logs in transit (add/remove fields, rewrite values, drop messages).
  • Sinks (Matches): Where logs go (files, Elasticsearch, S3, Kafka, stdout).
  • Tags: A hierarchical label system (like nginx.access, apache.error, system.syslog) that routes logs through the system.

td-agent is a distribution that bundles Fluentd with a curated set of plugins and sensible defaults, specifically aimed at making logging management easier out-of-the-box for general-purpose server logging. It often includes plugins for common destinations like Elasticsearch, Splunk, and cloud storage, and its configuration file structure is optimized for common use cases. td-agent also typically includes a systemd service or init script for easier management.

What Changed?

The primary "change" is the increasing adoption of the core Fluentd project, especially within cloud-native environments like Kubernetes. Kubernetes logging often involves deploying Fluentd as a DaemonSet, where each node runs a Fluentd instance to collect logs from containers on that node. In this context, you’re often installing the core Fluentd package, not the td-agent bundle. Treasure Data still actively develops and supports td-agent, but the open-source Fluentd project has gained broader community traction and integration into other projects.

Which to Use?

For new deployments, especially in Kubernetes or when you need fine-grained control over your logging pipeline and want to leverage the latest community plugins, Fluentd is generally the way to go. You install the core Fluentd package and build your configuration from there.

For simpler, traditional server logging scenarios where you want a quick setup with common destinations pre-configured, td-agent remains a solid choice. It’s often easier to get started with its bundled plugins and default configuration. If you’re working with an older system that already uses td-agent, it’s perfectly fine to continue using it.

The underlying mechanics of how logs are collected, parsed, tagged, and routed are the same because td-agent is Fluentd, just with a specific packaging and set of defaults. The choice often boils down to your environment, your need for pre-built integrations, and your preference for a "batteries-included" versus a "build-it-yourself" approach.

The most surprising aspect is how the tagging system, which seems like a simple labeling mechanism, forms the backbone of Fluentd’s routing logic. A single source can emit logs with multiple tags, and a single match block can capture logs from multiple tags using wildcards, creating a surprisingly flexible, declarative routing graph without explicit conditional logic in the configuration itself.

The next concept to dive into is the rich ecosystem of Fluentd plugins, particularly those for outputting to distributed tracing systems.

Want structured learning?

Take the full Fluentd course →