Fluent Bit, a log processor, is failing to send logs to Datadog because the Datadog agent is not properly configured to receive them.
Here are the common causes and how to fix them:
-
Datadog Agent Input Plugin Not Enabled or Misconfigured: Fluent Bit needs to know where to send logs. The Datadog agent, running on your infrastructure, typically acts as a local aggregator and forwarder. If its Fluent Bit input plugin isn’t active or is pointing to the wrong address/port, Fluent Bit won’t even try to send.
- Diagnosis: Check the Datadog agent configuration file. On Linux systems, this is usually
/etc/datadog-agent/datadog.yaml. Look for a section related tofluentbitorlogs. - Fix: Ensure the
fluentbitintegration is enabled. You’ll need to add or uncomment a section like this indatadog.yaml:
After modifyinglogs_enabled: true fluentbit: enabled: true port: 24224 # This is the default port Fluent Bit sends to # You might also see a 'host' parameter, usually '127.0.0.1'datadog.yaml, restart the Datadog agent:sudo systemctl restart datadog-agent. This tells the Datadog agent to listen for Fluent Bit traffic on the specified port.
- Diagnosis: Check the Datadog agent configuration file. On Linux systems, this is usually
-
Fluent Bit Output Plugin Misconfigured: Fluent Bit itself needs to be told to send logs to the Datadog agent. If its output plugin is missing, has the wrong address, or is using incorrect authentication, it won’t succeed.
- Diagnosis: Examine your Fluent Bit configuration files (often in
/etc/fluent-bit/fluent-bit.confor/etc/td-agent-bit/td-agent-bit.conf). Look for an[OUTPUT]section of typeforwardordatadog. - Fix: Ensure you have an output plugin configured to send to the Datadog agent. A common setup uses the
forwardplugin to send to the agent’s local endpoint.
Alternatively, if you are using the Datadog Forwarder for Fluent Bit, the configuration will look different and likely include your Datadog API key.[OUTPUT] Name forward Match * Host 127.0.0.1 Port 24224 Retry_Limit false
Restart Fluent Bit after changes:[OUTPUT] Name datadog Match * Host http-intake.logs.datadoghq.com # Or your Datadog EU endpoint Port 443 TLS On API_Key YOUR_DATADOG_API_KEY DD_URL https://http-intake.logs.datadoghq.com # Or your EU endpoint URLsudo systemctl restart fluent-bit. This ensures Fluent Bit attempts to send logs to the configured Datadog agent endpoint.
- Diagnosis: Examine your Fluent Bit configuration files (often in
-
Network Connectivity Issues Between Fluent Bit and Datadog Agent: Even if configured correctly, Fluent Bit can’t send logs if it can’t reach the Datadog agent on the network. This is common if Fluent Bit and the Datadog agent are running in different containers or on different hosts without proper network routing.
- Diagnosis: From the host/container running Fluent Bit, try to
pingortelnetto the host and port where the Datadog agent is listening (e.g.,telnet 127.0.0.1 24224). - Fix: If
telnetfails, check your network configuration. This might involve adjusting Docker network settings, Kubernetes network policies, firewall rules (e.g.,ufworiptables), or cloud provider security groups. Ensure that the port (default24224) is open and accessible from the Fluent Bit process to the Datadog agent process.
- Diagnosis: From the host/container running Fluent Bit, try to
-
Datadog Agent Not Running or Crashed: If the Datadog agent process is not active, there’s no listener for Fluent Bit to connect to.
- Diagnosis: Check the status of the Datadog agent.
- Fix: Start or restart the Datadog agent. On most Linux systems:
sudo systemctl status datadog-agentand if it’s not active,sudo systemctl start datadog-agent. Ensure the agent service is enabled to start on boot:sudo systemctl enable datadog-agent.
-
Incorrect Log Collection Configuration in Datadog: While not strictly a sending error, if Datadog isn’t configured to accept and process the logs from Fluent Bit, it will appear as if they aren’t arriving. This is more about downstream processing.
- Diagnosis: In your Datadog account, navigate to Logs -> Configuration -> Pipelines. Check if a pipeline is configured to process the logs coming from your Fluent Bit source. Also, check Log Indexes and Processing Rules.
- Fix: Ensure you have a Datadog pipeline set up to handle the logs. This might involve creating a new pipeline or modifying an existing one to match the source and attributes of the logs being sent by Fluent Bit. For example, you might need to set a
servicetag on your logs in Fluent Bit and have a Datadog pipeline that targets that service.
-
TLS/SSL Configuration Mismatch: If you’ve configured TLS for the communication between Fluent Bit and the Datadog agent (or Datadog’s intake directly), an incorrect certificate, cipher mismatch, or a disabled TLS setting will cause connection failures.
- Diagnosis: Check both Fluent Bit and Datadog agent configurations for any TLS-related parameters. Look for errors in Fluent Bit logs or Datadog agent logs mentioning TLS handshake failures.
- Fix: Ensure that if TLS is enabled on one side, it’s also enabled on the other, and that the correct certificates and keys are in place and accessible. For direct Datadog intake, ensure
TLS Onis set in the Fluent Bit output. For agent-to-agent communication, both the agent input and Fluent Bit output need to agree on TLS settings and have valid certificates.
-
Datadog API Key or Site Configuration Error (for direct intake): If your Fluent Bit is configured to send logs directly to Datadog’s intake API (bypassing the agent), an incorrect API key or specifying the wrong Datadog site (e.g., US vs. EU) will prevent logs from being accepted.
- Diagnosis: Review the
API_KeyandDD_URL(orSite) parameters in your Fluent Bit output configuration. - Fix: Verify your Datadog API key is correct and that the
DD_URLorSiteparameter matches your Datadog account’s region (e.g.,https://http-intake.logs.datadoghq.comfor US1,https://http-intake.logs.datadoghq.eufor EU1).
- Diagnosis: Review the
After resolving these, the next error you might encounter is a 400 Bad Request from Datadog’s intake API if the log payload itself is malformed or missing required attributes like service or source.