InfluxDB retention policies are fundamentally a form of time-based data pruning, not a proactive garbage collection mechanism.

Let’s see this in action. Imagine we have a measurement called cpu with a tag host and a field usage_system. We want to keep data for only 7 days.

package main

import (
	"fmt"
	"time"

	client "github.com/influxdata/influxdb1-client/v2"
)

func main() {
	c, err := client.NewHTTPClient(client.HTTPConfig{
		Addr: "http://localhost:8086",
	})
	if err != nil {
		panic(err)
	}
	defer c.Close()

	// Create a database
	dbName := "test_db"
	_, err = c.Query(client.Query{
		Command:  fmt.Sprintf("CREATE DATABASE %s", dbName),
		Database: dbName,
	})
	if err != nil {
		// Ignore if database already exists
		if err.Error() != "database not found" {
			panic(err)
		}
	}

	// Create a retention policy for 7 days
	rpName := "seven_day_retention"
	duration := "7d"
	shardDuration := "1d" // InfluxDB often breaks data into shards
	_, err = c.Query(client.Query{
		Command:  fmt.Sprintf("CREATE RETENTION POLICY \"%s\" ON \"%s\" DURATION %s SHARD DURATION %s REPLICATION 1", rpName, dbName, duration, shardDuration),
		Database: dbName,
	})
	if err != nil {
		// Ignore if policy already exists
		if err.Error() != "retention policy already exists" {
			panic(err)
		}
	}

	// Set the newly created policy as the default for the database
	_, err = c.Query(client.Query{
		Command:  fmt.Sprintf("ALTER DATABASE \"%s\" DEFAULT RETENTION POLICY \"%s\"", dbName, rpName),
		Database: dbName,
	})
	if err != nil {
		panic(err)
	}

	// Write some sample data
	bp, err := client.NewBatchPoints(client.BatchPointsConfig{
		Database: dbName,
		Precision: "ns",
	})
	if err != nil {
		panic(err)
	}

	// Write data from 10 days ago up to now
	now := time.Now()
	for i := 0; i < 10; i++ {
		t := now.Add(time.Duration(-i) * 24 * time.Hour)
		tags := map[string]string{"host": fmt.Sprintf("server-%d", i%2)}
		fields := map[string]interface{}{
			"usage_system": float64(i),
		}
		pt, err := client.NewPoint("cpu", tags, fields, t)
		if err != nil {
			panic(err)
		}
		bp.AddPoint(pt)
	}

	err = c.Write(bp)
	if err != nil {
		panic(err)
	}

	fmt.Printf("Wrote %d points to database '%s' with retention policy '%s' (%s duration).\n", len(bp.Points()), dbName, rpName, duration)

	// Wait for InfluxDB's internal process to run. This isn't instantaneous.
	// In a real scenario, you'd monitor InfluxDB logs or check data counts.
	// For demonstration, we'll just pause.
	fmt.Println("Waiting for InfluxDB to prune data...")
	time.Sleep(2 * time.Minute) // This sleep is just for demo purposes.

	// Query data to see what's left
	resp, err := c.Query(client.Query{
		Command:  "SELECT count(*) FROM \"cpu\"",
		Database: dbName,
	})
	if err != nil {
		panic(err)
	}

	if len(resp.Results) > 0 && len(resp.Results[0].Series) > 0 {
		count := resp.Results[0].Series[0].Values[0][1].(int64)
		fmt.Printf("After retention, count of points in 'cpu' measurement: %d\n", count)
	} else {
		fmt.Println("No data found after retention.")
	}
}

This code does a few things: it creates a database, defines a retention policy named seven_day_retention that will keep data for 7d (7 days) and breaks data into 1d shards, and then sets this as the default for the database. It then writes 10 days’ worth of data. After a short pause (simulating InfluxDB’s internal pruning process), it queries the count of points, which should now reflect only the data within the last 7 days.

The core problem InfluxDB retention policies solve is managing storage growth in time-series databases. Without them, your disk would fill up indefinitely with old, potentially irrelevant data.

Internally, InfluxDB partitions data into "shards." A retention policy dictates how long data within a shard lives. When a shard’s data is older than the retention period, InfluxDB marks it for deletion. This is an asynchronous, background process. The SHARD DURATION setting is crucial; it influences how InfluxDB groups data points into physical files on disk. Smaller shard durations mean more, smaller files, which can impact query performance but also make retention enforcement more granular. Larger shard durations mean fewer, larger files, potentially better for query performance but less flexible for retention.

The actual "deletion" isn’t immediate upon policy creation. InfluxDB has a background process, tsm1_compact_manager (in older versions) or similar background processes in newer TSM storage engine versions, that periodically checks shard groups. When a shard group’s data is entirely older than the retention period, the entire group is eligible for removal. This means you might see data slightly older than your retention policy for a short while until the next compaction and cleanup cycle runs. The REPLICATION clause determines how many copies of each shard are stored across your cluster for high availability; setting it to 1 means no replication.

The most critical lever you control is the DURATION. This is a human-readable duration string like 1h, 2d, 30d, 1y. It’s not a precise, point-in-time deletion. It’s a policy applied to shard groups. If you set DURATION 1h and SHARD DURATION 6h, InfluxDB won’t start deleting anything until at least 6 hours of data have passed and those 6 hours are older than 1 hour. In practice, you’d usually set SHARD DURATION to be less than or equal to your shortest DURATION.

The "auto-delete" aspect is a bit of a misnomer. It’s more of an automatic pruning based on time windows. The underlying storage engine decides when to physically remove the data files. You can influence this by setting SHARD DURATION appropriately, but you can’t force an immediate delete.

The next concept you’ll likely grapple with is how to query data across different retention policies, especially if you have a tiered storage strategy (e.g., keeping recent data on fast storage and older data on slower, cheaper storage, though InfluxDB Enterprise handles this more explicitly).

Want structured learning?

Take the full Influxdb course →