InfluxDB doesn’t actually store timestamps with nanosecond precision, despite what its documentation might lead you to believe.
Let’s see it in action. Suppose we have a simple InfluxDB setup and we want to write some data with very precise timestamps.
# Start InfluxDB (assuming it's already installed and running)
# For simplicity, we'll use the CLI to interact.
# Create a database
influx -execute "CREATE DATABASE mydb"
# Switch to the database
influx -database mydb
# Write some data with high precision timestamps
influx -execute '
INSERT mymeasurement,tagset=tagvalue fieldset=123.45 1678886400123456789
INSERT mymeasurement,tagset=tagvalue fieldset=678.90 1678886400987654321
'
# Query the data
influx -execute 'SELECT * FROM mymeasurement'
The output you’ll see will look something like this:
name: mymeasurement
time fieldset
---- --------
2023-03-15T13:20:00.123456789Z 123.45
2023-03-15T13:20:00.987654321Z 678.90
On the surface, it appears InfluxDB is storing and returning nanosecond precision. However, the reality is a bit more nuanced. InfluxDB uses a Unix epoch timestamp internally, and the storage format and precision of measurements are what truly matter. While you can write nanosecond-precision timestamps, InfluxDB’s internal representation and how it handles these values can lead to surprises if you’re not aware of the underlying mechanics. The key is understanding that the display of nanoseconds doesn’t always mean perfect fidelity of nanosecond-level differences in all operational contexts.
The problem this solves is ensuring that when you have data points that are extremely close in time, you can accurately distinguish them and reason about their order. This is crucial for high-frequency trading, network monitoring, or any application where the exact timing of events is paramount.
Internally, InfluxDB uses a 64-bit integer to store timestamps. This integer represents the number of nanoseconds since the Unix epoch (January 1, 1970, 00:00:00 UTC). The time data type in InfluxDB is a nanosecond-precision Unix epoch timestamp. When you write data, InfluxDB parses these timestamps. The confusion often arises because the display format can be configured, and the internal storage is indeed nanoseconds. However, the precision of the measurements themselves and how InfluxDB aggregates and queries data can be influenced by the precision setting of the database or retention policy.
The primary lever you control is the precision setting when creating a database or a retention policy. By default, InfluxDB databases are created with nanosecond precision. However, you can explicitly set it to other values like μs (microseconds), ms (milliseconds), s (seconds), m (minutes), or h (hours).
# Example: Creating a database with millisecond precision
influx -execute "CREATE DATABASE mydb_ms WITH DURATION 1w PRECISION ms"
# Example: Creating a retention policy with microsecond precision
influx -execute "CREATE RETENTION POLICY rp_us ON mydb DURATION 30d REPLICATION 1 SHARDS 1 DEFAULT MISMATCHED_RETENTION_POLICY=IGNORE PRECISION ms"
If you write data with nanosecond precision to a database or retention policy that has a lower precision set (e.g., milliseconds), InfluxDB will truncate or round your timestamps to match the configured precision. This is where you lose fidelity.
For instance, if you write two timestamps that are only a few microseconds apart to a database with ms precision:
Timestamp 1: 1678886400123456789 (nanoseconds) -> 1678886400123 (milliseconds)
Timestamp 2: 1678886400123987654 (nanoseconds) -> 1678886400123 (milliseconds)
Both would be stored as 2023-03-15T13:20:00.123Z, making them indistinguishable.
The most surprising thing is that even when your database and retention policies are set to nanosecond precision, the query execution can sometimes reveal that the internal representations or operations might behave as if they are operating at a lower effective precision for certain types of queries or aggregations, particularly when dealing with very high write volumes or complex queries. This is not a bug, but rather an optimization or a consequence of how time-series data is processed and indexed. The GROUP BY time(1s) clause, for example, explicitly tells InfluxDB to aggregate data into second-long buckets, overriding any finer precision for the grouping itself.
The next concept you’ll likely encounter is how to effectively query and downsample data while preserving the highest possible temporal resolution that was actually stored.