Flux, the query language for InfluxDB, is surprisingly powerful when it comes to crunching numbers. You can do more than just retrieve data; you can transform it on the fly.
Let’s see it in action. Imagine we have a sensor that reports temperature and humidity every second. We want to calculate the dew point. We can do this directly within a Flux query using the dewpoint function.
data
|> range(start: -1h)
|> filter(fn: (r) => r["_measurement"] == "sensor_data")
|> filter(fn: (r) => r["_field"] == "temperature" or r["_field"] == "humidity")
|> pivot(rowKey: ["_time"], columnKey: ["_field"], value: "_value")
|> map(fn: (r) => ({ r with dewpoint: dewpoint(t: r.temperature, h: r.humidity) }))
|> yield(name: "dewpoint_calculation")
This query first fetches the raw temperature and humidity readings. Then, it uses pivot to reshape the data so that temperature and humidity are in separate columns for each timestamp. Finally, the map function applies the dewpoint calculation, taking the temperature and humidity from each row and producing a new dewpoint column.
The core problem Flux solves here is bringing computation closer to the data source. Instead of pulling raw sensor readings into a separate application and performing calculations there, you can do it directly in Flux. This reduces data transfer, simplifies your architecture, and makes your queries self-contained.
Internally, Flux queries are executed as a directed acyclic graph (DAG). Each function in your script is a node, and the data flows between them. When you use a function like map with a custom script, Flux evaluates that script for each row passing through that node. For mathematical operations, this means Flux is performing the arithmetic on the values as they are processed, rather than needing to store intermediate results in a way that a traditional relational database might.
The levers you control are the standard mathematical operators (+, -, *, /, %) and the many built-in mathematical functions Flux provides. These include trigonometric functions (sin, cos, tan), logarithmic functions (ln, log), and statistical functions. You can also define your own functions using lambda expressions within map or reduce operations for more complex, reusable logic.
You can perform operations on entire columns by using map with a script that references fields directly. For example, to calculate the difference between two temperature readings from different sensors, you could join the data and then use map:
temp1 = from(bucket: "my_bucket")
|> range(start: -1h)
|> filter(fn: (r) => r["_measurement"] == "sensor_data" and r["sensor_id"] == "sensor_A")
|> filter(fn: (r) => r["_field"] == "temperature")
|> keep(columns: ["_time", "_value"])
|> rename(columns: {_value: "temp_A"})
temp2 = from(bucket: "my_bucket")
|> range(start: -1h)
|> filter(fn: (r) => r["_measurement"] == "sensor_data" and r["sensor_id"] == "sensor_B")
|> filter(fn: (r) => r["_field"] == "temperature")
|> keep(columns: ["_time", "_value"])
|> rename(columns: {_value: "temp_B"})
join(
tables: {t1: temp1, t2: temp2},
on: ["_time"]
)
|> map(fn: (r) => ({ r with temp_difference: r.temp_A - r.temp_B }))
|> yield(name: "temperature_difference")
This query explicitly fetches data for two sensors, renames the value column to avoid conflicts, and then joins them on the timestamp. The map function then subtracts temp_B from temp_A for each matching timestamp, creating a new column representing the difference.
A common point of confusion is how Flux handles data types during calculations. While Flux is dynamically typed, operations generally expect numeric types. If you try to perform arithmetic on a string value, you’ll encounter an error. It’s crucial to ensure your data is in a suitable numeric format before attempting calculations, often using cast or castBool if necessary.
Once you’re comfortable with basic arithmetic and built-in functions, you’ll likely want to explore how to combine multiple calculations or apply conditional logic within your Flux queries.