InfluxDB’s Line Protocol is a text-based format for writing data, but it’s far more efficient than JSON for time-series metrics.
Let’s see it in action. Imagine we have a sensor that reports temperature and humidity. Here’s how you’d write that data to InfluxDB using Line Protocol:
sensor,location=living_room,device_id=abc123 temp=22.5,humidity=45.2 1678886400000000000
This single line packs a lot of information.
The first part, sensor, is the measurement. Think of it as a table name in SQL, but optimized for time-series data. It groups related data points.
Next, location=living_room,device_id=abc123 are tags. Tags are key-value pairs that are indexed and allow for highly efficient filtering and querying. You can have multiple tags, separated by commas. They are best used for metadata that you’ll frequently group or filter by, like sensor location, device ID, or sensor type.
Then comes temp=22.5,humidity=45.2. These are the fields. Fields are the actual data values you’re recording. Here, we have two fields: temp with a value of 22.5 and humidity with a value of 45.2. Fields are not indexed, so they are best for the actual metrics you want to store and analyze. Multiple fields can be written in a single line, separated by commas.
Finally, 1678886400000000000 is the timestamp. This is a Unix epoch timestamp in nanoseconds. If you omit the timestamp, InfluxDB will assign the timestamp of when it received the data. However, it’s generally best practice to provide your own timestamps, especially if you’re writing data from historical logs or multiple sources, to ensure accuracy.
The whole line is structured as: measurement[,tag_key=tag_value...] field_key=field_value[,field_key=field_value...] [timestamp].
The magic of Line Protocol lies in its simplicity and the way InfluxDB parses it. By using a text-based format with specific delimiters, InfluxDB can ingest data extremely quickly. Unlike JSON, which requires more complex parsing and serialization, Line Protocol is designed for high-throughput writes. The separation of indexed tags from unindexed fields allows for both efficient storage and rapid retrieval of specific data subsets.
When writing data, ensure your field values are of the correct type. For example, floats are written as field=1.23, integers as field=123i, booleans as field=true or field=false, and strings as field="some string". If you don’t specify a type for a numeric field, InfluxDB will treat it as a float by default.
The system handles escaping characters within tag keys, tag values, and measurement names. If a character like a comma, space, or equals sign appears within these components, it must be escaped with a backslash (\). For example, if your location was "Living Room, Upstairs", the tag would be location=Living\ Room\,\ Upstairs. Field keys and values generally don’t need escaping in the same way, but string field values are enclosed in double quotes and require escaping of double quotes and backslashes within the string itself using a backslash (e.g., description="This is a \"quoted\" string").
The most surprising aspect of Line Protocol’s efficiency isn’t just its text format, but how it maps directly to InfluxDB’s internal storage engine, Time-Structured Merge (TSM) trees. Each line is essentially a set of operations that can be batched and efficiently written to disk, minimizing disk seeks and maximizing write throughput without the overhead of complex serialization formats.
The next hurdle you’ll encounter is understanding how to batch these lines for even greater write performance.