MongoDB time-series collections let you store metrics data incredibly efficiently, making it look like a simple collection but behaving like a highly optimized data store.

Let’s see it in action. Imagine you’re collecting temperature readings from sensors.

// Connect to MongoDB
const { MongoClient } = require('mongodb');
const client = new MongoClient('mongodb://localhost:27017');

async function run() {
  await client.connect();
  const db = client.db('metricsDB');
  const measurements = db.collection('measurements');

  // Create a time-series collection if it doesn't exist
  try {
    await db.createCollection('measurements', {
      timeseries: {
        timeField: 'timestamp',
        metaField: 'sensorId',
        granularity: 'seconds'
      }
    });
    console.log('Time-series collection "measurements" created.');
  } catch (e) {
    // Collection already exists, which is fine.
    if (e.codeName !== 'CollectionAlreadyExists') {
      console.error('Error creating time-series collection:', e);
    }
  }

  // Insert some data
  const now = new Date();
  await measurements.insertMany([
    { timestamp: new Date(now.getTime() - 10000), sensorId: 'sensor-A', temp: 22.5, humidity: 45 },
    { timestamp: new Date(now.getTime() - 9000), sensorId: 'sensor-B', temp: 23.1, humidity: 48 },
    { timestamp: new Date(now.getTime() - 8000), sensorId: 'sensor-A', temp: 22.7, humidity: 46 },
    { timestamp: new Date(now.getTime() - 7000), sensorId: 'sensor-C', temp: 21.9, humidity: 42 },
    { timestamp: new Date(now.getTime() - 6000), sensorId: 'sensor-B', temp: 23.3, humidity: 49 }
  ]);
  console.log('Inserted sample data.');

  // Query the data
  const recentReadings = await measurements.find({
    timestamp: { $gte: new Date(now.getTime() - 5000) }
  }).toArray();
  console.log('Recent readings:', recentReadings);

  // Aggregate data (e.g., average temperature per sensor in the last minute)
  const avgTempLastMinute = await measurements.aggregate([
    {
      $match: {
        timestamp: { $gte: new Date(now.getTime() - 60000) }
      }
    },
    {
      $group: {
        _id: '$sensorId',
        avgTemp: { $avg: '$temp' }
      }
    }
  ]).toArray();
  console.log('Average temperature per sensor (last minute):', avgTempLastMinute);

  await client.close();
}

run().catch(console.error);

This code demonstrates inserting data with a timestamp and sensorId, and then querying and aggregating it. The timestamp field is crucial as it defines the time axis, while sensorId acts as metadata.

The core problem time-series collections solve is the inefficient storage and querying of high-volume, time-ordered data. Traditional collections, even with indexing, can struggle with the sheer number of inserts and the performance of range queries over vast datasets. Time-series collections are optimized for this specific workload.

Internally, MongoDB groups data points that are close in time into buckets. These buckets are smaller, more manageable units that contain multiple individual measurements. When you insert data, MongoDB tries to place it into an existing bucket or create a new one if necessary. This bucketing significantly reduces the number of documents MongoDB needs to manage, leading to faster inserts and more efficient storage. Querying also benefits because MongoDB can often scan fewer buckets than individual documents.

You control the behavior of a time-series collection through its options during creation:

  • timeField: This is the field that contains the timestamp for each measurement. It’s mandatory.
  • metaField: This field stores metadata that identifies the source or context of the measurements (e.g., a sensor ID, a server name). MongoDB can index this field to optimize queries that filter or group by metadata.
  • granularity: This defines the minimum time interval for bucketing. Options include seconds, minutes, and hours. MongoDB will try to create buckets that are roughly this size. For example, granularity: 'minutes' means MongoDB will attempt to group measurements within the same minute into a single bucket. Choosing the right granularity is a balance: too fine, and you might not get much benefit; too coarse, and you might lose query flexibility.

When you query a time-series collection, MongoDB’s query planner is aware of the bucketing mechanism. For queries that target a time range, it can efficiently identify which buckets are relevant and only scan those. For queries that include metaField filters, it can use indexes on the metadata to further narrow down the search. The aggregate command is particularly powerful here, allowing you to perform operations like $match on time ranges and $group by metadata fields, all while leveraging the underlying bucket optimization.

The metaField can actually be an array or even a subdocument, allowing you to store multiple identifying attributes for each data point. However, when used as a grouping or filtering key in queries, MongoDB is most efficient when the metaField is a single field. If you need to group by multiple metadata attributes, it’s often better to combine them into a single string field before insertion or to use a compound index if you’re not using a dedicated time-series collection.

The next step in optimizing time-series data is exploring rollups and continuous aggregation.

Want structured learning?

Take the full Mongodb course →