Time Series Collections时间序列集合

On this page本页内容

New in version 5.0.在版本5.0中新增

Time series collections时间序列集合 efficiently store sequences of measurements over a period of time. 有效地存储一段时间内的测量序列。Time series data is any data that is collected over time and is uniquely identified by one or more unchanging parameters. 时间序列数据是随时间集合的,由一个或多个不变参数唯一标识的任何数据。The unchanging parameters that identify your time series data is generally your data source's metadata.识别时间序列数据的不变参数通常是数据源的元数据。

Example实例Measurement测量Metadata元数据
Weather data天气数据Temperature温度Sensor identifier, location传感器标识符、位置
Stock data股票数据Stock price股价Stock ticker, exchange证券交易所
Website visitors网站访问者View count浏览量URL

Compared to normal collections, storing time series data in time series collections improves query efficiency and reduces the disk usage for time series data and secondary indexes.与正常集合相比,在时间序列集合中存储时间序列数据可以提高查询效率,并减少时间序列数据和二级索引的磁盘使用。

Procedures过程

Create a Time Series Collection创建一个时间序列集合

Note注意

You can only create time series collections on a system with featureCompatibilityVersion set to 5.0.只能在featureCompatibilityVersion设置为5.0的系统上创建时间序列集合。

Before you can insert data into a time series collection, you must explicitly create the collection using either the db.createCollection() method or the create command:在将数据插入时间序列集合之前,必须使用db.createCollection()方法或create命令显式创建集合:

db.createCollection(
    "weather",
    {
       timeseries: {
          timeField: "timestamp",
          metaField: "metadata",
          granularity: "hours"
       }
    }
)

When creating a time series collection, specify the following options:创建时间序列集合时,请指定以下选项:

Field字段Type类型Description描述
timeseries.timeFieldstring

Required. 必需。The name of the field which contains the date in each time series document. 每个时间序列文档中包含日期的字段的名称。Documents in a time series collection must have a valid BSON date as the value for the timeField.时间序列集合中的文档必须具有有效的BSON date作为timeField的值。

timeseries.metaFieldstring

Optional. 可选。The name of the field which contains metadata in each time series document. 每个时间序列文档中包含元数据的字段的名称。The metadata in the specified field should be data that is used to label a unique series of documents. 指定字段中的元数据应该是用于标记一系列唯一文档的数据。The metadata should rarely, if ever, change.元数据应该很少发生变化。

The name of the specified field may not be _id or the same as the timeseries.timeField. 指定字段的名称不能为_id,也不能与timeseries.timeField相同。The field can be of any type.该字段可以是任何类型。

timeseries.granularitystring

Optional. 可选。Possible values are:可能的值包括:

  • "seconds"
  • "minutes"
  • "hours"

By default, MongoDB sets the granularity to "seconds" for high-frequency ingestion.默认情况下,MongoDB将高频摄取的granularity(粒度)设置为"seconds"

Manually set the granularity parameter to improve performance by optimizing how data in the time series collection is stored internally. 通过优化时间序列集合中数据的内部存储方式,手动设置granularity参数以提高性能。To select a value for granularity, choose the closest match to the time span between consecutive incoming measurements.要选择granularity值,请选择与连续传入测量之间的时间跨度最接近的匹配项。

If you specify the timeseries.metaField, consider the time span between consecutive incoming measurements that have the same unique value for the metaField field. 如果指定timeseries.metaField,请考虑具有相同的metaField唯一值的连续传入测量之间的时间跨度。Measurements often have the same unique value for the metaField field if they come from the same source.如果测量值来自同一个源,则通常具有相同的metaField唯一值。

If you do not specify timeseries.metaField, consider the time span between all measurements that are inserted in the collection.如果未指定timeseries.metaField,请考虑插入到集合中的所有测量之间的时间跨度。

expireAfterSecondsnumberOptional. 可选。Enable the automatic deletion of documents in a time series collection by specifying the number of seconds after which documents expire. 通过指定文档过期的秒数,可以自动删除时间序列集合中的文档。MongoDB deletes expired documents automatically. MongoDB会自动删除过期的文档。See Set up Automatic Removal for Time Series Collections (TTL) for more information.有关更多信息,请参阅设置时间序列集合(TTL)的自动删除。

Other options allowed with the timeseries option are:timeseries选项允许的其他选项包括:

  • storageEngine
  • indexOptionDefaults
  • collation
  • writeConcern
  • comment
Tip提示
See:

Insert Measurements into a Time Series Collection将测量值插入到时间序列集合中

Each document you insert should contain a single measurement. 插入的每个文档都应包含一个测量值。To insert multiple documents at once, issue the following command:要同时插入多个文档,请发出以下命令:

db.weather.insertMany( [
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T00:00:00.000Z"),
      "temp": 12
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T04:00:00.000Z"),
      "temp": 11
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T08:00:00.000Z"),
      "temp": 11
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T12:00:00.000Z"),
      "temp": 12
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T16:00:00.000Z"),
      "temp": 16
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T20:00:00.000Z"),
      "temp": 15
   }, {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T00:00:00.000Z"),
      "temp": 13
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T04:00:00.000Z"),
      "temp": 12
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T08:00:00.000Z"),
      "temp": 11
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T12:00:00.000Z"),
      "temp": 12
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T16:00:00.000Z"),
      "temp": 17
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T20:00:00.000Z"),
      "temp": 12
   }
] )

To insert a single document, use the db.collection.insertOne() method.要插入单个文档,请使用db.collection.insertOne()方法。

Query a Time Series Collection查询时间序列集合

To retrieve one document from a time series collection, issue the following command:要从时间序列集合中检索一个文档,请发出以下命令:

db.weather.findOne({
   "timestamp": ISODate("2021-05-18T00:00:00.000Z")
})

Run Aggregations on a Time Series Collection在时间序列集合上运行聚合

For additional query functionality, use an aggregation pipeline such as:要获得其他查询功能,请使用聚合管道,例如:

db.weather.aggregate( [
   {
      $project: {
         date: {
            $dateToParts: { date: "$timestamp" }
         },
         temp: 1
      }
   },
   {
      $group: {
         _id: {
            date: {
               year: "$date.year",
               month: "$date.month",
               day: "$date.day"
            }
         },
         avgTmp: { $avg: "$temp" }
      }
   }
] )

The example aggregation pipeline groups all documents by the date of the measurement and then returns the average of all temperature measurements that day:示例聚合管道按测量日期对所有文档进行分组,然后返回当天所有温度测量的平均值:

 {
  "_id" : {
    "date" : {
      "year" : 2021,
      "month" : 5,
      "day" : 18
    }
  },
  "avgTmp" : 12.714285714285714
}
{
  "_id" : {
    "date" : {
      "year" : 2021,
      "month" : 5,
      "day" : 19
    }
  },
  "avgTmp" : 13
}

Check if a Collection is of Type Time Series检查集合是否为时间序列类型

To determine if a collection is of type time series, use the listCollections command:要确定集合是否为时间序列类型,请使用listCollections命令:

db.runCommand( { listCollections: 1.0 } )

If the collection is a time series collection, it returns this:如果集合是时间序列集合,它将返回以下内容:

{
    cursor: {
       id: <number>,
       ns: 'test.$cmd.listCollections',
       firstBatch: [
         {
            name: <string>,
            type: 'timeseries',
            options: {
               expireAfterSeconds: <number>,
               timeseries: { ... }
            },
            ...
         },
         ...
       ]
    }
 }

Behavior行为

Time series collections behave like normal collections. 时间序列集合的行为与普通集合类似。You can insert and query your data as you normally would. 您可以像往常一样插入和查询数据。MongoDB treats time series collections as writable non-materialized views on internal collections that automatically organize time series data into an optimized storage format on insert.MongoDB将时间序列集合视为内部集合上的可写非物化视图,可在插入时自动将时间序列数据组织为优化的存储格式。

When you query time series collections, you operate on one document per measurement. 当您查询时间序列集合时,每个度量操作一个文档。Queries on time series collections take advantage of the optimized internal storage format and return results faster.对时间序列集合的查询利用了优化的内部存储格式,并更快地返回结果。

Index索引

The implementation of time series collections uses internal collections that reduce disk usage and improve query efficiency. 时间序列集合的实现使用内部集合来减少磁盘使用并提高查询效率。Time series collections automatically order and index data by time. 时间序列集合自动按时间排序和索引数据。The internal index for a time series collection is not displayed by listIndexes.listIndexes不显示时间序列集合的内部索引。

Tip提示

To improve query performance, you can manually add secondary indexes on the fields specified as the metaField and the timeField.为了提高查询性能,可以在指定为metaFieldtimeField的字段上手动添加二级索引

Default Compression Algorithm默认压缩算法

Time series collections ignore the global default compression algorithm, snappy, in favor of zstd, unless a different compression algorithm is specified using the storageEngine option when the collection was created. 时间序列集合忽略全局默认压缩算法snappy,而支持zstd,除非在创建集合时使用storageEngine选项指定了不同的压缩算法。For example, to change the compression algorithm to snappy for a new weather collection, add the following option:例如,要将新weather集合的压缩算法更改为snappy,请添加以下选项:

db.createCollection(
  "weather",
  {
     timeseries: {
        timeField: "timestamp"
     },
     storageEngine: {
        wiredTiger: {
           configString: "block_compressor=snappy"
        }
     }
  }
)

Valid block_compressor options are:有效的block_compressor选项包括:

  • snappy
  • zlib
  • zstd (default)
  • none

Compression Improvements压缩改进

Starting in MongoDB 5.2, time series collection data is further compressed to save database space. 从MongoDB 5.2开始,时间序列采集数据被进一步压缩以节省数据库空间。This compression does not affect query results, nor does it negatively affect performance.这种压缩不会影响查询结果,也不会对性能产生负面影响。

←  Capped CollectionsTime Series Collection Notes and Limitations →