Database Manual / Time Series / Create & Configure / Migrate Data

Migrate Data into a Time Series Collection with an Aggregation Pipeline使用聚合管道将数据迁移到时间序列集合中

Starting in MongoDB version 7.0, you can use the $out aggregation stage to migrate data from an existing collection into a time series collection.从MongoDB 7.0版本开始,您可以使用$out聚合阶段将数据从现有集合迁移到时间序列集合中。

Note

MongoDB does not guarantee output order when you use $out to migrate data into a times series collection. 当您使用$out将数据迁移到时间序列集合中时,MongoDB不保证输出顺序。To maintain order, sort your data before you migrate with an aggregation pipeline.为了保持秩序,请在使用聚合管道迁移之前对数据进行排序。

Before you Begin开始之前

Consider a weatherdata collection that contains time and metadata information:考虑一个包含时间和元数据信息的weatherdata集合:

db.weatherdata.insertOne(
{
_id: ObjectId("5553a998e4b02cf7151190b8"),
st: "x+47600-047900",
ts: ISODate("1984-03-05T13:00:00Z"),
position: {
type: "Point",
coordinates: [ -47.9, 47.6 ]
},
elevation: 9999,
callLetters: "VCSZ",
qualityControlProcess: "V020",
dataSource: "4",
type: "FM-13",
airTemperature: { value: -3.1, quality: "1" },
dewPoint: { value: 999.9, quality : "9" },
pressure: { value: 1015.3, quality: "1" },
wind: {
direction: { angle: 999, quality: "9" },
type: "9",
speed: { rate: 999.9, quality: "9" }
},
visibility: {
distance: { value: 999999, quality : "9" },
variability: { value: "N", quality: "9" }
},
skyCondition: {
ceilingHeight: { value: 99999, quality: "9", determination: "9" },
cavok: "N"
},
sections: [ "AG1" ],
precipitationEstimatedObservation: {
discrepancy: "2",
estimatedWaterDepth: 999
}
}
)

Steps步骤

1

Create a metadata field.创建元数据字段。

If your collection doesn't include a field you can use to identify each series, transform your data to define one. In this example, the metaData field becomes the metaField of the time series collection that you create.如果集合不包括可用于标识每个系列的字段,请转换数据以定义一个。在此示例中,metaData字段将成为您创建的时间序列集合的metaField

Note

Choosing the right field as your time series metaField and grandularity optimizes both storage and query performance. For more information on field selection and best practices, see metaField and Granularity Best Practices.选择正确的字段作为时间序列metaFieldgrandularity可以优化存储和查询性能。有关字段选择和最佳实践的更多信息,请参阅metaField和粒度最佳实践

The pipline below performs the following operations:下面的管线执行以下操作:

  • Uses $addFields to add a metaData field to the weather_data collection.使用$addFields将元数据字段添加到weather_data集合中。
  • Uses $project to include or exclude the remaining fields in the document.使用$project包含或排除文档中的其余字段。
db.weather_data.aggregate([
{
$addFields: {
metaData: {
"st": "$st",
"position": "$position",
"elevation": "$elevation",
"callLetters": "$callLetters",
"qualityControlProcess": "$qualityControlProcess",
"type": "$type"
}
},
},
{
$project: {
_id: 1,
ts: 1,
metaData: 1,
dataSource: 1,
airTemperature: 1,
dewPoint: 1,
pressure: 1,
wind: 1,
visibility: 1,
skyCondition: 1,
sections: 1,
precipitationEstimatedObservation: 1
}
}
])
2

Create your time series collection and insert your data.创建时间序列集合并插入数据。

Add an $out aggregation stage to your pipeline to create a time series collection and insert your data into it. The pipeline below performs the following operations:$out聚合阶段添加到管道中,以创建时间序列集合并将数据插入其中。下面的管道执行以下操作:

  • Uses $out with the timeseries option to create a weathernew time series collection in the mydatabase database.使用$outtimeseries选项在mydatabase数据库中创建weathernew时间序列集合。
  • Defines the metaData field as the metaField of the weathernew collection.metaData字段定义为weathernew集合的metaField
  • Defines the ts field as the timeField of the weathernew collection.ts字段定义为weathernew集合的timeField

    Note

    The timeField of a time series collection must be a date type.时间序列集合的timeField必须是date类型。

{
$out: {
db: "mydatabase",
coll: "weathernew",
timeseries: {
timeField: "ts",
metaField: "metaData",
granularity: "seconds"
}
}
}

For the aggregation stage syntax, see $out. For a full explanation of the time series options, see the Time Series Field Reference.有关聚合阶段语法,请参阅$out。有关时间序列选项的完整说明,请参阅时间序列字段参考

3

Review your data.检查数据。

After you run this aggregation pipeline, you can use findOne() to view a document in your weathernew time series collection:运行此聚合管道后,您可以使用findOne()查看weathernew时间序列集合中的文档:

db.weathernew.findOne()

The operation returns the following document:该操作返回以下文档:

   {
_id: ObjectId("5553a998e4b02cf7151190b8"),
ts: ISODate("1984-03-05T13:00:00Z"),
metaData: {
st: "x+47600-047900",
position: {
type: "Point",
coordinates: [ -47.9, 47.6 ]
},
elevation: 9999,
callLetters: "VCSZ",
qualityControlProcess: "V020",
type: "FM-13"
},
dataSource: "4",
airTemperature: { value: -3.1, quality: "1" },
dewPoint: { value: 999.9, quality: "9" },
pressure: { value: 1015.3, quality: "1" },
wind: {
direction: { angle: 999, quality: "9" },
type: "9",
speed: { rate: 999.9, quality: "9" }
},
visibility: {
distance: { value: 999999, quality: "9" },
variability: { value: "N", quality: "9" }
},
skyCondition: {
ceilingHeight: { value: 99999, quality: "9", determination: "9" },
cavok: "N"
},
sections: [ "AG1" ],
precipitationEstimatedObservation: { discrepancy: "2", estimatedWaterDepth: 999 }
}

Next Steps后续步骤

If your original collection had secondary indexes, manually recreate them now.如果原始集合有辅助索引,请立即手动重新创建它们。

If your time series collection includes timeField values before 1970-01-01T00:00:00.000Z or after 2038-01-19T03:14:07.000Z, MongoDB logs a warning and disables some query optimizations that make use of the internal clustered index. 如果时间序列集合包含1970-01-01T00:00:00.000Z之前或2038-01-1T03:14:07.000Z之后的timeField值,MongoDB会记录警告并禁用一些使用内部聚集索引的查询优化。To regain query performance and resolve the log warning, create a secondary index on the timeField.要恢复查询性能并解决日志警告,请timeField上创建辅助索引