Docs HomeMongoDB Manual

$densify (aggregation)

Definition定义

$densify

New in version 5.1. 5.1版新增。

Creates new documents in a sequence of documents where certain values in a field are missing.在字段中缺少某些值的文档序列中创建新文档。

You can use $densify to:您可以使用$densitify来:

  • Fill gaps in time series data.填补时间序列数据中的空白。
  • Add missing values between groups of data.在数据组之间添加缺少的值。
  • Populate your data with a specified range of values.使用指定的值范围填充数据。

Syntax语法

The $densify stage has this syntax:$densitify阶段具有以下语法:

{
$densify: {
field: <fieldName>,
partitionByFields: [ <field 1>, <field 2> ... <field n> ],
range: {
step: <number>,
unit: <time unit>,
bounds: < "full" || "partition" > || [ < lower bound >, < upper bound > ]
}
}
}

The $densify stage takes a document with these fields:$densitify阶段获取具有以下字段的文档:

Field字段Necessity必要性Description描述
fieldRequired必要的The field to densify. The values of the specified field must either be all numeric values or all dates.要稠密化的字段。指定field的值必须全部为数值或全部为日期。
Documents that do not contain the specified field continue through the pipeline unmodified.不包含指定field的文档将继续通过管道,而不会被修改。
To specify a <field> in an embedded document or in an array, use dot notation.要在嵌入文档或数组中指定<field>,请使用点表示法
For restrictions, see field Restrictions. 有关限制,请参阅field限制
partitionByFieldsOptional可选的The set of fields to act as the compound key to group the documents. 用作组合文档的复合键的字段集。In the $densify stage, each group of documents is known as a partition.$density阶段,每组文档被称为一个分区
If you omit this field, $densify uses one partition for the entire collection.如果省略此字段,$densitify将为整个集合使用一个分区。
For an example, see Densifiction with Partitions.有关示例,请参阅分区稠密化
For restrictions, see partitionByFields Restrictions. 有关限制,请参阅partitionByFields限制
rangeRequired必要的An object that specifies how the data is densified. 指定数据稠密化方式的对象。
range.boundsRequired必要的You can specify range.bounds as either: 您可以将range.bounds指定为:
  • An array: [ < lower bound >, < upper bound > ],一个数组:[ < lower bound >, < upper bound > ]
  • A string: either "full" or "partition".一个字符串:要么是"full",要么是"partition"
If bounds is an array: 如果bounds是一个数组:
  • $densify adds documents spanning the range of values within the specified bounds.添加跨越指定界限内的值范围的文档。
  • The data type for the bounds must correspond to the data type in the field being densified.边界的数据类型必须与要稠密化的field中的数据类型相对应。
  • For behavior details, see range.bounds Behavior.有关行为的详细信息,请参阅range.bounds行为。
If bounds is "full": 如果bounds"full"
  • $densify adds documents spanning the full range of values of the field being densified.添加跨越要稠密化的field的全部值范围的文档。
If bounds is "partition": 如果bounds"partition"
  • $densify adds documents to each partition, similar to if you had run a full range densification on each partition individually.将文档添加到每个分区,类似于在每个分区上单独运行full范围致密化。
range.stepRequired必要的The amount to increment the field value in each document. 每个文档中field值的增量。$densify creates a new document for each step between the existing documents.为现有文档之间的每个step创建一个新文档。
If range.unit is specified, step must be an integer. 如果指定了range.unit,则step必须是整数。Otherwise, step can be any numeric value. 否则,step可以是任何数值。
range.unitRequired if field is a date.如果field是日期,则为必要的。The unit to apply to the step field when incrementing date values in field.在字段中递增日期值时应用于step字段的单位。
You can specify one of the following values for unit as a string: 您可以将以下unit值之一指定为字符串:
  • millisecond
  • second
  • minute
  • hour
  • day
  • week
  • month
  • quarter
  • year
For an example, see Densify Time Series Data. 有关示例,请参阅稠密化时间序列数据

Behavior and Restrictions行为和限制

field Restrictions限制

For documents that contain the specified field, $densify errors if:对于包含指定字段的文档,$density错误,如果:

  • Any document in the collection has a field value of type date and the unit field is not specified.集合中的任何文档都具有日期类型的field值,并且未指定unit字段。
  • Any document in the collection has a field value of type numeric and the unit field is specified.集合中的任何文档都有一个数字类型的field值,并且指定了unit字段。
  • The field name begins with $. field名称以$开头。You must rename the field if you want to densify it. 如果要使字段致密化,则必须重命名字段。To rename fields, use $project.若要重命名字段,请使用$project

partitionByFields Restrictions限制

$densify errors if any field name in the partitionByFields array:如果partitionByFields数组中的任何字段名存在以下情况,则$densify出错:

  • Evaluates to a non-string value.计算为非字符串值。
  • Begins with $.$开头。

range.bounds Behavior行为

If range.bounds is an array:如果range.bounds是一个数组:

  • The lower bound indicates the start value for the added documents, irrespective of documents already in the collection.下限表示添加的文档的起始值,而与集合中已存在的文档无关。
  • The lower bound is inclusive.下限包括在内。
  • The upper bound is exclusive.上限是排他性的。
  • $densify does not filter out documents with field values outside of the specified bounds.不会筛选出field值超出指定界限的文档。

Order of Output输出顺序

$densify does not guarantee sort order of the documents it outputs.不保证它输出的文档的排序顺序。

To guarantee sort order, use $sort on the field you want to sort by.要保证排序顺序,请对要排序的字段使用$sort

Examples实例

Densify Time Series Data稠密化时间序列数据

Create a weather collection that contains temperature readings over four hour intervals.创建一个weather集合,其中包含四小时内的温度读数。

db.weather.insertMany( [
{
"metadata": { "sensorId": 5578, "type": "temperature" },
"timestamp": ISODate("2021-05-18T00:00:00.000Z"),
"temp": 12
},
{
"metadata": { "sensorId": 5578, "type": "temperature" },
"timestamp": ISODate("2021-05-18T04:00:00.000Z"),
"temp": 11
},
{
"metadata": { "sensorId": 5578, "type": "temperature" },
"timestamp": ISODate("2021-05-18T08:00:00.000Z"),
"temp": 11
},
{
"metadata": { "sensorId": 5578, "type": "temperature" },
"timestamp": ISODate("2021-05-18T12:00:00.000Z"),
"temp": 12
}
] )

This example uses the $densify stage to fill in the gaps between the four-hour intervals to achieve hourly granularity for the data points:此示例使用$density阶段来填充四个小时间隔之间的间隙,以实现数据点的每小时粒度:

db.weather.aggregate( [
{
$densify: {
field: "timestamp",
range: {
step: 1,
unit: "hour",
bounds:[ ISODate("2021-05-18T00:00:00.000Z"), ISODate("2021-05-18T08:00:00.000Z") ]
}
}
}
] )

In the example:在示例中:

  • The $densify stage fills in the gaps of time in between the recorded temperatures.$density阶段填补了记录温度之间的时间间隔。

    • field: "timestamp" densifies the timestamp field.稠密化timestamp字段。
    • range:

      • step: 1 increments the timestamp field by 1 unit.timestamp字段增加1个单位。
      • unit: hour densifies the timestamp field by the hour.按小时稠密化timestamp字段。
      • bounds: [ ISODate("2021-05-18T00:00:00.000Z"), ISODate("2021-05-18T08:00:00.000Z") ] sets the range of time that is densified.设置稠密化的时间范围。

In the following output, the $densify stage fills in the gaps of time between the hours of 00:00:00 and 08:00:00.在以下输出中,$densify阶段填充00:00:0008:00:00之间的时间间隔。

[
{
_id: ObjectId("618c207c63056cfad0ca4309"),
metadata: { sensorId: 5578, type: 'temperature' },
timestamp: ISODate("2021-05-18T00:00:00.000Z"),
temp: 12
},
{ timestamp: ISODate("2021-05-18T01:00:00.000Z") },
{ timestamp: ISODate("2021-05-18T02:00:00.000Z") },
{ timestamp: ISODate("2021-05-18T03:00:00.000Z") },
{
_id: ObjectId("618c207c63056cfad0ca430a"),
metadata: { sensorId: 5578, type: 'temperature' },
timestamp: ISODate("2021-05-18T04:00:00.000Z"),
temp: 11
},
{ timestamp: ISODate("2021-05-18T05:00:00.000Z") },
{ timestamp: ISODate("2021-05-18T06:00:00.000Z") },
{ timestamp: ISODate("2021-05-18T07:00:00.000Z") },
{
_id: ObjectId("618c207c63056cfad0ca430b"),
metadata: { sensorId: 5578, type: 'temperature' },
timestamp: ISODate("2021-05-18T08:00:00.000Z"),
temp: 11
}
{
_id: ObjectId("618c207c63056cfad0ca430c"),
metadata: { sensorId: 5578, type: 'temperature' },
timestamp: ISODate("2021-05-18T12:00:00.000Z"),
temp: 12
}
]

Densifiction with Partitions分区稠密化

Create a coffee collection that contains data for two varieties of coffee beans:创建一个coffee集合,其中包含两种咖啡豆的数据:

db.coffee.insertMany( [
{
"altitude": 600,
"variety": "Arabica Typica",
"score": 68.3
},
{
"altitude": 750,
"variety": "Arabica Typica",
"score": 69.5
},
{
"altitude": 950,
"variety": "Arabica Typica",
"score": 70.5
},
{
"altitude": 1250,
"variety": "Gesha",
"score": 88.15
},
{
"altitude": 1700,
"variety": "Gesha",
"score": 95.5,
"price": 1029
}
] )

Densify the Full Range of Values稠密化所有值

This example uses $densify to densify the altitude field for each coffee variety:本例使用$densify来稠密化每个咖啡variety(品种)的altitude(海拔)字段:

db.coffee.aggregate( [
{
$densify: {
field: "altitude",
partitionByFields: [ "variety" ],
range: {
bounds: "full",
step: 200
}
}
}
] )

The example aggregation:示例聚合:

  • Partitions the documents by variety to create one grouping for Arabica Typica and one for Gesha coffee.variety对文档进行分区,为Arabica TypicaGesha咖啡创建一个分组。
  • Specifies a full range, meaning that the data is densified across the full range of existing documents for each partition.指定一个full范围,这意味着在每个分区的整个现有文档范围内对数据进行稠密化。
  • Specifies a step of 200, meaning new documents are created at altitude intervals of 200.指定step200,这意味着以200altitude间隔创建新文档。

The aggregation outputs the following documents:聚合输出以下文档:

[
{
_id: ObjectId("618c031814fbe03334480475"),
altitude: 600,
variety: 'Arabica Typica',
score: 68.3
},
{
_id: ObjectId("618c031814fbe03334480476"),
altitude: 750,
variety: 'Arabica Typica',
score: 69.5
},
{ variety: 'Arabica Typica', altitude: 800 },
{
_id: ObjectId("618c031814fbe03334480477"),
altitude: 950,
variety: 'Arabica Typica',
score: 70.5
},
{ variety: 'Gesha', altitude: 600 },
{ variety: 'Gesha', altitude: 800 },
{ variety: 'Gesha', altitude: 1000 },
{ variety: 'Gesha', altitude: 1200 },
{
_id: ObjectId("618c031814fbe03334480478"),
altitude: 1250,
variety: 'Gesha',
score: 88.15
},
{ variety: 'Gesha', altitude: 1400 },
{ variety: 'Gesha', altitude: 1600 },
{
_id: ObjectId("618c031814fbe03334480479"),
altitude: 1700,
variety: 'Gesha',
score: 95.5,
price: 1029
},
{ variety: 'Arabica Typica', altitude: 1000 },
{ variety: 'Arabica Typica', altitude: 1200 },
{ variety: 'Arabica Typica', altitude: 1400 },
{ variety: 'Arabica Typica', altitude: 1600 }
]

This image visualizes the documents created with $densify:此图像将使用$density:

State of the coffee collection after full-range densifiction
  • The darker squares represent the original documents in the collection.较暗的正方形表示集合中的原始文档。
  • The lighter squares represent the documents created with $densify.较浅的正方形表示使用$density创建的文档。

Densify Values within Each Partition加密每个分区中的值

This example uses $densify to only densify gaps in the altitude field within each variety:此示例使用$densify仅加密每个varietyaltitude字段中的间隙:

db.coffee.aggregate( [
{
$densify: {
field: "altitude",
partitionByFields: [ "variety" ],
range: {
bounds: "partition",
step: 200
}
}
}
] )

The example aggregation:示例聚合:

  • Partitions the documents by variety to create one grouping for Arabica Typica and one for Gesha coffee.variety对文档进行分区,为Arabica TypicaGesha咖啡创建一个分组。
  • Specifies a partition range, meaning that the data is densified within each partition.指定partition范围,这意味着数据在每个分区内都被加密。

    • For the Arabica Typica partition, the range is 600-900.对于Arabica Typica分区,范围为600-900
    • For the Gesha partition, the range is 1250-1700.对于Gesha分区,范围为1250-1700
  • Specifies a step of 200, meaning new documents are created at altitude intervals of 200.指定step200,这意味着以200altitude间隔创建新文档。

The aggregation outputs the following documents:聚合输出以下文档:

[
{
_id: ObjectId("618c031814fbe03334480475"),
altitude: 600,
variety: 'Arabica Typica',
score: 68.3
},
{
_id: ObjectId("618c031814fbe03334480476"),
altitude: 750,
variety: 'Arabica Typica',
score: 69.5
},
{ variety: 'Arabica Typica', altitude: 800 },
{
_id: ObjectId("618c031814fbe03334480477"),
altitude: 950,
variety: 'Arabica Typica',
score: 70.5
},
{
_id: ObjectId("618c031814fbe03334480478"),
altitude: 1250,
variety: 'Gesha',
score: 88.15
},
{ variety: 'Gesha', altitude: 1450 },
{ variety: 'Gesha', altitude: 1650 },
{
_id: ObjectId("618c031814fbe03334480479"),
altitude: 1700,
variety: 'Gesha',
score: 95.5,
price: 1029
}
]

This image visualizes the documents created with $densify:此图像将使用$densify:

State of the coffee collection after partition range densification
  • The darker squares represent the original documents in the collection.较暗的正方形表示集合中的原始文档。
  • The lighter squares represent the documents created with $densify.较浅的正方形表示使用$densify创建的文档。