$densify (aggregation)

~~On this page~~本页内容

~~Definition~~定义
~~Syntax~~语法
~~Behavior and Restrictions~~行为和限制
~~Examples~~实例

Definition定义

$densify

~~New in version 5.1.~~ 5.1版新增。

~~Creates new documents in a sequence of documents where certain values in a field are missing.~~在字段中缺少某些值的文档序列中创建新文档。

~~You can use $densify to:~~您可以使用$densitify来：

~~Fill gaps in time series data.~~填补时间序列数据中的空白。
~~Add missing values between groups of data.~~在数据组之间添加缺少的值。
~~Populate your data with a specified range of values.~~使用指定的值范围填充数据。

Syntax语法

~~The $densify stage has this syntax:~~$densitify阶段具有以下语法：

{
   $densify: {
      field: <fieldName>,
      partitionByFields: [ <field 1>, <field 2> ... <field n> ],
      range: {
         step: <number>,
         unit: <time unit>,
         bounds: < "full" || "partition" > || [ < lower bound >, < upper bound > ]
      }
   }
}

~~The $densify stage takes a document with these fields:~~$densitify阶段获取具有以下字段的文档：

~~Field~~字段	~~Necessity~~必要性	~~Description~~描述
`field`	~~Required~~必要的	~~The field to densify. The values of the specified `field` must either be all numeric values or all dates.~~要稠密化的字段。指定`field`的值必须全部为数值或全部为日期。 ~~Documents that do not contain the specified `field` continue through the pipeline unmodified.~~不包含指定`field`的文档将继续通过管道，而不会被修改。 ~~To specify a `<field>` in an embedded document or in an array, use dot notation.~~要在嵌入文档或数组中指定`<field>`，请使用点表示法。 ~~For restrictions, see `field` Restrictions.~~ 有关限制，请参阅`field`限制。
`partitionByFields`	~~Optional~~可选的	~~The set of fields to act as the compound key to group the documents.~~ 用作组合文档的复合键的字段集。~~In the `$densify` stage, each group of documents is known as a partition.~~在`$density`阶段，每组文档被称为一个分区。 ~~If you omit this field, `$densify` uses one partition for the entire collection.~~如果省略此字段，`$densitify`将为整个集合使用一个分区。 ~~For an example, see Densifiction with Partitions.~~有关示例，请参阅分区稠密化。 ~~For restrictions, see `partitionByFields` Restrictions.~~ 有关限制，请参阅`partitionByFields`限制。
`range`	~~Required~~必要的	~~An object that specifies how the data is densified.~~ 指定数据稠密化方式的对象。
`range.bounds`	~~Required~~必要的	~~You can specify `range.bounds` as either:~~ 您可以将`range.bounds`指定为： ~~An array: `[ < lower bound >, < upper bound > ]`,~~一个数组：`[ < lower bound >, < upper bound > ]`， ~~A string: either `"full"` or `"partition"`.~~一个字符串：要么是`"full"`，要么是`"partition"`。 ~~If `bounds` is an array:~~ 如果`bounds`是一个数组： `$densify` ~~adds documents spanning the range of values within the specified bounds.~~添加跨越指定界限内的值范围的文档。 ~~The data type for the bounds must correspond to the data type in the field being densified.~~边界的数据类型必须与要稠密化的`field`中的数据类型相对应。 ~~For behavior details, see `range.bounds` Behavior.~~有关行为的详细信息，请参阅`range.bounds`行为。 ~~If `bounds` is `"full"`:~~ 如果`bounds`为`"full"`： `$densify` ~~adds documents spanning the full range of values of the `field` being densified.~~添加跨越要稠密化的`field`的全部值范围的文档。 ~~If `bounds` is `"partition"`:~~ 如果`bounds`为`"partition"`： `$densify` ~~adds documents to each partition, similar to if you had run a `full` range densification on each partition individually.~~将文档添加到每个分区，类似于在每个分区上单独运行`full`范围致密化。
`range.step`	~~Required~~必要的	~~The amount to increment the field value in each document.~~ 每个文档中`field`值的增量。`$densify` ~~creates a new document for each `step` between the existing documents.~~为现有文档之间的每个`step`创建一个新文档。 ~~If range.unit is specified, `step` must be an integer.~~ 如果指定了`range.unit`，则`step`必须是整数。~~Otherwise, `step` can be any numeric value.~~ 否则，`step`可以是任何数值。
`range.unit`	~~Required if field is a date.~~如果`field`是日期，则为必要的。	~~The unit to apply to the step field when incrementing date values in field.~~在字段中递增日期值时应用于`step`字段的单位。 ~~You can specify one of the following values for `unit` as a string:~~ 您可以将以下`unit`值之一指定为字符串： `millisecond` `second` `minute` `hour` `day` `week` `month` `quarter` `year` ~~For an example, see Densify Time Series Data.~~ 有关示例，请参阅稠密化时间序列数据。

Behavior and Restrictions行为和限制

`field` Restrictions限制

~~For documents that contain the specified field, $densify errors if:~~对于包含指定字段的文档，$density错误，如果：

~~Any document in the collection has a field value of type date and the unit field is not specified.~~集合中的任何文档都具有日期类型的field值，并且未指定unit字段。
~~Any document in the collection has a field value of type numeric and the unit field is specified.~~集合中的任何文档都有一个数字类型的field值，并且指定了unit字段。
~~The field name begins with $.~~ field名称以$开头。~~You must rename the field if you want to densify it.~~ 如果要使字段致密化，则必须重命名字段。~~To rename fields, use $project.~~若要重命名字段，请使用$project。

`partitionByFields` Restrictions限制

~~$densify errors if any field name in the partitionByFields array:~~如果partitionByFields数组中的任何字段名存在以下情况，则$densify出错：

~~Evaluates to a non-string value.~~计算为非字符串值。
~~Begins with $.~~以$开头。

`range.bounds` Behavior行为

~~If range.bounds is an array:~~如果range.bounds是一个数组：

~~The lower bound indicates the start value for the added documents, irrespective of documents already in the collection.~~下限表示添加的文档的起始值，而与集合中已存在的文档无关。
~~The lower bound is inclusive.~~下限包括在内。
~~The upper bound is exclusive.~~上限是排他性的。
$densify ~~does not filter out documents with field values outside of the specified bounds.~~不会筛选出field值超出指定界限的文档。

Order of Output输出顺序

$densify ~~does not guarantee sort order of the documents it outputs.~~不保证它输出的文档的排序顺序。

~~To guarantee sort order, use $sort on the field you want to sort by.~~要保证排序顺序，请对要排序的字段使用$sort。

Examples实例

Densify Time Series Data稠密化时间序列数据

~~Create a weather collection that contains temperature readings over four hour intervals.~~创建一个weather集合，其中包含四小时内的温度读数。

db.weather.insertMany( [
   {
       "metadata": { "sensorId": 5578, "type": "temperature" },
       "timestamp": ISODate("2021-05-18T00:00:00.000Z"),
       "temp": 12
   },
   {
       "metadata": { "sensorId": 5578, "type": "temperature" },
       "timestamp": ISODate("2021-05-18T04:00:00.000Z"),
       "temp": 11
   },
   {
       "metadata": { "sensorId": 5578, "type": "temperature" },
       "timestamp": ISODate("2021-05-18T08:00:00.000Z"),
       "temp": 11
   },
   {
       "metadata": { "sensorId": 5578, "type": "temperature" },
       "timestamp": ISODate("2021-05-18T12:00:00.000Z"),
       "temp": 12
   }
] )

~~This example uses the $densify stage to fill in the gaps between the four-hour intervals to achieve hourly granularity for the data points:~~此示例使用$density阶段来填充四个小时间隔之间的间隙，以实现数据点的每小时粒度：

db.weather.aggregate( [
   {
      $densify: {
         field: "timestamp",
         range: {
            step: 1,
            unit: "hour",
            bounds:[ ISODate("2021-05-18T00:00:00.000Z"), ISODate("2021-05-18T08:00:00.000Z") ]
         }
      }
   }
] )

~~In the example:~~在示例中：

~~The $densify stage fills in the gaps of time in between the recorded temperatures.~~$density阶段填补了记录温度之间的时间间隔。
- field: "timestamp" ~~densifies the timestamp field.~~稠密化timestamp字段。
- range:
  - step: 1 ~~increments the timestamp field by 1 unit.~~将timestamp字段增加1个单位。
  - unit: hour ~~densifies the timestamp field by the hour.~~按小时稠密化timestamp字段。
  - bounds: [ ISODate("2021-05-18T00:00:00.000Z"), ISODate("2021-05-18T08:00:00.000Z") ] ~~sets the range of time that is densified.~~设置稠密化的时间范围。

~~In the following output, the $densify stage fills in the gaps of time between the hours of 00:00:00 and 08:00:00.~~在以下输出中，$densify阶段填充00:00:00和08:00:00之间的时间间隔。

[
  {
    _id: ObjectId("618c207c63056cfad0ca4309"),
    metadata: { sensorId: 5578, type: 'temperature' },
    timestamp: ISODate("2021-05-18T00:00:00.000Z"),
    temp: 12
  },
  { timestamp: ISODate("2021-05-18T01:00:00.000Z") },
  { timestamp: ISODate("2021-05-18T02:00:00.000Z") },
  { timestamp: ISODate("2021-05-18T03:00:00.000Z") },
  {
    _id: ObjectId("618c207c63056cfad0ca430a"),
    metadata: { sensorId: 5578, type: 'temperature' },
    timestamp: ISODate("2021-05-18T04:00:00.000Z"),
    temp: 11
  },
  { timestamp: ISODate("2021-05-18T05:00:00.000Z") },
  { timestamp: ISODate("2021-05-18T06:00:00.000Z") },
  { timestamp: ISODate("2021-05-18T07:00:00.000Z") },
  {
    _id: ObjectId("618c207c63056cfad0ca430b"),
    metadata: { sensorId: 5578, type: 'temperature' },
    timestamp: ISODate("2021-05-18T08:00:00.000Z"),
    temp: 11
  }
  {
    _id: ObjectId("618c207c63056cfad0ca430c"),
    metadata: { sensorId: 5578, type: 'temperature' },
    timestamp: ISODate("2021-05-18T12:00:00.000Z"),
    temp: 12
  }
]

Densifiction with Partitions分区稠密化

~~Create a coffee collection that contains data for two varieties of coffee beans:~~创建一个coffee集合，其中包含两种咖啡豆的数据：

db.coffee.insertMany( [
   {
      "altitude": 600,
      "variety": "Arabica Typica",
      "score": 68.3
   },
   {
      "altitude": 750,
      "variety": "Arabica Typica",
      "score": 69.5
   },
   {
      "altitude": 950,
      "variety": "Arabica Typica",
      "score": 70.5
   },
   {
      "altitude": 1250,
      "variety": "Gesha",
      "score": 88.15
   },
   {
     "altitude": 1700,
     "variety": "Gesha",
     "score": 95.5,
     "price": 1029
   }
] )

Densify the Full Range of Values稠密化所有值

~~This example uses $densify to densify the altitude field for each coffee variety:~~本例使用$densify来稠密化每个咖啡variety（品种）的altitude（海拔）字段：

db.coffee.aggregate( [
   {
      $densify: {
         field: "altitude",
         partitionByFields: [ "variety" ],
         range: {
            bounds: "full",
            step: 200
         }
      }
   }
] )

~~The example aggregation:~~示例聚合：

~~Partitions the documents by variety to create one grouping for Arabica Typica and one for Gesha coffee.~~按variety对文档进行分区，为Arabica Typica和Gesha咖啡创建一个分组。
~~Specifies a full range, meaning that the data is densified across the full range of existing documents for each partition.~~指定一个full范围，这意味着在每个分区的整个现有文档范围内对数据进行稠密化。
~~Specifies a step of 200, meaning new documents are created at altitude intervals of 200.~~指定step为200，这意味着以200的altitude间隔创建新文档。

~~The aggregation outputs the following documents:~~聚合输出以下文档：

[
   {
     _id: ObjectId("618c031814fbe03334480475"),
     altitude: 600,
     variety: 'Arabica Typica',
     score: 68.3
   },
   {
     _id: ObjectId("618c031814fbe03334480476"),
     altitude: 750,
     variety: 'Arabica Typica',
     score: 69.5
   },
   { variety: 'Arabica Typica', altitude: 800 },
   {
     _id: ObjectId("618c031814fbe03334480477"),
     altitude: 950,
     variety: 'Arabica Typica',
     score: 70.5
   },
   { variety: 'Gesha', altitude: 600 },
   { variety: 'Gesha', altitude: 800 },
   { variety: 'Gesha', altitude: 1000 },
   { variety: 'Gesha', altitude: 1200 },
   {
     _id: ObjectId("618c031814fbe03334480478"),
     altitude: 1250,
     variety: 'Gesha',
     score: 88.15
   },
   { variety: 'Gesha', altitude: 1400 },
   { variety: 'Gesha', altitude: 1600 },
   {
     _id: ObjectId("618c031814fbe03334480479"),
     altitude: 1700,
     variety: 'Gesha',
     score: 95.5,
     price: 1029
   },
   { variety: 'Arabica Typica', altitude: 1000 },
   { variety: 'Arabica Typica', altitude: 1200 },
   { variety: 'Arabica Typica', altitude: 1400 },
   { variety: 'Arabica Typica', altitude: 1600 }
 ]

~~This image visualizes the documents created with $densify:~~此图像将使用$density:

State of the coffee collection after full-range densifiction

~~The darker squares represent the original documents in the collection.~~较暗的正方形表示集合中的原始文档。
~~The lighter squares represent the documents created with $densify.~~较浅的正方形表示使用$density创建的文档。

Densify Values within Each Partition加密每个分区中的值

~~This example uses $densify to only densify gaps in the altitude field within each variety:~~此示例使用$densify仅加密每个variety内altitude字段中的间隙：

db.coffee.aggregate( [
   {
      $densify: {
         field: "altitude",
         partitionByFields: [ "variety" ],
         range: {
            bounds: "partition",
            step: 200
         }
      }
   }
] )