$densify (aggregation)

~~On this page~~本页内容

~~Definition~~定义
~~Syntax~~语法
~~Behavior and Restrictions~~行为和限制
~~Examples~~示例

Definition定义

$densify

New in version 5.1.在版本5.1中新增。

~~Creates new documents in a sequence of documents where certain values in a field are missing.~~在缺少字段中某些值的文档序列中创建新文档。

~~You can use $densify to:~~您可以使用$densify来：

~~Fill gaps in time series data.~~填补时间序列数据中的空白。
~~Add missing values between groups of data.~~在数据组之间添加缺少的值。
~~Populate your data with a specified range of values.~~使用指定的值范围填充数据。

Syntax语法

~~The $densify stage has this syntax:~~$densify阶段具有以下语法：

{
   $densify: {
      field: <fieldName>,
      partitionByFields: [ <field 1>, <field 2> ... <field n> ],
      range: {
         step: <number>,
         unit: <time unit>,
         bounds: < "full" || "partition" > || [ < lower bound >, < upper bound > ]
      }
   }
}

~~The $densify stage takes a document with these fields:~~$densify阶段获取包含以下字段的文档：

~~Field~~字段	Necessity	~~Description~~描述
`field`	~~Required~~必需	~~The field to densify.~~ 要加密的字段。~~The values of the specified `field` must either be all numeric values or all dates.~~指定`field`的值必须是所有数值或所有日期。 ~~Documents that do not contain the specified `field` continue through the pipeline unmodified.~~不包含指定`field`的文档未经修改地继续通过管道。 ~~To specify a `<field>` in an embedded document or in an array, use dot notation.~~要在嵌入文档或数组中指定`<field>`，请使用点表示法。 ~~For restrictions, see `field` Restrictions.~~有关限制，请参阅`field`限制。
`partitionByFields`	~~Optional~~可选	~~The set of fields to act as the compound key to group the documents.~~ 用作对文档进行分组的复合键的字段集。~~In the `$densify` stage, each group of documents is known as a partition.~~在`$densify`阶段，每组文档称为分区。 ~~If you omit this field, `$densify` uses one partition for the entire collection.~~如果省略此字段，`$densify`对整个集合使用一个分区。 ~~For an example, see Densifiction with Partitions.~~例如，请参阅带分区的密度小说。 ~~For restrictions, see `partitionByFields` Restrictions.~~有关限制，请参阅`partitionByFields`限制。
`range`	~~Required~~必需	~~An object that specifies how the data is densified.~~指定数据加密方式的对象。
`range.bounds`	~~Required~~必需	~~You can specify `range.bounds` as either:~~您可以将`range.bounds`指定为： ~~An array:~~ 数组：`[ < lower bound >, < upper bound > ]`, ~~A string: either `"full"` or `"partition"`.~~一个字符串：`"full"`或`"partition"`。 ~~If `bounds` is an array:~~如果`bounds`是一个数组： `$densify` ~~adds documents spanning the range of values within the specified bounds.~~添加跨越指定范围内的值范围的文档。 ~~The data type for the bounds must correspond to the data type in the field being densified.~~边界的数据类型必须与加密字段中的数据类型相对应。 ~~For behavior details, see `range.bounds` Behavior.~~有关行为详细信息，请参阅`range.bounds`行为。 ~~If `bounds` is `"full"`:~~如果`bounds`为`"full"`： `$densify` ~~adds documents spanning the full range of values of the `field` being densified.~~添加跨越要加密的`field`的整个值范围的文档。 ~~If `bounds` is `"partition"`:~~如果`bounds`为`"partition"`： `$densify` ~~adds documents to each partition, similar to if you had run a `full` range densification on each partition individually.~~将文档添加到每个分区，类似于在每个分区上单独运行`full`范围加密。
`range.step`	~~Required~~必需	~~The amount to increment the field value in each document.~~ 每个文档中字段值的增量。`$densify` ~~creates a new document for each `step` between the existing documents.~~为现有文档之间的每个步骤创建一个新文档。 ~~If range.unit is specified, `step` must be an integer.~~ 如果指定了`range.unit`，则`step`必须是整数。~~Otherwise, `step` can be any numeric value.~~否则，`step`可以是任何数值。
`range.unit`	~~Required if field is a date.~~如果字段是日期，则为必填项。	~~The unit to apply to the step field when incrementing date values in field.~~递增字段中的日期值时应用于步骤字段的单位。 ~~You can specify one of the following values for `unit` as a string:~~您可以将`unit`的以下值之一指定为字符串： `millisecond` `second` `minute` `hour` `week` `month` `quarter` `year` ~~For an example, see Densify Time Series Data.~~有关示例，请参阅加密时间序列数据。

Behavior and Restrictions行为和限制

`field` Restrictions限制

~~For documents that contain the specified field, $densify errors if:~~对于包含指定字段的文档，如果出现以下情况，$densify将出错：

~~Any document in the collection has a field value of type date and the unit field is not specified.~~集合中的任何文档都具有日期类型的field值，并且未指定单位字段。
~~Any document in the collection has a field value of type numeric and the unit field is specified.~~集合中的任何文档都具有数值类型的field值，并且指定了单位字段。
~~The field name begins with $.~~ field名以$开头。~~You must rename the field if you want to densify it.~~ 如果要加密字段，则必须重命名该字段。~~To rename fields, use $project.~~要重命名字段，请使用$project。

`partitionByFields` Restrictions限制

~~$densify errors if any field name in the partitionByFields array:~~如果partitionByFields数组中有任何字段名出现以下情况，$densify将出错：

~~Evaluates to a non-string value.~~计算为非字符串值。
~~Begins with $.~~以$开始。

`range.bounds` Behavior行为

~~If range.bounds is an array:~~如果range.bounds是一个数组：

~~The lower bound indicates the start value for the added documents, irrespective of documents already in the collection.~~下限表示添加文档的起始值，与集合中已存在的文档无关。
~~The lower bound is inclusive.~~下限包括在内。
~~The upper bound is exclusive.~~上界是排他性的。
$densify ~~does not filter out documents with field values outside of the specified bounds.~~不筛选字段值超出指定边界的文档。

Order of Output输出顺序

$densify ~~does not guarantee sort order of the documents it outputs.~~不保证输出文档的排序顺序。

~~To guarantee sort order, use $sort on the field you want to sort by.~~要保证排序顺序，请在要排序的字段上使用$sort。

Examples示例

Densify Time Series Data加密时间序列数据

~~Create a weather collection that contains temperature readings over four hour intervals.~~创建一个weather集合，包含四小时间隔的温度读数。

db.weather.insertMany( [
   {
       "metadata": { "sensorId": 5578, "type": "temperature" },
       "timestamp": ISODate("2021-05-18T00:00:00.000Z"),
       "temp": 12
   },
   {
       "metadata": { "sensorId": 5578, "type": "temperature" },
       "timestamp": ISODate("2021-05-18T04:00:00.000Z"),
       "temp": 11
   },
   {
       "metadata": { "sensorId": 5578, "type": "temperature" },
       "timestamp": ISODate("2021-05-18T08:00:00.000Z"),
       "temp": 11
   },
   {
       "metadata": { "sensorId": 5578, "type": "temperature" },
       "timestamp": ISODate("2021-05-18T12:00:00.000Z"),
       "temp": 12
   }
] )

~~This example uses the $densify stage to fill in the gaps between the four-hour intervals to achieve hourly granularity for the data points:~~本示例使用$densify阶段填充四个小时间隔之间的间隙，以实现数据点的每小时粒度：

db.weather.aggregate( [
   {
      $densify: {
         field: "timestamp",
         range: {
            step: 1,
            unit: "hour",
            bounds:[ ISODate("2021-05-18T00:00:00.000Z"), ISODate("2021-05-18T08:00:00.000Z") ]
         }
      }
   }
] )

~~In the example:~~在该示例中：

~~The $densify stage fills in the gaps of time in between the recorded temperatures.~~$density阶段填补了记录温度之间的时间间隙。
- field: "timestamp" ~~densifies the timestamp field.~~加密timestamp字段。
- range:
  - step: 1 ~~increments the timestamp field by 1 unit.~~将timestamp字段递增1个单位。
  - unit: hour ~~densifies the timestamp field by the hour.~~按小时加密timestamp字段。
  - bounds: [ ISODate("2021-05-18T00:00:00.000Z"), ISODate("2021-05-18T08:00:00.000Z") ] ~~sets the range of time that is densified.~~设置加密的时间范围。

~~In the following output, the $densify stage fills in the gaps of time between the hours of 00:00:00 and 08:00:00.~~在以下输出中，$densify阶段将填充00:00:00和08:00:00之间的时间间隔。

[
  {
    _id: ObjectId("618c207c63056cfad0ca4309"),
    metadata: { sensorId: 5578, type: 'temperature' },
    timestamp: ISODate("2021-05-18T00:00:00.000Z"),
    temp: 12
  },
  { timestamp: ISODate("2021-05-18T01:00:00.000Z") },
  { timestamp: ISODate("2021-05-18T02:00:00.000Z") },
  { timestamp: ISODate("2021-05-18T03:00:00.000Z") },
  {
    _id: ObjectId("618c207c63056cfad0ca430a"),
    metadata: { sensorId: 5578, type: 'temperature' },
    timestamp: ISODate("2021-05-18T04:00:00.000Z"),
    temp: 11
  },
  { timestamp: ISODate("2021-05-18T05:00:00.000Z") },
  { timestamp: ISODate("2021-05-18T06:00:00.000Z") },
  { timestamp: ISODate("2021-05-18T07:00:00.000Z") },
  {
    _id: ObjectId("618c207c63056cfad0ca430b"),
    metadata: { sensorId: 5578, type: 'temperature' },
    timestamp: ISODate("2021-05-18T08:00:00.000Z"),
    temp: 11
  }
  {
    _id: ObjectId("618c207c63056cfad0ca430c"),
    metadata: { sensorId: 5578, type: 'temperature' },
    timestamp: ISODate("2021-05-18T12:00:00.000Z"),
    temp: 12
  }
]

Densifiction with Partitions带隔板的密度小说

~~Create a coffee collection that contains data for two varieties of coffee beans:~~创建包含两种咖啡豆数据的coffee集合：

db.coffee.insertMany( [
   {
      "altitude": 600,
      "variety": "Arabica Typica",
      "score": 68.3
   },
   {
      "altitude": 750,
      "variety": "Arabica Typica",
      "score": 69.5
   },
   {
      "altitude": 950,
      "variety": "Arabica Typica",
      "score": 70.5
   },
   {
      "altitude": 1250,
      "variety": "Gesha",
      "score": 88.15
   },
   {
     "altitude": 1700,
     "variety": "Gesha",
     "score": 95.5,
     "price": 1029
   }
] )

Densify the Full Range of Values加密整个值范围

~~This example uses $densify to densify the altitude field for each coffee variety:~~本示例使用$densify为每个咖啡variety（品种）加密altitude（海拔）字段：

db.coffee.aggregate( [
   {
      $densify: {
         field: "altitude",
         partitionByFields: [ "variety" ],
         range: {
            bounds: "full",
            step: 200
         }
      }
   }
] )

~~The example aggregation:~~示例聚合：

~~Partitions the documents by variety to create one grouping for Arabica Typica and one for Gesha coffee.~~按variety划分文档，为Arabica Typica和Gesha咖啡创建一个分组。
~~Specifies a full range, meaning that the data is densified across the full range of existing documents for each partition.~~指定full范围，这意味着数据在每个分区的现有文档的整个范围内进行加密。
~~Specifies a step of 200, meaning new documents are created at altitude intervals of 200.~~指定step为200，表示以200的altitude（海拔）间隔创建新文档。

~~The aggregation outputs the following documents:~~聚合输出以下文件：

[
   {
     _id: ObjectId("618c031814fbe03334480475"),
     altitude: 600,
     variety: 'Arabica Typica',
     score: 68.3
   },
   {
     _id: ObjectId("618c031814fbe03334480476"),
     altitude: 750,
     variety: 'Arabica Typica',
     score: 69.5
   },
   { variety: 'Arabica Typica', altitude: 800 },
   {
     _id: ObjectId("618c031814fbe03334480477"),
     altitude: 950,
     variety: 'Arabica Typica',
     score: 70.5
   },
   { variety: 'Gesha', altitude: 600 },
   { variety: 'Gesha', altitude: 800 },
   { variety: 'Gesha', altitude: 1000 },
   { variety: 'Gesha', altitude: 1200 },
   {
     _id: ObjectId("618c031814fbe03334480478"),
     altitude: 1250,
     variety: 'Gesha',
     score: 88.15
   },
   { variety: 'Gesha', altitude: 1400 },
   { variety: 'Gesha', altitude: 1600 },
   {
     _id: ObjectId("618c031814fbe03334480479"),
     altitude: 1700,
     variety: 'Gesha',
     score: 95.5,
     price: 1029
   },
   { variety: 'Arabica Typica', altitude: 1000 },
   { variety: 'Arabica Typica', altitude: 1200 },
   { variety: 'Arabica Typica', altitude: 1400 },
   { variety: 'Arabica Typica', altitude: 1600 }
 ]

~~This image visualizes the documents created with $densify:~~此图像显示了使用$densify创建的文档：

State of the coffee collection after full-range densifiction

~~The darker squares represent the original documents in the collection.~~较深的方块表示集合中的原始文档。
~~The lighter squares represent the documents created with $densify.~~较浅的方块表示使用$densify创建的文档。

Densify Values within Each Partition加密每个分区内的值

~~This example uses $densify to only densify gaps in the altitude field within each variety:~~本示例使用$densify仅加密每个variety内altitude字段中的间隙：

db.coffee.aggregate( [
   {
      $densify: {
         field: "altitude",
         partitionByFields: [ "variety" ],
         range: {
            bounds: "partition",
            step: 200
         }
      }
   }
] )