$fill (aggregation)

On this page本页内容

Definition定义

$fill

New in version 5.3.在版本5.3中新增

Populates null and missing field values within documents.填充文档中的null字段值和缺失字段值。

You can use $fill to populate missing data points:您可以使用$fill填充缺少的数据点:

  • In a sequence based on surrounding values.以基于周围值的顺序。
  • With a fixed value.具有固定值。

Syntax语法

The $fill stage has this syntax:$fill阶段具有以下语法:

{
   $fill: {
      partitionBy: <expression>,
      partitionByFields: [ <field 1>, <field 2>, ... , <field n> ],
      sortBy: {
         <sort field 1>: <sort order>,
         <sort field 2>: <sort order>,
         ...,
         <sort field n>: <sort order>
      },
      output: {
         <field 1>: { value: <expression> },
         <field 2>: { method: <string> },
         ...
      }
   }
}

The $fill stage takes a document with these fields:$fill阶段接受包含以下字段的文档:

Field字段Necessity必要性Description描述
partitionByOptional可选

Specifies an expression to group the documents. 指定用于对文档进行分组的表达式In the $fill stage, a group of documents is known as a partition.$fill阶段,一组文档称为分区

If you omit partitionBy and partitionByFields, $fill uses one partition for the entire collection.如果省略partitionBypartitionByFields$fill对整个集合使用一个分区。

partitionBy and partitionByFields are mutually exclusive.partitionBypartitionByFields是互斥的。

See an example.看一个例子

partitionByFieldsOptional可选

Specifies an array of fields as the compound key to group the documents. 将字段数组指定为组合文档的复合键。In the $fill stage, each group of documents is known as a partition.$fill阶段,每组文档称为分区

If you omit partitionBy and partitionByFields, $fill uses one partition for the entire collection.如果省略partitionBypartitionByFields$fill对整个集合使用一个分区。

partitionBy and partitionByFields are mutually exclusive.partitionBypartitionByFields是互斥的。

See partitionByFields Restrictions.请参阅partitionByFields限制

sortBy

Required if method is specified in at least one output.<field>.如果在至少一个output.<field>中指定了method,则为必需。

Otherwise, optional.否则,它是可选的。

Specifies the field or fields to sort the documents within each partition. 指定用于对每个分区中的文档进行排序的一个或多个字段。Uses the same syntax as the $sort stage.使用与$sort阶段相同的语法。

outputRequired必需

Specifies an object containing each field for which to fill missing values. 指定一个对象,该对象包含要填充缺失值的每个字段。You can specify multiple fields in the output object.可以在output对象中指定多个字段。

The object name is the name of the field to fill. 对象名称是要填充的字段的名称。The object value specifies how the field is filled.对象值指定字段的填充方式。

output.<field>Required必需

Specifies an object indicating how to fill missing values in the target field.指定一个对象,指示如何填充目标字段中缺少的值。

The object name must be either value or method. 对象名称必须是valuemethodIf the name is:如果名称为:

Behavior and Restrictions行为和限制

partitionByFields Restrictions限制

$fill returns an error if any field name in the partitionByFields array:如果partitionByFields数组中有任何字段名,则$fill返回错误:

  • Evaluates to a non-string value.计算为非字符串值。
  • Begins with $.$开始。

linear Behavior行为

The linear fill method fills null and missing fields using linear interpolation based on the surrounding non-null values in the sequence.linear填充方法使用基于序列中周围非null值的线性插值来填充null字段和缺失字段。

  • For each document where the field is null or missing, linearFill fills those fields in proportion to the missing value range between surrounding non-null values according to the sortBy order. 对于每个字段为null或缺失的文档,linearFill根据sortBy顺序按照周围非null值之间的缺失值范围按比例填充这些字段。To determine the values for missing fields, linearFill uses:要确定缺失字段的值,linearFill使用:

    • The difference of surrounding non-null values.周围非null值的差异。
    • The number of null fields to fill between the surrounding values.要在周围值之间填充的null字段数。
  • The linear method can fill multiple consecutive null values if those values are preceded and followed by non-null values according to the sortBy order.linear方法可以根据排序顺序填充多个连续的null值,如果这些值之前和之后是非null值。

    Example示例

    If a collection contains these documents:如果集合包含以下文档:

    { index: 0, value: 0 },
    { index: 1, value: null },
    { index: 2, value: null },
    { index: 3, value: null },
    { index: 4, value: 10 }

    After using the linear fill method to fill the null values, the documents become:使用linear填充方法填充null值后,文档变为:

    { index: 0, value: 0 },
    { index: 1, value: 2.5 },
    { index: 2, value: 5 },
    { index: 3, value: 7.5 },
    { index: 4, value: 10 }
  • null values that are not preceded and followed by non-null values remain null.不在非null值之前和之后的null值保持为null
  • To use the linear fill method, you must also use the sortBy field to sort your data.要使用linear填充方法,还必须使用sortBy字段对数据进行排序。

    • When using the linear fill method, $fill returns an error if there are any repeated values in the sortBy field in a single partition.当使用linear填充方法时,如果在单个分区的sortBy字段中有任何重复值,$fill将返回一个错误。

For a complete example using the linear fill method, see Fill Missing Field Values with Linear Interpolation.有关使用linear填充方法的完整示例,请参阅使用线性插值填充缺少的字段值

locf Behavior行为

locf stands for last observation carried forward.代表结转的最后观察。

  • If a field being filled contains both null and non-null values, locf sets the null and missing values to the field's last known non-null value according to the sortBy order.如果要填充的字段同时包含null值和非null值,则locf会根据sortBy顺序将null值和缺失值设置为字段的最后一个已知非空值。

    • If the field contains only null or missing values in a partition, locf sets the field value to null for that partition.如果该字段在分区中仅包含null值或缺少值,则locf将该分区的字段值设置为null
    • null and missing field values that appear before non-null values in the sort order remain null.排序顺序中出现在非null值之前的null值和缺失字段值保持为null
  • To use the locf fill method, you must also use the sortBy field to sort your data.要使用locf填充方法,还必须使用sortBy字段对数据进行排序。

For a complete example using the locf fill method, see Fill Missing Field Values Based on the Last Observed Value.有关使用locf填充方法的完整示例,请参阅基于上次观察到的值填充缺少的字段值

Comparison of $fill and Aggregation Operators$fill和聚合运算符的比较

To fill null and missing field values within a document you can use:要填充文档中的null字段值和缺失字段值,可以使用:

  • The $fill stage.$fill阶段。

    When you use the $fill stage, the field you specify in the output is the same field used as the source data.使用$fill阶段时,输出中指定的字段与源数据中使用的字段相同。

  • The $linearFill and $locf aggregation operators.$linearFill$locf聚合运算符。

    When you $linearFill or $locf, you can set values for a different field than the field used as the source data.使用$linearFill$locf时,可以为不同于用作源数据的字段设置值。

Examples示例

The examples in this section show how to use $fill to fill missing values:本节中的示例显示了如何使用$fill填充缺失的值:

Fill Missing Field Values with a Constant Value用常量值填充缺少的字段值

A shoe store maintains a dailySales collection that contains a document summarizing each day's sales. 鞋店维护一个dailySales集合,其中包含一个总结每天销售的文档。The shoe store sells these types of shoes:鞋店销售以下类型的鞋:

  • boots
  • sandals
  • sneakers

Create the following dailySales collection:创建以下dailySales集合:

db.dailySales.insertMany( [
   {
      "date": ISODate("2022-02-02"),
      "bootsSold": 10,
      "sandalsSold": 20,
      "sneakersSold": 12
   },
   {
      "date": ISODate("2022-02-03"),
      "bootsSold": 7,
      "sneakersSold": 18
   },
   {
      "date": ISODate("2022-02-04"),
      "sneakersSold": 5
   }
] )

Not all of the documents in the dailySales collection contain each shoe type. dailySales集合中的并非所有文档都包含每种鞋型。If a shoe type is missing, it means there were no shoes of that type sold on the corresponding date.如果缺少鞋型,则意味着在相应日期没有销售该类型的鞋。

The following example uses $fill to set the quantities sold to 0 for the missing shoe types for each day's sales:以下示例使用$fill将每天销售的缺失鞋型的销售数量设置为0

db.dailySales.aggregate( [
   {
      $fill:
         {
            output:
               {
                  "bootsSold": { value: 0 },
                  "sandalsSold": { value: 0 },
                  "sneakersSold": { value: 0 }
               }
         }
   }
] )

In the preceding pipeline:在前面的管道中:

  • $fill fills in values for missing fields.填写缺失字段的值。
  • output specifies:指定:

    • The names of the fields to fill in.要填写的字段的名称。
    • The value to set the filled in fields to. 将填充字段设置为的值。In this example, the output specifies a constant value of 0.在本例中,输出指定一个常量值0

Example output:示例输出:

[
   {
     _id: ObjectId("6202df9f394d47411658b51e"),
     date: ISODate("2022-02-02T00:00:00.000Z"),
     bootsSold: 10,
     sandalsSold: 20,
     sneakersSold: 12
   },
   {
     _id: ObjectId("6202df9f394d47411658b51f"),
     date: ISODate("2022-02-03T00:00:00.000Z"),
     bootsSold: 7,
     sneakersSold: 18,
     sandalsSold: 0
   },
   {
     _id: ObjectId("6202df9f394d47411658b520"),
     date: ISODate("2022-02-04T00:00:00.000Z"),
     sneakersSold: 5,
     bootsSold: 0,
     sandalsSold: 0
   }
]

Fill Missing Field Values with Linear Interpolation用线性插值填充缺失的字段值

Create a stock collection that contains tracks a single company's stock price at hourly intervals:创建一个stock集合,包含每小时跟踪一家公司的股票价格:

db.stock.insertMany( [
   {
      time: ISODate("2021-03-08T09:00:00.000Z"),
      price: 500
   },
   {
      time: ISODate("2021-03-08T10:00:00.000Z"),
   },
   {
      time: ISODate("2021-03-08T11:00:00.000Z"),
      price: 515
   },
   {
      time: ISODate("2021-03-08T12:00:00.000Z")
   },
   {
      time: ISODate("2021-03-08T13:00:00.000Z")
   },
   {
      time: ISODate("2021-03-08T14:00:00.000Z"),
      price: 485
   }
] )

The price field is missing for some of the documents in the collection.集合中的某些文档缺少price字段。

To populate the missing price values using linear interpolation, use $fill with the linear fill method:要使用线性插值填充缺失的price值,请使用$filllinear填充方法:

db.stock.aggregate( [
   {
      $fill:
         {
            sortBy: { time: 1 },
            output:
               {
                  "price": { method: "linear" }
               }
         }
   }
] )

In the preceding pipeline:在前面的管道中:

  • $fill fills in values for missing fields.填写缺失字段的值。
  • sortBy: { time: 1 } sorts the documents by the time field in ascending order, from earliest to latest.time字段按升序对文档进行排序,从最早到最晚。
  • output specifies:指定:

    • price as the field for which to fill in missing values.作为要填充缺失值的字段。
    • { method: "linear" } as the fill method. 作为填充方法。The linear fill method fills missing price values using linear interpolation based on the surrounding price values in the sequence.linear填充方法使用基于序列中周围price值的线性插值来填充缺失的price值。

Example output:示例输出:

[
   {
      _id: ObjectId("620ad41c394d47411658b5e9"),
      time: ISODate("2021-03-08T09:00:00.000Z"),
      price: 500
   },
   {
      _id: ObjectId("620ad41c394d47411658b5ea"),
      time: ISODate("2021-03-08T10:00:00.000Z"),
      price: 507.5
   },
   {
      _id: ObjectId("620ad41c394d47411658b5eb"),
      time: ISODate("2021-03-08T11:00:00.000Z"),
      price: 515
   },
   {
      _id: ObjectId("620ad41c394d47411658b5ec"),
      time: ISODate("2021-03-08T12:00:00.000Z"),
      price: 505
   },
   {
      _id: ObjectId("620ad41c394d47411658b5ed"),
      time: ISODate("2021-03-08T13:00:00.000Z"),
      price: 495
   },
   {
      _id: ObjectId("620ad41c394d47411658b5ee"),
      time: ISODate("2021-03-08T14:00:00.000Z"),
      price: 485
   }
]

Fill Missing Field Values Based on the Last Observed Value根据上次观察到的值填充缺少的字段值

Create a restaurantReviews collection that contains review scores for a single restaurant over time:创建一个restaurantReviews集合,其中包含单个餐厅随时间变化的审核分数:

db.restaurantReviews.insertMany( [
   {
      date: ISODate("2021-03-08"),
      score: 90
   },
   {
      date: ISODate("2021-03-09"),
      score: 92
   },
   {
      date: ISODate("2021-03-10")
   },
   {
      date: ISODate("2021-03-11")
   },
   {
      date: ISODate("2021-03-12"),
      score: 85
   },
   {
      date: ISODate("2021-03-13")
   }
] )

The score field is missing for some of the documents in the collection.集合中的某些文档缺少score字段。

To populate the missing score fields and ensure that there are no gaps in the data, use $fill. 要填充缺失的score字段并确保数据中没有空白,请使用$fillIn the following example, $fill uses the locf fill method to fill the missing score values with the previous score in the sequence:在以下示例中,$fill使用locf填充方法用序列中的前一个score填充缺失的score值:

db.restaurantReviews.aggregate( [
   {
      $fill:
         {
            sortBy: { date: 1 },
            output:
               {
                  "score": { method: "locf" }
               }
         }
   }
] )

In the preceding pipeline:在前面的管道中:

  • $fill fills in missing score values.填写缺失的score值。
  • sortBy: { date: 1 } sorts the documents by the date field in ascending order, from earliest to latest.date字段按升序对文档进行排序,从最早到最晚。
  • output specifies:指定:

    • score as the field for which to fill in missing values.作为要填充缺失值的字段。
    • { method: "locf" } as the fill method. 作为填充方法。The locf fill method fills missing score values with the last observed score in the sequence.locf填充方法用序列中最后观察到的score填充缺失的score值。

Example output:示例输出:

[
   {
     _id: ObjectId("62040bc9394d47411658b553"),
     date: ISODate("2021-03-08T00:00:00.000Z"),
     score: 90
   },
   {
     _id: ObjectId("62040bc9394d47411658b554"),
     date: ISODate("2021-03-09T00:00:00.000Z"),
     score: 92
   },
   {
     _id: ObjectId("62040bc9394d47411658b555"),
     date: ISODate("2021-03-10T00:00:00.000Z"),
     score: 92
   },
   {
     _id: ObjectId("62040bc9394d47411658b556"),
     date: ISODate("2021-03-11T00:00:00.000Z"),
     score: 92
   },
   {
     _id: ObjectId("62040bc9394d47411658b557"),
     date: ISODate("2021-03-12T00:00:00.000Z"),
     score: 85
   },
   {
     _id: ObjectId("62040bc9394d47411658b558"),
     date: ISODate("2021-03-13T00:00:00.000Z"),
     score: 85
   }
]

Fill Data for Distinct Partitions填充不同分区的数据

Consider the previous example with restaurant reviews but instead of tracking a single restaurant, the collection now contains reviews for multiple restaurants.考虑前面的餐馆评论示例,但该集合现在包含多家餐馆的评论,而不是跟踪单个餐馆。

Create a collection named restaurantReviewsMultiple and populate the collection with these documents:创建名为restaurantReviewsMultiple的集合,并使用以下文档填充该集合:

db.restaurantReviewsMultiple.insertMany( [
   {
      date: ISODate("2021-03-08"),
      restaurant: "Joe's Pizza",
      score: 90
   },
   {
      date: ISODate("2021-03-08"),
      restaurant: "Sally's Deli",
      score: 75
   },
   {
      date: ISODate("2021-03-09"),
      restaurant: "Joe's Pizza",
      score: 92
   },
   {
      date: ISODate("2021-03-09"),
      restaurant: "Sally's Deli"
   },
   {
      date: ISODate("2021-03-10"),
      restaurant: "Joe's Pizza"
   },
   {
      date: ISODate("2021-03-10"),
      restaurant: "Sally's Deli",
      score: 68
   },
   {
      date: ISODate("2021-03-11"),
      restaurant: "Joe's Pizza",
      score: 93
   },
   {
      date: ISODate("2021-03-11"),
      restaurant: "Sally's Deli"
   }
] )

The score field is missing for some of the documents in the collection.集合中的某些文档缺少score字段。

To populate the missing score fields and ensure that there are no gaps in the data, use $fill. 要填充缺失的score字段并确保数据中没有空白,请使用$fillIn the following example, $fill uses the locf fill method to fill the missing score values with the previous score in the sequence:在以下示例中,$fill使用locf填充方法用序列中的前一个score填充缺失的score值:

db.restaurantReviewsMultiple.aggregate( [
   {
      $fill:
         {
            sortBy: { date: 1 },
            partitionBy: { "restaurant": "$restaurant" },
            output:
               {
                  "score": { method: "locf" }
               }
         }
   }
] )

In the preceding pipeline:在前面的管道中:

  • $fill fills in missing score values.填写缺失的score值。
  • sortBy: { date: 1 } sorts the documents by the date field in ascending order, from earliest to latest.date字段按升序对文档进行排序,从最早到最晚。
  • partitionBy: { "restaurant": "$restaurant" } partitions the data by restaurant. restaurant划分数据。There are two restaurants: Joe's Pizza and Sally's Deli.有两个餐馆:乔的披萨店和莎莉的熟食店。
  • output specifies:指定:

    • score as the field for which to fill in missing values.作为要填充缺失值的字段。
    • { method: "locf" } as the fill method. 作为填充方法。The locf fill method fills missing score values with the last observed score in the sequence.locf填充方法用序列中最后观察到的score填充缺失的score值。

Example output:示例输出:

[
  {
    _id: ObjectId("620559f4394d47411658b58f"),
    date: ISODate("2021-03-08T00:00:00.000Z"),
    restaurant: "Joe's Pizza",
    score: 90
  },
  {
    _id: ObjectId("620559f4394d47411658b591"),
    date: ISODate("2021-03-09T00:00:00.000Z"),
    restaurant: "Joe's Pizza",
    score: 92
  },
  {
    _id: ObjectId("620559f4394d47411658b593"),
    date: ISODate("2021-03-10T00:00:00.000Z"),
    restaurant: "Joe's Pizza",
    score: 92
  },
  {
    _id: ObjectId("620559f4394d47411658b595"),
    date: ISODate("2021-03-11T00:00:00.000Z"),
    restaurant: "Joe's Pizza",
    score: 93
  },
  {
    _id: ObjectId("620559f4394d47411658b590"),
    date: ISODate("2021-03-08T00:00:00.000Z"),
    restaurant: "Sally's Deli",
    score: 75
  },
  {
    _id: ObjectId("620559f4394d47411658b592"),
    date: ISODate("2021-03-09T00:00:00.000Z"),
    restaurant: "Sally's Deli",
    score: 75
  },
  {
    _id: ObjectId("620559f4394d47411658b594"),
    date: ISODate("2021-03-10T00:00:00.000Z"),
    restaurant: "Sally's Deli",
    score: 68
  },
  {
    _id: ObjectId("620559f4394d47411658b596"),
    date: ISODate("2021-03-11T00:00:00.000Z"),
    restaurant: "Sally's Deli",
    score: 68
  }
]
←  $facet (aggregation)$geoNear (aggregation) →