Docs HomeMongoDB Manual

$fill (aggregation)

Definition定义

$fill

New in version 5.3. 5.3版新增。

Populates null and missing field values within documents.填充文档中的空字段值和缺少的字段值。

You can use $fill to populate missing data points:您可以使用$fill来填充缺失的数据点:

  • In a sequence based on surrounding values.在基于周围值的序列中。
  • With a fixed value.具有固定值。

Syntax语法

The $fill stage has this syntax:$fill阶段具有以下语法:

{
$fill: {
partitionBy: <expression>,
partitionByFields: [ <field 1>, <field 2>, ... , <field n> ],
sortBy: {
<sort field 1>: <sort order>,
<sort field 2>: <sort order>,
...,
<sort field n>: <sort order>
},
output: {
<field 1>: { value: <expression> },
<field 2>: { method: <string> },
...
}
}
}

The $fill stage takes a document with these fields:$fill阶段接受具有以下字段的文档:

Field字段Necessity必要性Description描述
partitionByOptional可选的Specifies an expression to group the documents. 指定用于对文档进行分组的表达式In the $fill stage, a group of documents is known as a partition.$fill阶段,一组文档被称为分区
If you omit partitionBy and partitionByFields, $fill uses one partition for the entire collection.如果省略partitionBypartitionByFields$fill将为整个集合使用一个分区。
partitionBy and partitionByFields are mutually exclusive.partitionBypartitionByFields$fill是互斥的。
See an example. 请参见示例
partitionByFieldsOptional可选的Specifies an array of fields as the compound key to group the documents. 指定一个字段数组作为复合键以对文档进行分组。In the $fill stage, each group of documents is known as a partition.$fill阶段,每组文档被称为一个分区
If you omit partitionBy and partitionByFields, $fill uses one partition for the entire collection.如果省略partitionBypartitionByFields$fill将为整个集合使用一个分区。
partitionBy and partitionByFields are mutually exclusive.partitionBypartitionByFields是互斥的。
See partitionByFields Restrictions. 请参阅partitionByFields限制
sortByRequired if method is specified in at least one output.<field>.如果至少在一个output.<field>中指定了method,则为必要的。
Otherwise, optional. 否则它是可选的。
Specifies the field or fields to sort the documents within each partition. 指定要在每个分区中对文档进行排序的一个或多个字段。Uses the same syntax as the $sort stage. 使用与$sort阶段相同的语法。
outputRequired必要的Specifies an object containing each field for which to fill missing values. 指定一个对象,该对象包含要为其填充缺失值的每个字段。You can specify multiple fields in the output object.可以在输出对象中指定多个字段。
The object name is the name of the field to fill. 对象名称是要填充的字段的名称。The object value specifies how the field is filled. 对象值指定字段的填充方式。
output.<field>Required必要的Specifies an object indicating how to fill missing values in the target field. 指定一个对象,指示如何填充目标字段中缺少的值。The object name must be either value or method. 对象名称必须是valuemethodIf the name is: 如果名称为:

Behavior and Restrictions行为和限制

partitionByFields Restrictions限制

$fill returns an error if any field name in the partitionByFields array:如果partitionByFields数组中有任何字段名存在以下情况,则$fill返回错误:

  • Evaluates to a non-string value.计算为非字符串值。
  • Begins with $.$开头。

linear Behavior行为

The linear fill method fills null and missing fields using linear interpolation based on the surrounding non-null values in the sequence.linear填充方法使用基于序列中周围非null值的线性插值来填充空字段和缺失字段。

  • For each document where the field is null or missing, linearFill fills those fields in proportion to the missing value range between surrounding non-null values according to the sortBy order. 对于字段为null或缺失的每个文档,linearFill根据sortBy顺序,按照周围非null值之间的缺失值范围的比例填充这些字段。To determine the values for missing fields, linearFill uses:要确定缺失字段的值,linearFill使用:

    • The difference of surrounding non-null values.周围非null值的差异。
    • The number of null fields to fill between the surrounding values.要在周围值之间填充的空字段数。
  • The linear method can fill multiple consecutive null values if those values are preceded and followed by non-null values according to the sortBy order.linear方法可以填充多个连续的null值,如果这些值前面和后面都是根据sortBy顺序的非null值。

    Example

    If a collection contains these documents:如果集合包含以下文档:

    { index: 0, value: 0 },
    { index: 1, value: null },
    { index: 2, value: null },
    { index: 3, value: null },
    { index: 4, value: 10 }

    After using the linear fill method to fill the null values, the documents become:使用linear填充方法填充null值后,文档变为:

    { index: 0, value: 0 },
    { index: 1, value: 2.5 },
    { index: 2, value: 5 },
    { index: 3, value: 7.5 },
    { index: 4, value: 10 }
  • null values that are not preceded and followed by non-null values remain null.前面和后面没有非null值的null值保持为null
  • To use the linear fill method, you must also use the sortBy field to sort your data.要使用linear填充方法,还必须使用sortBy字段对数据进行排序。

    • When using the linear fill method, $fill returns an error if there are any repeated values in the sortBy field in a single partition.使用linear填充方法时,如果单个分区中的sortBy字段中有任何重复值,$fill将返回一个错误。

For a complete example using the linear fill method, see Fill Missing Field Values with Linear Interpolation.有关使用linear填充方法的完整示例,请参阅使用线性插值填充缺少的字段值

locf Behavior行为

locf stands for last observation carried forward.代表最后一次观测。

  • If a field being filled contains both null and non-null values, locf sets the null and missing values to the field's last known non-null value according to the sortBy order.如果要填充的字段同时包含null值和非null值,则locf会根据sortBy顺序将null值和缺失值设置为字段最后一个已知的非null值。

    • If the field contains only null or missing values in a partition, locf sets the field value to null for that partition.如果字段在分区中只包含null或缺少值,则locf会将该分区的字段值设置为null
    • null and missing field values that appear before non-null values in the sort order remain null.排序顺序中出现在非null值之前的null和缺失字段值保持为null
  • To use the locf fill method, you must also use the sortBy field to sort your data.要使用locf填充方法,还必须使用sortBy字段对数据进行排序。

For a complete example using the locf fill method, see Fill Missing Field Values Based on the Last Observed Value.有关使用locf填充方法的完整示例,请参阅根据上次观测值填充缺少的字段值

Comparison of $fill and Aggregation Operators$fill和聚合运算符的比较

To fill null and missing field values within a document you can use:要填充文档中的空字段值和缺失字段值,可以使用:

  • The $fill stage.$fill阶段。

    When you use the $fill stage, the field you specify in the output is the same field used as the source data.使用$fill阶段时,在输出中指定的字段与源数据使用的字段相同。

  • The $linearFill and $locf aggregation operators.$linearFill$locf聚合运算符。

    When you $linearFill or $locf, you can set values for a different field than the field used as the source data.使用$linearFill$locf时,可以为不同于用作源数据的字段的字段设置值。

Examples实例

The examples in this section show how to use $fill to fill missing values:本节中的示例显示了如何使用$fill来填充缺失的值:

Fill Missing Field Values with a Constant Value用常数值填充缺少的字段值

A shoe store maintains a dailySales collection that contains a document summarizing each day's sales. 鞋店有一个dailySales集合,其中包含一份总结每天销售的文档。The shoe store sells these types of shoes:鞋店出售这些类型的鞋:

  • boots
  • sandals
  • sneakers

Create the following dailySales collection:创建以下dailySales集合:

db.dailySales.insertMany( [
{
"date": ISODate("2022-02-02"),
"bootsSold": 10,
"sandalsSold": 20,
"sneakersSold": 12
},
{
"date": ISODate("2022-02-03"),
"bootsSold": 7,
"sneakersSold": 18
},
{
"date": ISODate("2022-02-04"),
"sneakersSold": 5
}
] )

Not all of the documents in the dailySales collection contain each shoe type. 并非dailySales集合中的所有文档都包含每种鞋型。If a shoe type is missing, it means there were no shoes of that type sold on the corresponding date.如果缺少一种鞋型,则意味着在相应日期没有销售该类型的鞋。

The following example uses $fill to set the quantities sold to 0 for the missing shoe types for each day's sales:以下示例使用$fill将每天销售中缺少的鞋款的销售数量设置为0

db.dailySales.aggregate( [
{
$fill:
{
output:
{
"bootsSold": { value: 0 },
"sandalsSold": { value: 0 },
"sneakersSold": { value: 0 }
}
}
}
] )

In the preceding pipeline:在前面的管道中:

  • $fill fills in values for missing fields.填写缺失字段的值。
  • output specifies:指定:

    • The names of the fields to fill in.要填写的字段的名称。
    • The value to set the filled in fields to. In this example, the output specifies a constant value of 0.将填充字段设置为的值。在本例中,输出指定一个常量值0

Example output:示例输出:

[
{
_id: ObjectId("6202df9f394d47411658b51e"),
date: ISODate("2022-02-02T00:00:00.000Z"),
bootsSold: 10,
sandalsSold: 20,
sneakersSold: 12
},
{
_id: ObjectId("6202df9f394d47411658b51f"),
date: ISODate("2022-02-03T00:00:00.000Z"),
bootsSold: 7,
sneakersSold: 18,
sandalsSold: 0
},
{
_id: ObjectId("6202df9f394d47411658b520"),
date: ISODate("2022-02-04T00:00:00.000Z"),
sneakersSold: 5,
bootsSold: 0,
sandalsSold: 0
}
]

Fill Missing Field Values with Linear Interpolation使用线性插值填充缺失字段值

Create a stock collection that contains tracks a single company's stock price at hourly intervals:创建一个stock集合,其中包含按小时间隔跟踪单个公司股价的信息:

db.stock.insertMany( [
{
time: ISODate("2021-03-08T09:00:00.000Z"),
price: 500
},
{
time: ISODate("2021-03-08T10:00:00.000Z"),
},
{
time: ISODate("2021-03-08T11:00:00.000Z"),
price: 515
},
{
time: ISODate("2021-03-08T12:00:00.000Z")
},
{
time: ISODate("2021-03-08T13:00:00.000Z")
},
{
time: ISODate("2021-03-08T14:00:00.000Z"),
price: 485
}
] )

The price field is missing for some of the documents in the collection.集合中的某些文档缺少price字段。

To populate the missing price values using linear interpolation, use $fill with the linear fill method:要使用线性插值填充缺失的price值,请使用$filllinear填充方法:

db.stock.aggregate( [
{
$fill:
{
sortBy: { time: 1 },
output:
{
"price": { method: "linear" }
}
}
}
] )

In the preceding pipeline:在前面的管道中:

  • $fill fills in values for missing fields.填写缺失字段的值。
  • sortBy: { time: 1 } sorts the documents by the time field in ascending order, from earliest to latest.按照time字段从最早到最晚的升序对文档进行排序。
  • output specifies:指定:

    • price as the field for which to fill in missing values.作为要填充缺失值的字段。
    • { method: "linear" } as the fill method. 作为填充方法。The linear fill method fills missing price values using linear interpolation based on the surrounding price values in the sequence.linear填充方法使用基于序列中周围price值的线性插值来填充缺失的price值。

Example output:示例输出:

[
{
_id: ObjectId("620ad41c394d47411658b5e9"),
time: ISODate("2021-03-08T09:00:00.000Z"),
price: 500
},
{
_id: ObjectId("620ad41c394d47411658b5ea"),
time: ISODate("2021-03-08T10:00:00.000Z"),
price: 507.5
},
{
_id: ObjectId("620ad41c394d47411658b5eb"),
time: ISODate("2021-03-08T11:00:00.000Z"),
price: 515
},
{
_id: ObjectId("620ad41c394d47411658b5ec"),
time: ISODate("2021-03-08T12:00:00.000Z"),
price: 505
},
{
_id: ObjectId("620ad41c394d47411658b5ed"),
time: ISODate("2021-03-08T13:00:00.000Z"),
price: 495
},
{
_id: ObjectId("620ad41c394d47411658b5ee"),
time: ISODate("2021-03-08T14:00:00.000Z"),
price: 485
}
]

Fill Missing Field Values Based on the Last Observed Value根据上次观测值填充缺失的字段值

Create a restaurantReviews collection that contains review scores for a single restaurant over time:创建一个restaurantReviews集合,该集合包含单个餐厅在一段时间内的评分:

db.restaurantReviews.insertMany( [
{
date: ISODate("2021-03-08"),
score: 90
},
{
date: ISODate("2021-03-09"),
score: 92
},
{
date: ISODate("2021-03-10")
},
{
date: ISODate("2021-03-11")
},
{
date: ISODate("2021-03-12"),
score: 85
},
{
date: ISODate("2021-03-13")
}
] )

The score field is missing for some of the documents in the collection.集合中的某些文档缺少score字段。

To populate the missing score fields and ensure that there are no gaps in the data, use $fill. 要填充缺失的score字段并确保数据中没有空白,请使用$fillIn the following example, $fill uses the locf fill method to fill the missing score values with the previous score in the sequence:在以下示例中,$fill使用locf填充方法用序列中的前一个score填充缺失的分值:

db.restaurantReviews.aggregate( [
{
$fill:
{
sortBy: { date: 1 },
output:
{
"score": { method: "locf" }
}
}
}
] )

In the preceding pipeline:在前面的管道中:

  • $fill fills in missing score values.填写缺失的score值。
  • sortBy: { date: 1 } sorts the documents by the date field in ascending order, from earliest to latest.按照date字段从最早到最晚的升序对文档进行排序。
  • output specifies:指定:

    • score as the field for which to fill in missing values.作为要填充缺失值的字段。
    • { method: "locf" } as the fill method. 作为填充方法。The locf fill method fills missing score values with the last observed score in the sequence.locf填充方法使用序列中最后一次观察到的score来填充缺失的分数值。

Example output:示例输出:

[
{
_id: ObjectId("62040bc9394d47411658b553"),
date: ISODate("2021-03-08T00:00:00.000Z"),
score: 90
},
{
_id: ObjectId("62040bc9394d47411658b554"),
date: ISODate("2021-03-09T00:00:00.000Z"),
score: 92
},
{
_id: ObjectId("62040bc9394d47411658b555"),
date: ISODate("2021-03-10T00:00:00.000Z"),
score: 92
},
{
_id: ObjectId("62040bc9394d47411658b556"),
date: ISODate("2021-03-11T00:00:00.000Z"),
score: 92
},
{
_id: ObjectId("62040bc9394d47411658b557"),
date: ISODate("2021-03-12T00:00:00.000Z"),
score: 85
},
{
_id: ObjectId("62040bc9394d47411658b558"),
date: ISODate("2021-03-13T00:00:00.000Z"),
score: 85
}
]

Fill Data for Distinct Partitions填充不同分区的数据

Consider the previous example with restaurant reviews but instead of tracking a single restaurant, the collection now contains reviews for multiple restaurants.考虑一下前面的餐厅评论示例,但该集合现在包含多家餐厅的评论,而不是跟踪单个餐厅。

Create a collection named restaurantReviewsMultiple and populate the collection with these documents:创建一个名为restaurantReviewsMultiple的集合使用以下文档对集合进行填充:

db.restaurantReviewsMultiple.insertMany( [
{
date: ISODate("2021-03-08"),
restaurant: "Joe's Pizza",
score: 90
},
{
date: ISODate("2021-03-08"),
restaurant: "Sally's Deli",
score: 75
},
{
date: ISODate("2021-03-09"),
restaurant: "Joe's Pizza",
score: 92
},
{
date: ISODate("2021-03-09"),
restaurant: "Sally's Deli"
},
{
date: ISODate("2021-03-10"),
restaurant: "Joe's Pizza"
},
{
date: ISODate("2021-03-10"),
restaurant: "Sally's Deli",
score: 68
},
{
date: ISODate("2021-03-11"),
restaurant: "Joe's Pizza",
score: 93
},
{
date: ISODate("2021-03-11"),
restaurant: "Sally's Deli"
}
] )

The score field is missing for some of the documents in the collection.集合中的某些文档缺少score字段。

To populate the missing score fields and ensure that there are no gaps in the data, use $fill. 要填充缺失的score字段并确保数据中没有空白,请使用$fillIn the following example, $fill uses the locf fill method to fill the missing score values with the previous score in the sequence:在以下示例中,$fill使用locf填充方法用序列中的前一个score填充缺失的score值:

db.restaurantReviewsMultiple.aggregate( [
{
$fill:
{
sortBy: { date: 1 },
partitionBy: { "restaurant": "$restaurant" },
output:
{
"score": { method: "locf" }
}
}
}
] )

In the preceding pipeline:在前面的管道中:

  • $fill fills in missing score values.填写缺失的score值。
  • sortBy: { date: 1 } sorts the documents by the date field in ascending order, from earliest to latest.按照date字段从最早到最晚的升序对文档进行排序。
  • partitionBy: { "restaurant": "$restaurant" } partitions the data by restaurant. There are two restaurants: Joe's Pizza and Sally's Deli.restaurant划分数据。有两家餐馆:Joe's PizzaSally's Deli
  • output specifies:指定:

    • score as the field for which to fill in missing values.作为要填充缺失值的字段。
    • { method: "locf" } as the fill method. 作为填充方法。The locf fill method fills missing score values with the last observed score in the sequence.locf填充方法使用序列中最后一次观察到的score来填充缺失的分数值。

Example output:示例输出:

[
{
_id: ObjectId("620559f4394d47411658b58f"),
date: ISODate("2021-03-08T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 90
},
{
_id: ObjectId("620559f4394d47411658b591"),
date: ISODate("2021-03-09T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 92
},
{
_id: ObjectId("620559f4394d47411658b593"),
date: ISODate("2021-03-10T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 92
},
{
_id: ObjectId("620559f4394d47411658b595"),
date: ISODate("2021-03-11T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 93
},
{
_id: ObjectId("620559f4394d47411658b590"),
date: ISODate("2021-03-08T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 75
},
{
_id: ObjectId("620559f4394d47411658b592"),
date: ISODate("2021-03-09T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 75
},
{
_id: ObjectId("620559f4394d47411658b594"),
date: ISODate("2021-03-10T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 68
},
{
_id: ObjectId("620559f4394d47411658b596"),
date: ISODate("2021-03-11T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 68
}
]

Indicate if a Field was Populated Using $fill指示字段是否使用$fill填充

When you populate missing values, the output does not indicate if a value was populated with the $fill operator or if the value existed in the document originally. 填充缺少的值时,输出不会指示值是用$fill运算符填充的,还是该值最初存在于文档中。To distinguish between filled and preexisting values, you can use a $set stage before $fill and set a new field based on whether the value exists.要区分填充值和预先存在的值,可以在$fill之前使用$set阶段,并根据值是否存在设置新字段。

For example, create a restaurantReviews collection that contains review scores for a restaurant over time:例如,创建一个restaurantReviews集合,该集合包含一段时间内某家餐厅的评分:

db.restaurantReviews.insertMany( [
{
date: ISODate("2021-03-08"),
score: 90
},
{
date: ISODate("2021-03-09"),
score: 92
},
{
date: ISODate("2021-03-10")
},
{
date: ISODate("2021-03-11")
},
{
date: ISODate("2021-03-12"),
score: 85
},
{
date: ISODate("2021-03-13")
}
] )

The score field is missing for some of the documents in the collection. 集合中的某些文档缺少score字段。You can populate missing score values with the $fill operator.您可以使用$fill运算符填充缺失的score值。

Create a pipeline to perform the following actions:创建一个管道以执行以下操作:

  • Add a new field to each document (using $set) indicating if the document's score field exists prior to the $fill operator populating values. 向每个文档添加一个新字段(使用$set),指示在$fill运算符填充值之前文档的score字段是否存在。This new field is called valueExisted.这个新字段称为valueExisted
  • Populate missing score values with the last observed score in the sequence. The fill method locf stands for "last observation carried forward".用序列中最后一次观察到的score填充缺失的score值。填充法locf代表“最后一次观测结转”。

The pipeline looks like this:管道如下所示:

db.restaurantReviews.aggregate( [
{
$set: {
"valueExisted": {
"$ifNull": [
{ "$toBool": { "$toString": "$score" } },
false
]
}
}
},
{
$fill: {
sortBy: { date: 1 },
output:
{
"score": { method: "locf" }
}
}
}
] )
Note

Handling Values of Zero处理零值

In the $ifNull expression, the score values are converted to strings, then to booleans. $ifNull表达式中,score值会转换为字符串,然后转换为布尔值。The $toBool expression always converts strings to true. $toBool表达式总是将字符串转换为trueIf the score values are not converted to strings, score values of 0 will have valueExisted set to false.如果未将score值转换为字符串,则0score值的valueExisted将设置为false

Output:输出:

[
{
_id: ObjectId("63595116b1fac2ee2e957f15"),
date: ISODate("2021-03-08T00:00:00.000Z"),
score: 90,
valueExisted: true
},
{
_id: ObjectId("63595116b1fac2ee2e957f16"),
date: ISODate("2021-03-09T00:00:00.000Z"),
score: 92,
valueExisted: true
},
{
_id: ObjectId("63595116b1fac2ee2e957f17"),
date: ISODate("2021-03-10T00:00:00.000Z"),
valueExisted: false,
score: 92
},
{
_id: ObjectId("63595116b1fac2ee2e957f18"),
date: ISODate("2021-03-11T00:00:00.000Z"),
valueExisted: false,
score: 92
},
{
_id: ObjectId("63595116b1fac2ee2e957f19"),
date: ISODate("2021-03-12T00:00:00.000Z"),
score: 85,
valueExisted: true
},
{
_id: ObjectId("63595116b1fac2ee2e957f1a"),
date: ISODate("2021-03-13T00:00:00.000Z"),
valueExisted: false,
score: 85
}
]