Definition定义
$fillNew in version 5.3.在版本5.3中新增。Populates填充文档中的nulland missing field values within documents.null字段值和缺失的字段值。You can use您可以使用$fillto populate missing data points:$fill来填充缺失的数据点:In a sequence based on surrounding values.基于周围值的序列。With a fixed value.具有固定值。
Syntax语法
The $fill stage has this syntax:$fill阶段具有以下语法:
{
$fill: {
partitionBy: <expression>,
partitionByFields: [ <field 1>, <field 2>, ... , <field n> ],
sortBy: {
<sort field 1>: <sort order>,
<sort field 2>: <sort order>,
...,
<sort field n>: <sort order>
},
output: {
<field 1>: { value: <expression> },
<field 2>: { method: <string> },
...
}
}
}
The $fill stage takes a document with these fields:$fill阶段需要一个包含以下字段的文档:
partitionBy |
| |
partitionByFields |
| |
sortBy | output.<field>中指定了method,则此项为必填项。否则,可选。 |
|
output |
| |
output.<field> |
|
Behavior and Restrictions行为和限制
partitionByFields Restrictions限制
$fill returns an error if any field name in the partitionByFields array:如果partitionByFields数组中有任何字段名,则返回错误:
Evaluates to a non-string value.计算结果为非字符串值。Begins with以$.$开头。
linear Behavior行为
The linear fill method fills null and missing fields using linear interpolation based on the surrounding non-null values in the sequence.linear填充方法使用基于序列中周围非空值的线性插值来填充null字段和缺失字段。
For each document where the field is对于字段为nullor missing,linearFillfills those fields in proportion to the missing value range between surrounding non-nullvalues according to the sortBy order.null或缺失的每个文档,linearFill会根据sortBy顺序,按照周围非null值之间的缺失值范围按比例填充这些字段。To determine the values for missing fields,要确定缺失字段的值,linearFilluses:linearFill使用:The difference of surrounding non-周围非nullvalues.null值的差异。The number of要在周围值之间填充的nullfields to fill between the surrounding values.null字段数。
The如果根据linearmethod can fill multiple consecutivenullvalues if those values are preceded and followed by non-nullvalues according to the sortBy order.sortBy顺序,这些值前面和后面都是非null值,则linear方法可以填充多个连续的空值。Example示例If a collection contains these documents:如果集合包含以下文档:{ index: 0, value: 0 },
{ index: 1, value: null },
{ index: 2, value: null },
{ index: 3, value: null },
{ index: 4, value: 10 }After using the使用linearfill method to fill thenullvalues, the documents become:linear填充方法填充null值后,文档变为:{ index: 0, value: 0 },
{ index: 1, value: 2.5 },
{ index: 2, value: 5 },
{ index: 3, value: 7.5 },
{ index: 4, value: 10 }前后没有非nullvalues that are not preceded and followed by non-nullvalues remainnull.null值的null值仍为null。To use the要使用linearfill method, you must also use the sortBy field to sort your data.linear填充方法,您还必须使用sortBy字段对数据进行排序。
For a complete example using the 有关使用linear fill method, see Fill Missing Field Values with Linear Interpolation.linear填充方法的完整示例,请参阅使用线性插值填充缺失字段值。
locf Behavior行为
locf stands for last observation carried forward.代表最后的观察结果。
If a field being filled contains both如果正在填充的字段同时包含nulland non-null values,locfsets thenulland missing values to the field's last known non-null value according to the sortBy order.null和非null值,locf会根据sortBy顺序将null和缺失值设置为字段最后一个已知的非null值。To use the要使用locffill method, you must also use the sortBy field to sort your data.locf填充方法,您还必须使用sortBy字段对数据进行排序。
For a complete example using the 有关使用locf fill method, see Fill Missing Field Values Based on the Last Observed Value.locf填充方法的完整示例,请参阅基于上次观察值填充缺失字段值。
Comparison of $fill and Aggregation Operators$fill和聚合运算符的比较
$fill and Aggregation OperatorsTo fill 要填充文档中的null and missing field values within a document you can use:null字段值和缺失的字段值,可以使用:
The$填充阶段。$fillstage.When you use the当您使用$fillstage, the field you specify in the output is the same field used as the source data.$fill阶段时,您在输出中指定的字段与源数据使用的字段相同。The$linearFilland$locfaggregation operators.$linearFill和$locf聚合运算符。When you当您使用$linearFillor$locf, you can set values for a different field than the field used as the source data.$linearFill或$locf时,您可以为与用作源数据的字段不同的字段设置值。
Examples示例
MongoDB Shell
The examples in this section show how to use 本节中的示例显示了如何使用$fill to fill missing values:$fill填充缺失的值:
With a constant value具有恒定值With linear interpolation采用线性插值Based on the last observed value基于上次观测值For distinct partitions用于不同的分区With linear interpolation for identical values in different partitions对不同分区中的相同值进行线性插值
Fill Missing Field Values with a Constant Value用常数值填充缺失的字段值
A shoe store maintains a 一家鞋店维护着一个dailySales collection that contains a document summarizing each day's sales. The shoe store sells these types of shoes:dailySales集合,其中包含一份总结每天销售情况的文档。这家鞋店出售这些类型的鞋子:
bootssandalssneakers
Create the following 创建以下dailySales collection:dailySales集合:
db.dailySales.insertMany( [
{
"date": ISODate("2022-02-02"),
"bootsSold": 10,
"sandalsSold": 20,
"sneakersSold": 12
},
{
"date": ISODate("2022-02-03"),
"bootsSold": 7,
"sneakersSold": 18
},
{
"date": ISODate("2022-02-04"),
"sneakersSold": 5
}
] )
Not all of the documents in the 并非dailySales collection contain each shoe type. If a shoe type is missing, it means there were no shoes of that type sold on the corresponding date.dailySales集合中的所有文档都包含每种鞋型。如果缺少鞋子类型,则意味着在相应日期没有出售该类型的鞋子。
The following example uses 以下示例使用$fill to set the quantities sold to 0 for the missing shoe types for each day's sales:$fill将每天销售中缺失的鞋子类型的销售量设置为0:
db.dailySales.aggregate( [
{
$fill:
{
output:
{
"bootsSold": { value: 0 },
"sandalsSold": { value: 0 },
"sneakersSold": { value: 0 }
}
}
}
] )
In the preceding pipeline:在前面的管道中:
$fillfills in values for missing fields.填充缺失字段的值。outputspecifies:指定:The names of the fields to fill in.要填写的字段的名称。The value to set the filled in fields to. In this example, the output specifies a constant value of将已填写字段设置为的值。在此示例中,输出指定一个常数值0.0。
Example output:示例输出:
[
{
_id: ObjectId("6202df9f394d47411658b51e"),
date: ISODate("2022-02-02T00:00:00.000Z"),
bootsSold: 10,
sandalsSold: 20,
sneakersSold: 12
},
{
_id: ObjectId("6202df9f394d47411658b51f"),
date: ISODate("2022-02-03T00:00:00.000Z"),
bootsSold: 7,
sneakersSold: 18,
sandalsSold: 0
},
{
_id: ObjectId("6202df9f394d47411658b520"),
date: ISODate("2022-02-04T00:00:00.000Z"),
sneakersSold: 5,
bootsSold: 0,
sandalsSold: 0
}
]Fill Missing Field Values with Linear Interpolation用线性插值填充缺失的字段值
Create a 创建一个包含以下文档的stock collection that contains the following documents, which track a single company's stock price at hourly intervals:stock(股票)集合,这些文档每小时跟踪一家公司的股价:
db.stock.insertMany( [
{
time: ISODate("2021-03-08T09:00:00.000Z"),
price: 500
},
{
time: ISODate("2021-03-08T10:00:00.000Z"),
},
{
time: ISODate("2021-03-08T11:00:00.000Z"),
price: 515
},
{
time: ISODate("2021-03-08T12:00:00.000Z")
},
{
time: ISODate("2021-03-08T13:00:00.000Z")
},
{
time: ISODate("2021-03-08T14:00:00.000Z"),
price: 485
}
] )
The 集合中的某些文档缺少price field is missing for some of the documents in the collection.price字段。
To populate the missing 要使用线性插值填充缺失的price values by using linear interpolation, use $fill with the linear fill method:price值,请使用$fill和linear填充方法:
db.stock.aggregate( [
{
$fill:
{
sortBy: { time: 1 },
output:
{
"price": { method: "linear" }
}
}
}
] )
In the preceding pipeline:在前面的管道中:
$fillfills in values for missing fields.填充缺失字段的值。sortBy: { time: 1 }sorts the documents by the按timefield in ascending order, from earliest to latest.time字段从早到晚按升序对文档进行排序。output输出specifies指定:priceas the field for which to fill in missing values.作为填写缺失值的字段。{ method: "linear" }as the fill method.作为填充方法。Thelinearfill method fills missingpricevalues by using linear interpolation based on the surroundingpricevalues in the sequence.linear填充方法通过使用基于序列中周围price值的线性插值来填充缺失的price值。
Example output:示例输出:
[
{
_id: ObjectId("620ad41c394d47411658b5e9"),
time: ISODate("2021-03-08T09:00:00.000Z"),
price: 500
},
{
_id: ObjectId("620ad41c394d47411658b5ea"),
time: ISODate("2021-03-08T10:00:00.000Z"),
price: 507.5
},
{
_id: ObjectId("620ad41c394d47411658b5eb"),
time: ISODate("2021-03-08T11:00:00.000Z"),
price: 515
},
{
_id: ObjectId("620ad41c394d47411658b5ec"),
time: ISODate("2021-03-08T12:00:00.000Z"),
price: 505
},
{
_id: ObjectId("620ad41c394d47411658b5ed"),
time: ISODate("2021-03-08T13:00:00.000Z"),
price: 495
},
{
_id: ObjectId("620ad41c394d47411658b5ee"),
time: ISODate("2021-03-08T14:00:00.000Z"),
price: 485
}
]Fill Missing Field Values Based on the Last Observed Value根据上次观测值填充缺失字段值
Create a 创建一个包含以下文档的restaurantReviews collection that contains the following documents, which store review scores for a single restaurant over time:restaurantReviews集合,其中存储了单个餐厅随时间变化的评论分数:
db.restaurantReviews.insertMany( [
{
date: ISODate("2021-03-08"),
score: 90
},
{
date: ISODate("2021-03-09"),
score: 92
},
{
date: ISODate("2021-03-10")
},
{
date: ISODate("2021-03-11")
},
{
date: ISODate("2021-03-12"),
score: 85
},
{
date: ISODate("2021-03-13")
}
] )
The 集合中的某些文档缺少score field is missing for some of the documents in the collection.score字段。
To populate the missing 要填充缺失的分数字段并确保数据中没有空白,请使用score fields and ensure that there are no gaps in the data, use $fill. $fill。In the following example, 在以下示例中,$fill uses the locf fill method to fill the missing score values with the previous score in the sequence:$fill使用locf填充方法用序列中的前一个score填充缺失的score值:
db.restaurantReviews.aggregate( [
{
$fill:
{
sortBy: { date: 1 },
output:
{
"score": { method: "locf" }
}
}
}
] )
In the preceding pipeline:在前面的管道中:
$fillfills in missing填写缺失的scorevalues.score值。sortBy: { date: 1 }sorts the documents by the按datefield in ascending order, from earliest to latest.date字段从早到晚按升序对文档进行排序。outputspecifies:指定:scoreas the field for which to fill in missing values.作为填写缺失值的字段。{ method: "locf" }as the fill method.作为填充方法。Thelocffill method fills missingscorevalues with the last observedscorein the sequence.locf填充方法用序列中最后一个观察到的score填充缺失的score值。
Example output:示例输出:
[
{
_id: ObjectId("62040bc9394d47411658b553"),
date: ISODate("2021-03-08T00:00:00.000Z"),
score: 90
},
{
_id: ObjectId("62040bc9394d47411658b554"),
date: ISODate("2021-03-09T00:00:00.000Z"),
score: 92
},
{
_id: ObjectId("62040bc9394d47411658b555"),
date: ISODate("2021-03-10T00:00:00.000Z"),
score: 92
},
{
_id: ObjectId("62040bc9394d47411658b556"),
date: ISODate("2021-03-11T00:00:00.000Z"),
score: 92
},
{
_id: ObjectId("62040bc9394d47411658b557"),
date: ISODate("2021-03-12T00:00:00.000Z"),
score: 85
},
{
_id: ObjectId("62040bc9394d47411658b558"),
date: ISODate("2021-03-13T00:00:00.000Z"),
score: 85
}
]Fill Data for Distinct Partitions为不同分区填充数据
Consider the previous example with restaurant reviews but instead of tracking a single restaurant, the collection now contains reviews for multiple restaurants.考虑前面的餐厅评论示例,但该集合现在包含多家餐厅的评论,而不是跟踪单个餐厅。
Create a collection named 创建一个名为restaurantReviewsMultiple and populate the collection with these documents:restaurantReviewsMultiple的集合,并用以下文档填充该集合:
db.restaurantReviewsMultiple.insertMany( [
{
date: ISODate("2021-03-08"),
restaurant: "Joe's Pizza",
score: 90
},
{
date: ISODate("2021-03-08"),
restaurant: "Sally's Deli",
score: 75
},
{
date: ISODate("2021-03-09"),
restaurant: "Joe's Pizza",
score: 92
},
{
date: ISODate("2021-03-09"),
restaurant: "Sally's Deli"
},
{
date: ISODate("2021-03-10"),
restaurant: "Joe's Pizza"
},
{
date: ISODate("2021-03-10"),
restaurant: "Sally's Deli",
score: 68
},
{
date: ISODate("2021-03-11"),
restaurant: "Joe's Pizza",
score: 93
},
{
date: ISODate("2021-03-11"),
restaurant: "Sally's Deli"
}
] )
The 集合中的某些文档缺少score field is missing for some of the documents in the collection.score字段。
To populate the missing 要填充缺失的score fields and ensure that there are no gaps in the data, use $fill. score字段并确保数据中没有空白,请使用$fill。In the following example, 在以下示例中,$fill uses the locf fill method to fill the missing score values with the previous score in the sequence:$fill使用locf填充方法用序列中的前一个分数填充缺失的分数值:
db.restaurantReviewsMultiple.aggregate( [
{
$fill:
{
sortBy: { date: 1 },
partitionBy: { "restaurant": "$restaurant" },
output:
{
"score": { method: "locf" }
}
}
}
] )
In the preceding pipeline:在前面的管道中:
$fillfills in missing填写缺失的scorevalues.score值。sortBy: { date: 1 }sorts the documents by the按datefield in ascending order, from earliest to latest.date字段从早到晚按升序对文档进行排序。partitionBy: { "restaurant": "$restaurant" }partitions the data by按restaurant. There are two restaurants:Joe's PizzaandSally's Deli.restaurant(餐厅)对数据进行分区。有两家餐厅:Joe's Pizza(乔的披萨店)和Sally's Deli(萨莉的熟食店)。outputspecifies:指定:scoreas the field for which to fill in missing values.作为填写缺失值的字段。{ method: "locf" }as the fill method. The作为填充方法。locffill method fills missingscorevalues with the last observedscorein the sequence.locf填充方法用序列中最后一个观察到的score填充缺失的score值。
Example output:示例输出:
[
{
_id: ObjectId("620559f4394d47411658b58f"),
date: ISODate("2021-03-08T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 90
},
{
_id: ObjectId("620559f4394d47411658b591"),
date: ISODate("2021-03-09T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 92
},
{
_id: ObjectId("620559f4394d47411658b593"),
date: ISODate("2021-03-10T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 92
},
{
_id: ObjectId("620559f4394d47411658b595"),
date: ISODate("2021-03-11T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 93
},
{
_id: ObjectId("620559f4394d47411658b590"),
date: ISODate("2021-03-08T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 75
},
{
_id: ObjectId("620559f4394d47411658b592"),
date: ISODate("2021-03-09T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 75
},
{
_id: ObjectId("620559f4394d47411658b594"),
date: ISODate("2021-03-10T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 68
},
{
_id: ObjectId("620559f4394d47411658b596"),
date: ISODate("2021-03-11T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 68
}
]Indicate if a Field was Populated Using $fill指示字段是否使用$fill填充
$fillWhen you populate missing values, the output does not indicate if a value was populated with the 当您填充缺失的值时,输出不会指示值是使用$fill operator or if the value existed in the document originally. $fill运算符填充的,还是该值最初存在于文档中。To distinguish between filled and preexisting values, you can use a 为了区分填充值和预先存在的值,您可以在$set stage before $fill and set a new field based on whether the value exists.$fill之前使用$set阶段,并根据值是否存在设置一个新字段。
For example, create a 例如,创建一个包含以下文档的restaurantReviews collection that contains the following documents, which store review scores for a restaurant over time:restaurantReviews集合,其中存储了餐厅随时间变化的评论分数:
db.restaurantReviews.insertMany( [
{
date: ISODate("2021-03-08"),
score: 90
},
{
date: ISODate("2021-03-09"),
score: 92
},
{
date: ISODate("2021-03-10")
},
{
date: ISODate("2021-03-11")
},
{
date: ISODate("2021-03-12"),
score: 85
},
{
date: ISODate("2021-03-13")
}
] )
The 集合中的某些文档缺少score field is missing for some of the documents in the collection. You can populate missing score values by using the $fill operator.score字段。您可以使用$fill运算符填充缺失的score值。
Create a pipeline to perform the following actions:创建一个管道以执行以下操作:
Use使用$setto add a new field to each document indicating if the document'sscorefield exists prior to the$filloperator populating values.$set为每个文档添加一个新字段,指示在$fill运算符填充值之前文档的score字段是否存在。This new field is called这个新字段称为valueExisted.valueExisted。Populate missing用序列中最后一个观察到的scorevalues with the last observedscorein the sequence. The fill methodlocfstands for "last observation carried forward".score填充缺失的score值。填充法locf代表“最后一次观测向前推进”。
The pipeline resembles the following code:管道类似于以下代码:
db.restaurantReviews.aggregate( [
{
$set: {
"valueExisted": {
"$ifNull": [
{ "$toBool": { "$toString": "$score" } },
false
]
}
}
},
{
$fill: {
sortBy: { date: 1 },
output:
{
"score": { method: "locf" }
}
}
}
] )
Note
Handling Values of Zero处理零值
In the 在$ifNull expression, the score values are converted to strings, then to booleans. $ifNull表达式中,score值被转换为字符串,然后转换为布尔值。The $toBool expression always converts strings to true. $toBool表达式总是将字符串转换为true。If the 如果score values are not converted to strings, score values of 0 will have valueExisted set to false.score值未转换为字符串,则score值0的valueExisted将设置为false。
Output:输出:
[
{
_id: ObjectId("63595116b1fac2ee2e957f15"),
date: ISODate("2021-03-08T00:00:00.000Z"),
score: 90,
valueExisted: true
},
{
_id: ObjectId("63595116b1fac2ee2e957f16"),
date: ISODate("2021-03-09T00:00:00.000Z"),
score: 92,
valueExisted: true
},
{
_id: ObjectId("63595116b1fac2ee2e957f17"),
date: ISODate("2021-03-10T00:00:00.000Z"),
valueExisted: false,
score: 92
},
{
_id: ObjectId("63595116b1fac2ee2e957f18"),
date: ISODate("2021-03-11T00:00:00.000Z"),
valueExisted: false,
score: 92
},
{
_id: ObjectId("63595116b1fac2ee2e957f19"),
date: ISODate("2021-03-12T00:00:00.000Z"),
score: 85,
valueExisted: true
},
{
_id: ObjectId("63595116b1fac2ee2e957f1a"),
date: ISODate("2021-03-13T00:00:00.000Z"),
valueExisted: false,
score: 85
}
]Interpolate Identical Values in Different Partitions插值不同分区中的相同值
Changed in version 8.1.在版本8.1中的更改。
Starting in MongoDB 8.1, 从MongoDB 8.1开始,如果不同分区中有相同的值,$fill can use the linear method to interpolate if there are identical values in different partitions.$fill可以使用linear(线性)方法进行插值。
Create the following 创建以下restaurantReviewsMultiple collection that contains the following documents, which store restaurant review scores on identical dates:restaurantReviewsMultiple集合,其中包含以下文档,用于存储相同日期的餐厅评论分数:
db.restaurantReviewsMultiple.insertMany( [
{
date: ISODate("2021-03-08"),
restaurant: "Steve's Pizza",
score: 90
},
{
date: ISODate("2021-03-08"),
restaurant: "Sally's Deli",
score: 75
}
] )
The following aggregation example uses 以下聚合示例使用"linear" to interpolate the identical dates in the different partitions:"linear"在不同分区中插值相同的日期:
db.restaurantReviewsMultiple.aggregate( [
{
$fill: {
sortBy: { date: 1 },
partitionBy: { "restaurant": "$restaurant" },
output: { "score": { method: "linear" } }
}
}
] )
Starting in MongoDB 8.1, the example returns the following output:从MongoDB 8.1开始,该示例返回以下输出:
[
{
_id: ObjectId("620559f4394d47411658b590"),
date: ISODate("2021-03-08T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 75
},
{
_id: ObjectId("620559f4394d47411658b58f"),
date: ISODate("2021-03-08T00:00:00.000Z"),
restaurant: "Steve's Pizza",
score: 90
}
]Node.js
The Node.js examples on this page use the 此页面上的Node.js示例使用Atlas示例数据集中的sample_weatherdata.data collection from the Atlas sample datasets. sample_weatherdata.data集合。To learn how to create a free MongoDB Atlas cluster and load the sample datasets, see Get Started in the MongoDB Node.js driver documentation.要了解如何创建免费的MongoDB Atlas集群并加载示例数据集,请参阅MongoDB Node.js驱动程序文档中的入门。
To use the MongoDB Node.js driver to add a 要使用MongoDB Node.js驱动程序向聚合管道添加$fill stage to an aggregation pipeline, use the $fill operator in a pipeline object.$fill阶段,请在管道对象中使用$fill运算符。
The following example creates a pipeline that fills null or missing values. The pipeline includes the following stages:以下示例创建了一个填充null或缺失值的管道。管道包括以下阶段:
The$groupstage groups input documents by theirtsfield and computes each group's averageseaSurfaceTemperature.value.$group阶段按ts字段对输入文档进行分组,并计算每个组的平均seaSurfaceTemperature.value。The$fillstage sorts the grouped data by the_idfield in ascending order and fills null or missingseaSurfaceTemperaturevalues by using linear interpolation.$fill阶段按_id字段升序对分组数据进行排序,并使用线性插值填充空值或缺失的seaSurfaceTemperature值。
const pipeline = [
{
$group: {
_id: "$ts",
seaSurfaceTemperature: { $avg: "$seaSurfaceTemperature.value" },
}
},
{
$fill: {
sortBy: { _id: 1 },
output: { seaSurfaceTemperature: { method: "linear" } }
}
}
];
const cursor = collection.aggregate(pipeline);
return cursor;