$fill (aggregation)
On this page本页内容
Definition定义Syntax语法Behavior and Restrictions行为和限制partitionByFields
Restrictions限制linear
Behavior行为locf
Behavior行为Comparison of$fill
and Aggregation Operators$fill
和聚合运算符的比较Examples实例Fill Missing Field Values with a Constant Value用常数值填充缺少的字段值Fill Missing Field Values with Linear Interpolation使用线性插值填充缺失字段值Fill Missing Field Values Based on the Last Observed Value根据上次观测值填充缺失的字段值Fill Data for Distinct Partitions填充不同分区的数据Indicate if a Field was Populated Using指示字段是否使用$fill
$fill
填充
Definition定义
$fill
-
New in version 5.3.5.3版新增。Populates填充文档中的空字段值和缺少的字段值。null
and missing field values within documents.You can use您可以使用$fill
to populate missing data points:$fill
来填充缺失的数据点:In a sequence based on surrounding values.在基于周围值的序列中。With a fixed value.具有固定值。
Syntax语法
The $fill
stage has this syntax:$fill
阶段具有以下语法:
{
$fill: {
partitionBy: <expression>,
partitionByFields: [ <field 1>, <field 2>, ... , <field n> ],
sortBy: {
<sort field 1>: <sort order>,
<sort field 2>: <sort order>,
...,
<sort field n>: <sort order>
},
output: {
<field 1>: { value: <expression> },
<field 2>: { method: <string> },
...
}
}
}
The $fill
stage takes a document with these fields:$fill
阶段接受具有以下字段的文档:
partitionBy | $fill stage, a group of documents is known as a partition.$fill 阶段,一组文档被称为分区。$fill uses one partition for the entire collection.partitionBy 和partitionByFields ,$fill 将为整个集合使用一个分区。partitionBy 和partitionByFields ,$fill 是互斥的。 | |
partitionByFields | $fill stage, each group of documents is known as a partition.$fill 阶段,每组文档被称为一个分区。$fill uses one partition for the entire collection.partitionBy 和partitionByFields ,$fill 将为整个集合使用一个分区。partitionBy 和partitionByFields 是互斥的。partitionByFields Restrictions. partitionByFields 限制。 | |
sortBy | output.<field> 中指定了method ,则为必要的。 | $sort stage. $sort 阶段相同的语法。 |
output | ||
output.<field> | value or method . value 或method 。
|
Behavior and Restrictions行为和限制
partitionByFields
Restrictions限制
如果$fill
returns an error if any field name in the partitionByFields array:partitionByFields
数组中有任何字段名存在以下情况,则$fill
返回错误:
Evaluates to a non-string value.计算为非字符串值。Begins with以$
.$
开头。
linear
Behavior行为
The linear
fill method fills null
and missing fields using linear interpolation based on the surrounding non-
null
values in the sequence.linear
填充方法使用基于序列中周围非null
值的线性插值来填充空字段和缺失字段。
For each document where the field is对于字段为null
or missing,linearFill
fills those fields in proportion to the missing value range between surrounding non-null
values according to the sortBy order.null
或缺失的每个文档,linearFill
根据sortBy
顺序,按照周围非null
值之间的缺失值范围的比例填充这些字段。To determine the values for missing fields,要确定缺失字段的值,linearFill
uses:linearFill
使用:The difference of surrounding non-周围非null
values.null
值的差异。The number of要在周围值之间填充的空字段数。null
fields to fill between the surrounding values.
Thelinear
method can fill multiple consecutivenull
values if those values are preceded and followed by non-null
values according to the sortBy order.linear
方法可以填充多个连续的null
值,如果这些值前面和后面都是根据sortBy
顺序的非null
值。ExampleIf a collection contains these documents:如果集合包含以下文档:{ index: 0, value: 0 },
{ index: 1, value: null },
{ index: 2, value: null },
{ index: 3, value: null },
{ index: 4, value: 10 }After using the使用linear
fill method to fill thenull
values, the documents become:linear
填充方法填充null
值后,文档变为:{ index: 0, value: 0 },
{ index: 1, value: 2.5 },
{ index: 2, value: 5 },
{ index: 3, value: 7.5 },
{ index: 4, value: 10 }前面和后面没有非null
values that are not preceded and followed by non-null
values remainnull
.null
值的null
值保持为null
。To use the要使用linear
fill method, you must also use the sortBy field to sort your data.linear
填充方法,还必须使用sortBy
字段对数据进行排序。
For a complete example using the 有关使用linear
fill method, see Fill Missing Field Values with Linear Interpolation.linear
填充方法的完整示例,请参阅使用线性插值填充缺少的字段值。
locf
Behavior行为
locf
stands for last observation carried forward.代表最后一次观测。
If a field being filled contains both如果要填充的字段同时包含null
and non-null values,locf
sets thenull
and missing values to the field's last known non-null value according to the sortBy order.null
值和非null
值,则locf
会根据sortBy
顺序将null
值和缺失值设置为字段最后一个已知的非null
值。To use the要使用locf
fill method, you must also use the sortBy field to sort your data.locf
填充方法,还必须使用sortBy
字段对数据进行排序。
For a complete example using the 有关使用locf
fill method, see Fill Missing Field Values Based on the Last Observed Value.locf
填充方法的完整示例,请参阅根据上次观测值填充缺少的字段值。
Comparison of $fill
and Aggregation Operators$fill
和聚合运算符的比较
$fill
and Aggregation OperatorsTo fill 要填充文档中的空字段值和缺失字段值,可以使用:null
and missing field values within a document you can use:
The$fill
stage.$fill
阶段。When you use the使用$fill
stage, the field you specify in the output is the same field used as the source data.$fill
阶段时,在输出中指定的字段与源数据使用的字段相同。The$linearFill
and$locf
aggregation operators.$linearFill
和$locf
聚合运算符。When you使用$linearFill
or$locf
, you can set values for a different field than the field used as the source data.$linearFill
或$locf
时,可以为不同于用作源数据的字段的字段设置值。
Examples实例
The examples in this section show how to use 本节中的示例显示了如何使用$fill
to fill missing values:$fill
来填充缺失的值:
With a constant value具有常量值With linear interpolation使用线性插值Based on the last observed value基于上次观测值For distinct partitions对于不同的分区
Fill Missing Field Values with a Constant Value用常数值填充缺少的字段值
A shoe store maintains a 鞋店有一个dailySales
collection that contains a document summarizing each day's sales. dailySales
集合,其中包含一份总结每天销售的文档。The shoe store sells these types of shoes:鞋店出售这些类型的鞋:
boots
sandals
sneakers
Create the following 创建以下dailySales
collection:dailySales
集合:
db.dailySales.insertMany( [
{
"date": ISODate("2022-02-02"),
"bootsSold": 10,
"sandalsSold": 20,
"sneakersSold": 12
},
{
"date": ISODate("2022-02-03"),
"bootsSold": 7,
"sneakersSold": 18
},
{
"date": ISODate("2022-02-04"),
"sneakersSold": 5
}
] )
Not all of the documents in the 并非dailySales
collection contain each shoe type. dailySales
集合中的所有文档都包含每种鞋型。If a shoe type is missing, it means there were no shoes of that type sold on the corresponding date.如果缺少一种鞋型,则意味着在相应日期没有销售该类型的鞋。
The following example uses 以下示例使用$fill
to set the quantities sold to 0
for the missing shoe types for each day's sales:$fill
将每天销售中缺少的鞋款的销售数量设置为0
:
db.dailySales.aggregate( [
{
$fill:
{
output:
{
"bootsSold": { value: 0 },
"sandalsSold": { value: 0 },
"sneakersSold": { value: 0 }
}
}
}
] )
In the preceding pipeline:在前面的管道中:
$fill
fills in values for missing fields.填写缺失字段的值。output
specifies:指定:The names of the fields to fill in.要填写的字段的名称。The value to set the filled in fields to. In this example, the output specifies a constant value of将填充字段设置为的值。在本例中,输出指定一个常量值0
.0
。
Example output:示例输出:
[
{
_id: ObjectId("6202df9f394d47411658b51e"),
date: ISODate("2022-02-02T00:00:00.000Z"),
bootsSold: 10,
sandalsSold: 20,
sneakersSold: 12
},
{
_id: ObjectId("6202df9f394d47411658b51f"),
date: ISODate("2022-02-03T00:00:00.000Z"),
bootsSold: 7,
sneakersSold: 18,
sandalsSold: 0
},
{
_id: ObjectId("6202df9f394d47411658b520"),
date: ISODate("2022-02-04T00:00:00.000Z"),
sneakersSold: 5,
bootsSold: 0,
sandalsSold: 0
}
]
Fill Missing Field Values with Linear Interpolation使用线性插值填充缺失字段值
Create a 创建一个stock
collection that contains tracks a single company's stock price at hourly intervals:stock
集合,其中包含按小时间隔跟踪单个公司股价的信息:
db.stock.insertMany( [
{
time: ISODate("2021-03-08T09:00:00.000Z"),
price: 500
},
{
time: ISODate("2021-03-08T10:00:00.000Z"),
},
{
time: ISODate("2021-03-08T11:00:00.000Z"),
price: 515
},
{
time: ISODate("2021-03-08T12:00:00.000Z")
},
{
time: ISODate("2021-03-08T13:00:00.000Z")
},
{
time: ISODate("2021-03-08T14:00:00.000Z"),
price: 485
}
] )
The 集合中的某些文档缺少price
field is missing for some of the documents in the collection.price
字段。
To populate the missing 要使用线性插值price
values using linear interpolation, use
$fill
with the linear
fill method:填充缺失的
price
值,请使用$fill
和linear
填充方法:
db.stock.aggregate( [
{
$fill:
{
sortBy: { time: 1 },
output:
{
"price": { method: "linear" }
}
}
}
] )
In the preceding pipeline:在前面的管道中:
$fill
fills in values for missing fields.填写缺失字段的值。sortBy: { time: 1 }
sorts the documents by the按照time
field in ascending order, from earliest to latest.time
字段从最早到最晚的升序对文档进行排序。output
specifies:指定:price
as the field for which to fill in missing values.作为要填充缺失值的字段。{ method: "linear" }
as the fill method.作为填充方法。Thelinear
fill method fills missingprice
values using linear interpolationbased on the surrounding
price
values in the sequence.linear
填充方法使用基于序列中周围price
值的线性插值来填充缺失的
price
值。
Example output:示例输出:
[
{
_id: ObjectId("620ad41c394d47411658b5e9"),
time: ISODate("2021-03-08T09:00:00.000Z"),
price: 500
},
{
_id: ObjectId("620ad41c394d47411658b5ea"),
time: ISODate("2021-03-08T10:00:00.000Z"),
price: 507.5
},
{
_id: ObjectId("620ad41c394d47411658b5eb"),
time: ISODate("2021-03-08T11:00:00.000Z"),
price: 515
},
{
_id: ObjectId("620ad41c394d47411658b5ec"),
time: ISODate("2021-03-08T12:00:00.000Z"),
price: 505
},
{
_id: ObjectId("620ad41c394d47411658b5ed"),
time: ISODate("2021-03-08T13:00:00.000Z"),
price: 495
},
{
_id: ObjectId("620ad41c394d47411658b5ee"),
time: ISODate("2021-03-08T14:00:00.000Z"),
price: 485
}
]
Fill Missing Field Values Based on the Last Observed Value根据上次观测值填充缺失的字段值
Create a 创建一个restaurantReviews
collection that contains review scores for a single restaurant over time:restaurantReviews
集合,该集合包含单个餐厅在一段时间内的评分:
db.restaurantReviews.insertMany( [
{
date: ISODate("2021-03-08"),
score: 90
},
{
date: ISODate("2021-03-09"),
score: 92
},
{
date: ISODate("2021-03-10")
},
{
date: ISODate("2021-03-11")
},
{
date: ISODate("2021-03-12"),
score: 85
},
{
date: ISODate("2021-03-13")
}
] )
The 集合中的某些文档缺少score
field is missing for some of the documents in the collection.score
字段。
To populate the missing 要填充缺失的score
fields and ensure that there are no gaps in the data, use $fill
. score
字段并确保数据中没有空白,请使用$fill
。In the following example, 在以下示例中,$fill
uses the locf
fill method to fill the missing score
values with the previous score
in the sequence:$fill
使用locf
填充方法用序列中的前一个score
填充缺失的分值:
db.restaurantReviews.aggregate( [
{
$fill:
{
sortBy: { date: 1 },
output:
{
"score": { method: "locf" }
}
}
}
] )
In the preceding pipeline:在前面的管道中:
$fill
fills in missing填写缺失的score
values.score
值。sortBy: { date: 1 }
sorts the documents by the按照date
field in ascending order, from earliest to latest.date
字段从最早到最晚的升序对文档进行排序。output
specifies:指定:score
as the field for which to fill in missing values.作为要填充缺失值的字段。{ method: "locf" }
as the fill method.作为填充方法。Thelocf
fill method fills missingscore
values with the last observedscore
in the sequence.locf
填充方法使用序列中最后一次观察到的score
来填充缺失的分数值。
Example output:示例输出:
[
{
_id: ObjectId("62040bc9394d47411658b553"),
date: ISODate("2021-03-08T00:00:00.000Z"),
score: 90
},
{
_id: ObjectId("62040bc9394d47411658b554"),
date: ISODate("2021-03-09T00:00:00.000Z"),
score: 92
},
{
_id: ObjectId("62040bc9394d47411658b555"),
date: ISODate("2021-03-10T00:00:00.000Z"),
score: 92
},
{
_id: ObjectId("62040bc9394d47411658b556"),
date: ISODate("2021-03-11T00:00:00.000Z"),
score: 92
},
{
_id: ObjectId("62040bc9394d47411658b557"),
date: ISODate("2021-03-12T00:00:00.000Z"),
score: 85
},
{
_id: ObjectId("62040bc9394d47411658b558"),
date: ISODate("2021-03-13T00:00:00.000Z"),
score: 85
}
]
Fill Data for Distinct Partitions填充不同分区的数据
Consider the previous example with restaurant reviews but instead of tracking a single restaurant, the collection now contains reviews for multiple restaurants.考虑一下前面的餐厅评论示例,但该集合现在包含多家餐厅的评论,而不是跟踪单个餐厅。
Create a collection named 创建一个名为restaurantReviewsMultiple
and populate the collection with these documents:restaurantReviewsMultiple
的集合使用以下文档对集合进行填充:
db.restaurantReviewsMultiple.insertMany( [
{
date: ISODate("2021-03-08"),
restaurant: "Joe's Pizza",
score: 90
},
{
date: ISODate("2021-03-08"),
restaurant: "Sally's Deli",
score: 75
},
{
date: ISODate("2021-03-09"),
restaurant: "Joe's Pizza",
score: 92
},
{
date: ISODate("2021-03-09"),
restaurant: "Sally's Deli"
},
{
date: ISODate("2021-03-10"),
restaurant: "Joe's Pizza"
},
{
date: ISODate("2021-03-10"),
restaurant: "Sally's Deli",
score: 68
},
{
date: ISODate("2021-03-11"),
restaurant: "Joe's Pizza",
score: 93
},
{
date: ISODate("2021-03-11"),
restaurant: "Sally's Deli"
}
] )
The 集合中的某些文档缺少score
field is missing for some of the documents in the collection.score
字段。
To populate the missing 要填充缺失的score
fields and ensure that there are no gaps in the data, use $fill
. score
字段并确保数据中没有空白,请使用$fill
。In the following example, 在以下示例中,$fill
uses the locf
fill method to fill the missing score
values with the previous score
in the sequence:$fill
使用locf
填充方法用序列中的前一个score
填充缺失的score
值:
db.restaurantReviewsMultiple.aggregate( [
{
$fill:
{
sortBy: { date: 1 },
partitionBy: { "restaurant": "$restaurant" },
output:
{
"score": { method: "locf" }
}
}
}
] )
In the preceding pipeline:在前面的管道中:
$fill
fills in missing填写缺失的score
values.score
值。sortBy: { date: 1 }
sorts the documents by the按照date
field in ascending order, from earliest to latest.date
字段从最早到最晚的升序对文档进行排序。partitionBy: { "restaurant": "$restaurant" }
partitions the data by按restaurant
. There are two restaurants:Joe's Pizza
andSally's Deli
.restaurant
划分数据。有两家餐馆:Joe's Pizza
和Sally's Deli
。output
specifies:指定:score
as the field for which to fill in missing values.作为要填充缺失值的字段。{ method: "locf" }
as the fill method.作为填充方法。Thelocf
fill method fills missingscore
values with the last observedscore
in the sequence.locf
填充方法使用序列中最后一次观察到的score
来填充缺失的分数值。
Example output:示例输出:
[
{
_id: ObjectId("620559f4394d47411658b58f"),
date: ISODate("2021-03-08T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 90
},
{
_id: ObjectId("620559f4394d47411658b591"),
date: ISODate("2021-03-09T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 92
},
{
_id: ObjectId("620559f4394d47411658b593"),
date: ISODate("2021-03-10T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 92
},
{
_id: ObjectId("620559f4394d47411658b595"),
date: ISODate("2021-03-11T00:00:00.000Z"),
restaurant: "Joe's Pizza",
score: 93
},
{
_id: ObjectId("620559f4394d47411658b590"),
date: ISODate("2021-03-08T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 75
},
{
_id: ObjectId("620559f4394d47411658b592"),
date: ISODate("2021-03-09T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 75
},
{
_id: ObjectId("620559f4394d47411658b594"),
date: ISODate("2021-03-10T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 68
},
{
_id: ObjectId("620559f4394d47411658b596"),
date: ISODate("2021-03-11T00:00:00.000Z"),
restaurant: "Sally's Deli",
score: 68
}
]
Indicate if a Field was Populated Using $fill
指示字段是否使用$fill
填充
$fill
When you populate missing values, the output does not indicate if a value was populated with the 填充缺少的值时,输出不会指示值是用$fill
operator or if the value existed in the document originally. $fill
运算符填充的,还是该值最初存在于文档中。To distinguish between filled and preexisting values, you can use a 要区分填充值和预先存在的值,可以在$set
stage before $fill
and set a new field based on whether the value exists.$fill
之前使用$set
阶段,并根据值是否存在设置新字段。
For example, create a 例如,创建一个restaurantReviews
collection that contains review scores for a restaurant over time:restaurantReviews
集合,该集合包含一段时间内某家餐厅的评分:
db.restaurantReviews.insertMany( [
{
date: ISODate("2021-03-08"),
score: 90
},
{
date: ISODate("2021-03-09"),
score: 92
},
{
date: ISODate("2021-03-10")
},
{
date: ISODate("2021-03-11")
},
{
date: ISODate("2021-03-12"),
score: 85
},
{
date: ISODate("2021-03-13")
}
] )
The 集合中的某些文档缺少score
field is missing for some of the documents in the collection. score
字段。You can populate missing 您可以使用score
values with the $fill
operator.$fill
运算符填充缺失的score
值。
Create a pipeline to perform the following actions:创建一个管道以执行以下操作:
Add a new field to each document (using向每个文档添加一个新字段(使用$set
) indicating if the document'sscore
field exists prior to the$fill
operator populating values.$set
),指示在$fill
运算符填充值之前文档的score
字段是否存在。This new field is called这个新字段称为valueExisted
.valueExisted
。Populate missing用序列中最后一次观察到的score
values with the last observedscore
in the sequence. The fill methodlocf
stands for "last observation carried forward".score
填充缺失的score
值。填充法locf
代表“最后一次观测结转”。
The pipeline looks like this:管道如下所示:
db.restaurantReviews.aggregate( [
{
$set: {
"valueExisted": {
"$ifNull": [
{ "$toBool": { "$toString": "$score" } },
false
]
}
}
},
{
$fill: {
sortBy: { date: 1 },
output:
{
"score": { method: "locf" }
}
}
}
] )
Handling Values of Zero处理零值
In the 在$ifNull
expression, the score
values are converted to strings, then to booleans. $ifNull
表达式中,score
值会转换为字符串,然后转换为布尔值。The $toBool
expression always converts strings to true
. $toBool
表达式总是将字符串转换为true
。If the 如果未将score
values are not converted to strings, score
values of 0
will have valueExisted
set to false
.score
值转换为字符串,则0
的score
值的valueExisted
将设置为false
。
Output:输出:
[
{
_id: ObjectId("63595116b1fac2ee2e957f15"),
date: ISODate("2021-03-08T00:00:00.000Z"),
score: 90,
valueExisted: true
},
{
_id: ObjectId("63595116b1fac2ee2e957f16"),
date: ISODate("2021-03-09T00:00:00.000Z"),
score: 92,
valueExisted: true
},
{
_id: ObjectId("63595116b1fac2ee2e957f17"),
date: ISODate("2021-03-10T00:00:00.000Z"),
valueExisted: false,
score: 92
},
{
_id: ObjectId("63595116b1fac2ee2e957f18"),
date: ISODate("2021-03-11T00:00:00.000Z"),
valueExisted: false,
score: 92
},
{
_id: ObjectId("63595116b1fac2ee2e957f19"),
date: ISODate("2021-03-12T00:00:00.000Z"),
score: 85,
valueExisted: true
},
{
_id: ObjectId("63595116b1fac2ee2e957f1a"),
date: ISODate("2021-03-13T00:00:00.000Z"),
valueExisted: false,
score: 85
}
]