$linearFill (aggregation)
On this page
Definition
$linearFillNew in version 5.3.
Fills
nulland missing fields in a window using linear interpolation based on surrounding field values.$linearFillis only available in the$setWindowFieldsstage.
Syntax
The $linearFill expression has this syntax:
{ $linearFill: <expression> }
For more information on expressions, see Expressions.
Behavior
$linearFill fills null and missing fields using linear interpolation based on surrounding non-null field values. The surrounding field values are determined by the sort order specified in $setWindowFields.
-
$linearFillfillsnulland missing values proportionally spanning the value range between surrounding non-nullvalues. To determine the values for missing fields,$linearFilluses:-
The difference of surrounding non-
nullvalues. -
The number of
nullfields to fill between the surrounding values.
-
-
$linearFillcan fill multiple consecutivenullvalues if those values are preceded and followed by non-nullvalues according to the sort order specified in$setWindowFields.Example
If a collection contains these documents:
{ index: 0, value: 0 }, { index: 1, value: null }, { index: 2, value: null }, { index: 3, value: null }, { index: 4, value: 10 }After using
$linearFillto fill thenullvalues, the documents become:{ index: 0, value: 0 }, { index: 1, value: 2.5 }, { index: 2, value: 5 }, { index: 3, value: 7.5 }, { index: 4, value: 10 }For a complete example, see Examples.
-
nullvalues that are not preceded and followed by non-nullvalues remainnull.
Comparison of $fill and $linearFill
To fill missing field values using linear interpolation, you can use:
-
The
$fillstage with{ method: "linear" }.When you use the
$fillstage, the field you specify in the output is the same field used as the source data. See Fill Missing Field Values with Linear Interpolation. -
The
$linearFilloperator inside of a$setWindowFieldsstage.When you use the
$linearFilloperator, you can set values for a different field than the field used as the source data. See Use Multiple Fill Methods in a Single Stage.
Examples
The examples on this page use a stock collection that contains tracks a single company's stock price at hourly intervals:
db.stock.insertMany( [ { time: ISODate("2021-03-08T09:00:00.000Z"), price: 500 }, { time: ISODate("2021-03-08T10:00:00.000Z"), }, { time: ISODate("2021-03-08T11:00:00.000Z"), price: 515 }, { time: ISODate("2021-03-08T12:00:00.000Z") }, { time: ISODate("2021-03-08T13:00:00.000Z") }, { time: ISODate("2021-03-08T14:00:00.000Z"), price: 485 } ] )
The price field is missing for some of the documents in the collection.
Fill Missing Values with Linear Interpolation
To populate the missing price values using linear interpolation, use $linearFill inside of a $setWindowFields stage:
db.stock.aggregate( [ { $setWindowFields: { sortBy: { time: 1 }, output: { price: { $linearFill: "$price" } } } } ] )
In the example:
-
sortBy: { time: 1 }sorts the documents by thetimefield in ascending order, from earliest to latest. -
output specifies:
-
priceas the field for which to fill in missing values. -
{ $linearFill: "$price" }as the value for the missing field.$linearFillfills missingpricevalues using linear interpolation based on the surroundingpricevalues in the sequence.
-
Example output:
[
{
_id: ObjectId("620ad555394d47411658b5ef"),
time: ISODate("2021-03-08T09:00:00.000Z"),
price: 500
},
{
_id: ObjectId("620ad555394d47411658b5f0"),
time: ISODate("2021-03-08T10:00:00.000Z"),
price: 507.5
},
{
_id: ObjectId("620ad555394d47411658b5f1"),
time: ISODate("2021-03-08T11:00:00.000Z"),
price: 515
},
{
_id: ObjectId("620ad555394d47411658b5f2"),
time: ISODate("2021-03-08T12:00:00.000Z"),
price: 505
},
{
_id: ObjectId("620ad555394d47411658b5f3"),
time: ISODate("2021-03-08T13:00:00.000Z"),
price: 495
},
{
_id: ObjectId("620ad555394d47411658b5f4"),
time: ISODate("2021-03-08T14:00:00.000Z"),
price: 485
}
]Use Multiple Fill Methods in a Single Stage
When you use the $setWindowFields stage to fill missing values, you can set values for a different field than the field you fill from. As a result, you can use multiple fill methods in a single $setWindowFields stage and output the results in distinct fields.
The following pipeline populates missing price fields using linear interpolation and the last-observation-carried-forward method:
db.stock.aggregate( [ { $setWindowFields: { sortBy: { time: 1 }, output: { linearFillPrice: { $linearFill: "$price" }, locfPrice: { $locf: "$price" } } } } ] )
In the example:
-
sortBy: { time: 1 }sorts the documents by thetimefield in ascending order, from earliest to latest. -
output specifies:
-
linearFillPriceas a target field to be filled.-
{ $linearFill: "$price" }is the value for thelinearFillPricefield.$linearFillfills missingpricevalues using linear interpolation based on the surroundingpricevalues in the sequence.
-
-
locfPriceas a target field to be filled.-
{ $locf: "$price" }is the value for thelocfPricefield.locfstands for last observation carried forward.$locffills missingpricevalues with the value from the previous document in the sequence.
-
-
Example output:
[
{
_id: ObjectId("620ad555394d47411658b5ef"),
time: ISODate("2021-03-08T09:00:00.000Z"),
price: 500,
linearFillPrice: 500,
locfPrice: 500
},
{
_id: ObjectId("620ad555394d47411658b5f0"),
time: ISODate("2021-03-08T10:00:00.000Z"),
linearFillPrice: 507.5,
locfPrice: 500
},
{
_id: ObjectId("620ad555394d47411658b5f1"),
time: ISODate("2021-03-08T11:00:00.000Z"),
price: 515,
linearFillPrice: 515,
locfPrice: 515
},
{
_id: ObjectId("620ad555394d47411658b5f2"),
time: ISODate("2021-03-08T12:00:00.000Z"),
linearFillPrice: 505,
locfPrice: 515
},
{
_id: ObjectId("620ad555394d47411658b5f3"),
time: ISODate("2021-03-08T13:00:00.000Z"),
linearFillPrice: 495,
locfPrice: 515
},
{
_id: ObjectId("620ad555394d47411658b5f4"),
time: ISODate("2021-03-08T14:00:00.000Z"),
price: 485,
linearFillPrice: 485,
locfPrice: 485
}
]Restrictions
-
To use
$linearFill, you must use the sortBy field to sort your data. -
When using
$linearFillwindow function,$setWindowFieldsreturns an error if there are any repeated values in the sortBy field in a single partition.