$percentile (aggregation)

~~On this page~~本页内容

~~Definition~~定义
~~Syntax~~语法
~~Command Fields~~命令字段
~~Behavior~~行为
~~Examples~~实例
~~Learn More~~了解更多信息

Definition定义

$percentile

~~New in version 7.0.~~ 7.0版新增。

~~Returns an array of scalar values that correspond to specified percentile values.~~返回与指定百分位数相对应的标量值数组。

~~You can use $percentile as an accumulator in the $group stage or as an aggegation expression.~~您可以将$percentage用作$group阶段中的累加器或聚合表达式。

Syntax语法

~~The syntax for $percentile is:~~$percentage的语法为：

{
   $percentile: {
      input: <expression>,
      p: [ <expression1>, <expression2>, ... ],
      method: <string>
   }
}

Command Fields命令字段

$percentile ~~takes the following fields:~~采用以下字段：

~~Field~~字段	~~Type~~类型	~~Necessity~~必要性	~~Description~~描述
`input`	Expression	~~Required~~必要的	`$percentile` ~~calculates the percentile values of this data.~~ 计算该数据的百分位数。~~`input` must be a field name or an expression that evaluates to a numeric type.~~ `input`必须是字段名或计算结果为数字类型的表达式。~~If the expression cannot be converted to a numeric type, the `$percentile` calculation ignores it~~如果表达式无法转换为数字类型，则`$percentage`计算会忽略它.
`p`	Expression	~~Required~~必要的	`$percentile` ~~calculates a percentile value for each element in `p`.~~ 计算`p`中每个元素的百分比值。~~The elements represent percentages and must evaluate to numeric values in the range `0.0` to `1.0`, inclusive.~~元素表示百分比，并且必须计算为`0.0`到`1.0`（包括`0.0`和`1.0`）范围内的数值。 `$percentile` ~~returns results in the same order as the elements in `p`.~~ 按与`p`中元素相同的顺序返回结果。
`method`	String	~~Required~~必要的	~~The method that `mongod` uses to calculate the percentile value.~~ `mongod`用于计算百分位数的方法。~~The method must be `'approximate'`.~~该方法必须是`'approximate'`。

Behavior行为

~~You can use $percentile in:~~您可以在以下位置使用$percentage：

$group ~~stages as an accumulator~~阶段作为累加器
$setWindowFields ~~stages as an accumulator~~阶段作为累加器
$project ~~stages as an aggregation expression~~阶段作为聚合表达式

$percentile ~~has the following characteristics as an accumulator, it:~~作为累加器，它具有以下特性：

~~Calculates a single result for all the documents in the stage.~~计算阶段中所有文档的单个结果。
~~Uses the t-digest algorithm to calculate approximate, percentile based metrics.~~使用t-摘要算法计算基于百分比的近似度量。
~~Uses approximate methods to scale to large volumes of data.~~使用近似方法扩展到大量数据。

$percentile ~~has the following characteristics as an aggregation expression, it:~~作为聚合表达式，它具有以下特性：

~~Accepts an array as input~~接受数组作为输入
~~Calculates a separate result for each input document~~为每个输入文档计算单独的结果

Type of Operation操作类型

~~In a $group stage, $percentile is an accumulator and calculates a value for all documents in the window.~~在$group阶段，$percentage是一个累加器，用于计算窗口中所有文档的值。

~~In a $project stage, $percentile is an aggregation expression and calculates values for each document.~~在$project阶段中，$percentage是一个聚合表达式，用于计算每个文档的值。

~~In $setWindowFields stages, $percentile returns a result for each document like an aggregation expression, but the results are computed over groups of documents like an accumulator.~~在$setWindowFields阶段中，$percentage为每个文档返回一个结果，类似于聚合表达式，但结果是在文档组上计算的，类似于累加器。

Calculation Considerations计算注意事项

~~In $group stages, $percentile always uses an approximate calculation method.~~在$group阶段，$percentage始终使用近似计算方法。

~~In $project stages, $percentile might use the discrete calculation method even when the approximate method is specified.~~在$project阶段，即使指定了近似方法，$percentage也可能使用离散计算方法。

~~In $setWindowFields stages, the workload determines the calculation method that $percentile uses.~~在$setWindowFields阶段中，工作负载决定了$percentage使用的计算方法。

~~The computed percentiles $percentile returns might vary, even on the same datasets. This is because the algorithm calculates approximate values.~~即使在相同的数据集上，计算的百分位数$percentage返回值也可能有所不同。这是因为该算法计算近似值。

~~Duplicate samples can cause ambiguity.~~ 重复的样本可能导致歧义。~~If there are a large number of duplicates, the percentile values may not represent the actual sample distribution.~~ 如果存在大量重复，则百分位数可能无法代表实际的样本分布。~~Consider a data set where all the samples are the same.~~ 考虑一个所有样本都相同的数据集。~~All of the values in the data set fall at or below any percentile.~~ 数据集中的所有值都位于或低于任何百分位数。~~A "50th percentile" value would actually represent either 0 or 100 percent of the samples.~~“第50个百分位”的值实际上代表0或100%的样本。

$percentile ~~returns the minimum value for p = 0.0.~~返回p=0.0时的最小值。

$percentile ~~returns the maximum value for p = 1.0.~~返回p=1.0时的最大值。

Array Input数组输入

~~If you use $percentile as an aggregation expression in a $project stage, you can use an array as input. The syntax is:~~如果在$project阶段中使用$percentage作为聚合表达式，则可以使用数组作为输入。语法为：

{
   $percentile: {
      input: [ <expression1, <expression2>, .., <expressionN> ],
      p: [ <expression1>, <expression2>, ... ],
      method: <string>
   }
}

Window Functions窗口函数

~~A window function lets you calculate results over a moving "window" of neighboring documents.~~ 窗口函数使您可以在相邻文档的移动“窗口”上计算结果。~~As each document passes though the pipeline, the $setWindowFields stage:~~当每个文档通过管道时，$setWindowFields阶段：

~~Recomputes the set of documents in the current window~~在当前窗口中重新计算文档集
~~calculates a value for all documents in the set~~计算集合中所有文档的值
~~returns a single value for that document~~返回该文档的单个值

~~You can use $percentile in a $setWindowFields stage to calculate rolling statistics for time series or other related data.~~您可以在$setWindowFields阶段中使用$percentage来计算时间序列或其他相关数据的滚动统计信息。

~~When you use $percentile in a $setWindowField stage, the input value must be a field name. If you enter an array instead of a field name, the operation fails.~~在$setWindowField阶段中使用$percentage时，input值必须是字段名。如果输入的是数组而不是字段名，则操作将失败。

Examples实例

~~The following examples use the testScores collection.~~ 以下示例使用testScores集合。~~Create the collection:~~创建集合：

db.testScores.insertMany( [
   { studentId: "2345", test01: 62, test02: 81, test03: 80 },
   { studentId: "2356", test01: 60, test02: 83, test03: 79 },
   { studentId: "2358", test01: 67, test02: 82, test03: 78 },
   { studentId: "2367", test01: 64, test02: 72, test03: 77 },
   { studentId: "2369", test01: 60, test02: 53, test03: 72 }
] )

Calculate a Single Value as an Accumulator将单个值计算为累加器

~~Create an accumulator that calculates a single percentile value:~~创建一个累加器，用于计算单个百分比值：

db.testScores.aggregate( [
   {
      $group: {
         _id: null,
         test01_percentiles: {
            $percentile: {
               input: "$test01",
               p: [ 0.95 ],
               method: 'approximate'
            }
         },
      }
   }
] )

~~Output:~~输出：

{ _id: null, test01_percentiles: [ 67 ] }

~~The _id field value is null so $group selects all the documents in the collection.~~_id字段值为null，因此$group选择集合中的所有文档。

~~The percentile accumulator takes its input data from the test01 field.~~percentile累加器从test01字段获取其输入数据。

~~In this example, the percentiles array, p, has one value so the $percentile operator only calculates one term for the test01 data.~~ 在本例中，百分位数数组p有一个值，因此$percentage运算符只为test01数据计算一个项。~~The 95th percentile value is 67.~~第95个百分位数为67。

Calculate Multiple Values as an Accumulator作为累加器计算多个值

~~Create an accumulator that calculates multiple percentile values:~~创建一个计算多个百分比值的累加器：

db.testScores.aggregate( [
   {
       $group: {
          _id: null,
          test01_percentiles: {
             $percentile: {
                input: "$test01",
                p: [ 0.5, 0.75, 0.9, 0.95 ],
                method: 'approximate'
             }
          },
          test02_percentiles: {
             $percentile: {
                input: "$test02",
                p: [ 0.5, 0.75, 0.9, 0.95 ],
                method: 'approximate'
             }
          },
          test03_percentiles: {
             $percentile: {
                input: "$test03",
                p: [ 0.5, 0.75, 0.9, 0.95 ],
                method: 'approximate'
             }
          },
          test03_percent_alt: {
             $percentile: {
                input: "$test03",
                p: [ 0.9, 0.5, 0.75, 0.95 ],
                method: 'approximate'
             }
          },
       }
    }
] )

~~Output:~~输出：

{
    _id: null,
   test01_percentiles: [ 62, 64, 67, 67 ],
   test02_percentiles: [ 81, 82, 83, 83 ],
   test03_percentiles: [ 78, 79, 80, 80 ],
   test03_percent_alt: [ 80, 78, 79, 80 ]
}

~~The _id field value is null so $group selects all the documents in the collection.~~_id字段值为null，因此$group选择集合中的所有文档。

~~The percentile accumulator calculates values for three fields, test01, test02, and test03.~~百分位数累加器计算test01、test02和test03三个字段的值。

~~The accumulator calculates the 50th, 75th, 90th, and 95th percentile values for each input field.~~累加器计算每个输入字段的第50、第75、第90和第95百分位数。

~~The percentile values are returned in the same order as the elements of p. The values in test03_percentiles and test03_percent_alt are the same, but their order is different.~~ 百分位数的返回顺序与p的元素相同。test03_percentiles和test03_persent_alt中的值相同，但顺序不同。~~The order of elements in each result array matches the corresponding order of elements in p.~~每个结果数组中元素的顺序与p中元素的相应顺序相匹配。

Use `$percentile` in a `$project` Stage在`$project`阶段中使用`$percentage`

~~You can use a field name or an array as input in a $project stage.~~您可以使用字段名或数组作为$project阶段中的输入。

db.testScores.aggregate( [
   {
      $project: {
         _id: 0,
         studentId: 1,
         testPercentiles: {
            $percentile: {
               input: [ "$test01", "$test02", "$test03" ],
               p: [ 0.5, 0.95 ],
               method: 'approximate'
            }
         }
      }
   }
] )

~~Output:~~输出：

{ studentId: '2345', testPercentiles: [ 80, 81 ] },
{ studentId: '2356', testPercentiles: [ 79, 83 ] },
{ studentId: '2358', testPercentiles: [ 78, 82 ] },
{ studentId: '2367', testPercentiles: [ 72, 77 ] },
{ studentId: '2369', testPercentiles: [ 60, 72 ] }

~~When $percentile is an aggregation expression there is a result for each studentId.~~当$percentage是一个聚合表达式时，每个studentId都有一个结果。

Use `$percentile` in a `$setWindowField` Stage在`$setWindowField`阶段中使用`$percentage`

~~To base your percentile values on local data trends, use $percentile in a $setWindowField aggregation pipeline stage.~~要根据本地数据趋势确定百分比值，请在$setWindowField聚合管道阶段中使用$percentage。

~~This example creates a window to filter scores:~~本例创建了一个筛选分数的窗口：

db.testScores.aggregate( [
   {
      $setWindowFields: {
         sortBy: { test01: 1 },
         output: {
            test01_95percentile: {
               $percentile: {
                  input: "$test01",
                  p: [ 0.95 ],
                  method: 'approximate'
               },
               window: {
                  range: [ -3, 3 ]
               }
            }
         }
      }
   },
   {
      $project: {
         _id: 0,
         studentId: 1,
         test01_95percentile: 1
      }
   }
] )

~~Output:~~输出：

{ studentId: '2356', test01_95percentile: [ 62 ] },
{ studentId: '2369', test01_95percentile: [ 62 ] },
{ studentId: '2345', test01_95percentile: [ 64 ] },
{ studentId: '2367', test01_95percentile: [ 67 ] },
{ studentId: '2358', test01_95percentile: [ 67 ] }

~~In this example, the percentile calculation for each document also incorporates data from the three documents before and after it.~~在本例中，每个文档的百分比计算还合并了之前和之后三个文档的数据。

Learn More了解更多信息

~~The $median operator is a special case of the $percentile operator that uses a fixed value of p: [ 0.5 ].~~$median运算符是使用固定值p:[0.5]的$percentage运算符的特殊情况。

~~For more information on window functions, see: $setWindowFields.~~有关窗口函数的详细信息，请参阅：$setWindowFields。

← $or (aggregation) $pow (aggregation) →

$percentile (aggregation)

Definition定义

Syntax语法

Command Fields命令字段

Behavior行为

Type of Operation操作类型

Calculation Considerations计算注意事项

Array Input数组输入

Window Functions窗口函数

Examples实例

Calculate a Single Value as an Accumulator将单个值计算为累加器

Calculate Multiple Values as an Accumulator作为累加器计算多个值

Use $percentile in a $project Stage在$project阶段中使用$percentage

Use $percentile in a $setWindowField Stage在$setWindowField阶段中使用$percentage

Learn More了解更多信息

Use `$percentile` in a `$project` Stage在`$project`阶段中使用`$percentage`

Use `$percentile` in a `$setWindowField` Stage在`$setWindowField`阶段中使用`$percentage`