$bucketAuto (aggregation)

~~On this page~~本页内容

~~Definition~~定义
~~Considerations~~注意事项
~~Behavior~~行为
~~Example~~实例

Definition定义

$bucketAuto

~~Categorizes incoming documents into a specific number of groups, called buckets, based on a specified expression.~~ 根据指定的表达式，将传入文档分类为特定数量的组，称为桶。~~Bucket boundaries are automatically determined in an attempt to evenly distribute the documents into the specified number of buckets.~~桶边界是自动确定的，目的是将文档平均分配到指定数量的桶中。

~~Each bucket is represented as a document in the output. The document for each bucket contains:~~每个桶在输出中表示为一个文档。每个桶的文档包含：

~~An _id object that specifies the bounds of the bucket.~~一个_id对象，用于指定桶的边界。
- ~~The _id.min field specifies the inclusive lower bound for the bucket.~~_id.min字段指定桶的包含下界。
- ~~The _id.max field specifies the upper bound for the bucket.~~ _id.max字段指定桶的上限。~~This bound is exclusive for all buckets except the final bucket in the series, where it is inclusive.~~此绑定对于除系列中的最后一个桶之外的所有桶都是独占的，在该桶中它是包含的。
~~A count field that contains the number of documents in the bucket.~~ count字段，包含桶中的文档数。~~The count field is included by default when the output document is not specified.~~如果未指定output文档，则默认情况下会包含count字段。

~~The $bucketAuto stage has the following form:~~$bucketAuto阶段具有以下形式：

{
  $bucketAuto: {
      groupBy: <expression>,
      buckets: <number>,
      output: {
         <output1>: { <$accumulator expression> },
         ...
      }
      granularity: <string>
  }
}

~~Field~~字段 ~~Type~~类型 ~~Description~~描述

groupBy expression ~~An expression to group documents by.~~ 文档分组依据的表达式。~~To specify a field path, prefix the field name with a dollar sign $ and enclose it in quotes.~~若要指定字段路径，请在字段名称前面加上美元符号$，并将其括在引号中。

buckets integer ~~A positive 32-bit integer that specifies the number of buckets into which input documents are grouped.~~一个32位正整数，用于指定将输入文档分组到的桶数。

output

document

~~Optional.~~可选的。~~A document that specifies the fields to include in the output documents in addition to the _id field.~~ 除了_id字段外，还指定要包含在输出文档中的字段的文档。~~To specify the field to include, you must use accumulator expressions:~~ 要指定要包含的字段，必须使用累加器表达式：

<outputfield1>: { <accumulator>: <expression1> },
...

~~The default count field is not included in the output document when output is specified. Explicitly specify the count expression as part of the output document to include it:~~ 指定output时，输出文档中不包括默认count字段。将count表达式明确指定为输出文档的一部分，以包含它：

output: {
  <outputfield1>: { <accumulator>: <expression1> },
  ...
  count: { $sum: 1 }
}

granularity

string

~~Optional.~~可选的。~~A string that specifies the preferred number series to use to ensure that the calculated boundary edges end on preferred round numbers or their powers of 10.~~一个字符串，指定要使用的首选数字系列

，以确保计算的边界边以首选整数或其10的幂结束。
~~Available only if the all groupBy values are numeric and none of them are NaN.~~仅当所有groupBy值都是数字并且没有一个是NaN时才可用。
~~The supported values of granularity are:~~ 支持的granularity值为：


`"R5"` `"R10"` `"R20"` `"R40"` `"R80"` `"1-2-5"`	`"E6"` `"E12"` `"E24"` `"E48"` `"E96"` `"E192"` `"POWERSOF2"`

Considerations注意事项

`$bucketAuto` and Memory Restrictions和内存限制

~~The $bucketAuto stage has a limit of 100 megabytes of RAM.~~ $bucketAuto阶段的RAM限制为100兆字节。~~By default, if the stage exceeds this limit, $bucketAuto returns an error.~~ 默认情况下，如果阶段超过此限制，$bucketAuto将返回一个错误。~~To allow more space for stage processing, use the allowDiskUse option to enable aggregation pipeline stages to write data to temporary files.~~若要为阶段处理留出更多空间，请使用allowDiskUse选项启用聚合管道阶段以将数据写入临时文件。

Tip

Behavior行为

~~There may be less than the specified number of buckets if:~~如果出现以下情况，则可能存在少于指定数量的桶：

~~The number of input documents is less than the specified number of buckets.~~输入文档的数量小于指定的桶数量。
~~The number of unique values of the groupBy expression is less than the specified number of buckets.~~groupBy表达式的唯一值数小于指定的buckets数目。
~~The granularity has fewer intervals than the number of buckets.~~granularity的间隔少于buckets的数量。
~~The granularity is not fine enough to evenly distribute documents into the specified number of buckets.~~granularity不够细，无法将文档均匀分布到指定数量的buckets中。

~~If the groupBy expression refers to an array or document, the values are arranged using the same ordering as in $sort before determining the bucket boundaries.~~如果groupBy表达式引用数组或文档，则在确定桶边界之前，将使用与$sort中相同的顺序排列值。

~~The even distribution of documents across buckets depends on the cardinality, or the number of unique values, of the groupBy field.~~ 文档在桶之间的均匀分布取决于groupBy字段的基数或唯一值的数量。~~If the cardinality is not high enough, the $bucketAuto stage may not evenly distribute the results across buckets.~~如果基数不够高，$bucketAuto阶段可能无法将结果均匀地分布在桶中。

Granularity粒度

~~The $bucketAuto accepts an optional granularity parameter which ensures that the boundaries of all buckets adhere to a specified preferred number series.~~ $bucketAuto接受一个可选的granularity参数，该参数确保所有桶的边界都符合指定的首选数字序列。~~Using a preferred number series provides more control on where the bucket boundaries are set among the range of values in the groupBy expression.~~ 使用首选数字序列可以更好地控制groupBy表达式中值范围中的桶边界设置位置。~~They may also be used to help logarithmically and evenly set bucket boundaries when the range of the groupBy expression scales exponentially.~~当groupBy表达式的范围按指数缩放时，它们也可以用于帮助以对数和均匀的方式设置桶边界。

Renard SeriesRenard系列

The Renard number series are sets of numbers derived by taking either the 5 ^th, 10 ^th, 20 ^th, 40 ^th, or 80 ^th root of 10, then including various powers of the root that equate to values between 1.0 to 10.0 (10.3 in the case of R80).雷纳德数系列是通过取10的第5、第10、第20、第40或第80个根导出的数集，然后包括等于1.0到10.0之间的值的根的各种幂（在R80的情况下为10.3）。

~~Set granularity to R5, R10, R20, R40, or R80 to restrict bucket boundaries to values in the series.~~ 将granularity设置为R5、R10、R20、R40或R80，以将桶边界限制为系列中的值。~~The values of the series are multiplied by a power of 10 when the groupBy values are outside of the 1.0 to 10.0 (10.3 for R80) range.~~当groupBy值在1.0到10.0（R80为10.3）范围之外时，该系列的值将乘以10的幂。

Example

~~The R5 series is based off of the fifth root of 10, which is 1.58, and includes various powers of this root (rounded) until 10 is reached.~~ R5系列基于10的第五个根，即1.58，并包括该根的各种幂（四舍五入），直到达到10。~~The R5 series is derived as follows:~~R5系列的推导如下：

10 ^0/5 = 1
10 ^1/5 = 1.584 ~ 1.6
10 ^2/5 = 2.511 ~ 2.5
10 ^3/5 = 3.981 ~ 4.0
10 ^4/5 = 6.309 ~ 6.3
10 ^5/5 = 10

~~The same approach is applied to the other Renard series to offer finer granularity, i.e., more intervals between 1.0 and 10.0 (10.3 for R80).~~同样的方法也应用于其他Renard系列，以提供更精细的粒度，即1.0和10.0之间的更多间隔（R80为10.3）。

E SeriesE系列

The E number series are similar to the Renard series in that they subdivide the interval from 1.0 to 10.0 by the 6 ^th, 12 ^th, 24 ^th, 48 ^th, 96 ^th, or 192 ^nd root of ten with a particular relative error.E数系列与Renard系列相似，因为它们将1.0到10.0的区间细分为10的第6、12、24、48、96或192次根，并具有特定的相对误差。

~~Set granularity to E6, E12, E24, E48, E96, or E192 to restrict bucket boundaries to values in the series.~~ 将granularity设置为E6、E12、E24、E48、E96或E192，以将桶边界限制为系列中的值。~~The values of the series are multiplied by a power of 10 when the groupBy values are outside of the 1.0 to 10.0 range.~~ 当groupBy值在1.0到10.0的范围之外时，该系列的值将乘以10的幂。~~To learn more about the E-series and their respective relative errors, see preferred number series.~~要了解有关E系列及其各自相对误差的更多信息，请参阅首选数字系列。

1-2-5 Series系列

~~The 1-2-5 series behaves like a three-value Renard series, if such a series existed.~~1-2-5级数的行为类似于一个三值Renard级数，如果这样的级数存在的话。

~~Set granularity to 1-2-5 to restrict bucket boundaries to various powers of the third root of 10, rounded to one significant digit.~~将granularity设置为1-2-5，将桶边界限制为10的三次方，四舍五入到一个有效数字。

Example

~~The following values are part of the 1-2-5 series:~~ 以下值是1-2-5系列的一部分：0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000, and so on...

Powers of Two Series两个级数的幂

~~Set granularity to POWERSOF2 to restrict bucket boundaries to numbers that are a power of two.~~将granularity设置为POWERSOF2，将桶边界限制为2的幂。

Example

~~The following numbers adhere to the power of two Series:~~以下数字遵循两个系列的幂：

2 ⁰ = 1
2 ¹ = 2
2 ² = 4
2 ³ = 8
2 ⁴ = 16
2 ⁵ = 32
~~and so on...~~等等

~~A common implementation is how various computer components, like memory, often adhere to the POWERSOF2 set of preferred numbers:~~一个常见的实现是，各种计算机组件（如内存）通常遵循POWERSOF2首选数字集：

1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, and so on....

Comparing Different Granularities比较不同粒度

The following operation demonstrates how specifying different values for granularity affects how $bucketAuto determines bucket boundaries. A collection of things have an _id numbered from 0 to 99:以下操作演示了为granularity指定不同值如何影响$bucketAuto确定桶边界。一个things集合的_id编号从0到99：

{ _id: 0 }
{ _id: 1 }
...
{ _id: 99 }

~~Different values for granularity are substituted into the following operation:~~将不同的granularity值代入以下操作：

db.things.aggregate( [
  {
    $bucketAuto: {
      groupBy: "$_id",
      buckets: 5,
      granularity: <granularity>
    }
  }
] )

~~The results in the following table demonstrate how different values for granularity yield different bucket boundaries:~~下表中的结果展示了granularity的不同值如何产生不同的桶边界：

~~Granularity~~粒度	~~Results~~后果	~~Notes~~备注
~~No granularity~~无粒度	`{ "_id" : { "min" : 0, "max" : 20 }, "count" : 20 }` `{ "_id" : { "min" : 20, "max" : 40 }, "count" : 20 }` `{ "_id" : { "min" : 40, "max" : 60 }, "count" : 20 }` `{ "_id" : { "min" : 60, "max" : 80 }, "count" : 20 }` `{ "_id" : { "min" : 80, "max" : 99 }, "count" : 20 }`
R20	`{ "_id" : { "min" : 0, "max" : 20 }, "count" : 20 }` `{ "_id" : { "min" : 20, "max" : 40 }, "count" : 20 }` `{ "_id" : { "min" : 40, "max" : 63 }, "count" : 23 }` `{ "_id" : { "min" : 63, "max" : 90 }, "count" : 27 }` `{ "_id" : { "min" : 90, "max" : 100 }, "count" : 10 }`
E24	`{ "_id" : { "min" : 0, "max" : 20 }, "count" : 20 }` `{ "_id" : { "min" : 20, "max" : 43 }, "count" : 23 }` `{ "_id" : { "min" : 43, "max" : 68 }, "count" : 25 }` `{ "_id" : { "min" : 68, "max" : 91 }, "count" : 23 }` `{ "_id" : { "min" : 91, "max" : 100 }, "count" : 9 }`
1-2-5	`{ "_id" : { "min" : 0, "max" : 20 }, "count" : 20 }` `{ "_id" : { "min" : 20, "max" : 50 }, "count" : 30 }` `{ "_id" : { "min" : 50, "max" : 100 }, "count" : 50 }`	~~The specified number of buckets exceeds the number of intervals in the series.~~指定的桶数超过了序列中的间隔数。
POWERSOF2	`{ "_id" : { "min" : 0, "max" : 32 }, "count" : 32 }` `{ "_id" : { "min" : 32, "max" : 64 }, "count" : 32 }` `{ "_id" : { "min" : 64, "max" : 128 }, "count" : 36 }`	~~The specified number of buckets exceeds the number of intervals in the series.~~指定的桶数超过了序列中的间隔数。

Example实例

~~Consider a collection artwork with the following documents:~~考虑一件带有以下文件的集合artwork：

{ "_id" : 1, "title" : "The Pillars of Society", "artist" : "Grosz", "year" : 1926,
    "price" : NumberDecimal("199.99"),
    "dimensions" : { "height" : 39, "width" : 21, "units" : "in" } }
{ "_id" : 2, "title" : "Melancholy III", "artist" : "Munch", "year" : 1902,
    "price" : NumberDecimal("280.00"),
    "dimensions" : { "height" : 49, "width" : 32, "units" : "in" } }
{ "_id" : 3, "title" : "Dancer", "artist" : "Miro", "year" : 1925,
    "price" : NumberDecimal("76.04"),
    "dimensions" : { "height" : 25, "width" : 20, "units" : "in" } }
{ "_id" : 4, "title" : "The Great Wave off Kanagawa", "artist" : "Hokusai",
    "price" : NumberDecimal("167.30"),
    "dimensions" : { "height" : 24, "width" : 36, "units" : "in" } }
{ "_id" : 5, "title" : "The Persistence of Memory", "artist" : "Dali", "year" : 1931,
    "price" : NumberDecimal("483.00"),
    "dimensions" : { "height" : 20, "width" : 24, "units" : "in" } }
{ "_id" : 6, "title" : "Composition VII", "artist" : "Kandinsky", "year" : 1913,
    "price" : NumberDecimal("385.00"),
    "dimensions" : { "height" : 30, "width" : 46, "units" : "in" } }
{ "_id" : 7, "title" : "The Scream", "artist" : "Munch",
    "price" : NumberDecimal("159.00"),
    "dimensions" : { "height" : 24, "width" : 18, "units" : "in" } }
{ "_id" : 8, "title" : "Blue Flower", "artist" : "O'Keefe", "year" : 1918,
    "price" : NumberDecimal("118.42"),
    "dimensions" : { "height" : 24, "width" : 20, "units" : "in" } }

Single Facet Aggregation单位面聚合

~~In the following operation, input documents are grouped into four buckets according to the values in the price field:~~在以下操作中，输入的单据根据price字段中的值分为四个桶：

db.artwork.aggregate( [
   {
     $bucketAuto: {
         groupBy: "$price",
         buckets: 4
     }
   }
] )

~~The operation returns the following documents:~~该操作返回以下文档：

{
  "_id" : {
    "min" : NumberDecimal("76.04"),
    "max" : NumberDecimal("159.00")
  },
  "count" : 2
}
{
  "_id" : {
    "min" : NumberDecimal("159.00"),
    "max" : NumberDecimal("199.99")
  },
  "count" : 2
}
{
  "_id" : {
    "min" : NumberDecimal("199.99"),
    "max" : NumberDecimal("385.00")
  },
  "count" : 2
}
{
  "_id" : {
    "min" : NumberDecimal("385.00"),
    "max" : NumberDecimal("483.00")
  },
  "count" : 2
}

Multi-Faceted Aggregation多位面聚合

~~The $bucketAuto stage can be used within the $facet stage to process multiple aggregation pipelines on the same set of input documents from artwork.~~$bucketAuto阶段可以在$facet阶段中用于处理来自artwork的同一组输入文档上的多个聚合管道。

~~The following aggregation pipeline groups the documents from the artwork collection into buckets based on price, year, and the calculated area:~~以下聚合管道根据price、year和计算所得的area将artwork集合中的文档分组为多个桶：

db.artwork.aggregate( [
  {
    $facet: {
      "price": [
        {
          $bucketAuto: {
            groupBy: "$price",
            buckets: 4
          }
        }
      ],
      "year": [
        {
          $bucketAuto: {
            groupBy: "$year",
            buckets: 3,
            output: {
              "count": { $sum: 1 },
              "years": { $push: "$year" }
            }
          }
        }
      ],
      "area": [
        {
          $bucketAuto: {
            groupBy: {
              $multiply: [ "$dimensions.height", "$dimensions.width" ]
            },
            buckets: 4,
            output: {
              "count": { $sum: 1 },
              "titles": { $push: "$title" }
            }
          }
        }
      ]
    }
  }
] )

~~The operation returns the following document:~~该操作返回以下文档：

{
  "area" : [
    {
      "_id" : { "min" : 432, "max" : 500 },
      "count" : 3,
      "titles" : [
        "The Scream",
        "The Persistence of Memory",
        "Blue Flower"
      ]
    },
    {
      "_id" : { "min" : 500, "max" : 864 },
      "count" : 2,
      "titles" : [
        "Dancer",
        "The Pillars of Society"
      ]
    },
    {
      "_id" : { "min" : 864, "max" : 1568 },
      "count" : 2,
      "titles" : [
        "The Great Wave off Kanagawa",
        "Composition VII"
      ]
    },
    {
      "_id" : { "min" : 1568, "max" : 1568 },
      "count" : 1,
      "titles" : [
        "Melancholy III"
      ]
    }
  ],
  "price" : [
    {
      "_id" : { "min" : NumberDecimal("76.04"), "max" : NumberDecimal("159.00") },
      "count" : 2
    },
    {
      "_id" : { "min" : NumberDecimal("159.00"), "max" : NumberDecimal("199.99") },
      "count" : 2
    },
    {
      "_id" : { "min" : NumberDecimal("199.99"), "max" : NumberDecimal("385.00") },
      "count" : 2 },
    {
      "_id" : { "min" : NumberDecimal("385.00"), "max" : NumberDecimal("483.00") },
      "count" : 2
    }
  ],
  "year" : [
    { "_id" : { "min" : null, "max" : 1913 }, "count" : 3, "years" : [ 1902 ] },
    { "_id" : { "min" : 1913, "max" : 1926 }, "count" : 3, "years" : [ 1913, 1918, 1925 ] },
    { "_id" : { "min" : 1926, "max" : 1931 }, "count" : 2, "years" : [ 1926, 1931 ] }
  ]
}

← $bucket (aggregation) $changeStream (aggregation) →

$bucketAuto (aggregation)

Definition定义

Considerations注意事项

$bucketAuto and Memory Restrictions和内存限制

See also: 另请参阅：

Behavior行为

Granularity粒度

Renard SeriesRenard系列

E SeriesE系列

1-2-5 Series系列

Powers of Two Series两个级数的幂

Comparing Different Granularities比较不同粒度

Example实例

Single Facet Aggregation单位面聚合

Multi-Faceted Aggregation多位面聚合

`$bucketAuto` and Memory Restrictions和内存限制