On this page本页内容
$bucket
Categorizes incoming documents into groups, called buckets, based on a specified expression and bucket boundaries and outputs a document per each bucket. 根据指定的表达式和桶边界将传入文档分类为称为桶的组,并为每个桶输出一个文档。Each output document contains an 每个输出文档都包含一个_id
field whose value specifies the inclusive lower bound of the bucket. _id
字段,其值指定桶的包含下限。The output option specifies the fields included in each output document.输出选项指定每个输出文档中包含的字段。
$bucket
only produces output documents for buckets that contain at least one input document.仅为包含至少一个输入文档的存储桶生成输出文档。
$bucket
The $bucket
stage has a limit of 100 megabytes of RAM. $bucket
阶段的RAM限制为100兆字节。By default, if the stage exceeds this limit, 默认情况下,如果阶段超过此限制,$bucket
returns an error. $bucket
将返回一个错误。To allow more space for stage processing, use the allowDiskUse option to enable aggregation pipeline stages to write data to temporary files.要为阶段处理留出更多空间,请使用allowDiskUse
选项启用聚合管道阶段将数据写入临时文件。
{ $bucket: { groupBy: <expression>, boundaries: [ <lowerbound1>, <lowerbound2>, ... ], default: <literal>, output: { <output1>: { <$accumulator expression> }, ... <outputN>: { <$accumulator expression> } } } }
The $bucket
document contains the following fields:$bucket
文档包含以下字段:
groupBy | expression |
|
boundaries | array |
|
default | literal |
|
output | document |
<outputfield1>: { <accumulator>: <expression1> }, ... <outputfieldN>: { <accumulator>: <expressionN> }
|
$bucket
requires at least one of the following conditions to be met or the operation throws an error:要求至少满足以下条件之一,否则操作将抛出错误:
groupBy
表达式解析为boundaries
指定的桶范围之一内的值,或者groupBy
values are outside of the boundaries
or of a different BSON type than the values in boundaries
.groupBy
值超出boundaries
或与boundaries
中的值不同的BSON类型的bucket文档,将指定一个default
值。If the 如果groupBy
expression resolves to an array or a document, $bucket
arranges the input documents into buckets using the comparison logic from $sort
.groupBy
表达式解析为数组或文档,$bucket
使用$sort
的比较逻辑将输入文档排列到bucket中。
In 在mongosh
, create a sample collection named artists
with the following documents:mongosh
中,使用以下文档创建名为artists
的样本集合:
db.artists.insertMany([ { "_id" : 1, "last_name" : "Bernard", "first_name" : "Emil", "year_born" : 1868, "year_died" : 1941, "nationality" : "France" }, { "_id" : 2, "last_name" : "Rippl-Ronai", "first_name" : "Joszef", "year_born" : 1861, "year_died" : 1927, "nationality" : "Hungary" }, { "_id" : 3, "last_name" : "Ostroumova", "first_name" : "Anna", "year_born" : 1871, "year_died" : 1955, "nationality" : "Russia" }, { "_id" : 4, "last_name" : "Van Gogh", "first_name" : "Vincent", "year_born" : 1853, "year_died" : 1890, "nationality" : "Holland" }, { "_id" : 5, "last_name" : "Maurer", "first_name" : "Alfred", "year_born" : 1868, "year_died" : 1932, "nationality" : "USA" }, { "_id" : 6, "last_name" : "Munch", "first_name" : "Edvard", "year_born" : 1863, "year_died" : 1944, "nationality" : "Norway" }, { "_id" : 7, "last_name" : "Redon", "first_name" : "Odilon", "year_born" : 1840, "year_died" : 1916, "nationality" : "France" }, { "_id" : 8, "last_name" : "Diriks", "first_name" : "Edvard", "year_born" : 1855, "year_died" : 1930, "nationality" : "Norway" } ])
The following operation groups the documents into buckets according to the 以下操作根据year_born
field and filters based on the count of documents in the buckets:year_born
字段将文档分组到存储桶中,并根据存储桶中的文档计数进行筛选:
db.artists.aggregate( [ // First Stage { $bucket: { groupBy: "$year_born", // Field to group by boundaries: [ 1840, 1850, 1860, 1870, 1880 ], // Boundaries for the buckets default: "Other", // Bucket id for documents which do not fall into a bucket output: { // Output for each bucket "count": { $sum: 1 }, "artists" : { $push: { "name": { $concat: [ "$first_name", " ", "$last_name"] }, "year_born": "$year_born" } } } } }, // Second Stage { $match: { count: {$gt: 3} } } ] )
The $bucket
stage groups the documents into buckets by the year_born
field. $bucket
阶段按照year_born
字段将文档分组到桶中。The buckets have the following boundaries:桶具有以下boundaries
:
1840
and exclusive upper bound 1850
.1840
和排除上界1850
。1850
and exclusive upper bound 1860
.1850
和排除上界1860
。1860
and exclusive upper bound 1870
.1860
和排除上界1870
。1870
and exclusive upper bound 1880
.1870
和排除上界1880
。year_born
field or its year_born
field was outside the ranges above, it would be placed in the 默认 bucket with the _id
value "Other"
.year_born
字段或其year_born
字段不在上述范围内,则它将被放置在default
存储桶中,其_id
值为"Other"
。The stage includes the output document to determine the fields to return:该阶段包括用于确定要返回的字段的输出文档:
_id | |
count | |
artists |
|
This stage passes the following documents to the next stage:此阶段将以下文件传递到下一阶段:
{ "_id" : 1840, "count" : 1, "artists" : [ { "name" : "Odilon Redon", "year_born" : 1840 } ] } { "_id" : 1850, "count" : 2, "artists" : [ { "name" : "Vincent Van Gogh", "year_born" : 1853 }, { "name" : "Edvard Diriks", "year_born" : 1855 } ] } { "_id" : 1860, "count" : 4, "artists" : [ { "name" : "Emil Bernard", "year_born" : 1868 }, { "name" : "Joszef Rippl-Ronai", "year_born" : 1861 }, { "name" : "Alfred Maurer", "year_born" : 1868 }, { "name" : "Edvard Munch", "year_born" : 1863 } ] } { "_id" : 1870, "count" : 1, "artists" : [ { "name" : "Anna Ostroumova", "year_born" : 1871 } ] }
The $match
stage filters the output from the previous stage to only return buckets which contain more than 3 documents.$match
阶段筛选前一阶段的输出,只返回包含3个以上文档的存储桶。
The operation returns the following document:运算返回以下文档:
{ "_id" : 1860, "count" : 4, "artists" : [ { "name" : "Emil Bernard", "year_born" : 1868 }, { "name" : "Joszef Rippl-Ronai", "year_born" : 1861 }, { "name" : "Alfred Maurer", "year_born" : 1868 }, { "name" : "Edvard Munch", "year_born" : 1863 } ] }
$bucket
与$facet
一起用于多个字段的bucketYou can use the 您可以使用$facet
stage to perform multiple $bucket
aggregations in a single stage.$facet
阶段在单个阶段中执行多个$bucket
聚合。
In 在mongosh
, create a sample collection named artwork
with the following documents:mongosh
中,使用以下文档创建名为artwork
的样本集合:
db.artwork.insertMany([ { "_id" : 1, "title" : "The Pillars of Society", "artist" : "Grosz", "year" : 1926, "price" : NumberDecimal("199.99") }, { "_id" : 2, "title" : "Melancholy III", "artist" : "Munch", "year" : 1902, "price" : NumberDecimal("280.00") }, { "_id" : 3, "title" : "Dancer", "artist" : "Miro", "year" : 1925, "price" : NumberDecimal("76.04") }, { "_id" : 4, "title" : "The Great Wave off Kanagawa", "artist" : "Hokusai", "price" : NumberDecimal("167.30") }, { "_id" : 5, "title" : "The Persistence of Memory", "artist" : "Dali", "year" : 1931, "price" : NumberDecimal("483.00") }, { "_id" : 6, "title" : "Composition VII", "artist" : "Kandinsky", "year" : 1913, "price" : NumberDecimal("385.00") }, { "_id" : 7, "title" : "The Scream", "artist" : "Munch", "year" : 1893 /* No price*/ }, { "_id" : 8, "title" : "Blue Flower", "artist" : "O'Keefe", "year" : 1918, "price" : NumberDecimal("118.42") } ])
The following operation uses two 以下操作使用$bucket
stages within a $facet
stage to create two groupings, one by price
and the other by year
:$facet
阶段中的两个$bucket
阶段创建两个分组,一个按price
,另一个按year
:
db.artwork.aggregate( [ { $facet: { // Top-level $facet stage "price": [ // Output field 1 { $bucket: { groupBy: "$price", // Field to group by boundaries: [ 0, 200, 400 ], // Boundaries for the buckets default: "Other", // Bucket id for documents which do not fall into a bucket output: { // Output for each bucket "count": { $sum: 1 }, "artwork" : { $push: { "title": "$title", "price": "$price" } }, "averagePrice": { $avg: "$price" } } } } ], "year": [ // Output field 2 { $bucket: { groupBy: "$year", // Field to group by boundaries: [ 1890, 1910, 1920, 1940 ], // Boundaries for the buckets default: "Unknown", // Bucket id for documents which do not fall into a bucket output: { // Output for each bucket "count": { $sum: 1 }, "artwork": { $push: { "title": "$title", "year": "$year" } } } } } ] } } ] )
The first facet groups the input documents by 第一个方面按price
. price
对输入文档进行分组。The buckets have the following boundaries:桶具有以下边界:
0
and exclusive upper bound 200
.0
和排除上界200
。200
and exclusive upper bound 400
.200
和排除上界400
。default
bucket containing documents without prices or prices outside the ranges above.default
桶。The $bucket
stage includes the output document to determine the fields to return:$bucket
阶段包括output
文档,用于确定要返回的字段:
_id | |
count | |
artwork | |
averagePrice | $avg operator to display the average price of all artwork in the bucket.$avg 运算符显示桶中所有艺术品的平均价格。 |
The second facet groups the input documents by 第二个方面按year
. year
对输入文档进行分组。The buckets have the following boundaries:桶具有以下边界:
1890
and exclusive upper bound 1910
.1890
和排除上界1910
。1910
and exclusive upper bound 1920
.1910
和排除上界1920
。1910
and exclusive upper bound 1940
.1910
和排除上界1940
。default
bucket containing documents without years or years outside the ranges above.default
存储桶中包含的文档没有年份或年份超出上述范围。The $bucket
stage includes the output document to determine the fields to return:$bucket
阶段包括输出文档,用于确定要返回的字段:
count | |
artwork |
The operation returns the following document:运算返回以下文档:
{ "price" : [ // Output of first facet { "_id" : 0, "count" : 4, "artwork" : [ { "title" : "The Pillars of Society", "price" : NumberDecimal("199.99") }, { "title" : "Dancer", "price" : NumberDecimal("76.04") }, { "title" : "The Great Wave off Kanagawa", "price" : NumberDecimal("167.30") }, { "title" : "Blue Flower", "price" : NumberDecimal("118.42") } ], "averagePrice" : NumberDecimal("140.4375") }, { "_id" : 200, "count" : 2, "artwork" : [ { "title" : "Melancholy III", "price" : NumberDecimal("280.00") }, { "title" : "Composition VII", "price" : NumberDecimal("385.00") } ], "averagePrice" : NumberDecimal("332.50") }, { // Includes documents without prices and prices greater than 400 "_id" : "Other", "count" : 2, "artwork" : [ { "title" : "The Persistence of Memory", "price" : NumberDecimal("483.00") }, { "title" : "The Scream" } ], "averagePrice" : NumberDecimal("483.00") } ], "year" : [ // Output of second facet { "_id" : 1890, "count" : 2, "artwork" : [ { "title" : "Melancholy III", "year" : 1902 }, { "title" : "The Scream", "year" : 1893 } ] }, { "_id" : 1910, "count" : 2, "artwork" : [ { "title" : "Composition VII", "year" : 1913 }, { "title" : "Blue Flower", "year" : 1918 } ] }, { "_id" : 1920, "count" : 3, "artwork" : [ { "title" : "The Pillars of Society", "year" : 1926 }, { "title" : "Dancer", "year" : 1925 }, { "title" : "The Persistence of Memory", "year" : 1931 } ] }, { // Includes documents without a year "_id" : "Unknown", "count" : 1, "artwork" : [ { "title" : "The Great Wave off Kanagawa" } ] } ] }