On this page本页内容
$group
Groups input documents by the specified 按指定的_id
expression and for each distinct grouping, outputs a document. _id
表达式对输入文档进行分组,并为每个不同的分组输出一个文档。The 每个输出文档的_id
field of each output document contains the unique group by value. _id
字段包含唯一的group by值。The output documents can also contain computed fields that hold the values of some accumulator expression.输出文档还可以包含保存某些累加器表达式值的计算字段。
$group
does not order its output documents.不对其输出文档进行排序。
The $group
stage has the following prototype form:$group
阶段具有以下原型形式:
{ $group: { _id: <expression>, // Group By Expression <field1>: { <accumulator1> : <expression1> }, ... } }
_id | _id value of null, or any other constant value, the $group stage calculates accumulated values for all the input documents as a whole. _id 值指定为null 或任何其他常量值,$group 阶段将计算所有输入文档作为一个整体的累积值。 |
field |
The _id
and the accumulator operators can accept any valid expression
. _id
和累加器运算符可以接受任何有效表达式。For more information on expressions, see Expressions.有关表达式的详细信息,请参阅表达式。
The <accumulator>
operator must be one of the following accumulator operators:<accumulator>
运算符必须是以下累加器运算符之一:
Changed in version 5.0.在版本5.0中更改。
$accumulator | |
$addToSet |
|
$avg |
|
$bottom |
|
$bottomN |
|
$count |
|
$first |
|
$firstN |
|
$last |
|
$lastN |
|
$max |
|
$maxN |
|
$mergeObjects | |
$min |
|
$push |
|
$stdDevPop |
|
$stdDevSamp |
|
$sum |
|
$top |
|
$topN |
|
$group
The $group
stage has a limit of 100 megabytes of RAM. $group
阶段的RAM限制为100兆字节。By default, if the stage exceeds this limit, 默认情况下,如果阶段超过此限制,$group
returns an error. $group
将返回错误。To allow more space for stage processing, use the allowDiskUse option to enable aggregation pipeline stages to write data to temporary files.要为阶段处理提供更多空间,请使用allowDiskUse
选项启用聚合管道阶段将数据写入临时文件。
$group
This section describes optimizations to improve the performance of 本节介绍了提高$group
. $group
性能的优化。There are optimizations that you can make manually and optimizations MongoDB makes internally.您可以手动进行优化,也可以在MongoDB内部进行优化。
If a pipeline 如果管道按同一字段排序和分组,sorts
and groups
by the same field and the $group
stage only uses the $first
accumulator operator, consider adding an index on the grouped field which matches the sort order. $group
阶段仅使用$first
累加器运算符,请考虑在分组字段上添加与排序顺序匹配的索引。In some cases, the 在某些情况下,$group
stage can use the index to quickly find the first document of each group.$group
阶段可以使用索引快速查找每个组的第一个文档。
If a collection named 如果名为foo
contains an index { x: 1, y: 1 }
, the following pipeline can use that index to find the first document of each group:foo
的集合包含索引{ x: 1, y: 1 }
,则以下管道可以使用该索引查找每个组的第一个文档:
db.foo.aggregate([ { $sort:{ x : 1, y : 1 } }, { $group: { _id: { x : "$x" }, y: { $first : "$y" } } } ])
Starting in MongoDB 5.2, MongoDB uses the slot-based execution query engine to execute 从MongoDB 5.2开始,MongoDB使用基于时隙的执行查询引擎来执行$group
stages when $group
is either:$group
阶段,当$group
为:
$match
followed by $group
, the $match
and $group
stages are executed by the slot-based engine.$match
开头,后面跟着$group
,则$match
和$group
阶段由基于插槽的引擎执行。In most cases, the slot-based engine provides improved performance and lower CPU and memory costs compared to the classic query engine.在大多数情况下,与传统查询引擎相比,基于插槽的引擎提供了更好的性能和更低的CPU和内存成本。
See 请参见$group
Optimization.$group
优化。
In 在mongosh
, create a sample collection named sales
with the following documents:mongosh
中,使用以下文档创建名为sales
的样本集合:
db.sales.insertMany([ { "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") }, { "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") }, { "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") }, { "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") }, { "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") }, { "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") }, { "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") }, { "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") }, ])
The following aggregation operation uses the 以下聚合操作使用$group
stage to count the number of documents in the sales
collection:$group
阶段计算sales
集合中的文档数:
db.sales.aggregate( [ { $group: { _id: null, count: { $count: { } } } } ] )
The operation returns the following result:该操作返回以下结果:
{ "_id" : null, "count" : 8 }
This aggregation operation is equivalent to the following SQL statement:此聚合操作等效于以下SQL语句:
SELECT COUNT(*) AS count FROM sales
The following aggregation operation uses the 以下聚合操作使用$group
stage to retrieve the distinct item values from the sales
collection:$group
阶段从sales
集合中检索不同的项目值:
db.sales.aggregate( [ { $group : { _id : "$item" } } ] )
The operation returns the following result:该操作返回以下结果:
{ "_id" : "abc" } { "_id" : "jkl" } { "_id" : "def" } { "_id" : "xyz" }
The following aggregation operation groups documents by the 以下聚合操作按item
field, calculating the total sale amount per item and returning only the items with total sale amount greater than or equal to 100:item
字段对文档进行分组,计算每个项目的总销售额,并仅返回总销售额大于或等于100的项目:
db.sales.aggregate( [ // First Stage { $group : { _id : "$item", totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } } } }, // Second Stage { $match: { "totalSaleAmount": { $gte: 100 } } } ] )
$group
stage groups the documents by item
to retrieve the distinct item values. $group
阶段按项目对文档进行分组,以检索不同的项目值。totalSaleAmount
for each item.totalSaleAmount
(总销售额)。$match
stage filters the resulting documents to only return items with a totalSaleAmount
greater than or equal to 100.$match
阶段筛选结果文档,仅返回totalSaleAmount
大于或等于100的项目。The operation returns the following result:该操作返回以下结果:
{ "_id" : "abc", "totalSaleAmount" : NumberDecimal("170") } { "_id" : "xyz", "totalSaleAmount" : NumberDecimal("150") } { "_id" : "def", "totalSaleAmount" : NumberDecimal("112.5") }
This aggregation operation is equivalent to the following SQL statement:此聚合操作等效于以下SQL语句:
SELECT item, Sum(( price * quantity )) AS totalSaleAmount FROM sales GROUP BY item HAVING totalSaleAmount >= 100
In 在mongosh
, create a sample collection named sales
with the following documents:mongosh
中,使用以下文档创建名为sales
的样本集合:
db.sales.insertMany([ { "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") }, { "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") }, { "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") }, { "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") }, { "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") }, { "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") }, { "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") }, { "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") }, ])
The following pipeline calculates the total sales amount, average sales quantity, and sale count for each day in the year 2014:以下管道计算2014年每天的总销售额、平均销售额和销售额:
db.sales.aggregate([ // First Stage { $match : { "date": { $gte: new ISODate("2014-01-01"), $lt: new ISODate("2015-01-01") } } }, // Second Stage { $group : { _id : { $dateToString: { format: "%Y-%m-%d", date: "$date" } }, totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } }, averageQuantity: { $avg: "$quantity" }, count: { $sum: 1 } } }, // Third Stage { $sort : { totalSaleAmount: -1 } } ])
$match
stage filters the documents to only pass documents from the year 2014 to the next stage.$match
阶段筛选文档,仅将2014年的文档传递到下一阶段。$group
stage groups the documents by date and calculates the total sale amount, average quantity, and total count of the documents in each group.$group
阶段按日期对文档进行分组,并计算每组文档的总销售额、平均数量和总计数。$sort
stage sorts the results by the total sale amount for each group in descending order.$sort
阶段按每组的总销售额降序对结果进行排序。The operation returns the following results:该操作返回以下结果:
{ "_id" : "2014-04-04", "totalSaleAmount" : NumberDecimal("200"), "averageQuantity" : 15, "count" : 2 } { "_id" : "2014-03-15", "totalSaleAmount" : NumberDecimal("50"), "averageQuantity" : 10, "count" : 1 } { "_id" : "2014-03-01", "totalSaleAmount" : NumberDecimal("40"), "averageQuantity" : 1.5, "count" : 2 }
This aggregation operation is equivalent to the following SQL statement:此聚合操作等效于以下SQL语句:
SELECT date, Sum(( price * quantity )) AS totalSaleAmount, Avg(quantity) AS averageQuantity, Count(*) AS Count FROM sales GROUP BY Date(date) ORDER BY totalSaleAmount DESC
null
null
分组The following aggregation operation specifies a group 以下聚合操作指定group_id
of null
, calculating the total sale amount, average quantity, and count of all documents in the collection._id
为null
,用于计算集合中所有文档的总销售额、平均数量和计数。
db.sales.aggregate([ { $group : { _id : null, totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } }, averageQuantity: { $avg: "$quantity" }, count: { $sum: 1 } } } ])
The operation returns the following result:该操作返回以下结果:
{ "_id" : null, "totalSaleAmount" : NumberDecimal("452.5"), "averageQuantity" : 7.875, "count" : 8 }
This aggregation operation is equivalent to the following SQL statement:此聚合操作等效于以下SQL语句:
SELECT Sum(price * quantity) AS totalSaleAmount, Avg(quantity) AS averageQuantity, Count(*) AS Count FROM sales
$count
db.collection.countDocuments()
which wraps the $group
aggregation stage with a $sum
expression.db.collection.countDocuments()
,它用$sum
表达式包装$group
聚合阶段。In 在mongosh
, create a sample collection named books
with the following documents:mongosh
中,使用以下文档创建名为books
的示例集合:
db.books.insertMany([ { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }, { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }, { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }, { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }, { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 } ])
title
by author
author
分组title
The following aggregation operation pivots the data in the 以下聚合操作将books
collection to have titles grouped by authors.books
集合中的数据透视为按作者分组的标题。
db.books.aggregate([ { $group : { _id : "$author", books: { $push: "$title" } } } ])
The operation returns the following documents:操作将返回以下文档:
{ "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] } { "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }
author
author
分组文档The following aggregation operation groups documents by 以下聚合操作按author
:author
对文档进行分组:
db.books.aggregate([ // First Stage { $group : { _id : "$author", books: { $push: "$$ROOT" } } }, // Second Stage { $addFields: { totalCopies : { $sum: "$books.copies" } } } ])
$group
uses the 使用$$ROOT
system variable to group the entire documents by authors. $$ROOT
系统变量按作者对整个文档进行分组。This stage passes the following documents to the next stage:本阶段将以下文件传递给下一阶段:
{ "_id" : "Homer", "books" : [ { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }, { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 } ] }, { "_id" : "Dante", "books" : [ { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }, { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }, { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 } ] }
$addFields
adds a field to the output containing the total copies of books for each author.在输出中添加一个字段,该字段包含每个作者的图书总数。
The resulting documents must not exceed the BSON Document Size limit of 16 megabytes.生成的文档不得超过16兆字节的BSON文档大小限制。
The operation returns the following documents:操作将返回以下文档:
{ "_id" : "Homer", "books" : [ { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }, { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 } ], "totalCopies" : 20 } { "_id" : "Dante", "books" : [ { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }, { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }, { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 } ], "totalCopies" : 5 }
The Aggregation with the Zip Code Data Set tutorial provides an extensive example of the “使用邮政编码数据集进行聚合”教程提供了一个常见用例中的$group
operator in a common use case.$group
运算符的广泛示例。