Database Manual

Aggregation Operations聚集操作

Aggregation operations process multiple documents and return computed results. You can use aggregation operations to:聚合操作处理多个文档并返回计算结果。您可以使用聚合操作来:

By using the built-in aggregation operators in MongoDB, you can perform analytics on your cluster without having to move your data to another platform.通过使用MongoDB中的内置聚合运算符,您可以在集群上执行分析,而无需将数据移动到另一个平台。

Get Started开始使用

To perform aggregation operations, you can use:要执行聚合操作,您可以使用:

You can run aggregation pipelines in the UI for deployments hosted in MongoDB Atlas.您可以在MongoDB Atlas中托管的部署的UI中运行聚合管道

Aggregation Pipelines聚合管道

An aggregation pipeline consists of one or more stages that process documents. These documents can come from a collection, a view, or a specially designed stage.聚合管道由一个或多个处理文档的阶段组成。这些文档可以来自集合、视图或专门设计的阶段。

Each stage performs an operation on the input documents. For example, a stage can $filter documents, $group documents, and calculate values. 每个阶段对输入文档执行一个操作。例如,一个阶段可以$filter筛选文档、$group分组文档和计算值。The documents that a stage outputs are then passed to the next stage in the pipeline.一个阶段输出的文档随后被传递到管道中的下一个阶段。

An aggregation pipeline can return results for groups of documents. You can also update documents with an aggregation pipeline using the stages shown in Updates with Aggregation Pipeline.聚合管道可以返回文档组的结果。您还可以使用使用聚合管道更新中显示的阶段使用聚合管道来更新文档。

Note

Aggregation pipelines run with the db.collection.aggregate() method do not modify documents in a collection, unless the pipeline contains a $merge or $out stage.使用db.collection.aggregate()方法运行的聚合管道不会修改集合中的文档,除非管道包含$merge$out阶段。

Aggregation Pipeline Example聚合管道示例

The following example pipeline uses documents from the sample data available in MongoDB Atlas, specifically the sample_training.routes collection. In this pipeline, we'll find the top three airlines that offer the most direct flights out of the airport in Portland, Oregon, USA (PDX).以下示例管道使用MongoDB Atlas中可用的示例数据中的文档,特别是sample_training.routes集合。在这个管道中,我们将找到提供美国俄勒冈州波特兰市机场最直达航班的前三家航空公司(PDX)。

First, add a $match stage to filter the documents to flights that have a src_airport value of PDX and zero stops:首先,添加一个$match阶段,将文档筛选到src_airport值为PDX且零stops的航班:

{
$match : {
"src_airport" : "PDX",
"stops" : 0
}
}

The $match stage reduces the number of documents in our pipeline from 66,985 to 113. Next, $group the documents by airline name and count the number of flights:$match阶段将我们管道中的文档数量从66985个减少到113个。接下来,$按航空公司名称对文档进行分组,并计算航班数量:

{
$group : {
_id : {
"airline name": "$airline.name",
}
count : {
$sum : 1
}
}
}

The $group stage reduces the number of documents in the pipeline to 16 airlines. To find the airlines with the most flights, use the $sort stage to sort the remaining documents in descending order:$group阶段将管道中的文件数量减少到16家航空公司。要查找航班最多的航空公司,请使用$sort阶段按降序对剩余文档进行排序:

{
$sort : {
count : -1
}
}

After you sort your documents, use the $limit stage to return the top three airlines that offer the most direct flights out of PDX:在您对文件进行排序后,使用$limit阶段返回提供PDX最直达航班的前三家航空公司:

{
$limit : 3
}

After putting the documents in the sample_training.routes collection through this aggregation pipeline, the top three airlines offering non-stop flights from PDX are Alaska, American, and United Airlines with 39, 17, and 13 flights, respectively.通过此聚合管道将文件放入sample_training.routes集合后,提供从PDX直飞航班的前三大航空公司是阿拉斯加航空公司、美国航空公司和联合航空公司,分别有39、17和13个航班。

The full pipeline resembles the following:整个管道类似于以下内容:

db.routes.aggregate( [
{
$match : {
"src_airport" : "PDX",
"stops" : 0
}
},
{
$group : {
_id : {
"airline name": "$airline.name",
}
count : {
$sum: 1
}
}
},
{
$sort : {
count : -1
}
},
{
$limit : 3
}
] )

For runnable examples containing sample input documents, see Complete Aggregation Pipeline Examples.有关包含示例输入文档的可运行示例,请参阅完整的聚合管道示例

Learn More About Aggregation Pipelines了解有关聚合管道的更多信息

To learn more about aggregation pipelines, see Aggregation Pipeline.要了解有关聚合管道的更多信息,请参阅聚合管道

Single Purpose Aggregation Methods单一目的聚合方法

The single purpose aggregation methods aggregate documents from a single collection. The methods are simple but lack the capabilities of an aggregation pipeline.单用途聚合方法聚合来自单个集合的文档。这些方法很简单,但缺乏聚合管道的功能。

Method方法Description描述
db.collection.estimatedDocumentCount()Returns an approximate count of the documents in a collection or a view.返回集合或视图中文档的近似计数。
db.collection.count()Returns a count of the number of documents in a collection or a view.返回集合或视图中文档数量的计数。
db.collection.distinct()Returns an array of documents that have distinct values for the specified field.返回指定字段具有不同值的文档数组。