Aggregation with the Zip Code Data Set使用邮政编码数据集进行聚合

~~On this page~~本页内容

~~Data Model~~数据模型
aggregate() ~~Method~~方法
~~Return States with Populations above 10 Million~~返回人口超过1000万的州
~~Return Average City Population by State~~返回各州的平均城市人口
~~Return Largest and Smallest Cities by State~~按州返回最大和最小城市

~~The examples in this document use the zipcodes collection.~~ 本文档中的示例使用zipcodes集合。~~This collection is available at: media.mongodb.org/zips.json.~~ 该系列可在以下网址获得：media.mongodb.org/zips.json。~~Use mongoimport to load this data set into your mongod instance.~~使用mongoimport将此数据集加载到您的mongod实例中。

Data Model数据模型

~~Each document in the zipcodes collection has the following form:~~zipcodes集合中的每个文档都具有以下形式：

{
  "_id": "10280",
  "city": "NEW YORK",
  "state": "NY",
  "pop": 5574,
  "loc": [
    -74.016323,
    40.710537
  ]
}

~~The _id field holds the zip code as a string.~~_id字段将邮政编码作为字符串保存。
~~The city field holds the city name. A city can have more than one zip code associated with it as different sections of the city can each have a different zip code.~~city字段包含城市名称。一个城市可以有多个与之相关的邮政编码，因为城市的不同部分都可以有不同的邮政编码。
~~The state field holds the two letter state abbreviation.~~state字段包含两个字母的州缩写。
~~The pop field holds the population.~~pop字段保存了人口。
~~The loc field holds the location as a longitude latitude pair.~~loc字段将位置保存为经纬度对。

`aggregate()` Method方法

~~All of the following examples use the aggregate() helper in mongosh.~~以下所有示例都使用mongosh中的aggregate()辅助对象。

~~The aggregate() method uses the aggregation pipeline to process documents into aggregated results.~~ aggregate()方法使用聚合管道将文档处理为聚合结果。~~An aggregation pipeline consists of stages with each stage processing the documents as they pass along the pipeline. Documents pass through the stages in sequence.~~聚合管道由多个阶段组成，每个阶段在文档通过管道时对其进行处理。文档按顺序通过各个阶段。

~~The aggregate() method in mongosh provides a wrapper around the aggregate database command.~~ mongosh中的aggregate()方法为aggregate数据库命令提供了一个包装器。~~See the documentation for your driver for a more idiomatic interface for data aggregation operations.~~有关数据聚合操作的更惯用的接口，请参阅驱动程序的文档。

Return States with Populations above 10 Million返回人口超过1000万的州

~~The following aggregation operation returns all states with total population greater than 10 million:~~以下聚合操作返回总人口超过1000万的所有州：

db.zipcodes.aggregate( [
   { $group: { _id: "$state", totalPop: { $sum: "$pop" } } },
   { $match: { totalPop: { $gte: 10*1000*1000 } } }
] )

~~In this example, the aggregation pipeline consists of the $group stage followed by the $match stage:~~在本例中，聚合管道由$group阶段和$match阶段组成：

~~The $group stage groups the documents of the zipcode collection by the state field, calculates the totalPop field for each state, and outputs a document for each unique state.~~$group阶段根据state字段对zipcode集合的文档进行分组，计算每个州的totalPop字段，并为每个唯一州输出一个文档。

~~The new per-state documents have two fields: the _id field and the totalPop field.~~ 新的每个州文档有两个字段：_id字段和totalPop字段。~~The _id field contains the value of the state; i.e. the group by field.~~ _id字段包含state的值；即按字段分组。~~The totalPop field is a calculated field that contains the total population of each state.~~ totalPop字段是一个计算字段，包含每个州的总人口。~~To calculate the value, $group uses the $sum operator to add the population field (pop) for each state.~~为了计算该值，$group使用$sum运算符为每个州添加填充字段（pop）。

~~After the $group stage, the documents in the pipeline resemble the following:~~在$group阶段之后，管道中的文档如下所示：
```
{
  "_id" : "AK",
  "totalPop" : 550043
}
```
~~The $match stage filters these grouped documents to output only those documents whose totalPop value is greater than or equal to 10 million.~~ $match阶段筛选这些分组文档，以仅输出totalPop值大于或等于1000万的文档。~~The $match stage does not alter the matching documents but outputs the matching documents unmodified.~~$match阶段不会更改匹配的文档，而是输出未修改的匹配文档。

~~The equivalent SQL for this aggregation operation is:~~此聚合操作的等效SQL为：

SELECT state, SUM(pop) AS totalPop
FROM zipcodes
GROUP BY state
HAVING totalPop >= (10*1000*1000)

Tip

Return Average City Population by State返回各州的平均城市人口

~~The following aggregation operation returns the average populations for cities in each state:~~以下聚合操作返回各州城市的平均人口：

db.zipcodes.aggregate( [
   { $group: { _id: { state: "$state", city: "$city" }, pop: { $sum: "$pop" } } },
   { $group: { _id: "$_id.state", avgCityPop: { $avg: "$pop" } } }
] )

~~In this example, the aggregation pipeline consists of the $group stage followed by another $group stage:~~在本例中，聚合管道由$group阶段和另一个$group阶段组成：

The first $group stage groups the documents by the combination of city and state, uses the $sum expression to calculate the population for each combination, and outputs a document for each city and state combination. 第一个$group阶段按city和state的组合对文档进行分组，使用$sum表达式计算每个组合的人口，并为每个city和state组合输出一个文档。[1]

~~After this stage in the pipeline, the documents resemble the following:~~在管道的这一阶段之后，文档如下所示：
```
{
  "_id" : {
    "state" : "CO",
    "city" : "EDGEWATER"
  },
  "pop" : 13154
}
```
A second $group stage groups the documents in the pipeline by the _id.state field (i.e. the state field inside the _id document), uses the $avg expression to calculate the average city population (avgCityPop) for each state, and outputs a document for each state.第二个$group阶段根据_id.state字段（即_id文档中的state字段）对管道中的文档进行分组，使用$avg表达式计算每个州的平均城市人口（avgCityPop），并为每个州输出一个文档。

~~The documents that result from this aggregation operation resembles the following:~~此聚合操作产生的文档类似于以下内容：

{
  "_id" : "MN",
  "avgCityPop" : 5335
}

Tip

Return Largest and Smallest Cities by State按州返回最大和最小城市

~~The following aggregation operation returns the smallest and largest cities by population for each state:~~以下聚合操作按人口返回各州的人口最少和人口最多的城市：

db.zipcodes.aggregate( [
   { $group:
      {
        _id: { state: "$state", city: "$city" },
        pop: { $sum: "$pop" }
      }
   },
   { $sort: { pop: 1 } },
   { $group:
      {
        _id : "$_id.state",
        biggestCity:  { $last: "$_id.city" },
        biggestPop:   { $last: "$pop" },
        smallestCity: { $first: "$_id.city" },
        smallestPop:  { $first: "$pop" }
      }
   },

  // the following $project is optional, and modifies the output format.下面的$project是可选的，并修改输出格式。

  { $project:
    { _id: 0,
      state: "$_id",
      biggestCity:  { name: "$biggestCity",  pop: "$biggestPop" },
      smallestCity: { name: "$smallestCity", pop: "$smallestPop" }
    }
  }
] )

~~In this example, the aggregation pipeline consists of a $group stage, a $sort stage, another $group stage, and a $project stage:~~在本例中，聚合管道由$group阶段、$sort阶段、另一个$group阶段和$project阶段组成：

The first $group stage groups the documents by the combination of the city and state, calculates the sum of the pop values for each combination, and outputs a document for each city and state combination.第一个$group阶段根据city和state的组合对文档进行分组，计算每个组合的pop值之sum，并为每个city和state组合输出一个文档。

~~At this stage in the pipeline, the documents resemble the following:~~在此阶段，文件如下所示：
```
{
  "_id" : {
    "state" : "CO",
    "city" : "EDGEWATER"
  },
  "pop" : 13154
}
```
~~The $sort stage orders the documents in the pipeline by the pop field value, from smallest to largest; i.e. by increasing order.~~ $sort阶段按照pop字段值对管道中的文档进行排序，从最小到最大；即通过增加订单。~~This operation does not alter the documents.~~此操作不会更改文档。
~~The next $group stage groups the now-sorted documents by the _id.state field (i.e. the state field inside the _id document) and outputs a document for each state.~~下一个$group阶段根据_id.state字段（即_id文档中的state字段）对现在排序的文档进行分组，并为每个州输出一个文档。

~~The stage also calculates the following four fields for each state.~~ 该阶段还为每个状态计算以下四个字段。~~Using the $last expression, the $group operator creates the biggestCity and biggestPop fields that store the city with the largest population and that population.~~ $group运算符使用$last表达式创建biggestCity和biggestPop字段，用于存储人口最多的城市和人口。~~Using the $first expression, the $group operator creates the smallestCity and smallestPop fields that store the city with the smallest population and that population.~~$group运算符使用$first表达式创建smallestCity和smallestPop字段，用于存储人口最少的城市和人口。

~~The documents, at this stage in the pipeline, resemble the following:~~现阶段的文件如下所示：
```
{
  "_id" : "WA",
  "biggestCity" : "SEATTLE",
  "biggestPop" : 520096,
  "smallestCity" : "BENGE",
  "smallestPop" : 2
}
```
T~~he final $project stage renames the _id field to state and moves the biggestCity, biggestPop, smallestCity, and smallestPop into biggestCity and smallestCity embedded documents.~~最后的$project阶段将_id字段重命名为state，并将biggestCity、biggestPop、smallestCity和smallestPop移动到biggestCity和smallstCity嵌入文档中。

~~The output documents of this aggregation operation resemble the following:~~此聚合操作的输出文档如下所示：

{
  "state" : "RI",
  "biggestCity" : {
    "name" : "CRANSTON",
    "pop" : 176404
  },
  "smallestCity" : {
    "name" : "CLAYVILLE",
    "pop" : 45
  }
}

[1]	~~A city can have more than one zip code associated with it as different sections of the city can each have a different zip code.~~一个城市可以有多个与之相关的邮政编码，因为城市的不同部分都可以有不同的邮政编码。

← Aggregation Pipeline and Sharded Collections Aggregation with User Preference Data →

Aggregation with the Zip Code Data Set使用邮政编码数据集进行聚合

Data Model数据模型

`aggregate()` Method方法

Return States with Populations above 10 Million返回人口超过1000万的州

See also: 另请参阅：

Return Average City Population by State返回各州的平均城市人口

See also: 另请参阅：

Return Largest and Smallest Cities by State按州返回最大和最小城市

Aggregation with the Zip Code Data Set使用邮政编码数据集进行聚合

Data Model数据模型

aggregate() Method方法

Return States with Populations above 10 Million返回人口超过1000万的州

See also: 另请参阅：

Return Average City Population by State返回各州的平均城市人口

See also: 另请参阅：

Return Largest and Smallest Cities by State按州返回最大和最小城市

`aggregate()` Method方法