The aggregation pipeline supports operations on sharded collections. This section describes behaviors specific to the aggregation pipeline and sharded collections.聚合管道支持对分片集合的操作。本节描述特定于聚合管道和分片集合的行为。
Behavior行为
If the pipeline starts with an exact 如果管道从分片键上的精确$match on a shard key, and the pipeline does not contain $out or $lookup stages, the entire pipeline runs on the matching shard only.$match开始,并且管道不包含$out或$lookup阶段,则整个管道仅在匹配的分片上运行。
When aggregation operations run on multiple shards, the results are routed to the 当聚合操作在多个分片上运行时,结果会被路由到要合并的mongos to be merged, except in the following cases:mongos,以下情况除外:
If the pipeline includes the如果管道包含$outstage, the merge runs on the shard where the output collection lives.$out阶段,则合并将在输出集合所在的分片上运行。If the pipeline includes the如果管道包含引用未分片集合的$lookupstage that references an unsharded collection, the merge runs on the shard where the unsharded collection lives.$lookup阶段,则合并将在未分片集合所在的分片上运行。If the pipeline includes a sorting or grouping stage, and the allowDiskUse setting is enabled, the merge runs on a randomly-selected shard.如果管道包括排序或分组阶段,并且启用了allowDiskUse设置,则合并将在随机选择的分片上运行。
Optimization优化
When splitting the aggregation pipeline into two parts, the pipeline is split to ensure that the shards perform as many stages as possible with consideration for optimization.当将聚合管道拆分为两部分时,会对管道进行拆分,以确保分片在考虑优化的情况下执行尽可能多的阶段。
To see how the pipeline was split, include the 要查看管道是如何拆分的,请在explain option in the db.collection.aggregate() method.db.collection.aggregate()方法中包含explain选项。
Optimizations are subject to change between releases.优化可能会在不同版本之间发生变化。