Map-Reduce

On this page本页内容

Note注意
Aggregation Pipeline as Alternative作为替代方案的聚合管道

Starting in MongoDB 5.0, map-reduce is deprecated:从MongoDB 5.0开始,不推荐使用map-reduce

  • Instead of map-reduce, you should use an aggregation pipeline. 您应该使用聚合管道,而不是map-reduceAggregation pipelines provide better performance and usability than map-reduce.聚合管道提供了比map-reduce更好的性能和可用性。
  • You can rewrite map-reduce operations using aggregation pipeline stages, such as $group, $merge, and others.您可以使用聚合管道阶段(例如$group$merge等)重写map-reduce操作。
  • For map-reduce operations that require custom functionality, you can use the $accumulator and $function aggregation operators, available starting in version 4.4. 对于需要自定义功能的map-reduce操作,您可以使用$accumulator$function聚合运算符,这些运算符从4.4版开始提供。You can use those operators to define custom aggregation expressions in JavaScript.您可以使用这些运算符在JavaScript中定义自定义聚合表达式。

For examples of aggregation pipeline alternatives to map-reduce, see:有关映射reduce的聚合管道备选方案的示例,请参阅:

Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. Map-reduce是一种数据处理范式,用于将大量数据压缩为有用的聚合结果。To perform map-reduce operations, MongoDB provides the mapReduce database command.为了执行map reduce操作,MongoDB提供了mapReduce数据库命令。

Consider the following map-reduce operation:考虑以下map-reduce操作:

Diagram of the annotated map-reduce operation.

In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the collection that match the query condition). 在这个map-reduce操作中,MongoDB将map阶段应用于每个输入文档(即集合中匹配查询条件的文档)。The map function emits key-value pairs. map函数发出键值对。For those keys that have multiple values, MongoDB applies the reduce phase, which collects and condenses the aggregated data. 对于具有多个值的键,MongoDB应用reduce阶段,该阶段集合并压缩聚合的数据。MongoDB then stores the results in a collection. 然后,MongoDB将结果存储在一个集合中。Optionally, the output of the reduce function may pass through a finalize function to further condense or process the results of the aggregation.或者,reduce函数的输出可以通过finalize函数进一步压缩或处理聚合结果。

All map-reduce functions in MongoDB are JavaScript and run within the mongod process. MongoDB中的所有map-reduce函数都是JavaScript,并在mongod进程中运行。Map-reduce operations take the documents of a single collection as the input and can perform any arbitrary sorting and limiting before beginning the map stage. Map-reduce操作将单个集合的文档作为输入,可以在开始Map阶段之前执行任意排序和限制。mapReduce can return the results of a map-reduce operation as a document, or may write the results to collections.可以将map-reduce操作的结果作为文档返回,也可以将结果写入集合。

Map-Reduce JavaScript FunctionsMap-Reduce JavaScript函数

In MongoDB, map-reduce operations use custom JavaScript functions to map, or associate, values to a key. 在MongoDB中,map-reduce操作使用自定义JavaScript函数将值映射或关联到键。If a key has multiple values mapped to it, the operation reduces the values for the key to a single object.如果一个键映射了多个值,则该操作会将该键的值减少为单个对象。

The use of custom JavaScript functions provide flexibility to map-reduce operations. 自定义JavaScript函数的使用为映射reduce操作提供了灵活性。For instance, when processing a document, the map function can create more than one key and value mapping or no mapping. 例如,在处理文档时,map函数可以创建多个键和值映射,也可以不创建映射。Map-reduce operations can also use a custom JavaScript function to make final modifications to the results at the end of the map and reduce operation, such as perform additional calculations.Map-reduce操作还可以使用自定义JavaScript函数在Map和reduce操作结束时对结果进行最终修改,例如执行其他计算。

Note注意

Starting in MongoDB 4.4, mapReduce no longer supports the deprecated BSON type JavaScript code with scope (BSON type 15) for its functions. 从MongoDB 4.4开始,mapReduce不再支持其函数的范围为(BSON类型15)的不推荐使用的BSON类型JavaScript代码。The map, reduce, and finalize functions must be either BSON type String (BSON type 2) or BSON type JavaScript (BSON type 13). mapreducefinalize函数必须是BSON类型String(BSON类型2)或BSON类型JavaScript(BSON类别13)。To pass constant values which will be accessible in the map, reduce, and finalize functions, use the scope parameter.要传递可在mapreducefinalize函数中访问的常量值,请使用scope参数。

The use of JavaScript code with scope for the mapReduce functions has been deprecated since version 4.2.1.自版本4.2.1以来,不推荐使用具有mapReduce函数作用域的JavaScript代码。

Map-Reduce Results结果

In MongoDB, the map-reduce operation can write results to a collection or return the results inline. 在MongoDB中,map-reduce操作可以将结果写入集合或内联返回结果。If you write map-reduce output to a collection, you can perform subsequent map-reduce operations on the same input collection that merge replace, merge, or reduce new results with previous results. 如果将map-reduce输出写入集合,则可以对同一输入集合执行后续的map-reducte操作,以将新结果与以前的结果合并、替换、合并或减少。See mapReduce and Perform Incremental Map-Reduce for details and examples.有关详细信息和示例,请参阅mapReduce实施增量Map Reduce

When returning the results of a map-reduce operation inline, the result documents must be within the BSON Document Size limit, which is currently 16 megabytes. 当以内联方式返回map-reduce操作的结果时,结果文档必须在BSON文档大小限制(当前为16 MB)内。For additional information on limits and restrictions on map-reduce operations, see the mapReduce reference page.有关map-reduce操作的限制和限制的更多信息,请参阅mapReduce参考页面。

Sharded Collections分片集合

MongoDB supports map-reduce operations on sharded collections.MongoDB支持对分片集合执行map-reduce操作。

However, starting in version 4.2, MongoDB deprecates the map-reduce option to create a new sharded collection and the use of the sharded option for map-reduce. 然而,从4.2版开始,MongoDB不推荐使用map-reduce选项创建新的分片集合,也不推荐使用shared选项进行map-reducte。To output to a sharded collection, create the sharded collection first. 要输出到分片集合,请先创建分片集合。MongoDB 4.2 also deprecates the replacement of an existing sharded collection.MongoDB 4.2也不赞成替换现有的分片集合。

See Map-Reduce and Sharded Collections.请参见Map-Reduce和分片集合

Views视图

Views do not support map-reduce operations.视图不支持map-reduce操作。

←  Aggregation with User Preference DataMap-Reduce and Sharded Collections →