Note
Aggregation Pipeline as Alternative to Map-Reduce
Starting in MongoDB 5.0, map-reduce is deprecated:
- Instead of map-reduce, you should use an aggregation pipeline. Aggregation pipelines provide better performance and usability than map-reduce.
- You can rewrite map-reduce operations using aggregation pipeline stages, such as
$group,$merge, and others. - For map-reduce operations that require custom functionality, you can use the
$accumulatorand$functionaggregation operators. You can use those operators to define custom aggregation expressions in JavaScript.
For examples of aggregation pipeline alternatives to map-reduce, see:
An aggregation pipeline is also easier to troubleshoot than a map-reduce operation.
The reduce function is a JavaScript function that "reduces" to a single object all the values associated with a particular key during a map-reduce operation. The reduce function must meet various requirements. This tutorial helps verify that the reduce function meets the following criteria:
The
reducefunction must return an object whose type must be identical to the type of thevalueemitted by themapfunction.- The order of the elements in the
valuesArrayshould not affect the output of thereducefunction. - The
reducefunction must be idempotent.
For a list of all the requirements for the reduce function, see mapReduce, or mongosh helper method db.collection.mapReduce().
Confirm Output Type
You can test that the reduce function returns a value that is the same type as the value emitted from the map function.
Define a
reduceFunction1function that takes the argumentskeyCustIdandvaluesPrices.valuesPricesis an array of integers:var reduceFunction1 = function(keyCustId, valuesPrices) {
return Array.sum(valuesPrices);
};Define a sample array of integers:
var myTestValues = [ 5, 5, 10 ];Invoke the
reduceFunction1withmyTestValues:reduceFunction1('myKey', myTestValues);Verify the
reduceFunction1returned an integer:20Define a
reduceFunction2function that takes the argumentskeySKUandvaluesCountObjects.valuesCountObjectsis an array of documents that contain two fieldscountandqty:var reduceFunction2 = function(keySKU, valuesCountObjects) {
reducedValue = { count: 0, qty: 0 };
for (var idx = 0; idx < valuesCountObjects.length; idx++) {
reducedValue.count += valuesCountObjects[idx].count;
reducedValue.qty += valuesCountObjects[idx].qty;
}
return reducedValue;
};Define a sample array of documents:
var myTestObjects = [
{ count: 1, qty: 5 },
{ count: 2, qty: 10 },
{ count: 3, qty: 15 }
];Invoke the
reduceFunction2withmyTestObjects:reduceFunction2('myKey', myTestObjects);Verify the
reduceFunction2returned a document with exactly thecountand theqtyfield:{ "count" : 6, "qty" : 30 }
Ensure Insensitivity to the Order of Mapped Values
The reduce function takes a key and a values array as its argument. You can test that the result of the reduce function does not depend on the order of the elements in the values array.
Define a sample
values1array and a samplevalues2array that only differ in the order of the array elements:var values1 = [
{ count: 1, qty: 5 },
{ count: 2, qty: 10 },
{ count: 3, qty: 15 }
];
var values2 = [
{ count: 3, qty: 15 },
{ count: 1, qty: 5 },
{ count: 2, qty: 10 }
];Define a
reduceFunction2function that takes the argumentskeySKUandvaluesCountObjects.valuesCountObjectsis an array of documents that contain two fieldscountandqty:var reduceFunction2 = function(keySKU, valuesCountObjects) {
reducedValue = { count: 0, qty: 0 };
for (var idx = 0; idx < valuesCountObjects.length; idx++) {
reducedValue.count += valuesCountObjects[idx].count;
reducedValue.qty += valuesCountObjects[idx].qty;
}
return reducedValue;
};Invoke the
reduceFunction2first withvalues1and then withvalues2:reduceFunction2('myKey', values1);
reduceFunction2('myKey', values2);Verify the
reduceFunction2returned the same result:{ "count" : 6, "qty" : 30 }
Ensure Reduce Function Idempotence
Because the map-reduce operation may call a reduce multiple times for the same key, and won't call a reduce for single instances of a key in the working set, the reduce function must return a value of the same type as the value emitted from the map function. You can test that the reduce function process "reduced" values without affecting the final value.
Define a
reduceFunction2function that takes the argumentskeySKUandvaluesCountObjects.valuesCountObjectsis an array of documents that contain two fieldscountandqty:var reduceFunction2 = function(keySKU, valuesCountObjects) {
reducedValue = { count: 0, qty: 0 };
for (var idx = 0; idx < valuesCountObjects.length; idx++) {
reducedValue.count += valuesCountObjects[idx].count;
reducedValue.qty += valuesCountObjects[idx].qty;
}
return reducedValue;
};Define a sample key:
var myKey = 'myKey';Define a sample
valuesIdempotentarray that contains an element that is a call to thereduceFunction2function:var valuesIdempotent = [
{ count: 1, qty: 5 },
{ count: 2, qty: 10 },
reduceFunction2(myKey, [ { count:3, qty: 15 } ] )
];Define a sample
values1array that combines the values passed toreduceFunction2:var values1 = [
{ count: 1, qty: 5 },
{ count: 2, qty: 10 },
{ count: 3, qty: 15 }
];Invoke the
reduceFunction2first withmyKeyandvaluesIdempotentand then withmyKeyandvalues1:reduceFunction2(myKey, valuesIdempotent);
reduceFunction2(myKey, values1);Verify the
reduceFunction2returned the same result:{ "count" : 6, "qty" : 30 }