$topN (aggregation accumulator)

On this page本页内容

Definition定义

$topN

New in version 5.2.在版本5.2中新增

Returns an aggregation of the top n elements within a group, according to the specified sort order. 根据指定的排序顺序,返回组中前n个元素的聚合。If the group contains fewer than n elements, $topN returns all elements in the group.如果组包含的元素少于n个,$topN将返回组中的所有元素。

Syntax语法

{
   $topN:
      {
         n: <expression>,
         sortBy: { <field1>: <sort order>, <field2>: <sort order> ... },
         output: <expression>
      }
}
  • n limits the number of results per group and has to be a positive integral expression that is either a constant or depends on the _id value for $group.限制每个组的结果数,并且必须是一个正整数表达式,该表达式要么是常数,要么取决于$group_id值。
  • sortBy specifies the order of results, with syntax similar to $sort.sortBy指定结果的顺序,语法类似于$sort
  • output represents the output for each element in the group and can be any expression.表示组中每个元素的输出,可以是任何表达式。

Behavior行为

Null and Missing ValuesNull和缺少值

  • $topN does not filter out null values.不会筛选出空值。
  • $topN converts missing values to null which are preserved in the output.将输出中保留的缺失值转换为null
db.aggregate( [
   {
      $documents: [
         { playerId: "PlayerA", gameId: "G1", score: 1 },
         { playerId: "PlayerB", gameId: "G1", score: 2 },
         { playerId: "PlayerC", gameId: "G1", score: 3 },
         { playerId: "PlayerD", gameId: "G1"},
         { playerId: "PlayerE", gameId: "G1", score: null }
      ]
   },
   {
      $group:
      {
         _id: "$gameId",
         playerId:
            {
               $topN:
                  {
                     output: [ "$playerId", "$score" ],
                     sortBy: { "score": 1 },
                     n: 3
                  }
            }
      }
   }
] )

In this example:在此示例中:

  • $documents creates the literal documents that contain player scores.创建包含玩家分数的文字文档。
  • $group groups the documents by gameId. gameId对文档进行分组。This example has only one gameId, G1.这个例子只有一个gameIdG1
  • PlayerD has a missing score and PlayerE has a null score. 缺少分数,而PlayerE的分数为空。These values are both considered as null.这些值都被视为null
  • The playerId and score fields are specified as output : ["$playerId"," $score"] and returned as array values.playerIdscore字段被指定为output : ["$playerId"," $score"],并作为数组值返回。
  • Because of the sortBy: { "score" : 1 }, the null values are sorted to the front of the returned playerId array.由于sortBy: { "score" : 1 },空值被排序到返回的playerId数组的前面。
[
   {
      _id: 'G1',
      playerId: [ [ 'PlayerD', null ], [ 'PlayerE', null ], [ 'PlayerA', 1 ] ]
   }
]

BSON Data Type Sort OrderingBSON数据类型排序

When sorting different types, the order of BSON data types is used to determine ordering. 对不同类型排序时,BSON数据类型的顺序用于确定排序。As an example, consider a collection whose values consist of strings and numbers.例如,考虑一个值由字符串和数字组成的集合。

  • In an ascending sort, string values are sorted after numeric values.在升序排序中,字符串值在数值之后排序。
  • In a descending sort, string values are sorted before numeric values.在降序排序中,字符串值在数字值之前排序。
db.aggregate( [
   {
      $documents: [
         { playerId: "PlayerA", gameId: "G1", score: 1 },
         { playerId: "PlayerB", gameId: "G1", score: "2" },
         { playerId: "PlayerC", gameId: "G1", score: "" }
      ]
   },
   {
      $group:
         {
            _id: "$gameId",
            playerId: {
               $topN:
               {
                  output: ["$playerId","$score"],
                  sortBy: {"score": -1},
                  n: 3
               }
            }
         }
   }
] )

In this example:在此示例中:

  • PlayerA has an integer score.具有整数分数。
  • PlayerB has a string "2" score.具有字符串"2"分数。
  • PlayerC has an empty string score.具有空字符串分数。

Because the sort is in descending { "score" : -1 }, the string literal values are sorted before PlayerA's numeric score:因为排序是降序{ "score" : -1 },所以字符串文字值在PlayerA的数字分数之前排序:

[
   {
      _id: "G1",
      playerId: [ [ "PlayerB", "2" ], [ "PlayerC", "" ], [ "PlayerA", 1 ] ]
   }
]

Restrictions限制

Window Function and Aggregation Expression Support窗口函数和聚合表达式支持

$topN is not supported as a aggregation expression.不支持作为聚合表达式

$topN is supported as a window operator.支持作为窗口运算符

Memory Limit Considerations内存限制注意事项

Groups within the $topN aggregation pipeline are subject to the 100 MB limit pipeline limit. $topN聚合管道中的组受100 MB管道限制的限制。If this limit is exceeded for an individual group, the aggregation fails with an error.如果单个组超过了此限制,则聚合将失败并返回错误。

Examples示例

Consider a gamescores collection with the following documents:考虑一个包含以下文档的gamescores集合:

db.gamescores.insertMany([
   { playerId: "PlayerA", gameId: "G1", score: 31 },
   { playerId: "PlayerB", gameId: "G1", score: 33 },
   { playerId: "PlayerC", gameId: "G1", score: 99 },
   { playerId: "PlayerD", gameId: "G1", score: 1 },
   { playerId: "PlayerA", gameId: "G2", score: 10 },
   { playerId: "PlayerB", gameId: "G2", score: 14 },
   { playerId: "PlayerC", gameId: "G2", score: 66 },
   { playerId: "PlayerD", gameId: "G2", score: 80 }
])

Find the Three Highest Scores找到三个最高分数

You can use the $topN accumulator to find the highest scoring players in a single game.您可以使用$topN累加器查找单个游戏中得分最高的玩家。

db.gamescores.aggregate( [
   {
      $match : { gameId : "G1" }
   },
   {
      $group:
         {
            _id: "$gameId",
            playerId:
               {
                  $topN:
                  {
                     output: ["$playerId", "$score"],
                     sortBy: { "score": -1 },
                     n:3
                  }
               }
         }
   }
] )

The example pipeline:示例管道:

  • Uses $match to filter the results on a single gameId. In this case, G1.使用$match筛选单个gameId上的结果。在这种情况下,G1
  • Uses $group to group the results by gameId. In this case, G1.使用$groupgameId对结果进行分组。在这种情况下,G1。
  • Uses sort by { "score": -1 } to sort the results in descending order.使用{ "score": -1 }按降序对结果进行排序。
  • Specifies the fields that are output from $topN with output : ["$playerId"," $score"].使用output : ["$playerId"," $score"]指定从$topN输出的字段。
  • Uses $topN to return the top three documents with the highest score for the G1 game with n : 3.使用$topN返回G1游戏n:3中得分最高的前三个文档。

The operation returns the following results:该操作返回以下结果:

[
   {
      _id: 'G1',
      playerId: [ [ 'PlayerC', 99 ], [ 'PlayerB', 33 ], [ 'PlayerA', 31 ] ]
   }
]

The SQL equivalent to this query is:与此查询等效的SQL是:

SELECT T3.GAMEID,T3.PLAYERID,T3.SCORE
FROM GAMESCORES AS GS
JOIN (SELECT TOP 3
         GAMEID,PLAYERID,SCORE
         FROM GAMESCORES
         WHERE GAMEID = 'G1'
         ORDER BY SCORE DESC) AS T3
            ON GS.GAMEID = T3.GAMEID
GROUP BY T3.GAMEID,T3.PLAYERID,T3.SCORE
   ORDER BY T3.SCORE DESC

Finding the Three Highest Score Documents Across Multiple Games在多个游戏中查找三个得分最高的文档

You can use the $topN accumulator to find the highest scoring players in each game.您可以使用$topN累加器查找每场比赛中得分最高的玩家。

db.gamescores.aggregate( [
      {
         $group:
         { _id: "$gameId", playerId:
            {
               $topN:
                  {
                     output: [ "$playerId","$score" ],
                     sortBy: { "score": -1 },
                     n: 3
                  }
            }
         }
      }
] )

The example pipeline:示例管道:

  • Uses $group to group the results by gameId.使用$groupgameId对结果进行分组。
  • Specifies the fields that are output from $topN with output : ["$playerId", "$score"].使用output : ["$playerId", "$score"]指定从$topN输出的字段。
  • Uses sort by { "score": -1 } to sort the results in descending order.使用{ "score": -1 }按降序对结果进行排序。
  • Uses $topN to return the top three documents with the highest score for each game with n: 3.使用$topN返回n:3的每个游戏得分最高的前三个文档。

The operation returns the following results:该操作返回以下结果:

[
   {
      _id: 'G1',
      playerId: [ [ 'PlayerC', 99 ], [ 'PlayerB', 33 ], [ 'PlayerA', 31 ] ]
   },
   {
      _id: 'G2',
      playerId: [ [ 'PlayerD', 80 ], [ 'PlayerC', 66 ], [ 'PlayerB', 14 ] ]
   }
]

The SQL equivalent to this query is:与此查询等效的SQL是:

SELECT PLAYERID,GAMEID,SCORE
FROM(
   SELECT ROW_NUMBER() OVER (PARTITION BY GAMEID ORDER BY SCORE DESC) AS GAMERANK,
   GAMEID,PLAYERID,SCORE
   FROM GAMESCORES
) AS T
WHERE GAMERANK <= 3
ORDER BY GAMEID

Computing n Based on the Group Key for $group基于$Group的组密钥计算n

You can also assign the value of n dynamically. 还可以动态指定n的值。In this example, the $cond expression is used on the gameId field.在本例中,$cond表达式用于gameId字段。

db.gamescores.aggregate([
   {
      $group:
      {
         _id: {"gameId": "$gameId"},
         gamescores:
            {
               $topN:
                  {
                     output: "$score",
                     n: { $cond: { if: {$eq: ["$gameId","G2"] }, then: 1, else: 3 } },
                     sortBy: { "score": -1 }
                  }
            }
      }
   }
] )

The example pipeline:示例管道:

  • Uses $group to group the results by gameId.使用$groupgameId对结果进行分组。
  • Specifies the fields that are output from $topN with output : "$score".指定从$topN输出的字段,其输出为output : "$score"
  • If the gameId is G2 then n is 1, otherwise n is 3.如果gameIdG2,则n为1,否则n为3。
  • Uses sort by { "score": -1 } to sort the results in descending order.使用{ "score": -1 }按降序对结果进行排序。

The operation returns the following results:该操作返回以下结果:

[
   { _id: { gameId: 'G1' }, gamescores: [ 99, 33, 31 ] },
   { _id: { gameId: 'G2' }, gamescores: [ 80 ] }
]
←  $top (aggregation accumulator)$toString (aggregation) →