Database Manual / Reference / Query Language / Expressions

$split (aggregation)(聚合)

Definition定义

$split

Divides a string into an array of substrings based on a delimiter, which can be a string or regex pattern. 根据分隔符将字符串划分为子字符串数组,分隔符可以是字符串或正则表达式模式。$split removes the delimiter and returns the resulting substrings as elements of an array. $split删除分隔符,并将结果子字符串作为数组元素返回。If the delimiter is not found in the string, $split returns the original string as the only element of an array.如果在字符串中找不到分隔符,$split将返回原始字符串作为数组的唯一元素。

$split has the following operator expression syntax:$split具有以下运算符表达式语法

{ $split: [ <string expression>, <delimiter> ] }
Field字段Type类型Description描述
string expressionstring字符串The string to be split. string expression can be any valid expression as long as it resolves to a string. For more information on expressions, see Expressions.要拆分的字符串。字符串表达式可以是任何有效的表达式,只要它解析为字符串即可。有关表达式的更多信息,请参阅表达式
delimiterstring字符串The delimiter to use when splitting the string expression. delimiter can be any valid expression as long as it resolves to a string or regex pattern.拆分字符串表达式时使用的分隔符。delimiter可以是任何有效的表达式,只要它解析为字符串或正则表达式模式。

Behavior行为

The $split operator returns an array. The <string expression> input must be a string and the <delimiter> input must be a string or a regex pattern. Otherwise, the operation fails with an error.$split运算符返回一个数组。<string expression>输入必须是字符串,<delimiter>输入必须为字符串或正则表达式模式。否则,操作将失败并出现错误。

Example示例Results结果
{ $split: [ "June-15-2013", "-" ] }
[ "June", "15", "2013" ]
{ $split: [ "banana split", "a" ] }
[ "b", "n", "n", " split" ]
{ $split: [ "Hello World", " " ] }
[ "Hello", "World" ]
{ $split: [ "astronomical", "astro" ] }
[ "", "nomical" ]
{ $split: [ "pea green boat", "owl" ] }
[ "pea green boat" ]
{ $split: [ "headphone jack", /jack/ ] }
[ "headphone ", "" ]
{ $split: [ "f74a--43b6-bd68--55e3", /(-+)/ ] }
["f74a", "--", "43b6", "-", "bd68", "--", "55e3"]
{ $split: [ "headphone jack", 7 ] }

Errors with message:消息错误:

"$split requires an expression that evaluates to a string as a second argument, found: double"

Example示例

A collection named deliveries contains the following documents:名为deliveries的集合包含以下文档:

db.deliveries.insertMany( [
{ _id: 1, city: "Berkeley, CA", qty: 648 },
{ _id: 2, city: "Bend, OR", qty: 491 },
{ _id: 3, city: "Kensington, CA", qty: 233 },
{ _id: 4, city: "Eugene, OR", qty: 842 },
{ _id: 5, city: "Reno, NV", qty: 655 },
{ _id: 6, city: "Portland, OR", qty: 408 },
{ _id: 7, city: "Sacramento, CA", qty: 574 }
] )

The goal of following aggregation operation is to find the total quantity of deliveries for each state and sort the list in descending order. It has five pipeline stages:以下聚合操作的目标是找到每个州的交付总量,并按降序对列表进行排序。它有五个管道阶段:

  • The $project stage produces documents with two fields, qty (integer) and city_state (array). The $split operator creates an array of strings by splitting the city field, using a comma followed by a space (", ") as a delimiter.$project阶段生成具有两个字段的文档,qty(整数)和city_state(数组)。$split运算符通过使用逗号后跟空格(“,”)作为分隔符拆分city字段来创建字符串数组。
  • The $unwind stage creates a separate record for each element in the city_state field.$unvel阶段为city_state字段中的每个元素创建单独的记录。
  • The $match stage uses a regular expression to filter out the city documents, leaving only those containing a state.$match阶段使用正则表达式筛选掉城市文档,只留下包含州的文档。
  • The $group stage groups all the states together and sums the qty field.$group阶段将所有州分组在一起,并对qty字段求和。
  • The $sort stage sorts the results by total_qty in descending order.$sort阶段按total_qty降序对结果进行排序。
db.deliveries.aggregate( [
{ $project: { city_state: { $split: ["$city", ", "] }, qty: 1 } },
{ $unwind: "$city_state" },
{ $match: { city_state: /[A-Z]{2}/ } },
{ $group: { _id: { state: "$city_state" }, total_qty: { $sum: "$qty" } } },
{ $sort: { total_qty: -1 } }
] )

The operation returns the following results:该操作返回以下结果:

[
{ _id: { state: "OR" }, total_qty: 1741 },
{ _id: { state: "CA" }, total_qty: 1455 },
{ _id: { state: "NV" }, total_qty: 655 }
]