$regexMatch (aggregation)

On this page本页内容

Definition定义

$regexMatch

New in version 4.2.在版本4.2中新增

Performs a regular expression (regex) pattern matching and returns:执行正则表达式(regex)模式匹配并返回:

  • true if a match exists.如果存在匹配项,则为true
  • false if a match doesn't exist.如果不存在匹配,则返回false

MongoDB uses Perl compatible regular expressions (i.e. "PCRE" ) version 8.41 with UTF-8 support.MongoDB使用与Perl兼容的正则表达式(即“PCRE”)版本8.41,支持UTF-8。

Prior to MongoDB 4.2, aggregation pipeline can only use the query operator $regex in the $match stage. 在MongoDB4.2之前,聚合管道只能在$match阶段使用查询运算符$regexFor more information on using regex in a query, see $regex.有关在查询中使用$regex的详细信息,请参阅$regex

Syntax语法

The $regexMatch operator has the following syntax:$regexMatch运算符语法如下:

{ $regexMatch: { input: <expression> , regex: <expression>, options: <expression> } }
Field字段Description描述
input

The string on which you wish to apply the regex pattern. 要应用正则表达式模式的字符串。Can be a string or any valid expression that resolves to a string.可以是字符串或解析为字符串的任何有效表达式

regex

The regex pattern to apply. 要应用的正则表达式模式。Can be any valid expression that resolves to either a string or regex pattern /<pattern>/. 可以是解析为字符串或正则表达式模式/<pattern>/的任何有效表达式When using the regex /<pattern>/, you can also specify the regex options i and m (but not the s or x options):使用模式/<pattern>/时,还可以指定regex选项im(但不能指定sx选项):

  • "pattern"
  • /<pattern>/
  • /<pattern>/<options>

Alternatively, you can also specify the regex options with the options field. 或者,也可以使用options字段指定正则表达式选项。To specify the s or x options, you must use the options field.要指定sx选项,必须使用options字段。

You cannot specify options in both the regex and the options field.不能在regexoptions字段中同时指定选项。

options

Optional. 可选The following <options> are available for use with regular expression.以下<options>可用于正则表达式。

Note注意

You cannot specify options in both the regex and the options field.不能在regexoptions字段中同时指定选项。

Option选项Description描述
iCase insensitivity to match both upper and lower cases. 区分大小写以匹配大小写。You can specify the option in the options field or as part of the regex field.可以在options字段中指定选项,也可以将其作为正则表达式字段的一部分。
m

For patterns that include anchors (i.e. ^ for the start, $ for the end), match at the beginning or end of each line for strings with multiline values. 对于包含锚定的图案(即^表示开始,$表示结束),在每行的开始或结束处匹配具有多行值的字符串。Without this option, these anchors match at beginning or end of the string.如果没有此选项,这些锚点将在字符串的开头或结尾匹配。

If the pattern contains no anchors or if the string value has no newline characters (e.g. \n), the m option has no effect.如果模式不包含锚定,或者字符串值没有换行符(例如,\n),则m选项无效。

x

"Extended" capability to ignore all white space characters in the pattern unless escaped or included in a character class.“扩展”功能可以忽略模式中的所有空白字符,除非转义或包含在字符类中。

Additionally, it ignores characters in-between and including an un-escaped hash/pound (#) character and the next new line, so that you may include comments in complicated patterns. 此外,它会忽略介于未转义哈希/磅(#)字符和下一新行之间的字符,以便在复杂模式中包含注释。This only applies to data characters; white space characters may never appear within special character sequences in a pattern.这仅适用于数据字符;空白字符可能永远不会出现在模式中的特殊字符序列中。

The x option does not affect the handling of the VT character (i.e. code 11).x选项不影响VT字符的处理(即代码11)。

You can specify the option only in the options field.只能在options字段中指定选项。

s

Allows the dot character (i.e. .) to match all characters including newline characters.允许点字符(即.)匹配包括换行符在内的所有字符。

You can specify the option only in the options field.只能在options字段中指定选项。

Returns返回值

The operator returns a boolean:运算符返回布尔值:

  • true if a match exists.如果存在匹配项,则为true
  • false if a match doesn't exist.如果不存在匹配,则返回false
Tip提示
See also: 参阅:

Behavior行为

$regexMatch and Collation和排序规则

$regexMatch ignores the collation specified for the collection, db.collection.aggregate(), and the index, if used.$regexMatch忽略为集合db.collection.aggregate()和索引(如果使用)指定的排序规则。

For example, the create a sample collection with collation strength 1 (i.e. compare base character only and ignore other differences such as case and diacritics):例如,创建一个排序强度为1的样本集合(即仅比较基本字符,忽略大小写和音调符号等其他差异):

db.createCollection( "myColl", { collation: { locale: "fr", strength: 1 } } )

Insert the following documents:插入以下文档:

db.myColl.insertMany([
   { _id: 1, category: "café" },
   { _id: 2, category: "cafe" },
   { _id: 3, category: "cafE" }
])

Using the collection's collation, the following operation performs a case-insensitive and diacritic-insensitive match:以下操作使用集合的排序规则执行不区分大小写和区分音调符号的匹配:

db.myColl.aggregate( [ { $match: { category: "cafe" } } ] )

The operation returns the following 3 documents:该操作返回以下3个文档:

{ "_id" : 1, "category" : "café" }
{ "_id" : 2, "category" : "cafe" }
{ "_id" : 3, "category" : "cafE" }

However, the aggregation expression $regexMatch ignores collation; that is, the following regular expression pattern matching examples are case-sensitive and diacritic sensitive:但是,聚合表达式$regexMatch忽略排序规则;也就是说,以下正则表达式模式匹配示例区分大小写和区分音调:

db.myColl.aggregate( [ { $addFields: { results: { $regexMatch: { input: "$category", regex: /cafe/ }  } } } ] )
db.myColl.aggregate(
   [ { $addFields: { results: { $regexMatch: { input: "$category", regex: /cafe/ }  } } } ],
   { collation: { locale: "fr", strength: 1 } }
       // Ignored in the $regexMatch
)

Both operations return the following:这两个操作都返回以下结果:

{ "_id" : 1, "category" : "café", "results" : false }
{ "_id" : 2, "category" : "cafe", "results" : true }
{ "_id" : 3, "category" : "cafE", "results" : false }

To perform a case-insensitive regex pattern matching, use the i Option instead. 要执行不区分大小写的正则表达式模式匹配,请改用i选项See i Option for an example.有关示例,请参阅i选项

Examples示例

$regexMatch and Its Options及其选项

To illustrate the behavior of the $regexMatch operator as discussed in this example, create a sample collection products with the following documents:要演示本例中讨论的$regexMatch运算符的行为,请使用以下文档创建示例集合products

db.products.insertMany([
   { _id: 1, description: "Single LINE description." },
   { _id: 2, description: "First lines\nsecond line" },
   { _id: 3, description: "Many spaces before
     line" },
   { _id: 4, description: "Multiple\nline descriptions" },
   { _id: 5, description: "anchors, links and hyperlinks" },
   { _id: 6, description: "métier work vocation" }
])

By default, $regexMatch performs a case-sensitive match. 默认情况下,$regexMatch执行区分大小写的匹配。For example, the following aggregation performs a case-sensitive$regexMatch on the description field. 例如,以下聚合在description字段上执行区分大小写的$regexMatchThe regex pattern /line/ does not specify any grouping:regex模式/line/未指定任何分组:

db.products.aggregate([
   { $addFields: { result: { $regexMatch: { input: "$description", regex: /line/ } } } }
])

The operation returns the following:运算结果如下:

{ "_id" : 1, "description" : "Single LINE description.", "result" : false }
{ "_id" : 2, "description" : "First lines\nsecond line", "result" : true }
{ "_id" : 3, "description" : "Many spaces before
     line", "result" : true }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "result" : true }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "result" : false }
{ "_id" : 6, "description" : "métier work vocation", "result" : false }

The following regex pattern /lin(e|k)/ specifies a grouping (e|k) in the pattern:以下正则表达式模式/lin(e|k)/指定模式中的分组(e|k)

db.products.aggregate([
   { $addFields: { result: { $regexMatch: { input: "$description", regex: /lin(e|k)/ } } } }
])

The operation returns the following:运算结果如下:

{ "_id" : 1, "description" : "Single LINE description.", "result" : false }
{ "_id" : 2, "description" : "First lines\nsecond line", "result" : true }
{ "_id" : 3, "description" : "Many spaces before
     line", "result" : true }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "result" : true }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "result" : true }
{ "_id" : 6, "description" : "métier work vocation", "result" : false }

i Option选项

Note注意

You cannot specify options in both the regex and the options field.不能在regexoptions字段中同时指定选项。

To perform case-insensitive pattern matching, include the i option as part of the regex field or in the options field:要执行不区分大小写的模式匹配,请在regex字段或options字段中包含i选项:

// Specify i as part of the regex field
{ $regexMatch: { input: "$description", regex: /line/i } }
// Specify i in the options field
{ $regexMatch: { input: "$description", regex: /line/, options: "i" } }
{ $regexMatch: { input: "$description", regex: "line", options: "i" } }

For example, the following aggregation performs a case-insensitive$regexMatch on the description field. 例如,以下聚合在description字段上执行不区分大小写的$regexMatchThe regex pattern /line/ does not specify any grouping:regex模式/line/未指定任何分组:

db.products.aggregate([
   { $addFields: { result: { $regexMatch: { input: "$description", regex: /line/i } } } }
])

The operation returns the following documents:该操作返回以下文档:

{ "_id" : 1, "description" : "Single LINE description.", "result" : true }
{ "_id" : 2, "description" : "First lines\nsecond line", "result" : true }
{ "_id" : 3, "description" : "Many spaces before
     line", "result" : true }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "result" : true }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "result" : false }
{ "_id" : 6, "description" : "métier work vocation", "result" : false }

m Option选项

Note注意

You cannot specify options in both the regex and the options field.不能在regexoptions字段中同时指定选项。

To match the specified anchors (e.g. ^, $) for each line of a multiline string, include the m option as part of the regex field or in the options field:要匹配多行字符串每行的指定锚定(例如^$),请将m选项作为regex字段的一部分或在options字段中包括:

// Specify m as part of the regex field
{ $regexMatch: { input: "$description", regex: /line/m } }
// Specify m in the options field
{ $regexMatch: { input: "$description", regex: /line/, options: "m" } }
{ $regexMatch: { input: "$description", regex: "line", options: "m" } }

The following example includes both the i and the m options to match lines starting with either the letter s or S for multiline strings:以下示例包括im选项,用于匹配多行字符串中以字母sS开头的行:

db.products.aggregate([
   { $addFields: { result: { $regexMatch: { input: "$description", regex: /^s/im } } } }
])

The operation returns the following:运算结果如下:

{ "_id" : 1, "description" : "Single LINE description.", "result" : true }
{ "_id" : 2, "description" : "First lines\nsecond line", "result" : true }
{ "_id" : 3, "description" : "Many spaces before
     line", "result" : false }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "result" : false }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "result" : false }
{ "_id" : 6, "description" : "métier work vocation", "result" : false }

x Option选项

Note注意

You cannot specify options in both the regex and the options field.不能在regexoptions字段中同时指定选项。

To ignore all unescaped white space characters and comments (denoted by the un-escaped hash # character and the next new-line character) in the pattern, include the s option in the options field:要忽略模式中所有未转义的空白字符和注释(由未转义哈希#字符和下一个换行字符表示),请在options字段中包含s选项:

// Specify x in the options field
{ $regexMatch: { input: "$description", regex: /line/, options: "x" } }
{ $regexMatch: { input: "$description", regex: "line", options: "x" } }

The following example includes the x option to skip unescaped white spaces and comments:以下示例包括用于跳过未转义空白和注释的x选项:

db.products.aggregate([
   { $addFields: { returns: { $regexMatch: { input: "$description", regex: /lin(e|k) # matches line or link/, options:"x" } } } }
])

The operation returns the following:运算结果如下:

{ "_id" : 1, "description" : "Single LINE description.", "returns" : false }
{ "_id" : 2, "description" : "First lines\nsecond line", "returns" : true }
{ "_id" : 3, "description" : "Many spaces before
     line", "returns" : true }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returns" : true }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returns" : true }
{ "_id" : 6, "description" : "métier work vocation", "returns" : false }

s Option选项

Note注意

You cannot specify options in both the regex and the options field.不能在regexoptions字段中同时指定选项。

To allow the dot character (i.e. .) in the pattern to match all characters including the new line character, include the s option in the options field:要允许模式中的点字符(即.)匹配包括换行字符在内的所有字符,请在options字段中包含options选项:

// Specify s in the options field
{ $regexMatch: { input: "$description", regex: /m.*line/, options: "s" } }
{ $regexMatch: { input: "$description", regex: "m.*line", options: "s" } }

The following example includes the s option to allow the dot character (i.e. .) to match all characters including new line as well as the i option to perform a case-insensitive match:以下示例包括允许点字符(即.)匹配所有字符(包括新行)的s选项,以及执行不区分大小写匹配的i选项:

db.products.aggregate([
   { $addFields: { returns: { $regexMatch: { input: "$description", regex:/m.*line/, options: "si"  } } } }
])

The operation returns the following:运算结果如下:

{ "_id" : 1, "description" : "Single LINE description.", "returns" : false }
{ "_id" : 2, "description" : "First lines\nsecond line", "returns" : false }
{ "_id" : 3, "description" : "Many spaces before
     line", "returns" : true }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returns" : true }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returns" : false }
{ "_id" : 6, "description" : "métier work vocation", "returns" : false }

Use $regexMatch to Check Email Address使用$regexMatch检查电子邮件地址

Create a sample collection feedback with the following documents:使用以下文档创建集合feedback示例:

db.feedback.insertMany([
   { "_id" : 1, comment: "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com"  },
   { "_id" : 2, comment: "I wanted to concatenate a string" },
   { "_id" : 3, comment: "How do I convert a date to string? Contact me at either cam@mongodb.com or c.dia@mongodb.com" },
   { "_id" : 4, comment: "It's just me. I'm testing.  fred@MongoDB.com" }
])

The following aggregation uses the $regexMatch to check if the comment field contains an email address with @mongodb.com and categorize the feedback as Employee or External.以下聚合使用$regexMatch检查comment字段是否包含带有@mongodb.com的电子邮件地址,并将反馈分类为EmployeeExternal

db.feedback.aggregate( [
    { $addFields: {
       "category": { $cond: { if:  { $regexMatch: { input: "$comment", regex: /[a-z0-9_.+-]+@mongodb.com/i } },
                              then: "Employee",
                              else: "External" } }
    } },

The operation returns the following documents:该操作返回以下文档:

{ "_id" : 1, "comment" : "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com", "category" : "External" }
{ "_id" : 2, "comment" : "I wanted to concatenate a string", "category" : "External" }
{ "_id" : 3, "comment" : "How do I convert a date to string? Contact me at either cam@mongodb.com or c.dia@mongodb.com", "category" : "Employee" }
{ "_id" : 4, "comment" : "It's just me. I'm testing.  fred@MongoDB.com", "category" : "Employee" }
←  $regexFindAll (aggregation)$replaceOne (aggregation) →