$regex
On this page本页内容
This page describes regular expression search capabilities for self-managed (non-Atlas) deployments. 本页介绍用于自我管理(非Atlas)部署的正则表达式搜索功能。For data hosted on MongoDB Atlas, MongoDB offers an improved full-text search solution, Atlas Search, which has its own 对于托管在MongoDB Atlas上的数据,MongoDB提供了一个改进的全文搜索解决方案Atlas search,它有自己的$regex
operator. $regex
运算符。To learn more, see $regex in the Atlas Search documentation.要了解更多信息,请参阅Atlas Search文档中的$regex
。
Definition定义
$regex
-
Provides regular expression capabilities for pattern matching strings in queries.为查询中的模式匹配字符串提供正则表达式功能。To use要使用$regex
, use one of the following syntaxes:$regex
,请使用以下语法之一:{ <field>: { $regex: /pattern/, $options: '<options>' } }
{ <field>: { $regex: 'pattern', $options: '<options>' } }
{ <field>: { $regex: /pattern/<options> } }In MongoDB, you can also use regular expression objects (i.e.在MongoDB中,您还可以使用正则表达式对象(例如/pattern/
) to specify regular expressions:/pattern/
)来指定正则表达式:{ <field>: /pattern/<options> }
For restrictions on particular syntax use, see $regex vs. /pattern/ Syntax.有关特定语法使用的限制,请参阅$regex
vs./pattern/
语法。$options
The following以下<options>
are available for use with regular expression.<options>
可用于正则表达式。Option选项Description描述Syntax Restrictions语法限制i
Case insensitivity to match upper and lower cases.区分大小写以匹配大小写。For an example, see Perform Case-Insensitive Regular Expression Match.有关示例,请参阅执行不区分大小写的正则表达式匹配。m
For patterns that include anchors (i.e.对于包含锚点的模式(例如,^
for the start,$
for the end), match at the beginning or end of each line for strings with multiline values.^
表示开始,$
表示结束),对于具有多行值的字符串,请在每行的开头或末尾进行匹配。Without this option, these anchors match at beginning or end of the string.如果没有此选项,这些锚点将在字符串的开头或末尾匹配。For an example, see Multiline Match for Lines Starting with Specified Pattern.有关示例,请参阅以指定模式开始的行的多行匹配。
If the pattern contains no anchors or if the string value has no newline characters (e.g.如果模式不包含锚,或者字符串值没有换行符(例如\n
), them
option has no effect.\n
),则m
选项无效。x
"Extended" capability to ignore all white space characters in the“扩展”功能,忽略$regex
pattern unless escaped or included in a character class.$regex
模式中的所有空白字符,除非转义或包含在字符类中。
Additionally, it ignores characters in-between and including an un-escaped hash/pound (此外,它会忽略介于之间的字符,并包括一个未转义的hash/pund(#
) character and the next new line, so that you may include comments in complicated patterns.#
)字符和下一个新行,这样您就可以以复杂的模式包含注释。This only applies to data characters; white space characters may never appear within special character sequences in a pattern.这仅适用于数据字符;空白字符可能永远不会出现在模式中的特殊字符序列中。
Thex
option does not affect the handling of the VT character (i.e. code 11).x
选项不影响VT字符(即代码11)的处理。Requires需要带有$regex
with$options
syntax$options
语法的$regex
s
Allows the dot character (i.e.允许点字符(即.
) to match all characters including newline characters..
)与包括换行符在内的所有字符匹配。For an example, see Use the有关示例,请参阅使用点字符.
Dot Character to Match New Line..
来匹配新行Requires需要带有$regex
with$options
syntax$options
语法的$regex
NoteThe$regex
operator does not support the global search modifierg
.$regex
运算符不支持全局搜索修饰符g
。
Behavior行为
$regex
vs. /pattern/
Syntax语法
$in
Expressions表达式
To include a regular expression in an 要在$in
query expression, you can only use JavaScript regular expression objects (i.e. /pattern/
). $in
查询表达式中包含正则表达式,您只能使用JavaScript正则表达式对象(即/pattern/
)。For example:例如:
{ name: { $in: [ /^acme/i, /^ack/ ] } }
You cannot use 不能在$regex
operator expressions inside an $in
.$in
中使用$regex
运算符表达式。
Implicit AND
Conditions for the Field字段的隐式AND
条件
AND
Conditions for the FieldTo include a regular expression in a comma-separated list of query conditions for the field, use the 要在字段的查询条件的逗号分隔列表中包含正则表达式,请使用$regex
operator. $regex
运算符。For example:例如:
{ name: { $regex: /acme.*corp/i, $nin: [ 'acmeblahcorp' ] } }
{ name: { $regex: /acme.*corp/, $options: 'i', $nin: [ 'acmeblahcorp' ] } }
{ name: { $regex: 'acme.*corp', $options: 'i', $nin: [ 'acmeblahcorp' ] } }
x
and 和s
Options选项
To use either the 要使用x
option or s
options, you must use the $regex
operator expression with the $options
operator. x
选项或s
选项,必须将$regex
运算符表达式与$options
运算符一起使用。For example, to specify the 例如,要指定i
and the s
options, you must use $options
for both:i
和s
选项,必须同时使用$options
:
{ name: { $regex: /acme.*corp/, $options: "si" } }
{ name: { $regex: 'acme.*corp', $options: "si" } }
PCRE Versus JavaScriptPCRE与JavaScript
To use PCRE-supported features in a regular expression that aren't supported in JavaScript, you must use the 要在正则表达式中使用JavaScript不支持的$regex
operator and specify the regular expression as a string.PCRE
支持功能,必须使用$regex
运算符并将正则表达式指定为字符串。
To match case-insensitive strings:要匹配不区分大小写的字符串:
"(?i)"
begins a case-insensitive match.开始不区分大小写的匹配。"(?-i)"
ends a case-insensitive match.结束不区分大小写的匹配。
For example, the regular expression 例如,正则表达式"(?i)a(?-i)cme"
matches strings that:"(?i)a(?-i)cme"
匹配以下字符串:
Begin with以"a"
or"A"
."a"
或"A"
开头。This is a case-insensitive match.这是一个不区分大小写的匹配。End with以"cme"
. This is a case-sensitive match."cme"
结尾。这是区分大小写的匹配。
These strings match the example regular expression:这些字符串与示例正则表达式匹配:
"acme"
"Acme"
The following example uses the 以下示例使用$regex
operator to find name
field strings that match the regular expression "(?i)a(?-i)cme"
:$regex
运算符查找与正则表达式"(?i)a(?-i)cme"
匹配的name
字段字符串:
{ name: { $regex: "(?i)a(?-i)cme" } }
Starting in version 6.1, MongoDB uses the PCRE2 (Perl Compatible Regular Expressions) library to implement regular expression pattern matching. 从6.1版本开始,MongoDB使用PCRE2(Perl兼容正则表达式)库来实现正则表达式模式匹配。To learn more about PCRE2, see the PCRE Documentation.要了解有关PCRE2的更多信息,请参阅PCRE文档。
$regex
and 和$not
The $not
operator can perform logical NOT
operation on both:$not
运算符可以对以下两者执行逻辑not运算:
Regular expression objects (i.e.正则表达式对象(即/pattern/
)/pattern/
)For example:例如:db.inventory.find( { item: { $not: /^p.*/ } } )
$regex
operator expressions运算符表达式For example:例如:db.inventory.find( { item: { $not: { $regex: "^p.*" } } } )
db.inventory.find( { item: { $not: { $regex: /^p.*/ } } } )
Index Use索引使用
For case sensitive regular expression queries, if an index exists for the field, then MongoDB matches the regular expression against the values in the index, which can be faster than a collection scan.对于区分大小写的正则表达式查询,如果字段存在索引,则MongoDB将正则表达式与索引中的值进行匹配,这可能比集合扫描更快。
Further optimization can occur if the regular expression is a "prefix expression", which means that all potential matches start with the same string. 如果正则表达式是“前缀表达式”,则可以进行进一步的优化,这意味着所有潜在的匹配都以相同的字符串开头。This allows MongoDB to construct a "range" from that prefix and only match against those values from the index that fall within that range.这允许MongoDB根据该前缀构建一个“范围”,并且只与索引中位于该范围内的值匹配。
A regular expression is a "prefix expression" if it starts with a caret (如果正则表达式以插入符号(^
) or a left anchor (\A
), followed by a string of simple symbols. ^
)或左锚(\A
)开头,后跟一系列简单符号,则它就是“前缀表达式”。For example, the regex 例如,正则表达式/^abc.*/
will be optimized by matching only against the values from the index that start with abc
./^abc.*/
将通过仅匹配以abc
开头的索引中的值来进行优化。
Additionally, while 此外,当/^a/
, /^a.*/
, and /^a.*$/
match equivalent strings, they have different performance characteristics. /^a/
、/^a.*/
和/^a.*$/
匹配等效字符串时,它们具有不同的性能特征。All of these expressions use an index if an appropriate index exists; however, 如果存在适当的索引,则所有这些表达式都使用索引;但是,/^a.*/
, and /^a.*$/
are slower. /^a/
can stop scanning after matching the prefix./^a.*/
和/^a.*$/
较慢。/^a/
可以在匹配前缀之后停止扫描。
Case insensitive regular expression queries generally cannot use indexes effectively. 不区分大小写的正则表达式查询通常不能有效地使用索引。The $regex
implementation is not collation-aware and is unable to utilize case-insensitive indexes.$regex
实现不支持排序规则,并且无法使用不区分大小写的索引。
Examples实例
The examples in this section use the following 本节中的示例使用以下products
collection:products
集合:
db.products.insertMany( [
{ _id: 100, sku: "abc123", description: "Single line description." },
{ _id: 101, sku: "abc789", description: "First line\nSecond line" },
{ _id: 102, sku: "xyz456", description: "Many spaces before line" },
{ _id: 103, sku: "xyz789", description: "Multiple\nline description" },
{ _id: 104, sku: "Abc789", description: "SKU starts with A" }
] )
Perform a LIKE
Match执行LIKE
匹配
LIKE
MatchThe following example matches all documents where the 以下示例匹配sku
field is like "%789"
:sku
字段类似"%789"
的所有文档:
db.products.find( { sku: { $regex: /789$/ } } )
The example is analogous to the following SQL LIKE statement:该示例类似于以下SQL LIKE语句:
SELECT * FROM products
WHERE sku like "%789";
Example output:示例输出:
[
{ _id: 101, sku: 'abc789', description: 'First line\nSecond line' },
{ _id: 103, sku: 'xyz789', description: 'Multiple\nline description' },
{ _id: 104, sku: 'Abc789', description: 'SKU starts with A' }
]
Perform Case-Insensitive Regular Expression Match执行不区分大小写的正则表达式匹配
The following example uses the 以下示例使用i
option perform a case-insensitive match for documents with sku
value that starts with ABC
.i
选项为sku
值以ABC
开头的文档执行不区分大小写的匹配。
db.products.find( { sku: { $regex: /^ABC/i } } )
Example output:输出示例:
[
{ _id: 100, sku: 'abc123', description: 'Single line description.' },
{ _id: 101, sku: 'abc789', description: 'First line\nSecond line' },
{ _id: 104, sku: 'Abc789', description: 'SKU starts with A' }
]
Multiline Match for Lines Starting with Specified Pattern从指定图案开始的线的多线匹配
The following example uses the 以下示例使用m
option to match lines starting with the letter S
for multiline strings:m
选项来匹配多行字符串中以字母S
开头的行:
db.products.find( { description: { $regex: /^S/, $options: 'm' } } )
Example output:输出示例:
[
{ _id: 100, sku: 'abc123', description: 'Single line description.' },
{ _id: 101, sku: 'abc789', description: 'First line\nSecond line' },
{ _id: 104, sku: 'Abc789', description: 'SKU starts with A' }
]
Without the 如果没有m
option, the example output is:m
选项,示例输出为:
[
{ _id: 100, sku: 'abc123', description: 'Single line description.' },
{ _id: 104, sku: 'Abc789', description: 'SKU starts with A' }
]
If the 如果$regex
pattern does not contain an anchor, the pattern matches against the string as a whole, as in the following example:$regex
模式不包含锚,则该模式将与字符串作为一个整体进行匹配,如下例所示:
db.products.find( { description: { $regex: /S/ } } )
Example output:输出示例:
[
{ _id: 100, sku: 'abc123', description: 'Single line description.' },
{ _id: 101, sku: 'abc789', description: 'First line\nSecond line' },
{ _id: 104, sku: 'Abc789', description: 'SKU starts with A' }
]
Use the .
Dot Character to Match New Line使用.
点字符以匹配新行
.
Dot Character to Match New LineThe following example uses the 以下示例使用s
option to allow the dot character (i.e. .
) to match all characters including new line as well as the i
option to perform a case-insensitive match:s
选项允许点字符(即.
)匹配包括新行在内的所有字符,还使用i
选项执行不区分大小写的匹配:
db.products.find( { description: { $regex: /m.*line/, $options: 'si' } } )
Example output:输出示例:
[
{ _id: 102, sku: 'xyz456', description: 'Many spaces before line' },
{ _id: 103, sku: 'xyz789', description: 'Multiple\nline description' }
]
Without the 如果没有s
option, the example output is:s
选项,示例输出为:
[
{ _id: 102, sku: 'xyz456', description: 'Many spaces before line' }
]
Ignore White Spaces in Pattern忽略图案中的空白
The following example uses the 以下示例使用x
option ignore white spaces and the comments, denoted by the #
and ending with the \n
in the matching pattern:x
选项忽略空白和注释,在匹配模式中用#
表示并以\n
结尾:
var pattern = "abc #category code\n123 #item number"
db.products.find( { sku: { $regex: pattern, $options: "x" } } )
Example output:输出示例:
[
{ _id: 100, sku: 'abc123', description: 'Single line description.' }
]
Use a Regular Expression to Match Case in Strings使用正则表达式匹配字符串中的大小写
The following example uses the regular expression 以下示例使用正则表达式"(?i)a(?-i)bc"
to match sku
field strings that contain:"(?i)a(?-i)bc"
来匹配包含以下内容的sku
字段字符串:
"abc"
"Abc"
db.products.find( { sku: { $regex: "(?i)a(?-i)bc" } } )
Example output:输出示例:
[
{ _id: 100, sku: 'abc123', description: 'Single line description.' },
{ _id: 101, sku: 'abc789', description: 'First line\nSecond line' },
{ _id: 104, sku: 'Abc789', description: 'SKU starts with A' }
]
Extend Regex Options to Match Characters Outside of ASCII扩展Regex选项以匹配ASCII之外的字符
New in version 6.1. 6.1版新增。
By default, certain regex options (such as 默认情况下,某些正则表达式选项(如/b
and /w
) only recognize ASCII characters. This can cause unexpected results when performing regex matches against UTF-8 characters./b
和/w
)只能识别ASCII字符。当对UTF-8字符执行regex匹配时,这可能会导致意外的结果。
Starting in MongoDB 6.1, you can specify the 从MongoDB 6.1开始,您可以指定*UCP
regex option to match UTF-8 characters.*UCP
regex选项来匹配UTF-8字符。
Performance of UCP OptionUCP期权的履行
The *UCP
option results in slower queries than those without the option specified because *UCP
requires a multistage table lookup to perform the match.*UCP
选项导致的查询速度比没有指定选项的查询慢,因为*UCP
需要多级表查找才能执行匹配。
For example, consider the following documents in a 例如,考虑songs
collection:songs
集合中的以下文档:
db.songs.insertMany( [
{ _id: 0, "artist" : "Blue Öyster Cult", "title": "The Reaper" },
{ _id: 1, "artist": "Blue Öyster Cult", "title": "Godzilla" },
{ _id: 2, "artist" : "Blue Oyster Cult", "title": "Take Me Away" }
] )
The following regex query uses the 以下正则表达式查询在正则表达式匹配中使用\b
option in a regex match. The \b
option matches a word boundary.\b
选项。\b
选项与单词边界匹配。
db.songs.find( { artist: { $regex: /\byster/ } } )
Example output:输出示例:
[
{ _id: 0, artist: 'Blue Öyster Cult', title: 'The Reaper' },
{ _id: 1, artist: 'Blue Öyster Cult', title: 'Godzilla' }
]
The previous results are unexpected because none of the words in the returned 之前的结果出乎意料,因为返回的artist
fields begin with the matched string (yster
). artist
字段中没有任何单词以匹配的字符串(yster
)开头。The 执行匹配时会忽略文档Ö
character in documents _id: 0
and _id: 1
is ignored when performing the match because it is a UTF-8 character._id:0
和_id:1
中的Ö
字符,因为它是UTF-8字符。
The expected result is that the query does not return any documents.预期的结果是查询不返回任何文档。
To allow the query to recognize UTF-8 characters, specify the 要允许查询识别UTF-8字符,请在模式之前指定*UCP
option before the pattern:*UCP
选项:
db.songs.find( { artist: { $regex: "(*UCP)/\byster/" } } )
The previous query does not return any documents, which is the expected result.上一个查询没有返回任何文档,这是预期的结果。
Escape Characters for Regex PatternsRegex模式的转义符
When specifying 在指定*UCP
or any other regular expression option, ensure that you use the correct escape characters for your shell or driver.*UCP
或任何其他正则表达式选项时,请确保为shell或驱动程序使用正确的转义符。