On this page本页内容
MongoDB offers a full-text search solution, MongoDB Atlas Search, for data hosted on MongoDB Atlas. MongoDB为MongoDB Atlas上托管的数据提供了全文搜索解决方案MongoDB Attlas search。A legacy text search capability is available for users self-managing MongoDB deployments.用户可以使用遗留文本搜索功能自行管理MongoDB部署。
This page describes 本页介绍用于自我管理部署的$text
operator for self-managed deployments.$text
运算符。
$text
$text
performs a text search on the content of the fields indexed with a text index. 对使用文本索引索引的字段的内容执行文本搜索。A $text
expression has the following syntax:$text
表达式具有以下语法:
{ $text: { $search: <string>, $language: <string>, $caseSensitive: <boolean>, $diacriticSensitive: <boolean> } }
The $text
operator accepts a text query document with the following fields:$text
运算符接受包含以下字段的文本查询文档:
$search | string | OR search of the terms unless specified as a phrase. OR 搜索,除非指定为短语。 |
$language | string |
|
$caseSensitive | boolean |
|
$diacriticSensitive | boolean |
|
The 默认情况下,$text
operator, by default, does not return results sorted in terms of the results' scores. $text
运算符不返回按结果分数排序的结果。For more information on sorting by the text search scores, see the Text Score documentation.有关按文本搜索分数排序的更多信息,请参阅文本分数文档。
$text
expression.$text
表达式。$text
query can not appear in $nor
expressions.$text
查询不能出现在$nor
表达式中。$text
query can not appear in $elemMatch
query expressions or $elemMatch
projection expressions.$text
查询不能出现在$elemMatch
查询表达式或$elemMatch
投影表达式中。$text
query in an $or
expression, all clauses in the $or
array must be indexed.$or
表达式中使用$text
查询,必须索引$or
数组中的所有子句。hint()
if the query includes a $text
query expression.$text
查询表达式,则不能使用hint()
。You cannot specify 如果查询包含$natural
sort order if the query includes a $text
expression.$text
表达式,则不能指定$natural
排序顺序。
$text
expression, which requires a special text index, with a query operator that requires a different type of special index. $text
表达式与需要不同类型特殊索引的查询运算符组合在一起。$text
expression with the $near
operator.$text
表达式与$near
运算符组合在一起。If using the 如果在聚合中使用$text
operator in aggregation, the following restrictions also apply.$text
运算符,则也会应用以下限制。
$match
stage that includes a $text
must be the first stage in the pipeline.$text
的$match
阶段必须是管道中的第一个阶段。$text
operator can only occur once in the stage.$text
运算符在阶段中只能出现一次。$text
operator expression cannot appear in $or
or $not
expressions.$text
运算符表达式不能出现在$or
或$not
表达式中。$meta
aggregation expression in the $sort
stage.$sort
阶段使用$meta
聚合表达式。$search
In the 在$search
field, specify a string of words that the $text
operator parses and uses to query the text index.$search
字段中,指定$text
运算符解析并用于查询文本索引的字符串。
The $text
operator treats most punctuation in the string as delimiters, except a hyphen-minus (-
) that negates term or an escaped double quotes \"
that specifies a phrase.$text
运算符将字符串中的大多数标点符号视为分隔符,但连字符减号(-
)表示否定术语,或转义双引号“表示指定短语。
The $search
field for the $text
expression is different than the the $search aggregation stage provided by Atlas Search. $text
表达式的$search
字段与Atlas search提供的$serch
聚合阶段不同。The $search
aggregation stage performs full-text search on specified fields and is only available on MongoDB Atlas.$search
聚合阶段对指定字段执行全文搜索,仅在MongoDB Atlas上可用。
To match on a phrase, as opposed to individual terms, enclose the phrase in escaped double quotes (要匹配短语而不是单个术语,请将短语用转义双引号(\"
), as in:\"
)括起来,如:
"\"ssl certificate\""
If the 如果$search
string includes a phrase and individual terms, text search will only match the documents that include the phrase.$search
字符串包含短语和单个术语,则文本搜索将只匹配包含短语的文档。
For example, passed a 例如,传递了一个$search
string:$search
字符串:
"\"ssl certificate\" authority key"
The $text
operator searches for the phrase "ssl certificate"
.$text
运算符搜索短语"ssl certificate"
。
Prefixing a word with a hyphen-minus (在单词前面加上连字符减号(-
) negates a word:-
)表示否定单词:
pre-market
, is not a negation. pre-market
,不是否定词。$text
operator treats the hyphen-minus (-
) as a delimiter. $text
运算符将连字符减号(-
)视为分隔符。market
in this instance, include a space between pre
and -market
, i.e., pre -market
.market
,需要在pre
和-market
之间留一个空格,即pre -market
。The $text
operator adds all negations to the query with the logical AND
operator.$text
运算符使用逻辑AND
运算符将所有否定添加到查询中。
The $text
operator ignores language-specific stop words, such as the
and and
in English.$text
运算符忽略语言特定的停止词,例如英语中的the
和and
。
For case insensitive and diacritic insensitive text searches, the 对于不区分大小写和区分音调符号的文本搜索,$text
operator matches on the complete stemmed word. $text
运算符将匹配完整的词干单词。So if a document field contains the word 因此,如果文档字段包含单词blueberry
, a search on the term blue
will not match. blueberry
,则对术语blue
的搜索将不匹配。However, 然而,blueberry
or blueberries
will match.blueberry
或blueberries
将相配。
For case sensitive search (i.e. 对于区分大小写的搜索(即$caseSensitive: true
), if the suffix stem contains uppercase letters, the $text
operator matches on the exact word.$caseSensitive:true
),如果后缀词干包含大写字母,则$text
运算符会匹配精确的单词。
For diacritic sensitive search (i.e. 对于区分音调符号的搜索(即$diacriticSensitive: true
), if the suffix stem contains the diacritic mark or marks, the $text
operator matches on the exact word.$diacriticSensitive:true
),如果后缀词干包含一个或多个音调符号,则$text
运算符会匹配精确的单词。
Changed in version 3.2.在版本3.2中更改。
The $text
operator defaults to the case insensitivity of the text index:$text
运算符默认为文本索引不区分大小写:
text
index are case insensitive for Latin characters without diacritic marks; i.e. for [A-z]
.text
索引对没有音调符号的拉丁字符不区分大小写;即[A-z]
。$caseSensitive
To support case sensitive search where the 要支持text
index is case insensitive, specify $caseSensitive: true
.text
索引不区分大小写的区分大小写搜索,请指定$caseSensitive:true
。
When performing a case sensitive search (在$caseSensitive: true
) where the text
index is case insensitive, the $text
operator:text
索引不区分大小写的情况下执行区分大小写搜索($caseSensitive:true
)时,$text
运算符:
text
index for case insensitive and diacritic matches.text
索引中搜索不区分大小写和音调符号匹配。$text
query operation includes an additional stage to filter out the documents that do not match the specified case.$text
查询操作包括一个额外的阶段,以筛选出与指定大小写不匹配的文档。For case sensitive search (i.e. 对于区分大小写的搜索(即$caseSensitive: true
), if the suffix stem contains uppercase letters, the $text
operator matches on the exact word.$caseSensitive:true
),如果后缀词干包含大写字母,则$text
运算符会匹配精确的单词。
Specifying 指定$caseSensitive: true
may impact performance.$caseSensitive:true
可能会影响性能。
Changed in version 3.2.在版本3.2中更改。
The $text
operator defaults to the diacritic insensitivity of the text index:$text
运算符默认为文本索引的音调不敏感:
é
, ê
, and e
.é
、ê
和e
。text
index are diacritic sensitive.text
索引区分音调符号。$diacriticSensitive
To support diacritic sensitive text search against the version 3 要支持针对版本3文本索引的区分音调符号的文本搜索,请指定text
index, specify $diacriticSensitive: true
.$diacriticSensitive:true
。
Text searches against earlier versions of the 根据早期版本的text
index are inherently diacritic sensitive and cannot be diacritic insensitive. text
索引进行的文本搜索本质上是区分音调的,不能不区分音调。As such, the 因此,$diacriticSensitive
option for the $text
operator has no effect with earlier versions of the text
index.$text
运算符的$diacriticSensitive
选项对早期版本的文本索引没有影响。
To perform a diacritic sensitive text search (要对版本3文本索引执行区分音调符号的文本搜索($diacriticSensitive: true
) against a version 3 text
index, the $text
operator:$diacriticSensitive:true
),$text
运算符:
text
index, which is diacritic insensitive.text
索引,该索引不区分音调符号。$text
query operation includes an additional stage to filter out the documents that do not match.$text
查询操作包括一个额外的阶段,以筛选出不匹配的文档。Specifying 指定$diacriticSensitive: true
may impact performance.$diacriticSensitive: true
可能会影响性能。
To perform a diacritic sensitive search against an earlier version of the 要对早期版本的text
index, the $text
operator searches the text
index which is diacritic sensitive.text
索引执行区分音调符号的搜索,$text
运算符将搜索区分音调字符的文本索引。
For diacritic sensitive search, if the suffix stem contains the diacritic mark or marks, the 对于区分音调符号的搜索,如果后缀词干包含一个或多个音调符号,则$text
operator matches on the exact word.$text
运算符会匹配精确的单词。
The $text
operator assigns a score to each document that contains the search term in the indexed fields. $text
运算符为索引字段中包含搜索词的每个文档指定分数。The score represents the relevance of a document to a given text search query. 分数表示文档与给定文本搜索查询的相关性。The score can be part of a 分数可以是sort()
method specification as well as part of the projection expression. sort()
方法规范的一部分,也可以是投影表达式的一部分。The { $meta: "textScore" }
expression provides information on the processing of the $text
operation. { $meta: "textScore" }
表达式提供有关$text
操作处理的信息。See 有关访问投影或排序分数的详细信息,请参阅$meta
projection operator for details on accessing the score for projection or sort.$meta
投影运算符。
The following examples assume a collection 以下示例假设集合articles
that has a version 3 text index on the field subject
:articles
在字段subject
上具有版本3文本索引:
db.articles.createIndex( { subject: "text" } )
Populate the collection with the following documents:使用以下文档填充集合:
db.articles.insertMany( [ { _id: 1, subject: "coffee", author: "xyz", views: 50 }, { _id: 2, subject: "Coffee Shopping", author: "efg", views: 5 }, { _id: 3, subject: "Baking a cake", author: "abc", views: 90 }, { _id: 4, subject: "baking", author: "xyz", views: 100 }, { _id: 5, subject: "Café Con Leche", author: "abc", views: 200 }, { _id: 6, subject: "Сырники", author: "jkl", views: 80 }, { _id: 7, subject: "coffee and cream", author: "efg", views: 10 }, { _id: 8, subject: "Cafe con Leche", author: "xyz", views: 10 } ] )
The following query specifies a 以下查询指定$search
string of coffee
:coffee
的$search
字符串:
db.articles.find( { $text: { $search: "coffee" } } )
This query returns the documents that contain the term 此查询返回在索引coffee
in the indexed subject
field, or more precisely, the stemmed version of the word:subject
字段中包含术语coffee
的文档,或者更准确地说,返回单词的词干版本:
{ _id: 1, subject: 'coffee', author: 'xyz', views: 50 }, { _id: 7, subject: 'coffee and cream', author: 'efg', views: 10 }, { _id: 2, subject: 'Coffee Shopping', author: 'efg', views: 5 }
If the search string is a space-delimited string, 如果搜索字符串是空格分隔的字符串,$text
operator performs a logical OR
search on each term and returns documents that contains any of the terms.$text
运算符将对每个术语执行逻辑OR
搜索,并返回包含任何术语的文档。
The following query specifies a 以下查询指定由空格分隔的三个词组成的$search
string of three terms delimited by space, "bake coffee cake"
:$search
字符串"bake coffee cake"
:
db.articles.find( { $text: { $search: "bake coffee cake" } } )
This query returns documents that contain either 此查询返回在索引bake
orcoffee
or cake
in the indexed subject
field, or more precisely, the stemmed version of these words:subject
字段中包含bake
或coffee
或cake
的文档,或者更准确地说,这些单词的词干版本:
{ "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg", "views" : 5 } { "_id" : 7, "subject" : "coffee and cream", "author" : "efg", "views" : 10 } { "_id" : 1, "subject" : "coffee", "author" : "xyz", "views" : 50 } { "_id" : 3, "subject" : "Baking a cake", "author" : "abc", "views" : 90 } { "_id" : 4, "subject" : "baking", "author" : "xyz", "views" : 100 }
To match the exact phrase as a single term, escape the quotes.要将精确短语匹配为单个术语,请转义引号。
The following query searches for the phrase 以下查询搜索短语coffee shop
:"coffee shop"
:
db.articles.find( { $text: { $search: "\"coffee shop\"" } } )
This query returns documents that contain the phrase 此查询返回包含短语coffee shop
:coffee shop
的文档:
{ "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg", "views" : 5 }
A negated term is a term that is prefixed by a minus sign 否定项是以减号-
. -
为前缀的项。If you negate a term, the 如果对某个术语取反,$text
operator will exclude the documents that contain those terms from the results.$text
运算符将从结果中排除包含这些术语的文档。
The following example searches for documents that contain the words 以下示例搜索包含单词coffee
but do not contain the term shop
, or more precisely the stemmed version of the words:coffee
但不包含单词shop
的文档,或者更准确地说,搜索单词的词干版本:
db.articles.find( { $text: { $search: "coffee -shop" } } )
The query returns the following documents:查询返回以下文档:
{ "_id" : 7, "subject" : "coffee and cream", "author" : "efg", "views" : 10 } { "_id" : 1, "subject" : "coffee", "author" : "xyz", "views" : 50 }
Use the optional 使用$language
field in the $text
expression to specify a language that determines the list of stop words and the rules for the stemmer and tokenizer for the search string.$text
表达式中的可选$language
字段指定一种语言,该语言确定停止词列表以及搜索字符串的词干分析器和标记器规则。
If you specify a language value of 如果将语言值指定为"none"
, then the text search uses simple tokenization with no list of stop words and no stemming."none"
,则文本搜索将使用简单的标记化,不包含停止词列表和词干。
The following query specifies 以下查询指定es
, i.e. Spanish, as the language that determines the tokenization, stemming, and stop words:es
,即西班牙语,作为确定标记化、词干化和停止词的语言:
db.articles.find( { $text: { $search: "leche", $language: "es" } } )
The query returns the following documents:查询返回以下文档:
{ "_id" : 5, "subject" : "Café Con Leche", "author" : "abc", "views" : 200 } { "_id" : 8, "subject" : "Cafe con Leche", "author" : "xyz", "views" : 10 }
The $text
expression can also accept the language by name, spanish
. $text
表达式还可以接受名称为spanish
的语言。See Text Search Languages for the supported languages.有关支持的语言,请参阅文本搜索语言。
Changed in version 3.2.在版本3.2中更改。
The $text
operator defers to the case and diacritic insensitivity of the text
index. $text
运算符遵从文本索引的大小写和音调不敏感。The version 3 版本3的文本索引不区分变音符号,并将其不区分大小写扩展为包括西里尔字母以及带有变音符号的字符。text
index is diacritic insensitive and expands its case insensitivity to include the Cyrillic alphabet as well as characters with diacritics. For details, see text Index Case Insensitivity and text Index Diacritic Insensitivity.有关详细信息,请参阅文本索引大小写不敏感和文本索引音调不敏感。
The following query performs a case and diacritic insensitive text search for the terms 以下查询对术语сы́рники
or CAFÉS
:сы́рники
或CAFÉS
执行不区分大小写和音调符号的文本搜索:
db.articles.find( { $text: { $search: "сы́рники CAFÉS" } } )
Using the version 3 使用版本3text
index, the query matches the following documents.text
索引,查询匹配以下文档。
{ "_id" : 6, "subject" : "Сырники", "author" : "jkl", "views" : 80 } { "_id" : 5, "subject" : "Café Con Leche", "author" : "abc", "views" : 200 } { "_id" : 8, "subject" : "Cafe con Leche", "author" : "xyz", "views" : 10 }
With the previous versions of the 对于以前版本的text
index, the query would not match any document.text
索引,查询将不匹配任何文档。
Changed in version 3.2.在版本3.2中更改。
To enable case sensitive search, specify 要启用区分大小写搜索,请指定$caseSensitive: true
. $caseSensitive:true
。Specifying 指定$caseSensitive: true
may impact performance.$caseSensitive:true
可能会影响性能。
The following query performs a case sensitive search for the term 以下查询对术语Coffee
:Coffee
执行区分大小写的搜索:
db.articles.find( { $text: { $search: "Coffee", $caseSensitive: true } } )
The search matches just the document:搜索只匹配文档:
{ "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg", "views" : 5 }
The following query performs a case sensitive search for the phrase 以下查询对短语Café Con Leche
:Café Con Leche
执行区分大小写的搜索:
db.articles.find( { $text: { $search: "\"Café Con Leche\"", $caseSensitive: true } } )
The search matches just the document:搜索只匹配文档:
{ "_id" : 5, "subject" : "Café Con Leche", "author" : "abc", "views" : 200 }
A negated term is a term that is prefixed by a minus sign 否定项是以减号-
. -
为前缀的项。If you negate a term, the 如果对某个术语取反,$text
operator will exclude the documents that contain those terms from the results. $text
运算符将从结果中排除包含这些术语的文档。You can also specify case sensitivity for negated terms.您还可以为否定的术语指定区分大小写。
The following example performs a case sensitive search for documents that contain the word 以下示例对包含单词Coffee
but do not contain the lower-case term shop
, or more precisely the stemmed version of the words:Coffee
但不包含小写术语shop
的文档执行区分大小写的搜索,或者更准确地说,搜索单词的词干版本:
db.articles.find( { $text: { $search: "Coffee -shop", $caseSensitive: true } } )
The query matches the following document:查询匹配以下文档:
{ "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg" }
Changed in version 3.2.在版本3.2中更改。
To enable diacritic sensitive search against a version 3 text index, specify 要对版本3文本索引启用区分音调符号的搜索,请指定$diacriticSensitive: true
. $diacriticSensitive:true
。Specifying 指定$diacriticSensitive: true
may impact performance.$diacriticSensitive:true
可能会影响性能。
The following query performs a diacritic sensitive text search on the term 以下查询对术语CAFÉ
, or more precisely the stemmed version of the word:CAFÉ
或更准确地说是词干版本执行区分变音符号的文本搜索:
db.articles.find( { $text: { $search: "CAFÉ", $diacriticSensitive: true } } )
The query only matches the following document:查询仅匹配以下文档:
{ "_id" : 5, "subject" : "Café Con Leche", "author" : "abc" }
The $diacriticSensitive
option applies also to negated terms. $diacriticSensitive
选项也适用于否定的术语。A negated term is a term that is prefixed by a minus sign 否定项是以减号-
. -
为前缀的项。If you negate a term, the 如果对某个术语取反,$text
operator will exclude the documents that contain those terms from the results.$text
运算符将从结果中排除包含这些术语的文档。
The following query performs a diacritic sensitive text search for document that contains the term 以下查询对包含术语leches
but not the term cafés
, or more precisely the stemmed version of the words:leches
但不包含术语cafés
的文档执行区分音调符号的文本搜索,或者更准确地说,是单词的词干版本:
db.articles.find( { $text: { $search: "leches -cafés", $diacriticSensitive: true } } )
The query matches the following document:查询匹配以下文档:
{ "_id" : 8, "subject" : "Cafe con Leche", "author" : "xyz" }
The following query performs a text search for the term 以下查询对术语cake
and uses the $meta
operator in the projection document to append the relevance score to each matching document:cake
执行文本搜索,并使用投影文档中的$meta
运算符将相关性分数附加到每个匹配文档:
db.articles.find( { $text: { $search: "cake" } }, { score: { $meta: "textScore" } } )
The returned document includes an additional field 返回的文档包含一个额外的字段score
that contains the document's relevance score:score
,其中包含文档的相关分数:
{ "_id" : 3, "subject" : "Baking a cake", "author" : "abc", "views" : 90, "score" : 0.75 }
Starting in MongoDB 4.4, you can specify the 从MongoDB4.4开始,您可以在{ $meta: "textScore" }
expression in the sort()
without also specifying the expression in the projection. sort()
中指定{ $meta: "textScore" }
表达式,而无需在投影中指定表达式。For example,
db.articles.find( { $text: { $search: "cake" } } ).sort( { score: { $meta: "textScore" } } )
As a result, you can sort the resulting documents by their search relevance without projecting the 因此,您可以按搜索相关性对结果文档进行排序,而无需投影textScore
.textScore
。
{ $meta: "textScore" }
expression in the sort()
, you must also include the same expression in the projection.sort()
中包含{ $meta: "textScore" }
表达式,还必须在投影中包含相同的表达式。Starting in MongoDB 4.4, if you include the 从MongoDB 4.4开始,如果在投影和{ $meta: "textScore" }
expression in both the projection and sort()
, the projection and sort documents can have different field names for the expression.sort()
中都包含{ $meta: "textScore" }
表达式,则投影和排序文档可以为表达式使用不同的字段名。
db.articles.find( { $text: { $search: "cake" } } , { score: { $meta: "textScore" } } ).sort( { ignoredName: { $meta: "textScore" } } )
In previous versions of MongoDB, if 在MongoDB的早期版本中,如果{ $meta: "textScore" }
is included in both the projection and sort, you must specify the same field name for the expression.{ $meta: "textScore" }
同时包含在投影和排序中,则必须为表达式指定相同的字段名。
In MongoDB 4.2 and earlier, to sort by the text score, include the same 在MongoDB 4.2及更早版本中,要根据文本得分进行排序,在投影文档和排序表达式中都包含相同的$meta
expression in both the projection document and the sort expression. $meta
表达式。The following query searches for the term 以下查询搜索术语coffee
and sorts the results by the descending score:coffee
并按降序分数对结果进行排序:
db.articles.find( { $text: { $search: "coffee" } }, { score: { $meta: "textScore" } } ).sort( { score: { $meta: "textScore" } } )
The query returns the matching documents sorted by descending score.查询返回按降序排序的匹配文档。
Use the 结合使用limit()
method in conjunction with a sort()
to return the top n
matching documents.limit()
方法和sort()
返回前n个匹配文档。
The following query searches for the term 以下查询搜索术语coffee
and sorts the results by the descending score, limiting the results to the top two matching documents:coffee
,并按降序分数对结果进行排序,将结果限制为前两个匹配文档:
db.articles.find( { $text: { $search: "coffee" } }, { score: { $meta: "textScore" } } ).sort( { score: { $meta: "textScore" } } ).limit(2)
The following query searches for documents where the 以下查询搜索author
equals "xyz"
and the indexed field subject
contains the terms coffee
or bake
. author
等于"xyz"
且索引字段subject
包含术语coffee
或bake
的文档。The operation also specifies a sort order of ascending 该操作还指定了date
, then descending text search score:date
升序,然后文本搜索分数降序的排序顺序:
db.articles.find( { author: "xyz", $text: { $search: "coffee bake" } }, { score: { $meta: "textScore" } } ).sort( { date: 1, score: { $meta: "textScore" } } )