Database Manual / Self-Managed Deployments

Text Search on Self-Managed Deployments关于自我管理部署的文本搜索

Note

This page describes text query capabilities for self-managed (non-Atlas) deployments. For data hosted on MongoDB, MongoDB also offers an improved full-text query solution, MongoDB Search and a vector search solution, Vector Search.本页介绍自我管理(非Atlas)部署的文本查询功能。对于托管在MongoDB上的数据,MongoDB还提供了改进的全文查询解决方案MongoDB搜索矢量搜索解决方案矢量搜索。

To run text search queries on self-managed deployments, you must have a text index on your collection. 要在自我管理部署上运行文本搜索查询,集合上必须有一个文本索引MongoDB provides text indexes to support text search queries on string content. Text indexes can include any field whose value is a string or an array of string elements. MongoDB提供文本索引来支持对字符串内容的文本搜索查询。文本索引可以包括其值为字符串或字符串元素数组的任何字段。A collection can only have one text search index, but that index can cover multiple fields.一个集合只能有一个文本搜索索引,但该索引可以覆盖多个字段。

See the Text Indexes on Self-Managed Deployments section for a full reference on text indexes, including behavior, tokenization, and properties.有关文本索引的完整参考,包括行为、标记化和属性,请参阅自我管理部署的文本索引一节。

Examples示例

This example demonstrates how to build a text index and use it to find coffee shops, given only text fields.这个例子演示了如何构建一个文本索引,并在只有文本字段的情况下使用它来查找咖啡店。

Create a Collection创建集合

Create a collection stores with the following documents:使用以下文档创建集合stores

db.stores.insertMany(
[
{ _id: 1, name: "Java Hut", description: "Coffee and cakes" },
{ _id: 2, name: "Burger Buns", description: "Gourmet hamburgers" },
{ _id: 3, name: "Coffee Shop", description: "Just coffee" },
{ _id: 4, name: "Clothes Clothes Clothes", description: "Discount clothing" },
{ _id: 5, name: "Java Shopping", description: "Indonesian goods" },
{ _id: 6, name: "NYC_Coffee Shop", description: "local NYC coffee" }
]
)

Create a Text Index创建文本索引

Run the following in mongosh to allow text search over the name and description fields:mongosh中运行以下命令,以允许在namedescription字段上进行文本搜索:

db.stores.createIndex( { name: "text", description: "text" } )

Search for an Exact String搜索精确字符串

You can search for exact multi-word strings by wrapping them in double-quotes. Text search only matches documents that include the whole string.您可以通过将多个单词字符串括在双引号中来搜索它们。文本搜索仅匹配包含整个字符串的文档。

For example, the following query finds all documents that contain the string "coffee shop":例如,以下查询查找包含字符串“coffee shop”的所有文档:

db.stores.find( { $text: { $search: "\"coffee shop\"" } } )

This query returns the following documents:此查询返回以下文档:

[
{ _id: 3, name: 'Coffee Shop', description: 'Just coffee' },
{ _id: 6, name: 'NYC_Coffee Shop', description: 'local NYC coffee' }
]

Unless specified, exact string search is not case sensitive or diacritic sensitive. For example, the following query returns the same results as the previous query:除非指定,否则精确字符串搜索不区分大小写或变音符号。例如,以下查询返回与前一个查询相同的结果:

db.stores.find( { $text: { $search: "\"COFFEé SHOP\"" } } )

Exact string search does not handle stemming or stop words.精确字符串搜索不处理词干或停用词。

Exclude a Term排除一个期限

To exclude a word, you can prepend a "-" character. For example, to find all stores containing "java" or "shop" but not "coffee", use the following:要排除单词,可以在前面添加“-”字符。例如,要查找所有包含“java”或“shop”但不包含“coffee”的商店,请使用以下命令:

db.stores.find( { $text: { $search: "java shop -coffee" } } )

Sort the Results对结果进行排序

MongoDB returns its results in unsorted order by default. However, $text queries compute a relevance score for each document that specifies how well a document matches the query.默认情况下,MongoDB以未排序的顺序返回结果。但是,$text查询会为每个文档计算一个相关性得分,指定文档与查询的匹配程度。

To sort the results in order of relevance score, you must explicitly project the $meta textScore field and sort on it:要按相关性得分对结果进行排序,您必须显式投影$meta textScore字段并对其进行排序:

db.stores.find(
{ $text: { $search: "java coffee shop" } },
{ score: { $meta: "textScore" } }
).sort( { score: { $meta: "textScore" } } )

$text is also available in the aggregation pipeline.在聚合管道中也可用。