Compound Text Search Criteria复合文本搜索条件

Minimum MongoDB Version: 4.2

Scenario情形

You want to search a collection of e-commerce products to find specific movie DVDs. 您想要搜索电子商务产品的集合来查找特定的电影DVD。Based on each DVD's full-text plot description, you want movies with a post-apocalyptic theme, especially those related to a nuclear disaster where some people survive. 根据每张DVD的全文情节描述,你想要后世界末日主题的电影,尤其是那些与灾难有关的电影,其中一些人幸存下来。However, you aren't interested in seeing movies involving zombies.然而,你对看僵尸电影不感兴趣。

To execute this example, you need to be using an Atlas Cluster rather than a self-managed MongoDB deployment. 要执行此示例,您需要使用Atlas集群,而不是自行管理的MongoDB部署。The simplest way to achieve this is to provision a Free Tier Atlas Cluster.实现这一点的最简单方法是提供一个免费的Tier Atlas集群

Sample Data Population样本数据总体

Drop any old version of the database (if it exists) and then populate a new products collection with some DVD and Book records:删除数据库的任何旧版本(如果存在),然后用一些DVDBook记录填充新产品集合:

db = db.getSiblingDB("book-compound-text-search");
db.products.remove({});

// Insert 7 records into the products collection在产品集合中插入7条记录
db.products.insertMany([
  {
    "name": "The Road",
    "category": "DVD",
    "description": "In a dangerous post-apocalyptic world, a dying father protects his surviving son as they try to reach the coast",
  },
  {
    "name": "The Day Of The Triffids",
    "category": "BOOK",
    "description": "Post-apocalyptic disaster where most people are blinded by a meteor shower and then die at the hands of a new type of plant",
  },
  {
    "name": "The Road",
    "category": "BOOK",
    "description": "In a dangerous post-apocalyptic world, a dying father protects his surviving son as they try to reach the coast",
  },  
  {
    "name": "The Day the Earth Caught Fire",
    "category": "DVD",
    "description": "A series of nuclear explosions cause fires and earthquakes to ravage cities, with some of those that survive trying to rescue the post-apocalyptic world",
  },
  {
    "name": "28 Days Later",
    "category": "DVD",
    "description": "A caged chimp infected with a virus is freed from a lab, and the infection spreads to people who become zombie-like with just a few surviving in a post-apocalyptic country",
  },  
  {
    "name": "Don't Look Up",
    "category": "DVD",
    "description": "Pre-apocalyptic situation where some astronomers warn humankind of an approaching comet that will destroy planet Earth",
  },
  {
    "name": "Thirteen Days",
    "category": "DVD",
    "description": "Based on the true story of the Cuban nuclear misile threat, crisis is averted at the last minute and the workd survives",
  },
]); 

 

Now, using the simple procedure described in the Create Atlas Search Index appendix, define a Search Index. 现在,使用创建Atlas搜索索引附录中描述的简单过程,定义一个搜索索引。Select the new database collection book-compound-text-search.products and enter the following JSON search index definition:选择新的数据库集合book-compound-text-search.products,并输入以下JSON搜索索引定义:

{
  "searchAnalyzer": "lucene.english",
  "mappings": {
    "dynamic": true
  }
}

This definition indicates that the index should use the lucene-english analyzer and include all document fields to be searchable with their inferred data types.这个定义表明索引应该使用lucene英语分析器,并包括所有可搜索的文档字段及其推断的数据类型。

Aggregation Pipeline聚合管道

Define a pipeline ready to perform the aggregation:定义准备执行聚合的管道:

var pipeline = [
  // Search for DVDs where the description must contain "apocalyptic" but not "zombie"搜索描述必须包含“启示录”但不包含“僵尸”的DVD
  {"$search": {
    "index": "default",    
    "compound": {
      "must": [
        {"text": {
          "path": "description",
          "query": "apocalyptic",
        }},
      ],
      "should": [
        {"text": {
          "path": "description",
          "query": "nuclear survives",
        }},
      ],
      "mustNot": [
        {"text": {
          "path": "description",
          "query": "zombie",
        }},
      ],
      "filter": [
        {"text": {
          "path": "category",
          "query": "DVD",
        }},      
      ],
    }
  }},

  // Capture the search relevancy score in the output and omit the _id field在输出中捕获搜索相关性得分,并省略_id字段
  {"$set": {
    "score": {"$meta": "searchScore"},
    "_id": "$$REMOVE",
  }},
];

Execution执行

Execute the aggregation using the defined pipeline and also view its explain plan:使用定义的管道执行聚合,并查看其解释计划:

db.products.aggregate(pipeline);
db.products.explain("executionStats").aggregate(pipeline);

Expected Results预期结果

Three documents should be returned, showing products which are post-apocalyptic themed DVDs, as shown below:应退回三份文件,显示的产品为后世界末日主题DVD,如下所示:

[
  {
    name: 'The Day the Earth Caught Fire',
    category: 'DVD',
    description: 'A series of nuclear explosions cause fires and earthquakes to ravage cities, with some of those that survive trying to rescue the post-apocalyptic world',
    score: 0.8468831181526184
  },
  {
    name: 'The Road',
    category: 'DVD',
    description: 'In a dangerous post-apocalyptic world, a dying father protects his surviving son as they try to reach the coast',
    score: 0.3709350824356079
  },
  {
    name: "Don't Look Up",
    category: 'DVD',
    description: 'Pre-apocalyptic situation where some astronomers warn humankind of an approaching comet that will destroy planet Earth',
    score: 0.09836573898792267
  }
]

If you don't see any results, double-check that the system has finished generating your new index.如果没有看到任何结果,请仔细检查系统是否已完成生成新索引。

Observations观察