Model One-to-Many Relationships with Embedded Documents与嵌入文档建立一对多关系模型

On this page本页内容

Overview概述

This page describes a data model that uses embedded documents to describe a one-to-many relationship between connected data. 本页描述了一个数据模型,该模型使用嵌入式文档来描述连接数据之间的一对多关系。Embedding connected data in a single document can reduce the number of read operations required to obtain data. 在单个文档中嵌入连接数据可以减少获取数据所需的读取操作数。In general, you should structure your schema so your application receives all of its required information in a single read operation.一般来说,您应该构造模式,以便应用程序在一次读取操作中接收所有所需信息。

Embedded Document Pattern嵌入式文档模式

Consider the following example that maps patron and multiple address relationships. 考虑以下映射用户和多地址关系的示例。The example illustrates the advantage of embedding over referencing if you need to view many data entities in context of another. 如果需要在另一个数据实体的上下文中查看多个数据实体,该示例说明了嵌入优于引用的优点。In this one-to-many relationship between patron and address data, the patron has multiple address entities.patronaddress数据之间的这种一对多关系中,patron具有多个address实体。

In the normalized data model, the address documents contain a reference to the patron document.在规范化数据模型中,address文档包含对patron文档的引用。

// patron document
{
   _id: "joe",
   name: "Joe Bookreader"
}
// address documents
{
   patron_id: "joe", // reference to patron document
   street: "123 Fake Street",
   city: "Faketon",
   state: "MA",
   zip: "12345"
}
{
   patron_id: "joe",
   street: "1 Some Other Street",
   city: "Boston",
   state: "MA",
   zip: "12345"
}

If your application frequently retrieves the address data with the name information, then your application needs to issue multiple queries to resolve the references. 如果应用程序经常使用name信息检索address数据,则应用程序需要发出多个查询以解析引用。A more optimal schema would be to embed the address data entities in the patron data, as in the following document:更理想的方案是将address数据实体嵌入到patron数据中,如以下文档所示:

{
   "_id": "joe",
   "name": "Joe Bookreader",
   "addresses": [
                {
                  "street": "123 Fake Street",
                  "city": "Faketon",
                  "state": "MA",
                  "zip": "12345"
                },
                {
                  "street": "1 Some Other Street",
                  "city": "Boston",
                  "state": "MA",
                  "zip": "12345"
                }
              ]
 }

With the embedded data model, your application can retrieve the complete patron information with one query.使用嵌入式数据模型,您的应用程序可以通过一次查询检索完整的用户信息。

Subset Pattern子集模式

A potential problem with the embedded document pattern is that it can lead to large documents, especially if the embedded field is unbounded. 嵌入式文档模式的一个潜在问题是,它可能导致大文档,特别是如果嵌入式字段是无界的。In this case, you can use the subset pattern to only access data which is required by the application, instead of the entire set of embedded data.在这种情况下,您可以使用子集模式仅访问应用程序所需的数据,而不是整个嵌入式数据集。

Consider an e-commerce site that has a list of reviews for a product:考虑一个电子商务网站,它有一个产品评论列表:

{
  "_id": 1,
  "name": "Super Widget",
  "description": "This is the most useful item in your toolbox.",
  "price": { "value": NumberDecimal("119.99"), "currency": "USD" },
  "reviews": [
    {
      "review_id": 786,
      "review_author": "Kristina",
      "review_text": "This is indeed an amazing widget.",
      "published_date": ISODate("2019-02-18")
    },
    {
      "review_id": 785,
      "review_author": "Trina",
      "review_text": "Nice product. Slow shipping.",
      "published_date": ISODate("2019-02-17")
    },
    ...
    {
      "review_id": 1,
      "review_author": "Hans",
      "review_text": "Meh, it's okay.",
      "published_date": ISODate("2017-12-06")
    }
  ]
}

The reviews are sorted in reverse chronological order. 这些评论按相反的时间顺序排序。When a user visits a product page, the application loads the ten most recent reviews.当用户访问产品页面时,应用程序将加载十个最新的评论。

Instead of storing all of the reviews with the product, you can split the collection into two collections:您可以将集合拆分为两个集合,而不是将所有评论与产品一起存储:

  • The product collection stores information on each product, including the product's ten most recent reviews:product集合存储每个产品的信息,包括产品的十个最新评论:

    {
      "_id": 1,
      "name": "Super Widget",
      "description": "This is the most useful item in your toolbox.",
      "price": { "value": NumberDecimal("119.99"), "currency": "USD" },
      "reviews": [
        {
          "review_id": 786,
          "review_author": "Kristina",
          "review_text": "This is indeed an amazing widget.",
          "published_date": ISODate("2019-02-18")
        }
        ...
        {
          "review_id": 777,
          "review_author": "Pablo",
          "review_text": "Amazing!",
          "published_date": ISODate("2019-02-16")
        }
      ]
    }
  • The review collection stores all reviews. review集合存储所有评论。Each review contains a reference to the product for which it was written.每一篇评论都包含对其撰写的产品的引用。

    {
      "review_id": 786,
      "product_id": 1,
      "review_author": "Kristina",
      "review_text": "This is indeed an amazing widget.",
      "published_date": ISODate("2019-02-18")
    }
    {
      "review_id": 785,
      "product_id": 1,
      "review_author": "Trina",
      "review_text": "Nice product. Slow shipping.",
      "published_date": ISODate("2019-02-17")
    }
    ...
    {
      "review_id": 1,
      "product_id": 1,
      "review_author": "Hans",
      "review_text": "Meh, it's okay.",
      "published_date": ISODate("2017-12-06")
    }

By storing the ten most recent reviews in the product collection, only the required subset of the overall data is returned in the call to the product collection. 通过在product集合中存储十个最近的评论,在对product集合的调用中只返回总体数据的所需子集。If a user wants to see additional reviews, the application makes a call to the review collection.如果用户希望查看其他评论,应用程序将调用review集合。

Tip提示

When considering where to split your data, the most frequently-accessed portion of the data should go in the collection that the application loads first. 在考虑将数据拆分到何处时,数据中最频繁访问的部分应位于应用程序首先加载的集合中。In this example, the schema is split at ten reviews because that is the number of reviews visible in the application by default.在本例中,模式被拆分为10个评论,因为默认情况下,这是应用程序中可见的评论数。

Tip提示
See also: 参阅:

To learn how to use the subset pattern to model one-to-one relationships between collections, see Model One-to-One Relationships with Embedded Documents.要了解如何使用子集模式对集合之间的一对一关系进行建模,请参阅使用嵌入文档建模一对一的关系

Trade-Offs of the Subset Pattern子集模式的权衡

Using smaller documents containing more frequently-accessed data reduces the overall size of the working set. 使用包含更频繁访问数据的较小文档可以减少工作集的总体大小。These smaller documents result in improved read performance for the data that the application accesses most frequently.这些较小的文档可以提高应用程序最频繁访问的数据的读取性能。

However, the subset pattern results in data duplication. 但是,子集模式会导致数据重复。In the example, reviews are maintained in both the product collection and the reviews collection. 在本例中,在product集合和reviews集合中都维护了评论。Extra steps must be taken to ensure that the reviews are consistent between each collection. 必须采取额外步骤,以确保每个集合之间的审查一致。For example, when a customer edits their review, the application may need to make two write operations: one to update the product collection and one to update the reviews collection.例如,当客户编辑其评论时,应用程序可能需要执行两个写入操作:一个更新product集合,一个更新reviews集合。

You must also implement logic in your application to ensure that the reviews in the product collection are always the ten most recent reviews for that product.您还必须在应用程序中实现逻辑,以确保product集合中的评论始终是该产品的十个最新评论。

Other Sample Use Cases其他示例用例

In addition to product reviews, the subset pattern can also be a good fit to store:除了产品评论之外,子集模式还可以很好地存储:

  • Comments on a blog post, when you only want to show the most recent or highest-rated comments by default.在默认情况下,当您只想显示最新或评分最高的评论时,在博客文章上的评论。
  • Cast members in a movie, when you only want to show cast members with the largest roles by default.电影中的演员组成员,默认情况下,仅希望显示角色最大的演员组。
←  Model One-to-One Relationships with Embedded DocumentsModel One-to-Many Relationships with Document References →