Docs HomeMongoDB Manual

Model One-to-Many Relationships with Embedded Documents与嵌入式文档建立一对多关系模型

Overview概述

This page describes a data model that uses embedded documents to describe a one-to-many relationship between connected data. 此页面描述了一个数据模型,该模型使用嵌入文档来描述连接数据之间的一对多关系。Embedding connected data in a single document can reduce the number of read operations required to obtain data. 将连接的数据嵌入到单个文档中可以减少获取数据所需的读取操作次数。In general, you should structure your schema so your application receives all of its required information in a single read operation.通常,您应该构建您的模式,以便您的应用程序在一次读取操作中接收所有所需的信息。

Embedded Document Pattern嵌入式文档模式

Consider the following example that maps patron and multiple address relationships. 考虑以下映射赞助人和多个地址关系的示例。The example illustrates the advantage of embedding over referencing if you need to view many data entities in context of another. 该示例说明了如果您需要在另一个数据实体的上下文中查看许多数据实体,则嵌入优于引用的优势。In this one-to-many relationship between patron and address data, the patron has multiple address entities.patronaddress数据之间的这种一对多关系中,patron具有多个address实体。

In the normalized data model, the address documents contain a reference to the patron document.在规范化数据模型中,address文档包含对patron文档的引用。

// patron document
{
_id: "joe",
name: "Joe Bookreader"
}

// address documents
{
patron_id: "joe", // reference to patron document
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}

{
patron_id: "joe",
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: "12345"
}

If your application frequently retrieves the address data with the name information, then your application needs to issue multiple queries to resolve the references. 如果应用程序经常检索带有name信息的address数据,那么应用程序需要发出多个查询来解析引用。A more optimal schema would be to embed the address data entities in the patron data, as in the following document:更优化的方案是将address数据实体嵌入到patron数据中,如以下文档所示:

{
"_id": "joe",
"name": "Joe Bookreader",
"addresses": [
{
"street": "123 Fake Street",
"city": "Faketon",
"state": "MA",
"zip": "12345"
},
{
"street": "1 Some Other Street",
"city": "Boston",
"state": "MA",
"zip": "12345"
}
]
}

With the embedded data model, your application can retrieve the complete patron information with one query.使用嵌入式数据模型,您的应用程序可以通过一个查询检索完整的顾客信息。

Subset Pattern子集模式

A potential problem with the embedded document pattern is that it can lead to large documents, especially if the embedded field is unbounded. 嵌入式文档模式的一个潜在问题是,它可能会导致生成大型文档,尤其是在嵌入式字段是无边界的情况下。In this case, you can use the subset pattern to only access data which is required by the application, instead of the entire set of embedded data.在这种情况下,您可以使用子集模式只访问应用程序所需的数据,而不是整个嵌入数据集。

Consider an e-commerce site that has a list of reviews for a product:考虑一个电子商务网站,它有一个产品评论列表:

{
"_id": 1,
"name": "Super Widget",
"description": "This is the most useful item in your toolbox.",
"price": { "value": NumberDecimal("119.99"), "currency": "USD" },
"reviews": [
{
"review_id": 786,
"review_author": "Kristina",
"review_text": "This is indeed an amazing widget.",
"published_date": ISODate("2019-02-18")
},
{
"review_id": 785,
"review_author": "Trina",
"review_text": "Nice product. Slow shipping.",
"published_date": ISODate("2019-02-17")
},
...
{
"review_id": 1,
"review_author": "Hans",
"review_text": "Meh, it's okay.",
"published_date": ISODate("2017-12-06")
}
]
}

The reviews are sorted in reverse chronological order. 评论按时间倒序排列。When a user visits a product page, the application loads the ten most recent reviews.当用户访问产品页面时,应用程序会加载最新的十条评论。

Instead of storing all of the reviews with the product, you can split the collection into two collections:您可以将集合拆分为两个集合,而不是将所有评论与产品一起存储:

  • The product collection stores information on each product, including the product's ten most recent reviews:product集合存储每个产品的信息,包括该产品的十条最新评论:

    {
    "_id": 1,
    "name": "Super Widget",
    "description": "This is the most useful item in your toolbox.",
    "price": { "value": NumberDecimal("119.99"), "currency": "USD" },
    "reviews": [
    {
    "review_id": 786,
    "review_author": "Kristina",
    "review_text": "This is indeed an amazing widget.",
    "published_date": ISODate("2019-02-18")
    }
    ...
    {
    "review_id": 777,
    "review_author": "Pablo",
    "review_text": "Amazing!",
    "published_date": ISODate("2019-02-16")
    }
    ]
    }
  • The review collection stores all reviews. review集合存储所有评论。Each review contains a reference to the product for which it was written.每一篇评论都包含一篇对撰写评论的产品的引用。

    {
    "review_id": 786,
    "product_id": 1,
    "review_author": "Kristina",
    "review_text": "This is indeed an amazing widget.",
    "published_date": ISODate("2019-02-18")
    }
    {
    "review_id": 785,
    "product_id": 1,
    "review_author": "Trina",
    "review_text": "Nice product. Slow shipping.",
    "published_date": ISODate("2019-02-17")
    }
    ...
    {
    "review_id": 1,
    "product_id": 1,
    "review_author": "Hans",
    "review_text": "Meh, it's okay.",
    "published_date": ISODate("2017-12-06")
    }

By storing the ten most recent reviews in the product collection, only the required subset of the overall data is returned in the call to the product collection. 通过在product集合中存储最近的十条评论,在对product集合的调用中只返回所需的总体数据子集。If a user wants to see additional reviews, the application makes a call to the review collection.如果用户想查看其他评论,应用程序会调用review集合。

Tip

When considering where to split your data, the most frequently-accessed portion of the data should go in the collection that the application loads first. 当考虑在哪里拆分数据时,数据中最频繁访问的部分应该放在应用程序首先加载的集合中。In this example, the schema is split at ten reviews because that is the number of reviews visible in the application by default.在本例中,模式被划分为十个评论,因为默认情况下,这是应用程序中可见的评论数量。

Tip

See also: 另请参阅:

To learn how to use the subset pattern to model one-to-one relationships between collections, see Model One-to-One Relationships with Embedded Documents.要了解如何使用子集模式对集合之间的一对一关系进行建模,请参阅使用嵌入式文档建立一对一的关系模型

Trade-Offs of the Subset Pattern子集模式的权衡

Using smaller documents containing more frequently-accessed data reduces the overall size of the working set. 使用包含更频繁访问的数据的较小文档可以减少工作集的总体大小。These smaller documents result in improved read performance for the data that the application accesses most frequently.这些较小的文档提高了应用程序最频繁访问的数据的读取性能。

However, the subset pattern results in data duplication. 但是,子集模式会导致数据重复。In the example, reviews are maintained in both the product collection and the reviews collection. 在本例中,评论同时保存在product集合和reviews集合中。Extra steps must be taken to ensure that the reviews are consistent between each collection. 必须采取额外的步骤,以确保每个集合之间的审查一致。For example, when a customer edits their review, the application may need to make two write operations: one to update the product collection and one to update the reviews collection.例如,当客户编辑其评论时,应用程序可能需要进行两次写入操作:一次更新product集合,另一次更新reviews集合。

You must also implement logic in your application to ensure that the reviews in the product collection are always the ten most recent reviews for that product.您还必须在应用程序中实现逻辑,以确保product集合中的评论始终是该产品的十条最新评论。

Other Sample Use Cases其他示例用例

In addition to product reviews, the subset pattern can also be a good fit to store:除了产品评论,子集模式也可以很好地存储:

  • Comments on a blog post, when you only want to show the most recent or highest-rated comments by default.博客文章上的评论,默认情况下,当您只想显示最新或评分最高的评论时。
  • Cast members in a movie, when you only want to show cast members with the largest roles by default.电影中的演员,当默认情况下您只想显示角色最多的演员时。