Database Manual / Data Modeling / Schema Design Patterns / Group Data

Group Data with the Attribute Pattern使用属性模式对数据进行分组

The attribute pattern is a schema design pattern that helps organize documents with many similar fields, especially when the fields share common characteristics. If you need to sort or query on these subsets of similar fields, the attribute pattern can optimize your schema. 属性模式是一种模式设计模式,它有助于组织具有许多相似字段的文档,特别是在字段具有共同特征的情况下。如果你需要对这些类似字段的子集进行排序或查询,属性模式可以优化模式。It creates easier document indexing by consolidating multiple similar fields per document into a key-value sub-document. Rather than creating multiple indexes on several similar fields, the attribute pattern enables you to create fewer indexes, making your queries faster and simpler to write.它通过将每个文档的多个相似字段合并到一个键值子文档中,创建了更容易的文档索引。属性模式使您能够创建更少的索引,而不是在几个类似的字段上创建多个索引,从而使查询编写更快、更简单。

Use the attribute pattern if any of the following conditions apply to your collection:如果以下任何条件适用于集合,请使用属性模式:

About this Task关于此任务

Consider a collection of movies. Typical documents in the collection may look like this:考虑一组电影。集合中的典型文档可能如下:

db.movies.insertOne(
{
"_id": 1,
"title": "Star Wars",
"runtime": 121,
"directors": ["George Lucas"],
release_US: ISODate("1977-05-20T01:00:00+01:00"),
release_France: ISODate("1977-10-19T01:00:00+01:00"),
release_Italy: ISODate("1977-10-20T01:00:00+01:00"),
release_UK: ISODate("1977-12-27T01:00:00+01:00")
}
)

Note the multiple release date fields for different countries in the above document. If you want to search for a release date, you must look across many fields at once. Without the attribute pattern, you would need to create several indexes on the movies collection to quickly perform searches on release dates:请注意上述文档中不同国家的多个发布日期字段。如果你想搜索发布日期,你必须同时查看多个字段。如果没有属性模式,您需要在movies集合上创建多个索引,以便快速搜索发布日期:

db.movies.createIndex({ release_US: 1 });
db.movies.createIndex({ release_France: 1 });
db.movies.createIndex({ release_Italy: 1 });
db.movies.createIndex({ release_UK: 1 });

However, indexes are expensive and can slow down performance, particularly for write operations. The following procedure demonstrates how you can apply the attribute pattern to the movies collection by moving the information subset of different release dates into an array, reducing indexing needs.但是,索引很昂贵,可能会降低性能,特别是对于写操作。以下过程演示了如何通过将不同发布日期的信息子集移动到数组中来将属性模式应用于movies集合,从而减少索引需求。

Steps步骤

1

Group subsets of data into one array.将数据子集分组到一个数组中。

Reorganize the schema to turn the various release date fields into an array of key-value pairs:重新组织架构,将各种发布日期字段转换为键值对数组:

db.movies.insertOne(
{
"_id": 1,
"title": "Star Wars",
"runtime": 121,
"directors": ["George Lucas"],
releases: [
{
location: "USA",
date: ISODate("1977-05-20T01:00:00+01:00")
},
{
location: "France",
date: ISODate("1977-10-19T01:00:00+01:00")
},
{
location: "Italy",
date: ISODate("1977-10-20T01:00:00+01:00")
},
{
location: "UK",
date: ISODate("1977-12-27T01:00:00+01:00")
}
]
}
)
2

Create an index on the releases array.releases数组上创建索引。

You can then create one index on the releases array. This index optimizes queries on any of the release date fields for various countries.然后,您可以在releases数组上创建一个索引。此索引优化了对不同国家/地区的任何发布日期字段的查询。

db.movies.createIndex({ releases: 1 });

Results结果

If a document has multiple fields that track the same or similar characteristics, the attribute pattern avoids the need to create indexes on each similar field. By consolidating similar fields into an array and creating an index on that array, you reduce the total number of required indexes and improve query performance.如果文档有多个字段跟踪相同或相似的特征,则属性模式避免了在每个相似字段上创建索引的需要。通过将类似的字段合并到一个数组中并在该数组上创建索引,您可以减少所需索引的总数并提高查询性能。

Other Use Cases其他用例

The attribute pattern can be helpful when your document describes the characteristics of items. Some products, such as clothing, may have sizes that are expressed in small, medium, or large. Other products in the same collection may be expressed in volume, while others may be expressed in physical dimensions or weight.当文档描述项目的特征时,属性模式可能会有所帮助。某些产品(如服装)的尺寸可能以小、中或大表示。同一系列中的其他产品可能以体积表示,而其他产品则可能以物理尺寸或重量表示。

For example, consider a collection of bottles of water. A document that does not use the attribute pattern may look like this:例如,考虑集合几瓶水。不使用属性模式的文档可能看起来像这样:

db.bottles.insertOne([
{
"_id": 1,
"volume_ml": 500,
"volume_ounces": 12
}
])

The following code applies the attribute pattern to the bottles collection:以下代码将属性模式应用于bottles(瓶子)集合:

db.bottles.insertOne([
{
"_id": 1,
specs: [
{ k: "volume", v: "500", u: "ml" },
{ k: "volume", v: "12", u: "ounces" },
]
}
])

Since the volume_ml and volume_ounces fields in the first document contain similar information, the schema above consolidates them into one field, specs. 由于第一个文档中的volume_mlvolume_ounces字段包含类似的信息,因此上述模式将它们合并到一个字段specs中。The specs field groups together information regarding measurement specifications of a given water bottle, where the k field specifies what is being measured, v specifies the value, and u specifies the unit of measurement.specs字段将给定水瓶的测量规格信息组合在一起,其中k字段指定要测量的内容,v指定值,u指定测量单位。

The attribute pattern also allows you to group together similar fields with different names. By specifying attributes through key-value pairs, like the k field, which specifies what is being measured, you can store a wider variety of similar fields into one array, minimizing the number of indexes you need to efficiently query your data.属性模式还允许您将具有不同名称的相似字段组合在一起。通过键值对指定属性,如k字段,它指定了要测量的内容,您可以将更多种类的类似字段存储到一个数组中,从而最大限度地减少高效查询数据所需的索引数量。

For example, consider this document in the bottles collection that does not use the attribute pattern. This document stores specifications on water bottle volume and height:例如,考虑bottles(瓶子)集合中不使用属性模式的文档。本文档存储了水瓶体积和高度的规格:

db.bottles.insertOne([
{
"_id": 1,
"volume_ml": 500,
"volume_ounces": 12,
"height_inches": 8
}
])

The following code applies the attribute pattern to the document. It groups the volume_ml, volume_ounces, and height_inches fields all into the specs array:以下代码将属性模式应用于文档。它将volume_mlvolume_ouncesheight_inches字段全部分组到specs数组中:

db.bottles.insertOne([
{
"_id": 1,
specs: [
{ k: "volume", v: "500", u: "ml" },
{ k: "volume", v: "12", u: "ounces" },
{ k: "height", v: "8", u: "inches" }
]
}
])

Using key-value pairs, like k, v, and u, allows for more flexibility in what fields you can add to the array. The more fields you can consolidate into the array, the fewer indexes you need to make, maximizing query performance.使用键值对,如kvu,可以更灵活地添加到数组中的字段。可以合并到数组中的字段越多,需要创建的索引就越少,从而最大限度地提高查询性能。