On this page本页内容
In sharded clusters, you can create zones of sharded data based on the shard key. 在分片集群中,可以基于分片键创建分片数据区域。You can associate each zone with one or more shards in the cluster. 您可以将每个分区与集群中的一个或多个分片相关联。A shard can associate with any number of zones. 分片可以与任意数量的区域关联。In a balanced cluster, MongoDB migrates chunks covered by a zone only to those shards associated with the zone.在平衡集群中,MongoDB只将区域覆盖的块迁移到与该区域关联的分片。
Changed in version 4.0.3.在版本4.0.3中更改。
This tutorial uses Zones to route documents based on creation date either to shards zoned for supporting recent documents, or those zoned for supporting archived documents.本教程使用区域根据创建日期将文档路由到分区用于支持最近文档的分片,或分区用于支持存档文档的分片。
The following are some example use cases for segmenting data based on Service Level Agreement (SLA) or Service Level Objective (SLO):以下是基于服务级别协议(SLA)或服务级别目标(SLO)划分数据的一些示例用例:
The following diagram illustrates a sharded cluster that uses hardware based zones to satisfy data access SLAs or SLOs.下图说明了使用基于硬件的区域来满足数据访问SLA或SLO的分片集群。
A photo sharing application requires fast access to photos uploaded within the last 6 months. 照片共享应用程序需要快速访问过去6个月内上传的照片。The application stores the location of each photo along with its metadata in the 应用程序将每个照片的位置及其元数据存储在photoshare
database under the data
collection.photoshare
数据库的data
集合下中。
The following documents represent photos uploaded by a single user:以下文档表示单个用户上传的照片:
{ "_id" : 10003010, "creation_date" : ISODate("2012-12-19T06:01:17.171Z"), "userid" : 123, "photo_location" : "example.net/storage/usr/photo_1.jpg" } { "_id" : 10003011, "creation_date" : ISODate("2013-12-19T06:01:17.171Z"), "userid" : 123, "photo_location" : "example.net/storage/usr/photo_2.jpg" } { "_id" : 10003012, "creation_date" : ISODate("2016-01-19T06:01:17.171Z"), "userid" : 123, "photo_location" : "example.net/storage/usr/photo_3.jpg" }
Note that only the document with 请注意,在过去一年中(截至2016年6月),仅上传了_id : 10003012
was uploaded within the past year (as of June 2016)._id : 10003012
的文档。
The photo collection uses the 照片集合使用{ creation_date : 1 }
index as the shard key.{ creation_date : 1 }
索引作为分片键。
The 每个文档中的creation_date
field in each document allows for creating zones on the creation date.creation_date
字段允许在创建日期创建区域。
The sharded cluster deployment currently consists of three shards.分片集群部署目前由三个分片组成。
The application requires adding each shard to a zone based on its hardware tier. 应用程序需要根据其硬件层将每个分片添加到区域。Each hardware tier represents a specific hardware configuration designed to satisfy a given SLA or SLO.每个硬件层都代表一个特定的硬件配置,设计用于满足给定的SLA或SLO。
These are the fastest performing machines, with large amounts of RAM, fast SSD disks, and powerful CPUs.这些是性能最快的机器,具有大量RAM、快速SSD磁盘和强大的CPU。
The zone requires a range with:区域要求的范围包括:
{ creation_date : ISODate(YYYY-mm-dd)}
, where the Year, Month, and Date specified by YYYY-mm-dd
is within the last 6 months.{ creation_date : ISODate(YYYY-mm-dd)}
的下限,其中YYYY-mm-dd
指定的年、月和日期在最后6个月内。{ creation_date : MaxKey }
.{ creation_date : MaxKey }
的上界。These machines use less RAM, slower disks, and more basic CPUs. However, they have a greater amount of storage per server.这些机器使用更少的RAM、更慢的磁盘和更基本的CPU。但是,每台服务器的存储量更大。
The zone requires a range with:区域要求的范围包括:
{ creation_date : MinKey }
.{ creation_date : MinKey }
的下界。{ creation_date : ISODate(YYYY-mm-dd)}
, where the Year, Month, and Date match the values used for the recent
tier's lower bound.{ creation_date : ISODate(YYYY-mm-dd)}
的上界,其中年、月和日期与recent
层的下限值相匹配。As performance needs increase, adding additional shards and associating them to the appropriate zone based on their hardware tier allows for the cluster to scale horizontally.随着性能需求的增加,添加额外的分片并根据它们的硬件层将它们关联到适当的区域允许集群水平扩展。
When defining zone ranges based on time spans, weigh the benefits of infrequent updates to the zone ranges against the amount of data that must be migrated on an update. 在基于时间跨度定义区域范围时,根据更新时必须迁移的数据量权衡区域范围不频繁更新的好处。For example, setting a limit of 1 year for data to be considered 'recent' likely covers more data than setting a limit of 1 month. 例如,为被视为“最近”的数据设置1年的限制可能比设置1个月的限制涵盖更多的数据。While there are more migrations required when rotating on a 1 month scale, the amount of documents that must be migrated is lower than rotating on a 1 year scale.虽然以1个月为周期进行轮换时需要进行更多的迁移,但必须迁移的文档数量低于以1年为周期进行的轮换。
With zones, if an inserted or updated document matches a configured zone, it can only be written to a shard inside that zone.对于区域,如果插入或更新的文档与配置的区域匹配,则只能将其写入该区域内的分片。
MongoDB can write documents that do not match a configured zone to any shard in the cluster.MongoDB可以编写与集群中任何分片的配置区域不匹配的文档。
MongoDB can route queries to a specific shard if the query includes the shard key.如果查询包含shard密钥,MongoDB可以将查询路由到特定的分片。
For example, MongoDB can attempt a targeted read operation on the following query because it includes 例如,MongoDB可以尝试对以下查询执行目标读取操作,因为它在查询文档中包含creation_date
in the query document:creation_date
:
photoDB = db.getSiblingDB("photoshare") photoDB.data.find( { "creation_date" : ISODate("2015-01-01") } )
If the requested document falls within the 如果请求的文档位于recent
zone range, MongoDB would route this query to the shards inside that zone, ensuring a faster read compared to a cluster-wide broadcast read operationrecent
的区域范围内,MongoDB会将查询路由到该区域内的分片,确保与集群范围的广播读取操作相比,读取速度更快
The balancer migrates chunks to the appropriate shard respecting any configured zones. 平衡器将块迁移到与任何配置区域相关的适当分片。Until the migration, shards may contain chunks that violate configured zones. 在迁移之前,分片可能包含违反配置区域的块。Once balancing completes, shards should only contain chunks whose ranges do not violate its assigned zones.一旦平衡完成,分片应该只包含其范围不违反其分配区域的块。
Adding or removing zones or zone ranges can result in chunk migrations. 添加或删除区域或区域范围可能导致块迁移。Depending on the size of your data set and the number of chunks a zone or zone range affects, these migrations may impact cluster performance. 根据数据集的大小以及区域或区域范围影响的块数,这些迁移可能会影响群集性能。Consider running your balancer during specific scheduled windows. 考虑在特定的计划窗口期间运行平衡器。See Schedule the Balancing Window for a tutorial on how to set a scheduling window.有关如何设置计划窗口的教程,请参阅计划平衡窗口。
For sharded clusters running with Role-Based Access Control, authenticate as a user with at least the 对于使用基于角色的访问控制运行的分片集群,至少使用clusterManager
role on the admin
database.admin
数据库上的clusterManager
角色作为用户进行身份验证。
You must be connected to a 您必须连接到mongos
to create zones or zone ranges. mongos
才能创建区域或区域范围。You cannot create zone or zone ranges by connecting directly to a shard.无法通过直接连接到分片来创建分区或分区范围。
The balancer must be disabled on the collection to ensure no migrations take place while configuring the new zones.必须在集合上禁用平衡器,以确保在配置新区域时不会发生迁移。
Use 使用sh.disableBalancing()
, specifying the namespace of the collection, to stop the balancersh.disableBalancing()
,指定集合的命名空间,以停止平衡器
sh.disableBalancing("photoshare.data")
Use 使用sh.isBalancerRunning()
to check if the balancer process is currently running. sh.isBalancerRunning()
检查平衡器进程当前是否正在运行。Wait until any current balancing rounds have completed before proceeding.在继续之前,请等待所有当前平衡轮完成。
Add 将shard0000
to the recent
zone.shard0000
添加recent
区域。
sh.addShardTag("shard0000", "recent")
Add 将shard0001
to the recent
zone.shard0001
添加到recent
区域。
sh.addShardTag("shard0001", "recent")
Add 将shard0002
to the archive
zone.shard0002
添加到archive
区域。
sh.addShardTag("shard0002", "archive")
You can review the zone assigned to any given shard by running 您可以通过运行sh.status()
.sh.status()
查看分配给任何给定分片的区域。
Define range for recent photos and associate it to the 定义recent
zone using the sh.addTagRange()
method. recent
照片的范围,并使用sh.addTagRange()
方法将其与最近区域关联。This method requires:该方法要求:
sh.addTagRange( "photoshare.data", { "creation_date" : ISODate("2016-01-01") }, { "creation_date" : MaxKey }, "recent" )
Define range for older photos and associate it to the 定义旧照片的范围,并使用archive
zone using the sh.addTagRange()
method. sh.addTagRange()
方法将其与archive
区域关联。This method requires:该方法要求:
sh.addTagRange( "photoshare.data", { "creation_date" : MinKey }, { "creation_date" : ISODate("2016-01-01") }, "archive" )
MinKey
and 和MaxKey
are reserved special values for comparisons.为比较保留特殊值。
Re-enable the balancer to rebalance the cluster.重新启用平衡器以重新平衡群集。
Use 使用sh.enableBalancing()
, specifying the namespace of the collection, to start the balancersh.enableBalancing()
,指定集合的命名空间,以启动平衡器
sh.enableBalancing("photoshare.data")
Use 使用sh.isBalancerRunning()
to check if the balancer process is currently running.sh.isBalancerRunning()
检查平衡器进程当前是否正在运行。
The next time the balancer runs, it splits and migrates chunks across the shards respecting configured zones.下一次运行平衡器时,它会根据配置的区域在分片之间分割和迁移块。
Once balancing finishes, the shards in the 平衡完成后,recent
zone should only contain documents with creation_date
greater than or equal to ISODate("2016-01-01")
, while shards in the archive
zone should only contain documents with creation_date
less than ISODate("2016-01-01")
.recent
区域中的分片应仅包含creation_date
大于或等于ISODate("2016-01-01")
的文档,而archive
区域中的片段应仅包含creation_date
小于ISODate("2016-01-01")
的文档。
You can confirm the chunk distribution by running 您可以通过运行sh.status()
.sh.status()
来确认区块分布。
To update the shard ranges, perform the following operations as a part of a cron job or other scheduled procedure:要更新分片范围,请作为cron作业或其他计划过程的一部分执行以下操作:
The balancer must be disabled on the collection to ensure no migrations take place while configuring the new zones.必须在集合上禁用平衡器,以确保在配置新区域时不会发生迁移。
Use 使用sh.disableBalancing()
, specifying the namespace of the collection, to stop the balancersh.disableBalancing()
,指定集合的命名空间,以停止平衡器
sh.disableBalancing("photoshare.data")
Use 使用sh.isBalancerRunning()
to check if the balancer process is currently running. sh.isBalancerRunning()
检查平衡器进程当前是否正在运行。Wait until any current balancing rounds have completed before proceeding.在继续之前,请等待所有当前平衡轮完成。
Remove the old 使用recent
zone range using the sh.removeTagRange()
method. sh.removeTagRange()
方法删除旧的recent
区域范围。This method requires:该方法要求:
sh.removeTagRange( "photoshare.data", { "creation_date" : ISODate("2016-01-01") }, { "creation_date" : MaxKey }, "recent" )
Remove the old 使用archive
zone range using the sh.removeTagRange()
method. sh.removeTagRange()
方法删除旧的archive
区域范围。This method requires:该方法要求:
sh.removeTagRange( "photoshare.data", { "creation_date" : MinKey }, { "creation_date" : ISODate("2016-01-01") }, "archive" )
MinKey
and 和MaxKey
are reserved special values for comparisons.为比较保留特殊值。
Define range for recent photos and associate it to the 定义最近照片的范围,并使用recent
zone using the sh.addTagRange()
method. sh.addTagRange()
方法将其与recent
区域关联。This method requires:该方法要求:
sh.addTagRange( "photoshare.data", { "creation_date" : ISODate("2016-06-01") }, { "creation_date" : MaxKey }, "recent" )
Define range for older photos and associate it to the 定义旧照片的范围,并使用archive
zone using the sh.addTagRange()
method. sh.addTagRange()
方法将其与archive
区域关联。This method requires:该方法要求:
sh.addTagRange( "photoshare.data", { "creation_date" : MinKey }, { "creation_date" : ISODate("2016-06-01") }, "archive" )
MinKey
and 和MaxKey
are reserved special values for comparisons.为比较保留特殊值。
Re-enable the balancer to rebalance the cluster.重新启用平衡器以重新平衡群集。
Use 使用sh.enableBalancing()
, specifying the namespace of the collection, to start the balancersh.enableBalancing()
,指定集合的命名空间,以启动平衡器
sh.enableBalancing("photoshare.data")
Use 使用sh.isBalancerRunning()
to check if the balancer process is currently running.sh.isBalancerRunning()
检查平衡器进程当前是否正在运行。
The next time the balancer runs, it splits chunks where necessary and migrates chunks across the shards respecting the configured zones.下一次运行平衡器时,它会在必要时分割块,并根据配置的区域跨分片迁移块。
Before balancing, the shards in the 在平衡之前,recent
zone only contained documents with creation_date
greater than or equal to ISODate("2016-01-01")
, while shards in the archive
zone only contained documents with creation_date
less than ISODate("2016-01-01")
.recent
区域中的分片仅包含creation_date
大于或等于ISODate("2016-01-01")
的文档,而archive
区域中的片段仅包含creation_date
小于ISODate("2016-01-01")
的文档。
Once balancing finishes, the shards in the 平衡完成后,recent
zone should only contain documents with creation_date
greater than or equal to ISODate("2016-06-01")
, while shards in the archive
zone should only contain documents with creation_date
less than ISODate("2016-06-01")
.recent
区域中的分片应仅包含creation_date
大于或等于ISODate("2016-06-01")
的文档,而archive
区域中的片段应仅包含creation_date
小于ISODate("2016-06-01")
的文档。
You can confirm the chunk distribution by running 您可以通过运行sh.status()
.sh.status()
来确认区块分布。