Tiered Hardware for Varying SLA or SLO用于不同SLA或SLO的分层硬件
On this page本页内容
In sharded clusters, you can create zones of sharded data based on the shard key. 在分片集群中,可以根据分片键创建分片数据区域。You can associate each zone with one or more shards in the cluster. 您可以将每个区域与集群中的一个或多个分片相关联。A shard can associate with any number of zones. 分片可以与任意数量的区域关联。In a balanced cluster, MongoDB migrates chunks covered by a zone only to those shards associated with the zone.在一个平衡的集群中,MongoDB只将一个区域覆盖的区块迁移到与该区域关联的分片中。
By defining the zones and the zone ranges before sharding an empty or a non-existing collection, the shard collection operation creates chunks for the defined zone ranges as well as any additional chunks to cover the entire range of the shard key values and performs an initial chunk distribution based on the zone ranges. 通过在对空集合或不存在的集合进行分片之前定义区域和区域范围,分片集合操作为定义的区域范围以及任何额外的块创建块,以覆盖分片键值的整个范围,并基于区域范围执行初始块分布。This initial creation and distribution of chunks allows for faster setup of zoned sharding. After the initial distribution, the balancer manages the chunk distribution going forward.这种块的初始创建和分布允许更快地设置分区分片。在初始分发之后,平衡器管理接下来的块分发。
See Pre-Define Zones and Zone Ranges for an Empty or Non-Existing Collection for an example.有关示例,请参阅空集合或不存在集合的预定义分区和分区范围。
This tutorial uses Zones to route documents based on creation date either to shards zoned for supporting recent documents, or those zoned for supporting archived documents.本教程使用区域根据创建日期将文档路由到分区为支持最近文档的分片,或分区为支持存档文档的分片。
The following are some example use cases for segmenting data based on Service Level Agreement (SLA) or Service Level Objective (SLO):以下是基于服务级别协议(SLA)或服务级别目标(SLO)分割数据的一些示例用例:
An application requires providing low-latency access to recently inserted / updated documents应用程序要求提供对最近插入/更新的文档的低延迟访问An application requires prioritizing low-latency access to a range or subset of documents应用程序需要优先考虑对一系列或子集文档的低延迟访问An application that benefits from ensuring specific ranges or subsets of data are stored on servers with hardware that suits the SLA's for accessing that data确保特定范围或子集的数据存储在具有适合SLA访问数据的硬件的服务器上的应用程序
The following diagram illustrates a sharded cluster that uses hardware based zones to satisfy data access SLAs or SLOs.下图说明了一个分片集群,该集群使用基于硬件的区域来满足数据访问SLA或SLO。
Scenario情形
A photo sharing application requires fast access to photos uploaded within the last 6 months. 照片共享应用程序需要快速访问过去6个月内上传的照片。The application stores the location of each photo along with its metadata in the 该应用程序将每张照片的位置及其元数据存储在photoshare
database under the data
collection.photoshare
数据库中的data
集合下。
The following documents represent photos uploaded by a single user:以下文档表示单个用户上传的照片:
{
"_id" : 10003010,
"creation_date" : ISODate("2012-12-19T06:01:17.171Z"),
"userid" : 123,
"photo_location" : "example.net/storage/usr/photo_1.jpg"
}
{
"_id" : 10003011,
"creation_date" : ISODate("2013-12-19T06:01:17.171Z"),
"userid" : 123,
"photo_location" : "example.net/storage/usr/photo_2.jpg"
}
{
"_id" : 10003012,
"creation_date" : ISODate("2016-01-19T06:01:17.171Z"),
"userid" : 123,
"photo_location" : "example.net/storage/usr/photo_3.jpg"
}
Note that only the document with _id : 10003012
was uploaded within the past year (as of June 2016).
Shard Key
The photo collection uses the { creation_date : 1 }
index as the shard key.
The creation_date
field in each document allows for creating zones on the creation date.
Architecture
The sharded cluster deployment currently consists of three shards.
Zones
The application requires adding each shard to a zone based on its hardware tier. Each hardware tier represents a specific hardware configuration designed to satisfy a given SLA or SLO.
- Fast Tier ("recent")
-
These are the fastest performing machines, with large amounts of RAM, fast SSD disks, and powerful CPUs.
The zone requires a range with:
- a lower bound of
{ creation_date : ISODate(YYYY-mm-dd)}
, where the Year, Month, and Date specified byYYYY-mm-dd
is within the last 6 months. - an upper bound of
{ creation_date : MaxKey }
.
- a lower bound of
- Archival Tier ("archive")
-
These machines use less RAM, slower disks, and more basic CPUs. However, they have a greater amount of storage per server.
The zone requires a range with:
- a lower bound of
{ creation_date : MinKey }
. - an upper bound of
{ creation_date : ISODate(YYYY-mm-dd)}
, where the Year, Month, and Date match the values used for therecent
tier's lower bound.
- a lower bound of
As performance needs increase, adding additional shards and associating them to the appropriate zone based on their hardware tier allows for the cluster to scale horizontally.
When defining zone ranges based on time spans, weigh the benefits of infrequent updates to the zone ranges against the amount of data that must be migrated on an update. For example, setting a limit of 1 year for data to be considered 'recent' likely covers more data than setting a limit of 1 month. While there are more migrations required when rotating on a 1 month scale, the amount of documents that must be migrated is lower than rotating on a 1 year scale.
Write Operations
With zones, if an inserted or updated document matches a configured zone, it can only be written to a shard inside that zone.
MongoDB can write documents that do not match a configured zone to any shard in the cluster.
The behavior described above requires the cluster to be in a steady state with no chunks violating a configured zone. See the following section on the balancer for more information.
Read Operations
MongoDB can route queries to a specific shard if the query includes the shard key.
For example, MongoDB can attempt a targeted read operation on the following query because it includes creation_date
in the query document:
photoDB = db.getSiblingDB("photoshare")
photoDB.data.find( { "creation_date" : ISODate("2015-01-01") } )
If the requested document falls within the recent
zone range, MongoDB would route this query to the shards inside that zone, ensuring a faster read compared to a cluster-wide broadcast read operation
Balancer
The balancer migrates chunks to the appropriate shard respecting any configured zones. Until the migration, shards may contain chunks that violate configured zones. Once balancing completes, shards should only contain chunks whose ranges do not violate its assigned zones.
Adding or removing zones or zone ranges can result in chunk migrations. Depending on the size of your data set and the number of chunks a zone or zone range affects, these migrations may impact cluster performance. Consider running your balancer during specific scheduled windows. See Schedule the Balancing Window for a tutorial on how to set a scheduling window.
Security
For sharded clusters running with Role-Based Access Control, authenticate as a user with at least the clusterManager
role on the admin
database.
Procedure过程
You must be connected to a mongos
to create zones or zone ranges. You cannot create zone or zone ranges by connecting directly to a shard.
Disable the Balancer
The balancer must be disabled on the collection to ensure no migrations take place while configuring the new zones.
Use sh.disableBalancing()
, specifying the namespace of the collection, to stop the balancer
sh.disableBalancing("photoshare.data")
Use sh.isBalancerRunning()
to check if the balancer process is currently running. Wait until any current balancing rounds have completed before proceeding.
Add each shard to the appropriate zone
Add shard0000
to the recent
zone.
sh.addShardTag("shard0000", "recent")
Add shard0001
to the recent
zone.
sh.addShardTag("shard0001", "recent")
Add shard0002
to the archive
zone.
sh.addShardTag("shard0002", "archive")
You can review the zone assigned to any given shard by running sh.status()
.
Define ranges for each zone
Define range for recent photos and associate it to the recent
zone using the sh.addTagRange()
method. This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the zone.
sh.addTagRange(
"photoshare.data",
{ "creation_date" : ISODate("2016-01-01") },
{ "creation_date" : MaxKey },
"recent"
)
Define range for older photos and associate it to the archive
zone using the sh.addTagRange()
method. This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the zone.
sh.addTagRange(
"photoshare.data",
{ "creation_date" : MinKey },
{ "creation_date" : ISODate("2016-01-01") },
"archive"
)
MinKey
and MaxKey
are reserved special values for comparisons.
Enable the Balancer
Re-enable the balancer to rebalance the cluster.
Use sh.enableBalancing()
, specifying the namespace of the collection, to start the balancer
sh.enableBalancing("photoshare.data")
Use sh.isBalancerRunning()
to check if the balancer process is currently running.
Review the changes
The next time the balancer runs, it splits and migrates chunks across the shards respecting configured zones.
Once balancing finishes, the shards in the recent
zone should only contain documents with creation_date
greater than or equal to ISODate("2016-01-01")
, while shards in the archive
zone should only contain documents with creation_date
less than ISODate("2016-01-01")
.
You can confirm the chunk distribution by running sh.status()
.
Updating Zone Ranges
To update the shard ranges, perform the following operations as a part of a cron job or other scheduled procedure:
Disable the Balancer
The balancer must be disabled on the collection to ensure no migrations take place while configuring the new zones.
Use sh.disableBalancing()
, specifying the namespace of the collection, to stop the balancer
sh.disableBalancing("photoshare.data")
Use sh.isBalancerRunning()
to check if the balancer process is currently running. Wait until any current balancing rounds have completed before proceeding.
Remove the old shard zone ranges
Remove the old recent
zone range using the sh.removeTagRange()
method. This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the zone.
sh.removeTagRange(
"photoshare.data",
{ "creation_date" : ISODate("2016-01-01") },
{ "creation_date" : MaxKey },
"recent"
)
Remove the old archive
zone range using the sh.removeTagRange()
method. This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the zone.
sh.removeTagRange(
"photoshare.data",
{ "creation_date" : MinKey },
{ "creation_date" : ISODate("2016-01-01") },
"archive"
)
MinKey
and MaxKey
are reserved special values for comparisons.
Add the new zone range for each zone
Define range for recent photos and associate it to the recent
zone using the sh.addTagRange()
method. This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the zone.
sh.addTagRange(
"photoshare.data",
{ "creation_date" : ISODate("2016-06-01") },
{ "creation_date" : MaxKey },
"recent"
)
Define range for older photos and associate it to the archive
zone using the sh.addTagRange()
method. This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the zone.
sh.addTagRange(
"photoshare.data",
{ "creation_date" : MinKey },
{ "creation_date" : ISODate("2016-06-01") },
"archive"
)
MinKey
and MaxKey
are reserved special values for comparisons.
Enable the Balancer
Re-enable the balancer to rebalance the cluster.
Use sh.enableBalancing()
, specifying the namespace of the collection, to start the balancer
sh.enableBalancing("photoshare.data")
Use sh.isBalancerRunning()
to check if the balancer process is currently running.
Review the changes
The next time the balancer runs, it migrates data across the shards respecting the configured zones.
Before balancing, the shards in the recent
zone only contained documents with creation_date
greater than or equal to ISODate("2016-01-01")
, while shards in the archive
zone only contained documents with creation_date
less than ISODate("2016-01-01")
.
Once balancing finishes, the shards in the recent
zone should only contain documents with creation_date
greater than or equal to ISODate("2016-06-01")
, while shards in the archive
zone should only contain documents with creation_date
less than ISODate("2016-06-01")
.
You can confirm the chunk distribution by running sh.status()
.