On this page本页内容
MongoDB Tag Aware Sharding allows administrators to control data distribution in a sharded cluster by defining ranges of the shard key and tagging them to one or more shards.MongoDB标记感知分片允许管理员通过定义分片键的范围并将其标记到一个或多个分片来控制分片集群中的数据分布。
This tutorial uses Zones along with a multi-datacenter sharded cluster deployment and application-side logic to support distributed local writes, as well as high write availability in the event of a replica set election or datacenter failure.本教程使用分区以及多数据中心分片群集部署和应用程序端逻辑来支持分布式本地写入,以及在副本集选择或数据中心故障时的高写入可用性。
Changed in version 4.0.3.在版本4.0.3中更改。
The concepts discussed in this tutorial require a specific deployment architecture, as well as application-level logic.本教程中讨论的概念需要特定的部署体系结构以及应用程序级逻辑。
These concepts require familiarity with MongoDB sharded clusters, replica sets, and the general behavior of zones.这些概念需要熟悉MongoDB分片集群、副本集和区域的一般行为。
This tutorial assumes an insert-only or insert-intensive workload. 本教程假设只插入或插入密集型工作负载。The concepts and strategies discussed in this tutorial are not well suited for use cases that require fast reads or updates.本教程中讨论的概念和策略不太适合需要快速读取或更新的用例。
Consider an insert-intensive application, where reads are infrequent and low priority compared to writes. 考虑一个插入密集型应用程序,在该应用程序中,与写入相比,读取不频繁且优先级较低。The application writes documents to a sharded collection, and requires near-constant uptime from the database to support its SLAs or SLOs.该应用程序将文档写入分片集合,并且需要数据库近乎恒定的正常运行时间来支持其SLA或SLO。
The following represents a partial view of the format of documents the application writes to the database:以下是应用程序写入数据库的文档格式的局部视图:
{ "_id" : ObjectId("56f08c447fe58b2e96f595fa"), "message_id" : 329620, "datacenter" : "alfa", "userid" : 123, ... } { "_id" : ObjectId("56f08c447fe58b2e96f595fb"), "message_id" : 578494, "datacenter" : "bravo", "userid" : 456, ... } { "_id" : ObjectId("56f08c447fe58b2e96f595fc"), "message_id" : 689979, "datacenter" : "bravo", "userid" : 789, ... }
The collection uses the 集合使用{ datacenter : 1, userid : 1 }
compound index as the shard key.{ datacenter : 1, userid : 1 }
复合索引作为分片键。
The 每个文档中的数据中心字段允许在每个不同的datacenter
field in each document allows for creating a tag range on each distinct datacenter value. datacenter
值上创建标记范围。Without the 如果没有datacenter
field, it would not be possible to associate a document with a specific datacenter.datacenter
字段,则无法将文档与特定的数据中心相关联。
The userid
field provides a high cardinality and low frequency component to the shard key relative to datacenter
.userid
字段为分片键提供了相对于datacenter
的高基数和低频率组件。
See Choosing a Shard Key for more general instructions on selecting a shard key.有关选择分片键的更多一般说明,请参阅选择分片键。
The deployment consists of two datacenters, 部署包括两个数据中心,alfa
and bravo
. There are two shards, shard0000
and shard0001
. alfa
和bravo
。有两个分片,shard0000
和shard0001
。Each shard is a replica set with three members. 每个分片是一个具有三个成员的副本集。shard0000
has two members on alfa
and one priority 0 member on bravo
. shard0000
在alfa
上有两个成员,在bravo
上有一个priority 0成员。shard0001
has two members on bravo
and one priority 0 member on alfa
.shard0001
在bravo
上有两个成员,在alfa
上有一个priority 0成员。
This application requires one tag per datacenter. Each shard has one tag assigned to it based on the datacenter containing the majority of its replica set members. There are two tag ranges, one for each datacenter.此应用程序要求每个数据中心有一个标记。基于包含其大多数副本集成员的数据中心,每个分片都有一个分配给它的标记。有两个标记范围,每个数据中心一个。
alfa
Tag shards with a majority of members on this datacenter as 将此数据中心上大多数成员的分片标记为alfa
.alfa
。
Create a tag range with:使用以下内容创建标记范围:
{ "datacenter" : "alfa", "userid" : MinKey }
,{ "datacenter" : "alfa", "userid" : MaxKey }
, andalfa
bravo
Tag shards with a majority of members on this datacenter as 将此数据中心上大多数成员的分片标记为bravo
.bravo
。
Create a tag range with:使用以下内容创建标记范围:
{ "datacenter" : "bravo", "userid" : MinKey }
,{ "datacenter" : "bravo", "userid" : MaxKey }
, andbravo
Based on the configured tags and tag ranges, 根据配置的标记和标记范围,mongos
routes documents with datacenter : alfa
to the alfa
datacenter, and documents with datacenter : bravo
to the bravo
datacenter.mongos
将带有datacenter : alfa
的文档路由到alfa
数据中心,并将带有datacenter : bravo
的文档路由到bravo
数据中心。
If an inserted or updated document matches a configured tag range, it can only be written to a shard with the related tag.如果插入或更新的文档与配置的标记范围匹配,则只能将其写入具有相关标记的分片。
MongoDB can write documents that do not match a configured tag range to any shard in the cluster.MongoDB可以将与配置的标记范围不匹配的文档写入集群中的任何分片。
The balancer migrates the tagged chunks to the appropriate shard. 平衡器将标记的块迁移到适当的分片。Until the migration, shards may contain chunks that violate configured tag ranges and tags. 在迁移之前,分片可能包含违反配置的标记范围和标记的块。Once balancing completes, shards should only contain chunks whose ranges do not violate its assigned tags and tag ranges.一旦平衡完成,分片应该只包含范围不违反其分配的标记和标记范围的块。
Adding or removing tags or tag ranges can result in chunk migrations. 添加或删除标记或标记范围可能会导致区块迁移。Depending on the size of your data set and the number of chunks a tag range affects, these migrations may impact cluster performance. 根据数据集的大小和标记范围影响的块数,这些迁移可能会影响群集性能。Consider running your balancer during specific scheduled windows. 考虑在特定的计划窗口期间运行平衡器。See Schedule the Balancing Window for a tutorial on how to set a scheduling window.有关如何设置计划窗口的教程,请参阅计划平衡窗口。
By default, the application writes to the nearest datacenter. 默认情况下,应用程序会写入最近的数据中心。If the local datacenter is down, or if writes to that datacenter are not acknowledged within a set time period, the application switches to the other available datacenter by changing the value of the 如果本地数据中心关闭,或者在设置的时间段内未确认对该数据中心的写入,则应用程序在尝试将文档写入数据库之前,通过更改datacenter
field before attempting to write the document to the database.datacenter
字段的值,切换到其他可用的数据中心。
The application supports write timeouts. 应用程序支持写入超时。The application uses Write Concern to set a timeout for each write operation.应用程序使用写入关注点为每个写入操作设置超时。
If the application encounters a write or timeout error, it modifies the 如果应用程序遇到写入或超时错误,它会修改每个文档中的datacenter
field in each document and performs the write. datacenter
字段并执行写入。This routes the document to the other datacenter. 这会将文档路由到其他数据中心。If both datacenters are down, then writes cannot succeed. 如果两个数据中心都已关闭,则写入无法成功。See Resolve Write Failure.请参阅解决写入失败。
The application periodically checks connectivity to any data centers marked as "down". 应用程序定期检查与标记为“关闭”的任何数据中心的连接。If connectivity is restored, the application can continue performing normal write operations.如果连接恢复,应用程序可以继续执行正常的写入操作。
Given the switching logic, as well as any load balancers or similar mechanisms in place to handle client traffic between datacenters, the application cannot predict which of the two datacenters a given document was written to. 给定交换逻辑,以及任何负载平衡器或用于处理数据中心之间客户端流量的类似机制,应用程序无法预测给定文档写入到两个数据中心中的哪一个。To ensure that no documents are missed as a part of read operations, the application must perform broadcast queries by not including the 为了确保在读取操作中不会丢失任何文档,应用程序必须执行广播查询,不将datacenter
field as a part of any query.datacenter
字段作为任何查询的一部分。
The application performs reads using a read preference of 应用程序使用最接近的读取首选项执行读取,以减少延迟。nearest
to reduce latency.
It is possible for a write operation to succeed despite a reported timeout error. 尽管报告了超时错误,但写入操作仍有可能成功。The application responds to the error by attempting to re-write the document to the other datacenter - this can result in a document being duplicated across both datacenters. 应用程序通过尝试将文档重新写入另一个数据中心来响应错误-这可能导致文档在两个数据中心之间重复。The application resolves duplicates as a part of the read logic.应用程序将重复项解析为读取逻辑的一部分。
The application has logic to switch datacenters if one or more writes fail, or if writes are not acknowledged within a set time period. 如果一个或多个写操作失败,或者在设置的时间段内未确认写操作,则应用程序具有切换数据中心的逻辑。The application modifies the 应用程序根据目标数据中心的标记修改datacenter
field based on the target datacenter's tag to direct the document towards that datacenter.datacenter
字段,以将文档指向该数据中心。
For example, an application attempting to write to the 例如,试图写入alfa
datacenter might follow this general procedure:alfa
数据中心的应用程序可能遵循以下一般过程:
datacenter : alfa
.datacenter : alfa
。alfa
as momentarily down.datacenter : bravo
.bravo
as momentarily down.alfa
and bravo
are down, log and report errors.You must be connected to a mongos
associated with the target sharded cluster in order to proceed. You cannot create tags by connecting directly to a shard replica set member.
Tag each shard in the alfa
data center with the alfa
tag.
sh.addShardTag("shard0000", "alfa")
Tag each shard in the bravo
data center with the bravo
tag.
sh.addShardTag("shard0001", "bravo")
You can review the tags assigned to any given shard by running sh.status()
.
Define the range for the alfa
database and associate it to the alfa
tag using the sh.addTagRange()
method. This method requires:
sh.addTagRange( "<database>.<collection>", { "datacenter" : "alfa", "userid" : MinKey }, { "datacenter" : "alfa", "userid" : MaxKey }, "alfa" )
Define the range for the bravo
database and associate it to the bravo
tag using the sh.addTagRange()
method. This method requires:
sh.addTagRange( "<database>.<collection>", { "datacenter" : "bravo", "userid" : MinKey }, { "datacenter" : "bravo", "userid" : MaxKey }, "bravo" )
The MinKey
and MaxKey
values are reserved special values for comparisons. MinKey
always compares as less than every other possible value, while MaxKey
always compares as greater than every other possible value. The configured ranges capture every user for each datacenter
.
The next time the balancer runs, it splits and migrates chunks across the shards respecting the tag ranges and tags.
Once balancing finishes, the shards tagged as alfa
should only contain documents with datacenter : alfa
, while shards tagged as bravo
should only contain documents with datacenter : bravo
.
You can review the chunk distribution by running sh.status()
.
When the application's default datacenter is down or inaccessible, the application changes the datacenter
field to the other datacenter.
For example, the application attempts to write the following document to the alfa
datacenter by default:
{ "_id" : ObjectId("56f08c447fe58b2e96f595fa"), "message_id" : 329620, "datacenter" : "alfa", "userid" : 123, ... }
If the application receives an error on attempted write, or if the write acknowledgement takes too long, the application logs the datacenter as unavailable and alters the datacenter
field to point to the bravo
datacenter.
{ "_id" : ObjectId("56f08c457fe58b2e96f595fb"), "message_id" : 329620, "datacenter" : "bravo", "userid" : 123, ... }
The application periodically checks the alfa
datacenter for connectivity. If the datacenter is reachable again, the application can resume normal writes.
It is possible that the original write to datacenter : alfa
succeeded, especially if the error was related to a timeout. If so, the document with message_id : 329620
may now be duplicated across both datacenters. Applications must resolve duplicates as a part of read operations.
The application's switching logic allows for potential document duplication. When performing reads, the application resolves any duplicate documents on the application layer.
The following query searches for documents where the userid
is 123
. Note that while userid
is part of the shard key, the query does not include the datacenter
field, and therefore does not perform a targeted read operation.
db.collection.find( { "userid" : 123 } )
The results show that the document with message_id
of 329620
has been inserted into MongoDB twice, probably as a result of a delayed write acknowledgement.
{ "_id" : ObjectId("56f08c447fe58b2e96f595fa"), "message_id" : 329620 "datacenter" : "alfa", "userid" : 123, data : {...} } { "_id" : ObjectId("56f08c457fe58b2e96f595fb"), "message_id" : 329620 "datacenter" : "bravo", "userid" : 123, ... }
The application can either ignore the duplicates, taking one of the two documents, or it can attempt to trim the duplicates until only a single document remains.
One method for trimming duplicates is to use the ObjectId.getTimestamp()
method to extract the timestamp from the _id
field. The application can then keep either the first document inserted, or the last document inserted. This assumes the _id
field uses the MongoDB ObjectId()
.
For example, using getTimestamp()
on the document with ObjectId("56f08c447fe58b2e96f595fa")
returns:
ISODate("2016-03-22T00:05:24Z")
Using getTimestamp()
on the document with ObjectId("56f08c457fe58b2e96f595fb")
returns:
ISODate("2016-03-22T00:05:25Z")