Sharded Cluster Balancer分片群集均衡器

On this page本页内容

The MongoDB balancer is a background process that monitors the number of chunks on each shard. MongoDB平衡器是一个后台进程,它监视每个分片上的块数When the number of chunks on a given shard reaches specific migration thresholds, the balancer attempts to automatically migrate chunks between shards and reach an equal number of chunks per shard.当给定分片上的块数达到特定的迁移阈值时,均衡器会尝试在分片之间自动迁移块,并达到每个分片上相同的块数。

The balancing procedure for sharded clusters is entirely transparent to the user and application layer, though there may be some performance impact while the procedure takes place.分片集群的平衡过程对用户和应用程序层来说是完全透明的,尽管该过程发生时可能会对性能产生一些影响。

Diagram of a collection distributed across three shards. For this collection, the difference in the number of chunks between the shards reaches the *migration thresholds* (in this case, 2) and triggers migration.

The balancer runs on the primary of the config server replica set (CSRS).平衡器在配置服务器副本集(CSR)的主服务器上运行。

Cluster Balancer群集平衡器

The balancer process is responsible for redistributing the chunks of a sharded collection evenly among the shards for every sharded collection. 平衡器进程负责在每个分片集合的分片之间均匀地重新分配分片集合的块。By default, the balancer process is always enabled.默认情况下,平衡器进程始终处于启用状态。

To address uneven chunk distribution for a sharded collection, the balancer migrates chunks from shards with more chunks to shards with a fewer number of chunks. 为了解决分片集合中块分布不均匀的问题,平衡器将块从块数较多的块迁移到块数较少的块。The balancer migrates the chunks until there is an even distribution of chunks for the collection across the shards. 均衡器会迁移块,直到集合的块在分片之间均匀分布。For details about chunk migration, see Chunk Migration Procedure.有关区块迁移的详细信息,请参阅区块迁移过程

Chunk migrations can have an impact on disk space, as the source shard automatically archives the migrated documents by default. 区块迁移可能会影响磁盘空间,因为默认情况下,源分片会自动归档迁移的文档。For details, see moveChunk directory.有关详细信息,请参阅moveChunk目录。

Chunk migrations carry some overhead in terms of bandwidth and workload, both of which can impact database performance. 区块迁移带来了带宽和工作负载方面的一些开销,这两者都会影响数据库性能。[1] The balancer attempts to minimize the impact by:平衡器试图通过以下方式将影响降至最低:

  • Restricting a shard to at most one migration at any given time; i.e. a shard cannot participate in multiple chunk migrations at the same time. 限制分片在任何给定时间最多迁移一次;即,一个shard不能同时参与多个块迁移。To migrate multiple chunks from a shard, the balancer migrates the chunks one at a time.要从一个shard迁移多个块,平衡器一次迁移一个块。

    Changed in version 3.4.在版本3.4中更改

    Starting in MongoDB 3.4, MongoDB can perform parallel chunk migrations. 从MongoDB 3.4开始,MongoDB可以执行并行块迁移。Observing the restriction that a shard can participate in at most one migration at a time, for a sharded cluster with nshards, MongoDB can perform at most n/2 (rounded down) simultaneous chunk migrations.遵守一个shard一次最多只能参与一次迁移的限制,对于具有n个分片的分片集群,MongoDB最多可以执行n/2(向下舍入)同时块迁移。

    See also Asynchronous Chunk Migration Cleanup.另请参见异步块迁移清理

  • Starting a balancing round only when the difference in the number of chunks between the shard with the greatest number of chunks for a sharded collection and the shard with the lowest number of chunks for that collection reaches the migration threshold.只有当分片集合中块数最多的分片与该集合中块数最少的分片之间的块数差异达到迁移阈值时,才开始平衡回合。

You may disable the balancer temporarily for maintenance. 您可以暂时禁用平衡器进行维护。See Disable the Balancer for details.有关详细信息,请参阅禁用平衡器

You can also limit the window during which the balancer runs to prevent it from impacting production traffic. 您还可以限制平衡器运行的窗口,以防止其影响生产流量。See Schedule the Balancing Window for details.有关详细信息,请参阅计划平衡窗口

Note注意

The specification of the balancing window is relative to the local time zone of the primary of the config server replica set.平衡窗口的规格与配置服务器副本集的主服务器的本地时区相关。

[1] Starting in MongoDB 4.0.3, the shard collection operation can perform an initial chunk creation and distribution for empty or non-existing collections if zones and zone ranges have been defined for the collection. 从MongoDB 4.0.3开始,如果为集合定义了区域和区域范围,则分片集合操作可以为空集合或不存在的集合执行初始区块创建和分发。Initial creation and distribution of chunk allows for faster setup of zoned sharding. 块的初始创建和分发允许更快地设置分区分片。After the initial distribution, the balancer manages the chunk distribution going forward per usual.Starting in version 4.4, MongoDB supports sharding collections on compound hashed indexes. 在初始分发之后,平衡器按照4.4版的usualStarting管理块分发,MongoDB支持在复合哈希索引上分片集合。When sharding an empty or non-existing collection using a compound hashed shard key, additional requirements apply in order for MongoDB to perform initial chunk creation and distribution.当使用复合哈希分片键对空的或不存在的集合进行分片时,还需要满足其他要求,以便MongoDB执行初始块创建和分发。See Pre-Define Zones and Zone Ranges for an Empty or Non-Existing Collection for an example.有关示例,请参阅空集合或不存在集合的预定义分区和分区范围

Adding and Removing Shards from the Cluster从群集中添加和删除分片

Adding a shard to a cluster creates an imbalance, since the new shard has no chunks. While MongoDB begins migrating data to the new shard immediately, it can take some time before the cluster balances. 将分片添加到集群会造成不平衡,因为新分片没有块。虽然MongoDB立即开始将数据迁移到新的shard,但在集群平衡之前可能需要一些时间。See the Add Shards to a Cluster tutorial for instructions on adding a shard to a cluster.有关将分片添加到集群的说明,请参阅将分片添加到集群教程

Removing a shard from a cluster creates a similar imbalance, since chunks residing on that shard must be redistributed throughout the cluster. 从集群中删除一个分片也会造成类似的不平衡,因为驻留在该分片上的块必须在整个集群中重新分布。While MongoDB begins draining a removed shard immediately, it can take some time before the cluster balances. 虽然MongoDB立即开始排出已移除的分片,但可能需要一段时间才能平衡集群。Do not shutdown the servers associated to the removed shard during this process.在此过程中,请勿关闭与已删除分片关联的服务器。

When you remove a shard in a cluster with an uneven chunk distribution, the balancer first removes the chunks from the draining shard and then balances the remaining uneven chunk distribution.当您在块分布不均匀的集群中移除一个分片时,平衡器首先从排出的分片中移除块,然后平衡剩余的不均匀块分布。

See the Remove Shards from an Existing Sharded Cluster tutorial for instructions on safely removing a shard from a cluster.有关从集群中安全移除分片的说明,请参阅从现有分片集群中移除分片教程。

Tip提示
See also: 参阅:

Chunk Migration Procedure区块迁移过程

All chunk migrations use the following procedure:所有区块迁移都使用以下过程:

  1. The balancer process sends the moveChunk command to the source shard.平衡器进程将moveChunk命令发送到源shard。
  2. The source starts the move with an internal moveChunk command. 源使用内部moveChunk命令启动移动。During the migration process, operations to the chunk route to the source shard. 在迁移过程中,对区块的操作将路由到源分片。The source shard is responsible for incoming write operations for the chunk.源shard负责区块的传入写入操作。
  3. The destination shard builds any indexes required by the source that do not exist on the destination.目标分片构建源所需的、目标上不存在的任何索引。
  4. The destination shard begins requesting documents in the chunk and starts receiving copies of the data. 目标分片开始请求区块中的文档,并开始接收数据的副本。See also Chunk Migration and Replication.另请参见区块迁移和复制
  5. After receiving the final document in the chunk, the destination shard starts a synchronization process to ensure that it has the changes to the migrated documents that occurred during the migration.在接收到区块中的最后一个文档后,目标分片将启动同步过程,以确保它对迁移过程中发生的迁移文档进行了更改。
  6. When fully synchronized, the source shard connects to the config database and updates the cluster metadata with the new location for the chunk.完全同步后,源分片将连接到配置数据库,并使用区块的新位置更新集群元数据。
  7. After the source shard completes the update of the metadata, and once there are no open cursors on the chunk, the source shard deletes its copy of the documents.在源shard完成元数据更新后,一旦区块上没有打开的游标,源shard将删除其文档副本。

    Note注意

    If the balancer needs to perform additional chunk migrations from the source shard, the balancer can start the next chunk migration without waiting for the current migration process to finish this deletion step. 如果平衡器需要从源shard执行其他块迁移,则平衡器可以开始下一个块迁移,而无需等待当前迁移过程完成此删除步骤。See Asynchronous Chunk Migration Cleanup.请参见异步块迁移清理

    Tip提示
    See also: 参阅:

The migration process ensures consistency and maximizes the availability of chunks during balancing.迁移过程确保了一致性,并在平衡期间最大限度地提高了块的可用性。

Migration Thresholds迁移阈值

To minimize the impact of balancing on the cluster, the balancer only begins balancing after the distribution of chunks for a sharded collection has reached certain thresholds. 为了将平衡对集群的影响降至最低,平衡器仅在分片集合的块分布达到特定阈值后才开始平衡。The thresholds apply to the difference in number of chunks between the shard with the most chunks for the collection and the shard with the fewest chunks for that collection. 阈值适用于集合中块数最多的分片与集合中块数最少的分片之间的块数差异The balancer has the following thresholds:平衡器具有以下阈值:

Number of Chunks分片数目Migration Threshold迁移阈值
Fewer than 202
20-794
80 and greater8

The balancer stops running on the target collection when the difference between the number of chunks on any two shards for that collection is less than two, or a chunk migration fails.当目标集合的任何两个分片上的块数之差小于2时,或者块迁移失败时,均衡器将停止在该集合上运行。

Tip提示
See also: 参阅:

Asynchronous Chunk Migration Cleanup异步块迁移清理

To migrate multiple chunks from a shard, the balancer migrates the chunks one at a time. 要从一个shard迁移多个块,平衡器一次迁移一个块。However, the balancer does not wait for the current migration's delete phase to complete before starting the next chunk migration. 但是,平衡器不会等待当前迁移的删除阶段完成,然后再开始下一个块迁移。See Chunk Migration for the chunk migration process and the delete phase.有关区块迁移过程和删除阶段,请参阅区块迁移。

This queuing behavior allows shards to unload chunks more quickly in cases of heavily imbalanced cluster, such as when performing initial data loads without pre-splitting and when adding new shards.这种排队行为允许分片在集群严重不平衡的情况下更快地卸载块,例如在没有预拆分的情况下执行初始数据加载时以及在添加新分片时。

This behavior also affects the moveChunk command, and migration scripts that use the moveChunk command may proceed more quickly.此行为还影响moveChunk命令,使用moveChunk命令的迁移脚本可能会进行得更快。

In some cases, the delete phases may persist longer. 在某些情况下,删除阶段可能会持续更长时间。Starting in MongoDB 4.4, chunk migrations are enhanced to be more resilient in the event of a failover during the delete phase. 从MongoDB 4.4开始,区块迁移得到了增强,以便在删除阶段发生故障切换时更具弹性。Orphaned documents are cleaned up even if a replica set's primary crashes or restarts during this phase.即使副本集的主副本在此阶段崩溃或重新启动,孤立文档也会被清理。

The _waitForDelete, available as a setting for the balancer as well as the moveChunk command, can alter the behavior so that the delete phase of the current migration blocks the start of the next chunk migration. _waitForDelete可以作为平衡器和moveChunk命令的设置,它可以改变行为,以便当前迁移的删除阶段阻止下一个块迁移的开始。The _waitForDelete is generally for internal testing purposes. _waitForDelete通常用于内部测试。For more information, see Wait for Delete.有关详细信息,请参阅等待删除

Chunk Migration and Replication区块迁移和复制

Changed in version 3.4.在版本3.4中更改

During chunk migration, the _secondaryThrottle value determines when the migration proceeds with next document in the chunk.在区块迁移期间,_secondaryThrottle值确定何时迁移区块中的下一个文档。

In the config.settings collection:config.settings集合中:

  • If the _secondaryThrottle setting for the balancer is set to a write concern, each document move during chunk migration must receive the requested acknowledgement before proceeding with the next document.如果平衡器的_secondaryThrottle设置设置为写入问题,则区块迁移期间的每个文档移动都必须在继续下一个文档之前收到请求的确认。
  • If the _secondaryThrottle setting for the balancer is set to true, each document move during chunk migration must receive acknowledgement from at least one secondary before the migration proceeds with the next document in the chunk. 如果平衡器的_secondaryThrottle设置设置为true,则区块迁移期间的每个文档移动都必须在继续迁移区块中的下一个文档之前收到至少一个辅助文档的确认。This is equivalent to a write concern of { w: 2 }.这相当于{ w: 2 }的写关注点。
  • If the _secondaryThrottle setting is unset, the migration process does not wait for replication to a secondary and instead continues with the next document.如果未设置_secondaryThrottle设置,则迁移过程不会等待复制到辅助文档,而是继续下一个文档。

To update the _secondaryThrottle parameter for the balancer, see Secondary Throttle for an example.要更新平衡器的_secondaryThrottle参数,请参阅Secondary Throttle以获取示例。

Independent of any _secondaryThrottle setting, certain phases of the chunk migration have the following replication policy:与任何_secondaryThrottle设置无关,区块迁移的某些阶段具有以下复制策略:

  • MongoDB briefly pauses all application reads and writes to the collection being migrated, on the source shard, before updating the config servers with the new location for the chunk, and resumes the application reads and writes after the update. 在使用区块的新位置更新配置服务器之前,MongoDB会在源分片上短暂暂停对要迁移的集合的所有应用程序读写,并在更新后恢复应用程序读写。The chunk move requires all writes to be acknowledged by majority of the members of the replica set both before and after committing the chunk move to config servers.区块移动要求在将区块移动提交到配置服务器之前和之后,副本集的大多数成员都要确认所有写入操作。
  • When an outgoing chunk migration finishes and cleanup occurs, all writes must be replicated to a majority of servers before further cleanup (from other outgoing migrations) or new incoming migrations can proceed.当传出块迁移完成并进行清理时,必须将所有写入复制到大多数服务器,然后才能进一步清理(来自其他传出迁移)或新的传入迁移。

To update the _secondaryThrottle setting in the config.settings collection, see Secondary Throttle for an example.要更新config.settings集合中的_secondaryThrottle设置,请参阅Secondary Throttle以获取示例。

Maximum Number of Documents Per Chunk to Migrate每个区块要迁移的最大文档数

By default, MongoDB cannot move a chunk if the number of documents in the chunk is greater than 1.3 times the result of dividing the configured chunk size by the average document size. 默认情况下,如果区块中的文档数大于配置的区块大小除以平均文档大小的1.3倍,MongoDB将无法移动区块。db.collection.stats() includes the avgObjSize field, which represents the average document size in the collection.db.collection.stats()包括avgObjSize字段,该字段表示集合中的平均文档大小。

For chunks that are too large to migrate, starting in MongoDB 4.4:对于太大而无法迁移的块,从MongoDB 4.4开始:

  • A new balancer setting attemptToBalanceJumboChunks allows the balancer to migrate chunks too large to move as long as the chunks are not labeled jumbo. 新的平衡器设置attemptToBalanceJumboChunks允许平衡器迁移太大而无法移动的块,只要这些块没有标记为jumboSee Balance Chunks that Exceed Size Limit for details.有关详细信息,请参阅超出大小限制的平衡块
  • The moveChunk command can specify a new option forceJumbo to allow for the migration of chunks that are too large to move. moveChunk命令可以指定一个新选项forceJumbo,以允许迁移太大而无法移动的块。The chunks may or may not be labeled jumbo.这些块可以标记为jumbo,也可以不标记为jumbo。

Range Deletion Performance Tuning范围删除性能调整

You can tune the performance impact of range deletions with rangeDeleterBatchSize and rangeDeleterBatchDelayMS parameters. 可以使用rangeDeleterBatchSizerangeDeleterBatchDelayMS参数调整范围删除对性能的影响。For example:例如:

  • To limit the number of documents deleted per batch, you can set rangeDeleterBatchSize to a small value such as 32.要限制每批删除的文档数,可以将rangeDeleterBatchSize设置为较小的值,例如32
  • To add an additional delay between batch deletions, you can set rangeDeleterBatchDelayMS above the current default of 20 milliseconds.要在批删除之间添加额外的延迟,可以将rangeDeleterBatchDelayMS设置为当前默认值20毫秒以上。
Note注意

If there are ongoing read operations or open cursors on the collection targeted for deletes, range deletion processes may not proceed.如果要删除的集合上有正在进行的读取操作或打开的游标,则范围删除过程可能无法继续。

Shard Size分片大小

By default, MongoDB attempts to fill all available disk space with data on every shard as the data set grows. 默认情况下,随着数据集的增长,MongoDB会尝试用每个分片上的数据填充所有可用磁盘空间。To ensure that the cluster always has the capacity to handle data growth, monitor disk usage as well as other performance metrics.为确保群集始终具有处理数据增长的能力,请监视磁盘使用情况以及其他性能指标。

See the Change the Maximum Storage Size for a Given Shard tutorial for instructions on setting the maximum size for a shard.有关设置分片最大大小的说明,请参阅更改给定分片的最大存储大小教程。

←  Modify Chunk Size in a Sharded ClusterManage Sharded Cluster Balancer →