On this page本页内容
MongoDB uses the shard key associated to the collection to partition the data into chunks. MongoDB使用与集合关联的分片键将数据划分为块。A chunk consists of a subset of sharded data. 区块由分片数据的子集组成。Each chunk has a inclusive lower and exclusive upper range based on the shard key.每个区块都有一个基于分片键的包含下限和排除上限范围。
MongoDB splits chunks when they grow beyond the configured chunk size. 当块增长超过配置的块大小时,MongoDB会分割块。Both inserts and updates can trigger a chunk split.插入和更新都会触发区块分割。
The smallest range a chunk can represent is a single unique shard key value. 块可以表示的最小范围是单个唯一的分片键值。A chunk that only contains documents with a single shard key value cannot be split.无法拆分仅包含具有单个分片键值的文档的区块。
If you define zones and zone ranges defined for an empty or non-existing collection (Available starting in MongoDB 4.0.3):如果为空集合或不存在的集合定义分区和分区范围(从MongoDB 4.0.3开始可用):
If you do not have zones and zone ranges defined for an empty or non-existing collection:如果没有为空集合或不存在的集合定义区域和区域范围:
For hashed sharding:对于散列分片:
numInitialChunks
option to specify a different number of initial chunks. numInitialChunks
选项指定不同数量的初始块。For ranged sharding:对于远程分片:
The default chunk size in MongoDB is 128 megabytes. MongoDB中的默认区块大小为128 MB。You can increase or reduce the chunk size. 您可以增加或减少区块大小。Consider the implications of changing the default chunk size:考虑更改默认区块大小的含义:
mongos
) layer.mongos
)层产生开销。For many deployments, it makes sense to avoid frequent and potentially spurious migrations at the expense of a slightly less evenly distributed data set.对于许多部署,以稍微不均匀分布的数据集为代价,避免频繁和潜在的虚假迁移是有意义的。
Changing the chunk size affects when chunks split but there are some limitations to its effects.更改块大小会影响块拆分时的效果,但其效果有一些限制。
Splitting is a process that keeps chunks from growing too large. 拆分是一个防止块增长过大的过程。When a chunk grows beyond a specified chunk size, or if the number of documents in the chunk exceeds Maximum Number of Documents Per Chunk to Migrate, MongoDB splits the chunk based on the shard key values the chunk represent. 当区块增长超过指定的区块大小时,或者如果区块中的文档数超过要迁移的每个区块的最大文档数,MongoDB会根据区块表示的分片键值拆分区块。A chunk may be split into multiple chunks where necessary. 必要时,可以将一个块拆分为多个块。Inserts and updates may trigger splits. 插入和更新可能会触发拆分。Splits are an efficient meta-data change. 拆分是一种有效的元数据更改。To create splits, MongoDB does not migrate any data or affect the shards.为了创建拆分,MongoDB不会迁移任何数据或影响分片。
Splits may lead to an uneven distribution of the chunks for a collection across the shards. 拆分可能会导致一个集合的块在分片上的分布不均匀。In such cases, the balancer redistributes chunks across shards. 在这种情况下,平衡器会跨分片重新分配块。See Cluster Balancer for more details on balancing chunks across shards.有关跨分片平衡块的详细信息,请参阅群集平衡器。
MongoDB migrates chunks in a sharded cluster to distribute the chunks of a sharded collection evenly among shards. MongoDB迁移分片集群中的块,以便在分片之间均匀分布分片集合的块。Migrations may be either:迁移可以是:
For more information on the sharded cluster balancer, see Sharded Cluster Balancer.有关分片群集平衡器的更多信息,请参阅分片群集平衡器。
The balancer is a background process that manages chunk migrations. 平衡器是一个管理区块迁移的后台进程。If the difference in number of chunks between the largest and smallest shard exceed the migration thresholds, the balancer begins migrating chunks across the cluster to ensure an even distribution of data.如果最大和最小分片之间的块数差异超过迁移阈值,则平衡器开始跨集群迁移块,以确保数据的均匀分布。
You can manage certain aspects of the balancer. 您可以管理平衡器的某些方面。The balancer also respects any zones created as a part of configuring zones in a sharded cluster.平衡器还考虑作为在分片集群中配置区域的一部分创建的任何区域。
See Sharded Cluster Balancer for more information on the balancer.有关平衡器的详细信息,请参阅分片群集平衡器。
In some cases, chunks can grow beyond the specified chunk size but cannot undergo a split. 在某些情况下,块可以超过指定的块大小,但不能进行拆分。The most common scenario is when a chunk represents a single shard key value. 最常见的情况是块表示单个分片键值。Since the chunk cannot split, it continues to grow beyond the chunk size, becoming a jumbo chunk. 由于块无法拆分,它会继续增长,超出块大小,成为一个巨型块。These jumbo chunks can become a performance bottleneck as they continue to grow, especially if the shard key value occurs with high frequency.随着这些巨型块的不断增长,特别是在分片键值频繁出现的情况下,它们可能成为性能瓶颈。
Starting in MongoDB 5.0, you can reshard a collection by changing a document's shard key.从MongoDB5.0开始,您可以通过更改文档的分片键来重新装载集合。
Starting in MongoDB 4.4, MongoDB provides the 从MongoDB 4.4开始,MongoDB提供了refineCollectionShardKey
command. refineCollectionShardKey
命令。Refining a collection's shard key allows for a more fine-grained data distribution and can address situations where the existing key insufficient cardinality leads to jumbo chunks.优化集合的分片键允许更细粒度的数据分发,并可以解决现有键基数不足导致巨型块的情况。
For more information, see:有关详细信息,请参阅:
moveChunk
In MongoDB 2.6 and MongoDB 3.0, 在MongoDB 2.6和MongoDB 3.0中,sharding.archiveMovedChunks
is enabled by default. sharding.archiveMovedChunks
默认启用。All other MongoDB versions have this disabled by default. >默认情况下,所有其他MongoDB版本都禁用了此功能。With sharding.archiveMovedChunks
enabled, the source shard archives the documents in the migrated chunks in a directory named after the collection namespace under the moveChunk
directory in the storage.dbPath
.sharding.archiveMovedChunks
后,源分片将迁移块中的文档归档到storage.dbPath
中moveChunk
目录下以集合名称空间命名的目录中。
If some error occurs during a migration, these files may be helpful in recovering documents affected during the migration.如果迁移过程中发生错误,这些文件可能有助于恢复迁移过程中受影响的文档。
Once the migration has completed successfully and there is no need to recover documents from these files, you may safely delete these files. 一旦迁移成功完成,并且不需要从这些文件恢复文档,您就可以安全地删除这些文件。Or, if you have an existing backup of the database that you can use for recovery, you may also delete these files after migration.或者,如果您有可用于恢复的数据库的现有备份,也可以在迁移后删除这些文件。
To determine if all migrations are complete, run 要确定所有迁移是否完成,请在连接到sh.isBalancerRunning()
while connected to a mongos
instance.mongos
实例时运行sh.isBalancerRunning()
。