Create Chunks in a Sharded Cluster在分片群集中创建区块

In most situations a sharded cluster will create/split and distribute chunks automatically without user intervention. 在大多数情况下,分片集群将自动创建/分割和分发块,而无需用户干预。However, in a limited number of cases, MongoDB cannot create enough chunks or distribute data fast enough to support the required throughput.然而,在数量有限的情况下,MongoDB无法创建足够的块或以足够快的速度分发数据以支持所需的吞吐量。

For example, if you want to ingest a large volume of data into a cluster that is unbalanced, or where the ingestion of data will lead to data imbalance, such as with monotonically increasing or decreasing shard keys. 例如,如果要将大量数据摄取到不平衡的集群中,或者摄取数据将导致数据不平衡,例如单调增加或减少分片键。Pre-splitting the chunks of an empty sharded collection can help with the throughput in these cases.在这些情况下,预先分割空分片集合的块有助于提高吞吐量。

Alternatively, starting in MongoDB 4.0.3, by defining the zones and zone ranges before sharding an empty or a non-existing collection, the shard collection operation creates chunks for the defined zone ranges as well as any additional chunks to cover the entire range of the shard key values and performs an initial chunk distribution based on the zone ranges. 或者,从MongoDB 4.0.3开始,通过在对空集合或不存在的集合进行分片之前定义区域和区域范围,分片集合操作会为定义的区域范围以及任何其他块创建块,以覆盖整个分片键值范围,并基于区域范围执行初始块分配。For more information, see Empty Collection.有关详细信息,请参阅清空集合

Warning警告

Only pre-split chunks for an empty collection. 空集合仅预拆分块。Manually splitting chunks for a populated collection can lead to unpredictable chunk ranges and sizes as well as inefficient or ineffective balancing behavior.手动拆分填充集合的块可能会导致不可预测的块范围和大小,以及低效或无效的平衡行为。

To split empty chunks manually, you can run the split command:要手动拆分空块,可以运行split命令:

Example示例

To create chunks for documents in the myapp.users collection using the email field as the shard key, use the following operation in mongosh:要使用email字段作为分片键myapp.users集合中为文档创建区块,请在mongosh中使用以下操作:

for ( var x=97; x<97+26; x++ ){
    for ( var y=97; y<97+26; y+=6 ) {
        var prefix = String.fromCharCode(x) + String.fromCharCode(y);
        db.adminCommand( { split: "myapp.users", middle: { email : prefix } } );
    }
}

This assumes a collection size of 100 million documents.这假定集合大小为1亿个文档。

Tip提示
See also: 参阅:

sh.balancerCollectionStatus()

←  Data Partitioning with ChunksSplit Chunks in a Sharded Cluster →