Remove Shards from an Existing Sharded Cluster从现有分片群集中删除分片

On this page本页内容

To remove a shard you must ensure the shard's data is migrated to the remaining shards in the cluster. 要删除分片,必须确保将分片的数据迁移到集群中的其余分片。This procedure describes how to safely migrate data and how to remove a shard.此过程描述了如何安全迁移数据以及如何删除分片。

When you remove a shard in a cluster with an uneven chunk distribution, the balancer first removes the chunks from the draining shard and then balances the remaining uneven chunk distribution.当您在具有不均匀块分布的集群中删除分片时,平衡器首先从排出分片中删除块,然后平衡剩余的不均匀块分配。

This procedure describes how to remove a shard from a cluster. 此过程描述如何从集群中删除分片。Do not use this procedure to migrate an entire cluster to new hardware. 不要使用此过程将整个群集迁移到新硬件。To migrate, see Migrate a Sharded Cluster to Different Hardware instead.要迁移,请参阅将分片群集迁移到不同硬件

To remove a shard, first connect to one of the cluster's mongos instances using mongosh. 要删除分片,首先使用mongosh连接到集群的mongos实例之一。Then use the sequence of tasks in this document to remove a shard from the cluster.然后使用本文档中的任务序列从集群中删除分片。

Considerations注意事项

  • A shard removal may cause an open change stream cursor to close, and the closed change stream cursor may not be fully resumable.分片删除可能会导致打开的变更流游标关闭,并且关闭的变更流游标可能无法完全恢复。
  • You can safely restart a cluster during a shard removal process. 您可以在分片删除过程中安全地重新启动集群。If you restart a cluster during an ongoing draining process, draining continues automatically after the cluster components restart. 如果在正在进行的出过程中重新启动群集,则会在群集组件重新启动后自动继续排出。MongoDB records the shard draining status in the config.shards collection.MongoDB在config.shards集合中记录分片排出状态。

Ensure the Balancer Process is Enabled确保已启用平衡器进程

To successfully migrate data from a shard, the balancer process must be enabled. 要成功地从分片迁移数据,必须启用平衡器进程。Check the balancer state using the sh.getBalancerState() helper in mongosh. 使用mongosh中的sh.getBalancerState()助手检查平衡器状态。For more information, see the section on balancer operations.有关更多信息,请参阅有关平衡器操作的部分。

Determine the Name of the Shard to Remove确定要删除的分片的名称

To determine the name of the shard, connect to a mongos instance with mongosh and either:要确定分片的名称,请使用mongosh连接到mongos实例,然后执行以下操作之一:

The shards._id field lists the name of each shard.shards._id字段列出每个分片的名称。

Remove Chunks from the Shard从分片中删除块

From the admin database, run the removeShard command. admin数据库中,运行removeShard命令。This begins "draining" chunks from the shard you are removing to other shards in the cluster. 这将开始从您要删除的分片“排出”块到集群中的其他分片。For example, for a shard named mongodb0, run:例如,对于名为mongodb0的分片,运行:

db.adminCommand( { removeShard: "mongodb0" } )

mongos converts the write concern of the removeShard command to "majority".removeShard命令的写入关注点转换为"majority"

This operation returns with the following response:此操作返回以下响应:

{
   "msg" : "draining started successfully",
   "state" : "started",
   "shard" : "mongodb0",
   "note" : "you need to drop or movePrimary these databases",
   "dbsToMove" : [
      "fiz",
      "buzz"
   ],
   "ok" : 1,
   "operationTime" : Timestamp(1575398919, 2),
   "$clusterTime" : {
      "clusterTime" : Timestamp(1575398919, 2),
      "signature" : {
         "hash" : BinData(0,"Oi68poWCFCA7b9kyhIcg+TzaGiA="),
         "keyId" : NumberLong("6766255701040824328")
      }
   }
}

The balancer begins migrating chunks from the shard named mongodb0 to other shards in the cluster. 平衡器开始将块从名为mongodb0的分片迁移到集群中的其他分片。These migrations happens slowly to avoid placing undue load on the overall cluster. 这些迁移进行得很慢,以避免对整个集群造成过度负载。Depending on your network capacity and the amount of data, this operation can take from a few minutes to several days to complete.根据您的网络容量和数据量,此操作可能需要几分钟到几天才能完成。

Note注意

The output includes the field dbsToMove indicating the databases, if any, for which the shard is the primary shard. 输出包括字段dbsToMove,指示分片是其主分片的数据库(如果有)。After all chunks have been drained from the shard, you must either movePrimary for the database(s) or alternatively, drop the databases (which deletes the associated data files).从分片中排出所有块后,必须为数据库移动Primary,或者删除数据库(这将删除关联的数据文件)。

Check the Status of the Migration检查迁移的状态

To check the progress of the migration at any stage in the process, run removeShard from the admin database again. 要在进程的任何阶段检查迁移进度,请再次从admin数据库运行removeShardFor example, for a shard named mongodb0, run:例如,对于名为mongodb0的分片,运行:

db.adminCommand( { removeShard: "mongodb0" } )

mongos converts the write concern of the removeShard command to "majority".mongosremoveShard命令的写关注点转换为"majority"

The command returns output similar to the following:该命令返回类似于以下内容的输出:

{
   "msg" : "draining ongoing",
   "state" : "ongoing",
   "remaining" : {
      "chunks" : NumberLong(2),
      "dbs" : NumberLong(2),
      "jumboChunks" : NumberLong(0)
         // Available starting in 4.2.2 (and 4.0.14)
   },
   "note" : "you need to drop or movePrimary these databases",
   "dbsToMove" : [
      "fizz",
      "buzz"
   ],
   "ok" : 1,
   "operationTime" : Timestamp(1575399086, 1655),
   "$clusterTime" : {
      "clusterTime" : Timestamp(1575399086, 1655),
      "signature" : {
         "hash" : BinData(0,"XBrTmjMMe82fUtVLRm13GBVtRE8="),
         "keyId" : NumberLong("6766255701040824328")
      }
   }
}

In the output, the remaining field includes the following fields:在输出中,remaining字段包括以下字段:

Field字段Description描述
chunksTotal number of chunks currently remaining on the shard.分片上当前剩余的块总数。
dbsTotal number of databases whose primary shard is the shard. 主分片为分片的数据库总数。These databases are specified in the dbsToMove output field. 这些数据库在dbsToMove输出字段中指定。
jumboChunks

Of the total number of chunks, the number that are jumbo.在总chunks数中,jumbo的数目。

If the jumboChunks is greater than 0, wait until only the jumboChunks remain on the shard. 如果jumboChunks大于0,请等待,直到只有jumboChunks保留在分片上。Once only the jumbo chunks remain, you must manually clear the jumbo flag before the draining can complete. 一旦只剩下jumbo块,您必须手动清除jumbo标志,然后才能完成排出。See Clear jumbo Flag.请参见清除jumbo标志

After the jumbo flag clears, the balancer can migrate these chunks. 清除jumbo标志后,平衡器可以迁移这些块。However if the queue of writes that modify any documents being migrated surpasses 500MB of memory the migration will fail. 但是,如果修改要迁移的任何文档的写入队列超过500MB内存,则迁移将失败。For details on the migration procedure, see Chunk Migration Procedure.有关迁移过程的详细信息,请参阅区块迁移过程

Available starting in 4.2.2 (and 4.0.14)从4.2.2(和4.0.14)开始可用

Continue checking the status of the removeShard command until the number of chunks remaining is 0.继续检查removeShard命令的状态,直到剩余的块数为0

db.adminCommand( { removeShard: "mongodb0" } )

Move Databases to Another Primary Shard将数据库移动到另一个主分片

If the shard is the primary shard for one or more databases in the cluster, then you must make that database use a different shard as its primary shard. 如果分片是群集中一个或多个数据库的主分片,则必须使该数据库使用不同的分片作为其主分片。removeShard lists any databases that you need to move in the dbsToMove field in the command output. 在命令输出的dbsToMove字段中列出了需要移动的所有数据库。If the shard is not the primary shard for any databases, skip to the next task, Finalize the Migration.如果分片不是任何数据库的主分片,请跳到下一个任务“完成迁移”。

To move a database to another shard, use the movePrimary command.要将数据库移动到另一个分片,请使用movePrimary命令。

Important重要

To ensure a smooth migration, refer to the considerations in the movePrimary command documentation before running movePrimary.要确保顺利迁移,请在运行movePrimary之前参阅movePrimary命令文档中的注意事项

To migrate the fizz database from mongodb0 to mongodb1, issue the following command:要将fizz数据库从mongodb0迁移到mongodb1,请发出以下命令:

db.adminCommand( { movePrimary: "fizz", to: "mongodb1" })

mongos uses "majority" write concern for movePrimary.movePrimary使用"majority"写入关注点。

This command does not return until MongoDB completes moving all data. 在MongoDB完成移动所有数据之前,此命令不会返回。The response from this command will resemble the following:此命令的响应如下:

{
   "ok" : 1,
   "operationTime" : Timestamp(1575400369, 9),
   "$clusterTime" : {
   "clusterTime" : Timestamp(1575400369, 9),
   "signature" : {
      "hash" : BinData(0,"2Nz8QCcVXB0LJLm1hsXfpTCaM0M="),
      "keyId" : NumberLong("6766255701040824328")
   }
}
}

Using movePrimary To Move Unsharded Collections使用movePrimary移动非硬集

For MongoDB 4.2 and previous, if using the movePrimary command on a database that contains an unsharded collection, you must perform the following additional steps.对于MongoDB 4.2及更早版本,如果对包含非分片集合的数据库使用movePrimary命令,则必须执行以下附加步骤。

Note注意

MongoDB 4.4 does not require these additional steps when moving databases that contain unsharded collections.当移动包含非存储集合的数据库时,MongoDB 4.4不需要这些额外步骤。

  • For MongoDB 4.2, you must either:对于MongoDB 4.2,您必须:

    • Restart all mongos instances and all mongod shard members (including the secondary members);重启所有mongos实例和所有mongod 分片成员(包括次要成员);
    • Use the flushRouterConfig command on all mongos instances and all mongod shard members (including the secondary members) before reading or writing any data to any unsharded collections that were moved.在所有mongos实例和所有mongod 分片成员(包括次要成员)上使用flushRouterConfig命令,然后将任何数据读取或写入已移动的任何非存储集合。
  • For MongoDB 4.0 and earlier, you must either:对于MongoDB 4.0及更早版本,您必须:

    • Restart all mongos instances;重启所有mongos实例;
    • Use the flushRouterConfig command on all mongos instances before reading or writing any data to any unsharded collections that were moved.在所有mongos实例上使用flushRouterConfig命令,然后将任何数据读取或写入已移动的任何非存储集合。

These steps ensure that all cluster nodes refresh their metadata cache, which includes the location of the primary shard. 这些步骤确保所有集群节点刷新其元数据缓存,其中包括主分片的位置。Otherwise, you may miss data on reads, and may not write data to the correct shard. 否则,读取时可能会丢失数据,并且可能无法将数据写入正确的分片。To recover, you must manually intervene.要恢复,必须手动干预。

Finalize the Migration完成迁移

To clean up all metadata information and finalize the removal, run removeShard again. 要清理所有元数据信息并完成删除,请再次运行removeShardFor example, for a shard named mongodb0, run:例如,对于名为mongodb0的分片,运行:

db.adminCommand( { removeShard: "mongodb0" } )

mongos converts the write concern of the removeShard command to "majority".mongosremoveShard命令的写关注点转换为"majority"

A success message appears at completion:完成时显示成功消息:

{
    "msg" : "removeshard completed successfully",
    "state" : "completed",
    "shard" : "mongodb0",
    "ok" : 1,
    "operationTime" : Timestamp(1575400370, 2),
    "$clusterTime" : {
       "clusterTime" : Timestamp(1575400370, 2),
       "signature" : {
          "hash" : BinData(0,"JjSRciHECXDBXo0e5nJv9mdRG8M="),
          "keyId" : NumberLong("6766255701040824328")
       }
    }
}

Once the value of the state field is "completed", you may safely stop the instances comprising the mongodb0 shard.一旦state字段的值为“completed”,就可以安全地停止包含mongodb0分片的实例。

←  Add Shards to a ClusterClear jumbo Flag →