Overview概述
This tutorial involves creating a new sharded cluster that consists of a 本教程涉及创建一个新的分片集群,该集群由一个mongos, the config server replica set, and two shard replica sets.mongos、配置服务器副本集和两个分片副本集组成。
Considerations注意事项
Connectivity连接性
Each member of a sharded cluster must be able to connect to all other members in the cluster. This includes all shards and config servers. Ensure that network and security systems, including all interface and firewalls, allow these connections.分片集群的每个成员都必须能够连接到集群中的所有其他成员。这包括所有分片和配置服务器。确保网络和安全系统,包括所有接口和防火墙,允许这些连接。
Hostnames and Configuration主机名和配置
Important
To avoid configuration updates due to IP address changes, use DNS hostnames instead of IP addresses. It is particularly important to use a DNS hostname instead of an IP address when configuring replica set members or sharded cluster members.为避免因IP地址更改而进行配置更新,请使用DNS主机名而不是IP地址。在配置副本集成员或分片集群成员时,使用DNS主机名而不是IP地址尤为重要。
Use hostnames instead of IP addresses to configure clusters across a split network horizon. Starting in MongoDB 5.0, nodes that are only configured with an IP address fail startup validation and do not start.使用主机名而不是IP地址来配置跨拆分网络范围的集群。从MongoDB 5.0开始,仅配置了IP地址的节点无法启动验证,也无法启动。
Localhost Deployments本地主机部署
If you use either 如果将localhost or its IP address as the hostname portion of any host identifier, you must use that identifier as the host setting for any other MongoDB component in the cluster.localhost或其IP地址用作任何主机标识符的主机名部分,则必须将该标识符用作集群中任何其他MongoDB组件的主机设置。
For example, the 例如,sh.addShard() method takes a host parameter for the hostname of the target shard. sh.addShard()方法接受目标分片主机名的host参数。If you set 如果将host to localhost, you must then use localhost as the host for all other shards in the cluster.host设置为localhost,则必须将localhost用作集群中所有其他分片的主机。
Security安全
This tutorial does not include the required steps for configuring Self-Managed Internal/Membership Authentication or Role-Based Access Control in Self-Managed Deployments.本教程不包括在自我管理部署中配置自我管理内部/成员身份验证或基于角色的访问控制所需的步骤。
In production environments, sharded clusters should employ at minimum x.509 security for internal authentication and client access.在生产环境中,分片集群应至少采用x.509安全性进行内部身份验证和客户端访问。
Before You Begin开始之前
Starting in MongoDB 8.0, you can use the 从MongoDB 8.0开始,您可以使用directShardOperations role to perform maintenance operations that require you to execute commands directly against a shard.directShardOperations角色执行维护操作,这些操作要求您直接对分片执行命令。
Warning
Running commands using the 使用directShardOperations role can cause your cluster to stop working correctly and may cause data corruption. directShardOperations角色运行命令可能会导致集群停止正常工作,并可能导致数据损坏。Only use the 仅将directShardOperations role for maintenance purposes or under the guidance of MongoDB support. Once you are done performing maintenance operations, stop using the directShardOperations role.directShardOperations角色用于维护目的或在MongoDB支持的指导下使用。完成维护操作后,停止使用directShardOperations角色。
Procedure过程
Create the Config Server Replica Set创建配置服务器副本集
The following steps deploys a config server replica set.以下步骤部署配置服务器副本集。
For a production deployment, deploy a config server replica set with at least three members. For testing purposes, you can create a single-member replica set.对于生产部署,部署一个至少有三个成员的配置服务器副本集。出于测试目的,您可以创建单个成员副本集。
Note
The config server replica set must not use the same name as any of the shard replica sets.配置服务器副本集不得使用与任何分片副本集相同的名称。
For this tutorial, the config server replica set members are associated with the following hosts:对于本教程,配置服务器副本集成员与以下主机相关联:
| Member 0 | cfg1.example.net |
| Member 1 | cfg2.example.net |
| Member 2 | cfg3.example.net |
Start each member of the config server replica set.启动配置服务器副本集的每个成员。
When starting each 启动每个mongod, specify the mongod settings either via a configuration file or the command line.mongod时,通过配置文件或命令行指定mongod设置。
Configuration File配置文件
If using a configuration file, set:如果使用配置文件,请设置:
sharding:
clusterRole: configsvr
replication:
replSetName: <replica set name>
net:
bindIp: localhost,<hostname(s)|ip address(es)>
sharding.clusterRoleto为configsvr,configsvr,replication.replSetNameto the desired name of the config server replica set,为配置服务器副本集的期望名称,net.bindIpoption to the hostname/ip address or comma-delimited list of hostnames or ip addresses that remote clients (including the other members of the config server replica set as well as other members of the sharded cluster) can use to connect to the instance.远程客户端(包括配置服务器副本集的其他成员以及分片集群的其他成员)可以用来连接到实例的主机名/ip地址或逗号分隔的主机名或ip地址列表的选项。Warning
Before you bind your instance to a publicly-accessible IP address, you must secure your cluster from unauthorized access.在将实例绑定到可公开访问的IP地址之前,必须保护集群免受未经授权的访问。For a complete list of security recommendations, see Security Checklist for Self-Managed Deployments.有关安全建议的完整列表,请参阅自我管理部署的安全检查表。At minimum, consider enabling authentication and hardening network infrastructure.至少,考虑启用身份验证并加强网络基础设施。Additional settings as appropriate to your deployment, such as根据部署情况进行其他设置,如storage.dbPathandnet.port. For more information on the configuration file, see configuration options.storage.dbPath和net.port。有关配置文件的更多信息,请参阅配置选项。
Start the 将mongod with the --config option set to the configuration file path.--config选项设置为配置文件路径,启动mongod。
mongod --config <path-to-config-file>Command Line命令行
If using the command line options, start the 如果使用命令行选项,请使用mongod with the --configsvr, --replSet, --bind_ip, and other options as appropriate to your deployment. For example:--configsvr、--replSet、--bind_ip和其他适合您部署的选项启动mongod。例如:
Warning
Before you bind your instance to a publicly-accessible IP address, you must secure your cluster from unauthorized access. 在将实例绑定到可公开访问的IP地址之前,必须保护集群免受未经授权的访问。For a complete list of security recommendations, see Security Checklist for Self-Managed Deployments. 有关安全建议的完整列表,请参阅自我管理部署的安全检查表。At minimum, consider enabling authentication and hardening network infrastructure.至少,考虑启用身份验证并加强网络基础设施。
mongod --configsvr --replSet <replica set name> --dbpath <path> --bind_ip localhost,<hostname(s)|ip address(es)>
For more information on startup parameters, see the 有关启动参数的更多信息,请参阅mongod reference page.mongod参考页面。
Connect to one of the config servers.连接到其中一个配置服务器。
Connect 将mongosh to one of the config server members.mongosh连接到配置服务器成员之一。
mongosh --host <hostname> --port <port>Initiate the replica set.启动副本集。
From 从mongosh, run the rs.initiate() method.mongosh运行rs.initiate()方法。
rs.initiate() can take an optional replica set configuration document. In the replica set configuration document, include:可以采用可选的副本集配置文档。在副本集配置文档中,包括:
The_idset to the replica set name specified in either thereplication.replSetNameor the--replSetoption._id设置为replication.replSetName或--replSet选项中指定的副本集名称。The配置服务器副本集的configsvrfield set totruefor the config server replica set.configsvr字段设置为true。Themembersarray with a document per each member of the replica set.members数组,每个副本集的成员都有一个文档。
Important
Run 仅对副本集的一个rs.initiate() on only one mongod instance for the replica set.mongod实例运行rs.initiate()。
rs.initiate(
{
_id: "myReplSet",
configsvr: true,
members: [
{ _id : 0, host : "cfg1.example.net:27019" },
{ _id : 1, host : "cfg2.example.net:27019" },
{ _id : 2, host : "cfg3.example.net:27019" }
]
}
)
See Self-Managed Replica Set Configuration for more information on replica set configuration documents.有关副本集配置文档的更多信息,请参阅自我管理副本集配置。
Once the config server replica set (CSRS) is initiated and up, proceed to creating the shard replica sets.启动并启动配置服务器副本集(CSRS)后,继续创建分片副本集。
Create the Shard Replica Sets创建分片副本集
For a production deployment, use a replica set with at least three members. For testing purposes, you can create a single-member replica set.对于生产部署,使用至少有三个成员的副本集。出于测试目的,您可以创建单个成员副本集。
Note
Shard replica sets must not use the same name as the config server replica set.分片副本集不得使用与配置服务器副本集相同的名称。
For each shard, use the following steps to create the shard replica set:对于每个分片,使用以下步骤创建分片副本集:
Start each member of the shard replica set.启动分片副本集的每个成员。
When starting each 启动每个mongod, specify the mongod settings either via a configuration file or the command line.mongod时,通过配置文件或命令行指定mongod设置。
Configuration File配置文件
If using a configuration file, set:如果使用配置文件,请设置:
sharding:
clusterRole: shardsvr
replication:
replSetName: <replSetName>
net:
bindIp: localhost,<ip address>
replication.replSetNameto the desired name of the replica set,为副本集的期望名称,sharding.clusterRoleoption to选项为shardsvr,shardsvr,net.bindIpoption to the ip or a comma-delimited list of ips that remote clients (including the other members of the config server replica set as well as other members of the sharded cluster) can use to connect to the instance.选项为ip或逗号分隔的ip列表,远程客户端(包括配置服务器副本集的其他成员以及分片集群的其他成员)可以使用这些ip连接到实例。Warning
Before you bind your instance to a publicly-accessible IP address, you must secure your cluster from unauthorized access.在将实例绑定到可公开访问的IP地址之前,必须保护集群免受未经授权的访问。For a complete list of security recommendations, see Security Checklist for Self-Managed Deployments.有关安全建议的完整列表,请参阅自我管理部署的安全检查表。At minimum, consider enabling authentication and hardening network infrastructure.至少,考虑启用身份验证并加强网络基础设施。Additional settings as appropriate to your deployment, such as根据部署情况进行其他设置,如storage.dbPathandnet.port. For more information on the configuration file, see configuration options.storage.dbPath和net.port。有关配置文件的更多信息,请参阅配置选项。
Start the 将mongod with the --config option set to the configuration file path.--config选项设置为配置文件路径,启动mongod。
mongod --config <path-to-config-file>Command Line命令行
If using the command line option, start the 如果使用命令行选项,请使用mongod with the --replSet, and --shardsvr, --bind_ip options, and other options as appropriate to your deployment. For example:--replSet、--shardsvr、--bind_ip选项以及其他适合您部署的选项启动mongod。例如:
mongod --shardsvr --replSet <replSetname> --dbpath <path> --bind_ip localhost,<hostname(s)|ip address(es)>
For more information on startup parameters, see the 有关启动参数的更多信息,请参阅mongod reference page.mongod参考页面。
Connect to one member of the shard replica set.连接到分片副本集的一个成员。
Connect 将mongosh to one of the replica set members.mongosh连接到副本集成员之一。
mongosh --host <hostname> --port <port>Initiate the replica set.启动副本集。
From 从mongosh, run the rs.initiate() method.mongosh运行rs.initiate()方法。
rs.initiate() can take an optional replica set configuration document. In the replica set configuration document, include:rs.initiate()可以接受可选的副本集配置文档。在副本集配置文档中,包括:
The_idfield set to the replica set name specified in either thereplication.replSetNameor the--replSetoption._id字段设置为replication.replSetName或--replSet选项中指定的副本集名称。Themembersarray with a document per each member of the replica set.members数组,每个副本集的成员都有一个文档。
The following example initiates a three member replica set.以下示例启动了一个由三个成员组成的副本集。
Important
Run 仅对副本集的一个rs.initiate() on only one mongod instance for the replica set.mongod实例运行rs.initiate()。
rs.initiate(
{
_id : "myReplSet",
members: [
{ _id : 0, host : "s1-mongo1.example.net:27018" },
{ _id : 1, host : "s1-mongo2.example.net:27018" },
{ _id : 2, host : "s1-mongo3.example.net:27018" }
]
}
)Start a mongos for the Sharded Cluster为分片集群启动mongos
mongos for the Sharded ClusterStart a 使用配置文件或命令行参数启动mongos using either a configuration file or a command line parameter to specify the config servers.mongos以指定配置服务器。
Configuration File配置文件
If using a configuration file, set the 如果使用配置文件,请将sharding.configDB to the config server replica set name and at least one member of the replica set in <replSetName>/<host:port> format.sharding.configDB设置为配置服务器副本集名称,并以<replSetName>/<host:port>格式设置副本集的至少一个成员。
Warning
Before you bind your instance to a publicly-accessible IP address, you must secure your cluster from unauthorized access. 在将实例绑定到可公开访问的IP地址之前,必须保护集群免受未经授权的访问。For a complete list of security recommendations, see Security Checklist for Self-Managed Deployments. 有关安全建议的完整列表,请参阅自我管理部署的安全检查表。At minimum, consider enabling authentication and hardening network infrastructure.至少,考虑启用身份验证并加强网络基础设施。
sharding:
configDB: <configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019
net:
bindIp: localhost,<hostname(s)|ip address(es)>
Start the 启动mongos specifying the --config option and the path to the configuration file.mongos,指定--config选项和配置文件的路径。
mongos --config <path-to-config>
For more information on the configuration file, see configuration options.有关配置文件的更多信息,请参阅配置选项。
Command Line命令行
If using command line parameters start the 如果使用命令行参数,请启动mongos and specify the --configdb, --bind_ip, and other options as appropriate to your deployment. For example:mongos,并根据部署情况指定--configdb、--bind_ip和其他选项。例如:
Warning
Before you bind your instance to a publicly-accessible IP address, you must secure your cluster from unauthorized access. 在将实例绑定到可公开访问的IP地址之前,必须保护集群免受未经授权的访问。For a complete list of security recommendations, see Security Checklist for Self-Managed Deployments. 有关安全建议的完整列表,请参阅自我管理部署的安全检查表。At minimum, consider enabling authentication and hardening network infrastructure.至少,考虑启用身份验证并加强网络基础设施。
mongos --configdb <configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019,cfg3.example.net:27019 --bind_ip localhost,<hostname(s)|ip address(es)>
Include any other options as appropriate for your deployment.根据部署情况,包括任何其他选项。
At this point, your sharded cluster consists of the 此时,分片集群由mongos and the config servers. You can now connect to the sharded cluster using mongosh.mongos和配置服务器组成。现在,您可以使用mongosh连接到分片集群。
Connect to the Sharded Cluster连接到分片集群
Connect 把mongosh to the mongos. Specify the host and port on which the mongos is running:mongosh和mongos联系起来。指定运行mongos的host和port:
mongosh --host <hostname> --port <port>
Once you have connected 一旦你将mongosh to the mongos, continue to the next procedure to add shards to the cluster.mongosh连接到mongos,继续下一个过程将分片添加到集群中。
Add Shards to the Cluster将分片添加到集群
In a 在连接到mongosh session that is connected to the mongos, use the sh.addShard() method to add each shard to the cluster.mongos的mongosh会话中,使用sh.addShard()方法将每个分片添加到集群中。
The following operation adds a single shard replica set to the cluster:以下操作将单个分片副本集添加到集群中:
sh.addShard( "<replSetName>/s1-mongo1.example.net:27018,s1-mongo2.example.net:27018,s1-mongo3.example.net:27018")
Repeat these steps until the cluster includes all desired shards.重复这些步骤,直到集群包含所有所需的分片。
Shard a Collection分片集合
To shard a collection, connect 要对集合进行分片,请将mongosh to the mongos and use the sh.shardCollection() method.mongosh连接到mongos,并使用sh.shardCollection()方法。
Note
Sharding and Indexes分片和索引
If the collection already contains data, you must create an index that supports the shard key before sharding the collection. 如果集合已经包含数据,则必须在对集合进行分片之前创建一个支持分片键的索引。If the collection is empty, MongoDB creates the index as part of 如果集合为空,MongoDB将创建索引作为sh.shardCollection().sh.shardCollection()的一部分。
MongoDB provides two strategies to shard collections:MongoDB提供了两种分片集合的策略:
Hashed sharding uses a hashed index of a single field as the shard key to partition data across your sharded cluster.哈希分片使用单个字段的哈希索引作为分片键,在分片集群中对数据进行分区。sh.shardCollection("<database>.<collection>", { <shard key field> : "hashed" } )Range-based sharding can use multiple fields as the shard key and divides data into contiguous ranges determined by the shard key values.基于范围的分片可以使用多个字段作为分片键,并将数据划分为由分片键值确定的连续范围。sh.shardCollection("<database>.<collection>", { <shard key field> : 1, ... } )
Shard Key Considerations分片化关键考虑因素
Your selection of shard key affects the efficiency of sharding, as well as your ability to take advantage of certain sharding features such as zones. 您对分片键的选择会影响分片的效率,以及您利用某些分片功能(如区域)的能力。To learn how to choose an effective shard key, see Choose a Shard Key.要了解如何选择有效的分片键,请参阅选择分片键。
mongosh provides the method 提供了convertShardKeyToHashed(). convertShardKeyToHashed()方法。This method uses the same hashing function as the hashed index and can be used to see what the hashed value would be for a key.此方法使用与哈希索引相同的哈希函数,可用于查看键的哈希值。
Tip
For hashed sharding shard keys, see Hashed Sharding Shard Key关于哈希分片分片键,请参阅哈希分片键For ranged sharding shard keys, see Shard Key Selection有关范围分片分片键,请参阅分片键选择