On this page本页内容
A replica set in MongoDB is a group of MongoDB中的副本集是维护同一数据集的一组mongod
processes that maintain the same data set. mongod
进程。Replica sets provide redundancy and high availability, and are the basis for all production deployments. 副本集提供冗余和高可用性,是所有生产部署的基础。This section introduces replication in MongoDB as well as the components and architecture of replica sets. 本节介绍MongoDB中的复制以及副本集的组件和体系结构。The section also provides tutorials for common tasks related to replica sets.本节还提供了与副本集相关的常见任务的教程。
Replication provides redundancy and increases data availability. 复制提供了冗余并提高了数据可用性。With multiple copies of data on different database servers, replication provides a level of fault tolerance against the loss of a single database server.由于不同数据库服务器上有多个数据副本,复制提供了一定程度的容错能力,可以防止丢失单个数据库服务器。
In some cases, replication can provide increased read capacity as clients can send read operations to different servers. 在某些情况下,复制可以提供更大的读取容量,因为客户端可以向不同的服务器发送读取操作。Maintaining copies of data in different data centers can increase data locality and availability for distributed applications. 在不同的数据中心维护数据副本可以提高分布式应用程序的数据局部性和可用性。You can also maintain additional copies for dedicated purposes, such as disaster recovery, reporting, or backup.您还可以出于专用目的维护其他副本,如灾难恢复、报告或备份。
A replica set is a group of 副本集是维护同一数据集的一组mongod
instances that maintain the same data set. mongod
实例。A replica set contains several data bearing nodes and optionally one arbiter node. 副本集包含多个数据承载节点和一个仲裁节点(可选)。Of the data bearing nodes, one and only one member is deemed the primary node, while the other nodes are deemed secondary nodes.在数据承载节点中,一个且只有一个成员被视为主节点,而其他节点被视为次节点。
The primary node receives all write operations. 主节点接收所有写入操作。A replica set can have only one primary capable of confirming writes with 一个副本集只能有一个主副本,能够用{ w: "majority" }
write concern; although in some circumstances, another mongod instance may transiently believe itself to also be primary. { w: "majority" }
写问题确认写操作;尽管在某些情况下,另一个mongod
实例可能会暂时认为自己也是主要的。[1] The primary records all changes to its data sets in its operation log, i.e. oplog. 主服务器在其操作日志(即oplog)中记录对其数据集的所有更改。For more information on primary node operation, see Replica Set Primary.有关主节点操作的更多信息,请参阅副本集主节点。
The secondaries replicate the primary's oplog and apply the operations to their data sets such that the secondaries' data sets reflect the primary's data set. 二级数据库复制主数据库的oplog,并将操作应用于其数据集,以便二级数据库的数据集反映主数据库的数据集。If the primary is unavailable, an eligible secondary will hold an election to elect itself the new primary. 如果无法进行初选,符合条件的中学将举行选举,选出新的初选。For more information on secondary members, see Replica Set Secondary Members.有关辅助成员的详细信息,请参阅副本集辅助成员。
In some circumstances (such as you have a primary and a secondary but cost constraints prohibit adding another secondary), you may choose to add a 在某些情况下(例如您有一个主副本和一个辅助副本,但成本限制禁止添加另一个辅助副本),您可以选择将mongod
instance to a replica set as an arbiter. mongod
实例作为仲裁器添加到副本集中。An arbiter participates in elections but does not hold data (i.e. does not provide data redundancy). 仲裁人参与选举但不保存数据(即不提供数据冗余)。For more information on arbiters, see Replica Set Arbiter.有关仲裁器的更多信息,请参阅副本集仲裁器。
An arbiter will always be an arbiter whereas a primary may step down and become a secondary and a secondary may become the primary during an election.仲裁器永远是仲裁器,而primary可能会下台,成为secondary人,secondary可能会在选举中成为primary。
Secondaries replicate the primary's oplog and apply the operations to their data sets asynchronously. 辅助设备复制主设备的oplog,并将操作异步应用于其数据集。By having the secondaries' data sets reflect the primary's data set, the replica set can continue to function despite the failure of one or more members.通过让二级数据集反映主数据集,副本集可以在一个或多个成员出现故障的情况下继续运行。
For more information on replication mechanics, see Replica Set Oplog and Replica Set Data Synchronization.有关复制机制的更多信息,请参阅副本集Oplog和副本集数据同步。
Starting in version 4.2 (also available starting in 4.0.6), secondary members of a replica set now log oplog entries that take longer than the slow operation threshold to apply. 从版本4.2开始(也可从4.0.6开始使用),副本集的次要成员现在会记录需要比慢速操作阈值更长时间才能应用的oplog条目。These slow oplog messages:这些缓慢的oplog消息:
diagnostic log
.diagnostic log
中记录辅助设备的。REPL
component with the text applied op: <oplog entry> took <num>ms
.REPL
组件下,文本为applied op: <oplog entry> took <num>ms
。May be affected by 可能会受到slowOpSampleRate
, depending on your MongoDB version:slowOpSampleRate
的影响,具体取决于您的MongoDB版本:
slowOpSampleRate
. slowOpSampleRate
的影响。slowOpSampleRate
.slowOpSampleRate
的影响。The profiler does not capture slow oplog entries.探查器不会捕获较慢的oplog条目。
Replication lag refers to the amount of time that it takes to copy (i.e. replicate) a write operation on the primary to a secondary. 复制延迟是指将primary上的写入操作复制(即副本)到secondary所需的时间。Some small delay period may be acceptable, but significant problems emerge as replication lag grows, including building cache pressure on the primary.一些小的延迟周期可能是可以接受的,但随着复制延迟的增长,会出现重大问题,包括在主服务器上构建缓存压力。
Starting in MongoDB 4.2, administrators can limit the rate at which the primary applies its writes with the goal of keeping the 从MongoDB 4.2开始,管理员可以限制主应用其写操作的速率,目的是将大多数提交的延迟保持在可配置的最大值majority committed
lag under a configurable maximum value flowControlTargetLagSeconds
.flowControlTargetLagSeconds
之下。
By default, flow control is 默认情况下,流量控制处于enabled
.enabled
状态。
For flow control to engage, the replica set/sharded cluster must have: featureCompatibilityVersion (FCV) of 要启用流控制,副本集/分片集群必须具有:4.2
and read concern majority enabled
. 4.2
的featureCompatibilityVersion (FCV)和已启用的读关注"majority"
。That is, enabled flow control has no effect if FCV is not 也就是说,如果FCV不是4.2
or if read concern majority is disabled.4.2
或读关注多数被禁用,则启用的流控制无效。
With flow control enabled, as the lag grows close to the 在启用流控制的情况下,随着延迟接近flowControlTargetLagSeconds
, writes on the primary must obtain tickets before taking locks to apply writes. flowControlTargetLagSeconds
,主服务器上的写操作必须先获得票证,然后才能使用锁应用写操作。By limiting the number of tickets issued per second, the flow control mechanism attempts to keep the lag under the target.通过限制每秒发出的票证数量,流控制机制试图将滞后保持在目标值以下。
For more information, see Check the Replication Lag and Flow Control.有关更多信息,请参阅检查复制延迟和流量控制。
When a primary does not communicate with the other members of the set for more than the configured 当主要成员与集合中的其他成员的通信时间超过配置的electionTimeoutMillis
period (10 seconds by default), an eligible secondary calls for an election to nominate itself as the new primary. electionTimeoutMillis
期间(默认为10秒)时,符合条件的次要成员将调用选举以指定自己为新的主要成员。The cluster attempts to complete the election of a new primary and resume normal operations.该小组试图完成新的初选并恢复正常运作。
The replica set cannot process write operations until the election completes successfully. 在选择成功完成之前,副本集无法处理写入操作。The replica set can continue to serve read queries if such queries are configured to run on secondaries while the primary is offline.如果将读取查询配置为在主设备脱机时在辅助设备上运行,则副本集可以继续提供读取查询。
The median time before a cluster elects a new primary should not typically exceed 12 seconds, assuming default 假定默认的副本配置设置,集群选择新主服务器之前的平均时间通常不应超过12秒。replica configuration settings
. This includes time required to mark the primary as unavailable and call and complete an election. 这包括将初选标记为不可用、呼叫并完成选举所需的时间。You can tune this time period by modifying the 您可以通过修改settings.electionTimeoutMillis
replication configuration option. settings.electionTimeoutMillis
复制配置选项来调整此时间段。Factors such as network latency may extend the time required for replica set elections to complete, which in turn affects the amount of time your cluster may operate without a primary. 网络延迟等因素可能会延长副本集选择完成所需的时间,这反过来会影响群集在没有主服务器的情况下运行的时间。These factors are dependent on your particular cluster architecture.这些因素取决于特定的集群体系结构。
Lowering the 将electionTimeoutMillis
replication configuration option from the default 10000
(10 seconds) can result in faster detection of primary failure. electionTimeoutMillis
复制配置选项从默认值10000
(10秒)降低可以更快地检测主故障。However, the cluster may call elections more frequently due to factors such as temporary network latency even if the primary is otherwise healthy. 然而,由于诸如临时网络延迟之类的因素,集群可能会更频繁地调用选举,即使主节点在其他方面是健康的。This can result in increased rollbacks for w : 1 write operations.这可能会导致w : 1写操作的回滚增加。
Your application connection logic should include tolerance for automatic failovers and the subsequent elections. 应用程序连接逻辑应包括自动故障切换和后续选择的容差。Starting in MongoDB 3.6, MongoDB drivers can detect the loss of the primary and automatically retry certain write operations a single time, providing additional built-in handling of automatic failovers and elections:从MongoDB 3.6开始,MongoDB驱动程序可以检测到主设备的丢失,并一次性自动重试某些写入操作,从而提供自动故障切换和选择的额外内置处理:
retryWrites=true
in the connection string.retryWrites=true
来显式启用可重试写入。Starting in version 4.4, MongoDB provides mirrored reads to pre-warm electable secondary members' cache with the most recently accessed data. 从4.4版开始,MongoDB提供镜像读取,以使用最近访问的数据预热可选举辅助成员的缓存。Pre-warming the cache of a secondary can help restore performance more quickly after an election.在选举结束后,预热辅助缓存有助于更快地恢复性能。
To learn more about MongoDB’s failover process, see:要了解有关MongoDB故障切换过程的更多信息,请参阅:
By default, clients read from the primary 默认情况下,客户端从主服务器读取数据[1];however, clients can specify a read preference to send read operations to secondaries.;但是,客户机可以指定一个读取首选项,以便将读取操作发送到辅助设备。
Asynchronous replication异步复制 to secondaries means that reads from secondaries may return data that does not reflect the state of the data on the primary.到secondary意味着从secondary读取的数据可能返回不反映主服务器上数据状态的数据。
Multi-document transactions that contain read operations must use read preference 包含读取操作的多文档事务必须使用读取首选项primary
. primary
。All operations in a given transaction must route to the same member.给定事务中的所有操作都必须路由到同一成员。
For information on reading from replica sets, see Read Preference.有关读取副本集的信息,请参阅读取首选项。
Depending on the read concern, clients can see the results of writes before the writes are durable:根据读关注点的不同,客户端可以在写入持久之前查看写入的结果:
"local"
or "available"
read concern can see the result of a write operation before the write operation is acknowledged to the issuing client."local"
或"available"
读关注点的其他客户端都可以在向发出请求的客户端确认写操作之前看到写操作的结果。"local"
or "available"
read concern can read data which may be subsequently rolled back during replica set failovers."local"
或"available"
读取关注点的客户端可以读取数据,这些数据可能会在副本集故障切换期间随后回滚。For operations in a multi-document transaction, when a transaction commits, all data changes made in the transaction are saved and visible outside the transaction. 对于多文档事务中的操作,当事务提交时,事务中所做的所有数据更改都将保存并在事务外部可见。That is, a transaction will not commit some of its changes while rolling back others.也就是说,事务在回滚其他更改时不会提交其部分更改。
Until a transaction commits, the data changes made in the transaction are not visible outside the transaction.在事务提交之前,事务中所做的数据更改在事务外部是不可见的。
However, when a transaction writes to multiple shards, not all outside read operations need to wait for the result of the committed transaction to be visible across the shards. 但是,当事务写入多个分片时,并非所有外部读取操作都需要等待提交的事务的结果在分片中可见。For example, if a transaction is committed and write 1 is visible on shard A but write 2 is not yet visible on shard B, an outside read at read concern 例如,如果事务已提交,且写入1在分片a上可见,但写入2在分片B上尚不可见,则外部读取-读取关注点"local"
can read the results of write 1 without seeing write 2."local"
可以读取写入1的结果,而不查看写入2。
For more information on read isolations, consistency and recency for MongoDB, see Read Isolation, Consistency, and Recency.有关MongoDB的读取隔离、一致性和最近性的更多信息,请参阅读取隔离、一致性和最近性。
Mirrored reads reduce the impact of primary elections following an outage or planned maintenance. 镜像读取可减少大修或计划维护后的初选影响。After a failover in a replica set, the secondary that takes over as the new primary updates its cache as new queries come in. 在副本集中进行故障切换后,作为新主副本接管的次副本会在新查询进入时更新其缓存。While the cache is warming up performance can be impacted.缓存预热时,性能可能会受到影响。
Starting in version 4.4, mirrored reads pre-warm the caches of 从版本4.4开始,镜像读取对electable
secondary replica set members. electable
辅助副本集成员的缓存进行预热。To pre-warm the caches of electable secondaries, the primary mirrors a sample of the supported operations it receives to electable secondaries.要预热可选辅助设备的缓存,主设备会将其接收到的受支持操作的示例镜像到可选辅助设备。
The size of the subset of 可以使用electable
secondary replica set members that receive mirrored reads can be configured with the mirrorReads
parameter. mirrorReads
参数配置接收镜像读取的electable
辅助副本集成员子集的大小。See Enable/Disable Support for Mirrored Reads for further details.有关更多详细信息,请参阅启用/禁用对镜像读取的支持。
Mirrored reads do not affect the primary's response to the client. 镜像读取不会影响主服务器对客户端的响应。The reads that the primary mirrors to secondaries are "fire-and-forget" operations. The primary doesn't await responses.文件中写道,主镜像到辅助镜像是“开火并忘记”操作。初选并不等待回应。
Mirrored reads support the following operations:镜像读取支持以下操作:
count
distinct
find
findAndModify
update
Starting in MongoDB 4.4, mirrored reads are enabled by default and use a default 从MongoDB 4.4开始,默认情况下会启用镜像读取,并使用默认采样率sampling rate
of 0.01
. 0.01
。To disable mirrored reads, set the 要禁用镜像读取,请将mirrorReads
parameter to { samplingRate: 0.0 }
:mirrorReads
参数设置为{samplingRate:0.0}
:
db.adminCommand( { setParameter: 1, mirrorReads: { samplingRate: 0.0 } } )
With a sampling rate greater than 当采样率大于0.0
, the primary mirrors supported reads to a subset of electable
secondaries. 0.0
时,主镜像支持读取electable
辅助镜像的子集。With a sampling rate of 在采样率为0.01
, the primary mirrors one percent of the supported reads it receives to each electable secondary.0.01
的情况下,主映像将其接收到的支持读取的1%镜像到每个可选择的次映像。
Consider a replica set that consists of one primary and two electable secondaries. 考虑一个由一个主副本和两个可选的辅助副本组成的副本集。If the primary receives 如果主设备接收到1000
operations that can be mirrored and the sampling rate is 0.01
, the primary sends about 10
reads to electable secondaries. 1000
个可镜像的操作,且采样率为0.01
,则主设备将向可选的辅助设备发送大约10
次读取。Each electable secondary receives only a fraction of the 10 reads. 每一个可选择的中学只收到10次阅读的一小部分。Each read that is mirrored, is sent to a randomly chosen non-empty selection of electable secondaries.每个镜像的读取数据都会被发送到随机选择的非空可选二级数据集。
To change the sampling rate for mirrored reads, set the 要更改镜像读取的采样率,请将mirrorReads
parameter to a number between 0.0
and 1.0
:mirrorReads
参数设置为介于0.0
和1.0
之间的数字:
0.0
disables mirrored reads.0.0
将禁用镜像读取。0.0
and 1.0
results in the primary forwarding a random sample of the supported reads at the specified sample rate to electable secondaries.0.0
到1.0
之间,则主服务器会以指定的采样率将受支持读取的随机样本转发给可选择的二级服务器。1.0
results in the primary forwarding all supported reads to electable secondaries.1.0
时,主服务器会将所有支持的读取转发到可选择的辅助服务器。For details, see 有关详细信息,请参阅mirrorReads
.mirrorReads
。
Starting in MongoDB 4.4, the 从MongoDB 4.4开始,如果在操作中指定以下字段,serverStatus
command and the db.serverStatus()
shell method return mirroredReads
metrics if you specify the field in the operation:serverStatus
命令和db.serverStatus()
shell方法将返回mirroredReads
metrics:
db.serverStatus( { mirroredReads: 1 } )
Starting in MongoDB 4.0, multi-document transactions are available for replica sets.从MongoDB 4.0开始,多文档事务可用于副本集。
Multi-document transactions that contain read operations must use read preference 包含读取操作的多文档事务必须使用读取首选项primary
. primary
。All operations in a given transaction must route to the same member.给定事务中的所有操作都必须路由到同一成员。
Until a transaction commits, the data changes made in the transaction are not visible outside the transaction.在事务提交之前,事务中所做的数据更改在事务外部是不可见的。
However, when a transaction writes to multiple shards, not all outside read operations need to wait for the result of the committed transaction to be visible across the shards. 但是,当事务写入多个分片时,并非所有外部读取操作都需要等待提交的事务的结果在分片中可见。For example, if a transaction is committed and write 1 is visible on shard A but write 2 is not yet visible on shard B, an outside read at read concern 例如,如果事务已提交,且写入1在分片a上可见,但写入2在分片B上尚不可见,则外部读取-读取关注点"local"
can read the results of write 1 without seeing write 2."local"
可以读取写入1的结果,而不查看写入2。
Starting in MongoDB 3.6, change streams are available for replica sets and sharded clusters. 从MongoDB 3.6开始,变更流可用于副本集和分片集群。Change streams allow applications to access real-time data changes without the complexity and risk of tailing the oplog. 更改流允许应用程序访问实时数据更改,而不必承担跟踪oplog的复杂性和风险。Applications can use change streams to subscribe to all data changes on a collection or collections.应用程序可以使用更改流订阅一个或多个集合上的所有数据更改。
Replica sets provide a number of options to support application needs. 应用程序需要多个副本集来支持。For example, you may deploy a replica set with members in multiple data centers, or control the outcome of elections by adjusting the 例如,您可以在多个数据中心部署包含成员的副本集,或者通过调整某些成员的成员[n]优先级来控制选举结果。members[n].priority
of some members. Replica sets also support dedicated members for reporting, disaster recovery, or backup functions.副本集还支持用于报告、灾难恢复或备份功能的专用成员。
See Priority 0 Replica Set Members, Hidden Replica Set Members and Delayed Replica Set Members for more information.有关详细信息,请参阅优先级为0的副本集成员、隐藏的副本集成员和延迟的副本集成员。
[1] | (1, 2) { w: "majority" } write concern. { w: "majority" } 写操作。{ w: "majority" } writes is the current primary, and the other node is a former primary that has not yet recognized its demotion, typically due to a network partition. { w: "majority" } 写入的节点是当前主节点,另一个节点是尚未识别其降级的前主节点,通常是由于网络分区。primary , and new writes to the former primary will eventually roll back.primary ,而对前主服务器的新写入操作最终将回滚。 |