Replica Set Elections复制集选择

On this page本页内容

Replica sets use elections to determine which set member will become primary. 副本集使用选举来确定哪个集成员将成为主要成员。Replica sets can trigger an election in response to a variety of events, such as:副本集可以触发选举以响应各种事件,例如:

In the following diagram, the primary node was unavailable for longer than the configured timeout and triggers the automatic failover process. 在下图中,主节点不可用的时间超过了配置的超时时间,并触发自动故障切换过程。One of the remaining secondaries calls for an election to select a new primary and automatically resume normal operations.剩余的一个二级学院要求进行选举,以选择新的一级学院,并自动恢复正常运作。

Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new primary

The replica set cannot process write operations until the election completes successfully. 在成功完成选择之前,副本集无法处理写入操作。The replica set can continue to serve read queries if such queries are configured to run on secondaries.如果将此类查询配置为在辅助设备上运行,则副本集可以继续为读取查询提供服务。

The median time before a cluster elects a new primary should not typically exceed 12 seconds, assuming default replica configuration settings. 假定默认的复制副本配置设置,群集选择新主服务器之前的中间时间通常不应超过12秒。This includes time required to mark the primary as unavailable and call and complete an election. 这包括将初选标记为不可用以及召集和完成选举所需的时间。You can tune this time period by modifying the settings.electionTimeoutMillis replication configuration option. 您可以通过修改settings.electionTimeoutMillis复制配置选项来调整此时间段。Factors such as network latency may extend the time required for replica set elections to complete, which in turn affects the amount of time your cluster may operate without a primary. 网络延迟等因素可能会延长复制集选择完成所需的时间,这反过来又会影响群集在没有主节点的情况下运行的时间。These factors are dependent on your particular cluster architecture.这些因素取决于您的特定集群体系结构。

Your application connection logic should include tolerance for automatic failovers and the subsequent elections. 应用程序连接逻辑应包括自动故障切换和后续选择的容差。Starting in MongoDB 3.6, MongoDB drivers can detect the loss of the primary and automatically retry certain write operations a single time, providing additional built-in handling of automatic failovers and elections:从MongoDB 3.6开始,MongoDB驱动程序可以检测到主数据的丢失,并自动重试某些写入操作一次,从而提供自动故障切换和选择的附加内置处理:

Factors and Conditions that Affect Elections影响选举的因素和条件

Replication Election Protocol复制选择协议

Changed in version 4.0.在版本4.0中更改

MongoDB 4.0 removes the deprecated replication protocol version 0.MongoDB 4.0删除了不推荐使用的复制协议版本0。

Replication protocolVersion: 1 reduces replica set failover time and accelerate the detection of multiple simultaneous primaries.复制协议protocolVersion: 1缩短了副本集故障转移时间,并加快了对多个同时出现的主副本的检测。

With protocolVersion 1, you can use catchUpTimeoutMillis to prioritize between faster failovers and preservation of w:1 writes.使用protocolVersion1,您可以使用catchUpTimeoutMillis在更快的故障切换和保存w:1写入之间划分优先级。

For more information on pv1, see Replica Set Protocol Version.有关pv1的详细信息,请参阅副本集协议版本

Heartbeats心跳

Replica set members send heartbeats (pings) to each other every two seconds. 副本集成员每隔两秒向彼此发送心跳信号(ping)。If a heartbeat does not return within 10 seconds, the other members mark the delinquent member as inaccessible.如果心跳在10秒钟内没有返回,其他成员会将该成员标记为无法访问。

Member Priority成员优先级

After a replica set has a stable primary, the election algorithm will make a "best-effort" attempt to have the secondary with the highest priority available call an election. 在副本集具有稳定的主副本后,选举算法将“尽最大努力”尝试让具有最高优先级的次副本调用选举。Member priority affects both the timing and the outcome of elections; secondaries with higher priority call elections relatively sooner than secondaries with lower priority, and are also more likely to win. 成员优先权影响选举的时间和结果;优先级较高的二级学院比优先级较低的二级院校更快召集选举,也更有可能获胜。However, a lower priority instance can be elected as primary for brief periods, even if a higher priority secondary is available. 然而,即使有较高优先级的次实例可用,也可以选择较低优先级的实例作为短期的主实例。Replica set members continue to call elections until the highest priority member available becomes primary.副本集成员继续进行选举,直到可用的最高优先级成员成为主要成员。

Members with a priority value of 0 cannot become primary and do not seek election. 优先级值为0的成员不能成为主要成员,也不寻求选举。For details, see Priority 0 Replica Set Members.有关详细信息,请参阅优先级0副本集成员

Mirrored Reads镜像读取

Starting in version 4.4, MongoDB provides mirrored reads to pre-warm electable secondary members' cache with the most recently accessed data. 从4.4版开始,MongoDB使用最近访问的数据向预热可选次要成员的缓存提供镜像读取With mirrored reads, the primary can mirror a subset of operations that it receives and send them to a subset of electable secondaries. 通过镜像读取,主服务器可以镜像它接收的操作子集,并将其发送到可选辅助服务器的子集。Pre-warming the cache of a secondary can help restore performance more quickly after an election.预热辅助缓存可以帮助在选举后更快地恢复性能。

For details, see Mirrored Reads.有关详细信息,请参阅镜像读取

Loss of a Data Center失去数据中心

With a distributed replica set, the loss of a data center may affect the ability of the remaining members in other data center or data centers to elect a primary.对于分布式副本集,数据中心的丢失可能会影响其他数据中心或数据中心中其他成员选择主副本的能力。

If possible, distribute the replica set members across data centers to maximize the likelihood that even with a loss of a data center, one of the remaining replica set members can become the new primary.如果可能,将副本集成员分布到各个数据中心,以最大限度地提高即使丢失数据中心,剩余副本集成员之一也可能成为新主副本的可能性。

Network Partition网络分区

A network partition may segregate a primary into a partition with a minority of nodes. 网络分区可以将一个主分区分隔为一个具有少数节点的分区。When the primary detects that it can only see a minority of nodes in the replica set, the primary steps down as primary and becomes a secondary. 当主节点检测到它只能看到副本集中的少数节点时,主节点将作为主节点降级并成为次节点。Independently, a member in the partition that can communicate with a majority of the nodes (including itself) holds an election to become the new primary.独立地,分区中可以与大多数节点(包括其自身)通信的成员将举行选举,成为新的主节点。

Voting Members有投票权的成员

The replica set member configuration setting members[n].votes and member state determine whether a member votes in an election.副本集成员配置设置members[n].votes和成员状态决定成员是否在选举中投票。

  • All replica set members that have their members[n].votes setting equal to 1 vote in elections. 所有副本集成员在选举中具有members[n].votes1To exclude a member from voting in an election, change the value of the member's members[n].votes configuration to 0.若要在选举中排除某个成员的投票,请将该成员的members[n].votes配置的值更改为0

  • Only voting members in the following states are eligible to vote:只有以下州的有投票权的成员才有资格投票:

Non-Voting Members无投票权成员

Although non-voting members do not vote in elections, these members hold copies of the replica set's data and can accept read operations from client applications.虽然无投票权的成员在选举中不投票,但这些成员持有副本集数据的副本,并可以接受客户端应用程序的读取操作。

Because a replica set can have up to 50 members, but only 7 voting members, non-voting members allow a replica set to have more than seven members.由于副本集最多可以有50个成员,但只有7个投票成员,因此无投票权成员允许副本集有7个以上的成员。

Non-voting (i.e. votes is 0) members must have priority of 0.无投票权(即votes0)成员必须具有0的priority

For instance, the following nine-member replica set has seven voting members and two non-voting members.例如,以下九个成员的副本集有七个投票成员和两个无投票权成员。

Diagram of a 9 member replica set with the maximum of 7 voting members.

A non-voting member has both votes and priority equal to 0:无表决权成员的votespriority均等于0:

{
   "_id" : <num>,
   "host" : <hostname:port>,
   "arbiterOnly" : false,
   "buildIndexes" : true,
   "hidden" : false,
   "priority" : 0,
   "tags" : {
   },
   "secondaryDelaySecs" : NumberLong(0),
   "votes" : 0
}
Important重要

Do not alter the number of votes to control which members will become primary. 不要通过改变投票数来控制哪些成员将成为主要成员。Instead, modify the members[n].priority option. 相反,请修改members[n].priority选项。Onlyalter the number of votes in exceptional cases. For example, to permit more than seven members.仅在例外情况下更改投票数。例如,允许超过七名成员。

To configure a non-voting member, see Configure Non-Voting Replica Set Member.要配置无投票权成员,请参阅配置无投票副本集成员

←  Replica Set High AvailabilityRollbacks During Replica Set Failover →