Replica Set Data Synchronization复制集数据同步

On this page本页内容

In order to maintain up-to-date copies of the shared data set, secondary members of a replica set sync or replicate data from other members. 为了维护共享数据集的最新副本,副本集的辅助成员从其他成员同步或复制数据。MongoDB uses two forms of data synchronization: initial sync to populate new members with the full data set, and replication to apply ongoing changes to the entire data set.MongoDB使用两种形式的数据同步:初始同步用完整的数据集填充新成员,复制将正在进行的更改应用于整个数据集。

Initial Sync初始同步

Initial sync copies all the data from one member of the replica set to another member. 初始同步将所有数据从复制集的一个成员复制到另一个成员。See Initial Sync Source Selection for more information on initial sync source selection criteria.有关初始同步源选择条件的更多信息,请参阅初始同步源的选择

Starting in MongoDB 4.4, you can specify the preferred initial sync source using the initialSyncSourceReadPreference parameter. 从MongoDB 4.4开始,您可以使用initialSyncSourceReadPreference参数指定首选初始同步源。This parameter can only be specified when starting the mongod.此参数只能在启动mongod时指定。

Starting in MongoDB 5.2, initial syncs can be logical or file copy based.从MongoDB 5.2开始,初始同步可以基于逻辑文件副本

Logical Initial Sync Process逻辑初始同步过程

When you perform a logical initial sync, MongoDB:执行逻辑初始同步时,MongoDB:

  1. Clones all databases except the local database. 克隆除local数据库之外的所有数据库。To clone, the mongod scans every collection in each source database and inserts all data into its own copies of these collections.为了克隆,mongod扫描每个源数据库中的每个集合,并将所有数据插入到这些集合的自己副本中。
  2. Builds all collection indexes as the documents are copied for each collection.在为每个集合复制文档时生成所有集合索引。
  3. Pulls newly added oplog records during the data copy. 在数据复制期间提取新添加的oplog记录。Ensure that the target member has enough disk space in the local database to temporarily store these oplog records for the duration of this data copy stage.确保目标成员在local数据库中有足够的磁盘空间,以便在此数据复制阶段期间临时存储这些oplog记录。
  4. Applies all changes to the data set. 将所有更改应用于数据集。Using the oplog from the source, the mongod updates its data set to reflect the current state of the replica set.mongod使用源中的oplog更新其数据集,以反映副本集的当前状态。

When the initial sync finishes, the member transitions from STARTUP2 to SECONDARY.当初始同步完成时,成员从STARTUP2转换到SECONDARY

To perform an initial sync, see Resync a Member of a Replica Set.要执行初始同步,请参阅重新同步副本集的成员

File Copy Based Initial Sync基于初始同步的文件拷贝

File copy based initial sync基于初始同步的文件拷贝 runs the initial sync process by copying and moving files on the file system.通过复制和移动文件系统上的文件来运行初始同步过程。

File copy based initial sync can be faster than logical initial sync.基于文件拷贝的初始同步可以比逻辑初始同步更快。

It is only available on Enterprise Server.它仅在Enterprise Server上可用。

Limitations局限性

  • File copy based initial sync replaces the local database on the node being synced to with the local database from the node being synced from.基于文件复制的初始同步将同步到的节点上的local数据库替换为同步来自的节点的local数据库。
  • You can only run an initial sync from one given node at a time.一次只能从一个给定节点运行初始同步。
  • You cannot run a backup on the node that is being synced to during the initial sync.您不能在初始同步期间同步到的节点上运行备份。
  • You cannot run a backup on the node that is being synced from during the initial sync.在初始同步期间,无法在正在从中同步的节点上运行备份。
  • You cannot use file copy based initial sync with encrypted storage engine to re-key the data.不能使用基于文件副本的初始同步和加密存储引擎来为数据重新设置密钥。
Warning警告

You cannot write to the local database during a file copy based sync.在基于文件副本的同步期间,不能写入local数据库。

Fault Tolerance容错

If a secondary performing initial sync encounters a non-transient(i.e. persistent) network error during the sync process, the secondary restarts the initial sync process from the beginning.如果执行初始同步的辅助设备在同步过程中遇到非瞬时(即持续)网络错误,则辅助设备将从一开始重新启动初始同步过程。

Starting in MongoDB 4.4, a secondary performing initial sync can attempt to resume the sync process if interrupted by a transient (i.e. temporary) network error, collection drop, or collection rename. 从MongoDB 4.4开始,执行初始同步的二级服务器可以尝试在被暂时(即临时)网络错误、集合删除或集合重命名中断时恢复同步过程。The sync source must also run MongoDB 4.4 to support resumable initial sync. 同步源还必须运行MongoDB 4.4以支持可恢复的初始同步。If the sync source runs MongoDB 4.2 or earlier, the secondary must restart the initial sync process as if it encountered a non-transient network error.如果同步源运行MongoDB 4.2或更早版本,则辅助服务器必须重新启动初始同步进程,就像遇到非瞬态网络错误一样。

By default, the secondary tries to resume initial sync for 24 hours. 默认情况下,辅助服务器会尝试在24小时内恢复初始同步。MongoDB 4.4 adds the initialSyncTransientErrorRetryPeriodSeconds server parameter for controlling the amount of time the secondary attempts to resume initial sync. MongoDB 4.4添加了initialSyncTransientErrorRetryPeriodSeconds服务器参数,用于控制辅助服务器尝试恢复初始同步的时间量。If the secondary cannot successfully resume the initial sync process during the configured time period, it selects a new healthy source from the replica set and restarts the initial synchronization process from the beginning.如果辅助服务器在配置的时间段内无法成功恢复初始同步过程,它将从副本集中选择一个新的健康源,并从头开始重新启动初始同步过程。

The secondary attempts to restart the initial sync up to 10 times before returning a fatal error.在返回致命错误之前,辅助服务器最多尝试重新启动初始同步10次。

Initial Sync Source Selection初始同步源选择

Initial sync source selection depends on the value of the mongod startup parameter initialSyncSourceReadPreference (new in 4.4):初始同步源选择取决于mongod启动参数initialSyncSourceReadPreference的值(4.4中新增):

  • For initialSyncSourceReadPreference set to primary (default if chaining is disabled), select the primary as the sync source. 如果initialSyncSourceReadPreference设置为primary(如果禁用了chaining,则为默认值),请选择primary作为同步源。If the primary is unavailable or unreachable, log an error and periodically check for primary availability.如果主服务器不可用或无法访问,请记录错误并定期检查主服务器的可用性。
  • For initialSyncSourceReadPreference set to primaryPreferred (default for voting replica set members), attempt to select the primary as the sync source. 对于initialSyncSourceReadPreference设置为primaryPreferred(投票副本集成员的默认设置),请尝试选择primary作为同步源。If the primary is unavailable or unreachable, perform sync source selection from the remaining replica set members.如果主副本不可用或无法访问,请从其余副本集成员中执行同步源选择。
  • For all other supported read modes, perform sync source selection from the replica set members.对于所有其他受支持的读取模式,请从副本集成员中执行同步源选择。

Members performing initial sync source selection make two passes through the list of all replica set members:执行初始同步源选择的成员在所有复制集成员列表中进行两次遍历:

The member applies the following criteria to each replica set member when making the first pass for selecting a initial sync source:在第一次选择初始同步源时,成员将以下条件应用于每个副本集成员:

  • The sync source must be in the PRIMARY or SECONDARY replication state.同步源必须处于PRIMARYSECONDARY复制状态。

  • The sync source must be online and reachable.同步源必须联机且可访问。

  • If initialSyncSourceReadPreference is secondary or secondaryPreferred, the sync source must be a secondary.如果initialSyncSourceReadPreferencesecondarysecondaryPreferred,则同步源必须是secondary

  • The sync source must be visible.同步源必须visible

  • The sync source must be within 30 seconds of the newest oplog entry on the primary.同步源必须在主服务器上最新oplog条目的30秒内。

  • If the member builds indexes, the sync source must build indexes.如果成员构建索引,则同步源必须构建索引。

  • If the member votes in replica set elections, the sync source must also vote.如果成员在副本集选举中投票,同步源也必须投票。

  • If the member is not a delayed member, the sync source must not be delayed.如果该成员不是延迟成员,则不能延迟同步源。

  • If the member is a delayed member, the sync source must have a shorter configured delay.如果该成员是延迟成员,则同步源必须具有较短的配置延迟。

  • The sync source must be faster (i.e. lower latency) than the current best sync source.同步源必须比当前最佳同步源更快(即更低的延迟)。

If no candidate sync sources remain after the first pass, the member performs a second pass with relaxed criteria. 如果在第一遍之后没有候选同步源剩余,则成员使用放宽的标准执行第二遍。See Sync Source Selection (Second Pass).请参见同步源选择(第二遍)

The member applies the following criteria to each replica set member when making the second pass for selecting a initial sync source:在进行第二次选择初始同步源时,该成员将以下条件应用于每个副本集成员:

If the member cannot select an initial sync source after two passes, it logs an error and waits 1 second before restarting the selection process. 如果成员在两次传递后无法选择初始同步源,则会记录一个错误并等待1秒,然后重新启动选择过程。The secondary mongod can restart the initial sync source selection process up to 10 times before exiting with an error.辅助mongod可以重新启动初始同步源选择过程多达10次,然后退出并出现错误。

Replication复制

Secondary members replicate data continuously after the initial sync. 辅助成员在初始同步后连续复制数据。Secondary members copy the oplog from their sync from source and apply these operations in an asynchronous process. 辅助成员从源同步复制oplog,并在异步进程中应用这些操作。[1]

Secondaries may automatically change their sync from source as needed based on changes in the ping time and state of other members' replication. 根据ping时间和其他成员的复制状态的变化,辅助成员可以根据需要自动更改其来自源的同步。See Replication Sync Source Selection for more information on sync source selection criteria.有关同步源选择标准的详细信息,请参阅复制同步源选择

[1] Starting in version 4.2 (also available starting in 4.0.6), secondary members of a replica set now log oplog entries that take longer than the slow operation threshold to apply. 从4.2版开始(也从4.0.6版开始提供),副本集的辅助成员现在记录的oplog条目的应用时间超过了慢操作阈值。These slow oplog messages:这些慢速oplog消息:
  • Are logged for the secondaries in the diagnostic log.诊断日志中记录辅助设备。
  • Are logged under the REPL component with the text applied op: <oplog entry> took <num>ms.REPL组件下记录并带有文本applied op: <oplog entry> took <num>ms
  • Do not depend on the log levels (either at the system or component level)不依赖于日志级别(在系统或组件级别)
  • Do not depend on the profiling level.不要依赖于分析级别。
  • May be affected by slowOpSampleRate, depending on your MongoDB version:可能会受到slowOpSampleRate的影响,具体取决于MongoDB版本:
    • In MongoDB 4.2 and earlier, these slow oplog entries are not affected by the slowOpSampleRate. 在MongoDB 4.2及更早版本中,这些慢oplog条目不受slowOpSampleRate的影响。MongoDB logs all slow oplog entries regardless of the sample rate.MongoDB记录所有慢速oplog条目,而不考虑采样率。
    • In MongoDB 4.4 and later, these slow oplog entries are affected by the slowOpSampleRate.在MongoDB 4.4及更高版本中,这些慢oplog条目受slowOpSampleRate的影响。
The profiler does not capture slow oplog entries.探查器不会捕获慢速oplog条目。

Streaming Replication流复制

Starting in MongoDB 4.4, sync from sources send a continuous stream of oplog entries to their syncing secondaries. 从MongoDB 4.4开始,来自源的同步将源源不断的oplog条目发送到它们的同步辅助设备。Streaming replication mitigates replication lag in high-load and high-latency networks. 流式复制减轻了高负载和高延迟网络中的复制延迟。It also:它还:

  • Reduces staleness for reads from secondaries.减少从二级数据库读取的陈旧性。
  • Reduces risk of losing write operations with w: 1 due to primary failover.减少由于主故障切换而导致w: 1写操作丢失的风险。
  • Reduces latency on write operations with w: "majority" and w: >1 (that is, any write concern that requires waiting for replication).使用w: "majority"w: >1(即任何需要等待复制的写入问题)减少写入操作的延迟。

Prior to MongoDB 4.4, secondaries fetched batches of oplog entries by issuing a request to their sync from source and waiting for a response. 在MongoDB 4.4之前,辅助数据库通过从源发出同步请求并等待响应来获取一批oplog条目。This required a network roundtrip for each batch of oplog entries. 这需要对每批oplog条目进行网络往返。MongoDB 4.4 adds the oplogFetcherUsesExhaust startup parameter for disabling streaming replication and using the older replication behavior. MongoDB 4.4添加了oplogFetcherUsesExhaust启动参数,用于禁用流式复制并使用旧的复制行为。Set the oplogFetcherUsesExhaust parameter to false only if there are any resource constraints on the sync from source or if you wish to limit MongoDB's usage of network bandwidth for replication.仅当源同步存在任何资源限制或您希望限制MongoDB使用网络带宽进行复制时,才将oplogFetcherUsesExhaust参数设置为false

Multithreaded Replication多线程复制

MongoDB applies write operations in batches using multiple threads to improve concurrency. MongoDB使用多线程批量应用写操作以提高并发性。MongoDB groups batches by document id (WiredTiger) and simultaneously applies each group of operations using a different thread. MongoDB按文档id(WiredTiger)对批进行分组,并使用不同的线程同时应用每组操作。MongoDB always applies write operations to a given document in their original write order.MongoDB始终按照原始写入顺序对给定文档应用写入操作。

Changed in version 4.0.在版本4.0中更改

Starting in MongoDB 4.0, read operations that target secondaries and are configured with a read concern level of "local" or "majority" will now read from a WiredTiger snapshot of the data if the read takes place on a secondary where replication batches are being applied. 从MongoDB 4.0开始,如果读取发生在应用复制批处理的辅助设备上,则针对辅助设备并配置为"local""majority"读取关注级别的读取操作现在将从WiredTiger快照中读取数据。Reading from a snapshot guarantees a consistent view of the data, and allows the read to occur simultaneously with the ongoing replication without the need for a lock. 从快照中读取可确保数据的一致视图,并允许读取与正在进行的复制同时进行,而无需锁定。As a result, secondary reads requiring these read concern levels no longer need to wait for replication batches to be applied, and can be handled as they are received.因此,需要这些读取关注级别的二次读取不再需要等待应用复制批,并且可以在接收时进行处理。

Flow Control流量控制

Starting in MongoDB 4.2, administrators can limit the rate at which the primary applies its writes with the goal of keeping the majority committed lag under a configurable maximum value flowControlTargetLagSeconds.从MongoDB 4.2开始,管理员可以限制主应用写入的速率,目的是将大多数提交延迟保持在可配置的最大值flowControlTargetLagSeconds之下。

By default, flow control is enabled.默认情况下,流量控制是启用的

Note注意

For flow control to engage, the replica set/sharded cluster must have: featureCompatibilityVersion (FCV) of 4.2 and read concern majority enabled. 要启用流控制,副本集/分片集群必须具有:featureCompatibilityVersion(FCV)为"4.2",并启用读取关注点"majority"That is, enabled flow control has no effect if FCV is not 4.2 or if read concern majority is disabled.也就是说,如果FCV不是4.2,或者如果禁用了大多数读取问题,则启用的流量控制无效。

For more information, see Flow Control.有关详细信息,请参阅流量控制

Replication Sync Source Selection复制同步源选择

Replication sync source selection depends on the replica set chaining setting:复制同步源选择取决于复制集chaining设置:

  • With chaining enabled (default), perform sync source selection from the replica set members.启用链接(默认)后,从复制集成员中执行同步源选择。
  • With chaining disabled, select the primary as the sync source. 禁用链接后,选择primary作为同步源。If the primary is unavailable or unreachable, log an error and periodically check for primary availability.如果主服务器不可用或无法访问,请记录错误并定期检查主服务器的可用性。

Members performing replication sync source selection make two passes through the list of all replica set members:执行复制同步源选择的成员在所有复制集成员列表中进行两次遍历:

The member applies the following criteria to each replica set member when making the first pass for selecting a replication sync source:在第一次选择复制同步源时,成员将以下条件应用于每个副本集成员:

  • The sync source must be in the PRIMARY or SECONDARY replication state.同步源必须处于PRIMARYSECONDARY复制状态。

  • The sync source must be online and reachable.同步源必须联机且可访问。

  • The sync source must have newer oplog entries than the member (i.e. the sync source is ahead of the member).同步源必须具有比成员更新的oplog条目(即,同步源在成员之前)。

  • The sync source must be visible.同步源必须visible

  • The sync source must be within 30 seconds of the newest oplog entry on the primary.同步源必须在主服务器上最新oplog条目的30秒内。

  • If the member builds indexes, the sync source must build indexes.如果成员构建索引,则同步源必须构建索引。

  • If the member votes in replica set elections, the sync source must also vote.如果成员在副本集选举中投表,同步源也必须投票。

  • If the member is not a delayed member, the sync source must not be delayed.如果成员不是延迟成员,则同步源不能延迟。

  • If the member is a delayed member, the sync source must have a shorter configured delay.如果成员是延迟成员,则同步源必须具有较短的配置延迟。

  • The sync source must be faster (i.e. lower latency) than the current best sync source.同步源必须比当前最佳同步源更快(即更低的延迟)。

If no candidate sync sources remain after the first pass, the member performs a second pass with relaxed criteria. 如果在第一遍之后没有候选同步源剩余,则成员使用放宽的标准执行第二遍。See the Sync Source Selection (Second Pass).请参见同步源选择(第二遍)。

The member applies the following criteria to each replica set member when making the second pass for selecting a replication sync source:在进行第二次选择复制同步源时,成员将以下条件应用于每个副本集成员:

  • The sync source must be in the PRIMARY or SECONDARY replication state.同步源必须处于PRIMARYSECONDARY复制状态。

  • The sync source must be online and reachable.同步源必须联机且可访问。

  • If the member builds indexes, the sync source must build indexes.如果成员构建索引,则同步源必须构建索引。

  • The sync source must be faster (i.e. lower latency) than the current best sync source.同步源必须比当前最佳同步源更快(即更低的延迟)。

If the member cannot select a sync source after two passes, it logs an error and waits 1 second before restarting the selection process.如果成员在两次通过后无法选择同步源,则会记录一个错误并等待1秒,然后重新启动选择过程。

The number of times a source can be changed per hour is configurable by setting the maxNumSyncSourceChangesPerHour parameter.通过设置maxNumSyncSourceChangesPerHour参数,可以配置每小时更改源的次数。

Note注意

Starting in MongoDB 4.4, the startup parameter initialSyncSourceReadPreference takes precedence over the replica set's settings.chainingAllowed setting when selecting an initial sync source. 从MongoDB 4.4开始,选择初始同步源时,启动参数initialSyncSourceReadPreference优先于复制集的settings.chainingAllowed设置。After a replica set member successfully performs initial sync, it defers to the value of chainingAllowed when selecting a replication sync source.副本集成员成功执行初始同步后,在选择复制同步源时,它会遵循chainingAllowed的值。

See Initial Sync Source Selection for more information on initial sync source selection.有关初始同步源选择的更多信息,请参阅初始同步源的选择

←  Replica Set OplogReplica Set Deployment Architectures →