Docs HomeMongoDB Manual

Replica Set Data Synchronization副本集数据同步

On this page本页内容

In order to maintain up-to-date copies of the shared data set, secondary members of a replica set sync or replicate data from other members. 为了维护共享数据集的最新副本,副本集的辅助成员同步或复制来自其他成员的数据。MongoDB uses two forms of data synchronization: initial sync to populate new members with the full data set, and replication to apply ongoing changes to the entire data set.MongoDB使用两种形式的数据同步:初始同步用完整的数据集填充新成员,复制用正在进行的更改应用于整个数据集。

Initial Sync初始同步

Initial sync copies all the data from one member of the replica set to another member. 初始同步将所有数据从副本集的一个成员复制到另一个成员。See Initial Sync Source Selection for more information on initial sync source selection criteria.有关初始同步源选择条件的详细信息,请参阅初始同步源的选择

Starting in MongoDB 4.4, you can specify the preferred initial sync source using the initialSyncSourceReadPreference parameter. 从MongoDB 4.4开始,您可以使用initialSyncSourceReadPreference参数指定首选的初始同步源。This parameter can only be specified when starting the mongod.此参数只能在启动mongod时指定。

Starting in MongoDB 5.2, initial syncs can be logical or file copy based.从MongoDB 5.2开始,初始同步可以是基于逻辑的,也可以是基于文件副本的

Logical Initial Sync Process逻辑初始同步过程

When you perform a logical initial sync, MongoDB:当您执行逻辑初始同步时,MongoDB:

  1. Clones all databases except the local database. 克隆除local数据库之外的所有数据库。To clone, the mongod scans every collection in each source database and inserts all data into its own copies of these collections.为了进行克隆,mongod扫描每个源数据库中的每个集合,并将所有数据插入到这些集合的自己的副本中。
  2. Builds all collection indexes as the documents are copied for each collection.在为每个集合复制文档时生成所有集合索引。
  3. Pulls newly added oplog records during the data copy. Ensure that the target member has enough disk space in the local database to temporarily store these oplog records for the duration of this data copy stage.在数据复制期间提取新添加的oplog记录。请确保目标成员在local数据库中有足够的磁盘空间,以便在此数据复制阶段期间临时存储这些操作日志记录。
  4. Applies all changes to the data set. Using the oplog from the source, the mongod updates its data set to reflect the current state of the replica set.将所有更改应用于数据集。使用来自源的oplog,mongod更新其数据集以反映副本集的当前状态。

When the initial sync finishes, the member transitions from STARTUP2 to SECONDARY.初始同步完成后,成员将从STARTUP2转换为SECONDARY

To perform an initial sync, see Resync a Member of a Replica Set.要执行初始同步,请参阅重新同步副本集的成员

File Copy Based Initial Sync基于文件复制的初始同步

Available in MongoDB Enterprise only.仅在MongoDB Enterprise中可用。

File copy based initial sync runs the initial sync process by copying and moving files on the file system. 基于文件副本的初始同步通过复制和移动文件系统上的文件来运行初始同步过程。This sync method can be faster than logical initial sync.此同步方法可能比逻辑初始同步更快。

Important

File copy based initial sync may cause inaccurate counts基于文件副本的初始同步可能会导致计数不准确

After file copy based initial sync completes, if you run the count() method without a query predicate, the count of documents returned may be inaccurate.在基于文件副本的初始同步完成后,如果在没有查询谓词的情况下运行count()方法,则返回的文档数可能不准确。

A count method without a query predicate looks like this: db.<collection>.count().没有查询谓词的count方法如下所示:db.<collection>.count()

To learn more, see Inaccurate Counts Without Query Predicate.要了解更多信息,请参阅无查询谓词的计数不准确

Enable File Copy Based Initial Sync启用基于文件复制的初始同步

To enable file copy based initial sync, set the initialSyncMethod parameter to fileCopyBased on the destination member for the initial sync. 要启用基于文件副本的初始同步,请根据初始同步的目标成员将initialSyncMethod参数设置为fileCopyBasedThis parameter can only be set at startup.此参数只能在启动时设置。

Behavior行为

File copy based initial sync replaces the local database on the member being synced to with the local database from the member being synced from.基于文件副本的初始同步将要同步到的成员上的local数据库替换为要同步的成员的local数据库。

Limitations局限性

  • During a file copy based initial sync:在基于文件副本的初始同步过程中:

    • You cannot run a backup on the member that is being synced to or the member that is being synced from.无法对正在同步到的成员或正在从中同步的成员运行备份。
    • You cannot write to the local database on the member that is being synced to.无法写入正在同步到的成员的本地数据库。
  • You can only run an initial sync from one given member at a time.一次只能从一个给定成员运行初始同步。
  • When using the encrypted storage engine, MongoDB uses the source key to encrypt the destination.当使用加密存储引擎时,MongoDB使用源键来加密目的地。

Fault Tolerance容错

If a secondary performing initial sync encounters a non-transient (i.e. persistent) network error during the sync process, the secondary restarts the initial sync process from the beginning.如果执行初始同步的辅助设备在同步过程中遇到非瞬态(即持久性)网络错误,则辅助设备会从头开始重新启动初始同步过程。

Starting in MongoDB 4.4, a secondary performing initial sync can attempt to resume the sync process if interrupted by a transient (i.e. temporary) network error, collection drop, or collection rename. The sync source must also run MongoDB 4.4 to support resumable initial sync. 从MongoDB 4.4开始,如果由于暂时(即临时)网络错误、集合丢弃或集合重命名而中断,执行初始同步的辅助进程可以尝试恢复同步过程。同步源还必须运行MongoDB 4.4才能支持可恢复的初始同步。If the sync source runs MongoDB 4.2 or earlier, the secondary must restart the initial sync process as if it encountered a non-transient network error.如果同步源运行MongoDB 4.2或更早版本,则辅助必须重新启动初始同步过程,就像遇到非瞬态网络错误一样。

By default, the secondary tries to resume initial sync for 24 hours. 默认情况下,辅助服务器会尝试在24小时内恢复初始同步。MongoDB 4.4 adds the initialSyncTransientErrorRetryPeriodSeconds server parameter for controlling the amount of time the secondary attempts to resume initial sync. MongoDB 4.4添加了initialSyncTransientErrorRetryPeriodSeconds服务器参数,用于控制辅助尝试恢复初始同步的时间量。If the secondary cannot successfully resume the initial sync process during the configured time period, it selects a new healthy source from the replica set and restarts the initial synchronization process from the beginning.如果辅助服务器在配置的时间段内无法成功恢复初始同步过程,它将从副本集中选择一个新的正常源,并从头开始重新启动初始同步过程。

The secondary attempts to restart the initial sync up to 10 times before returning a fatal error.辅助尝试重新启动初始同步最多10次,然后返回致命错误。

Initial Sync Source Selection初始同步源选择

Initial sync source selection depends on the value of the mongod startup parameter initialSyncSourceReadPreference (new in 4.4):初始同步源的选择取决于mongod启动参数initialSyncSourceReadPreference的值(4.4中新增):

  • For initialSyncSourceReadPreference set to primary (default if chaining is disabled), select the primary as the sync source. If the primary is unavailable or unreachable, log an error and periodically check for primary availability.
  • For initialSyncSourceReadPreference set to primaryPreferred (default for voting replica set members), attempt to select the primary as the sync source. If the primary is unavailable or unreachable, perform sync source selection from the remaining replica set members.如果主副本不可用或无法访问,请从其余副本集成员中执行同步源选择。
  • For all other supported read modes, perform sync source selection from the replica set members.对于所有其他支持的读取模式,请从副本集成员中执行同步源选择。

Members performing initial sync source selection make two passes through the list of all replica set members:执行初始同步源选择的成员在所有副本集成员的列表中进行两次遍历:

The member applies the following criteria to each replica set member when making the first pass for selecting a initial sync source:当第一次通过选择初始同步源时,成员将以下标准应用于每个副本集成员:

  • The sync source must be in the PRIMARY or SECONDARY replication state.同步源必须处于PRIMARYSECONDARY复制状态。

  • The sync source must be online and reachable.同步源必须联机并且可以访问。

  • If initialSyncSourceReadPreference is secondary or secondaryPreferred, the sync source must be a secondary.

  • The sync source must be visible.同步源必须visible

  • The sync source must be within 30 seconds of the newest oplog entry on the primary.同步源必须在主操作日志中最新操作日志项的30秒内。

  • If the member builds indexes, the sync source must build indexes.如果成员生成索引,则同步源必须生成索引。

  • If the member votes in replica set elections, the sync source must also vote.如果成员在副本集选举中投票,则同步源也必须投票。

  • If the member is not a delayed member, the sync source must not be delayed.如果该成员不是延迟成员,则不能延迟同步源。

  • If the member is a delayed member, the sync source must have a shorter configured delay.如果成员是延迟成员,则同步源必须具有较短的配置延迟。

  • The sync source must be faster (i.e. lower latency) than the current best sync source.同步源必须比当前最佳同步源更快(即延迟更低)。

If no candidate sync sources remain after the first pass, the member performs a second pass with relaxed criteria. See Sync Source Selection (Second Pass).如果在第一次通过之后没有候选同步源保留,则成员执行具有放宽标准的第二次通过。请参见同步源选择(第二遍)。

The member applies the following criteria to each replica set member when making the second pass for selecting a initial sync source:当第二次通过选择初始同步源时,成员将以下标准应用于每个副本集成员:

  • The sync source must be in the PRIMARY or SECONDARY replication state.同步源必须处于PRIMARYSECONDARY复制状态。

  • The sync source must be online and reachable.同步源必须联机并且可以访问。

  • If initialSyncSourceReadPreference is secondary, the sync source must be a secondary.

  • If the member builds indexes, the sync source must build indexes.

  • The sync source must be faster (i.e. lower latency) than the current best sync source.同步源必须比当前最佳同步源更快(即延迟更低)。

If the member cannot select an initial sync source after two passes, it logs an error and waits 1 second before restarting the selection process. 如果成员在两次通过后无法选择初始同步源,则会记录一个错误并等待1秒,然后再重新启动选择过程。The secondary mongod can restart the initial sync source selection process up to 10 times before exiting with an error.在出现错误退出之前,辅助mongod可以重新启动初始同步源选择过程多达10次。

Replication复制

Secondary members replicate data continuously after the initial sync. 辅助成员在初始同步后连续复制数据。Secondary members copy the oplog from their sync from source and apply these operations in an asynchronous process. [1]

Secondaries may automatically change their sync from source as needed based on changes in the ping time and state of other members' replication. See Replication Sync Source Selection for more information on sync source selection criteria.

[1] Starting in version 4.2, secondary members of a replica set now log oplog entries that take longer than the slow operation threshold to apply. 从4.2版开始,副本集的辅助成员现在会记录应用时间超过慢速操作阈值的oplog条目These slow oplog messages: 这些慢速操作日志消息:
  • Are logged for the secondaries in the diagnostic log.
  • Are logged under the REPL component with the text applied op: <oplog entry> took <num>ms.
  • Do not depend on the log levels (either at the system or component level)不依赖于日志级别(在系统或组件级别)
  • Do not depend on the profiling level.不要依赖于分析级别。
  • May be affected by slowOpSampleRate, depending on your MongoDB version: 可能会受到slowOpSampleRate的影响,具体取决于您的MongoDB版本:
    • In MongoDB 4.2, these slow oplog entries are not affected by the slowOpSampleRate. 在MongoDB 4.2中,这些慢速操作日志条目不受slowOpSampleRate的影响。MongoDB logs all slow oplog entries regardless of the sample rate.MongoDB记录所有慢速操作日志条目,而不管采样率如何。
    • In MongoDB 4.4 and later, these slow oplog entries are affected by the slowOpSampleRate.在MongoDB 4.4及更高版本中,这些慢速操作日志条目受到slowOpSampleRate的影响。
The profiler does not capture slow oplog entries.探查器未捕获慢速操作日志项。

Streaming Replication流式复制

Starting in MongoDB 4.4, sync from sources send a continuous stream of oplog entries to their syncing secondaries. 从MongoDB 4.4开始,来自源的同步会向其同步辅助设备发送一个连续的oplog条目流。Streaming replication mitigates replication lag in high-load and high-latency networks. It also:流式复制缓解了高负载和高延迟网络中的复制滞后。它还:

  • Reduces staleness for reads from secondaries.减少从中学读取的陈旧性。
  • Reduces risk of losing write operations with w: 1 due to primary failover.通过w: 1降低了由于主故障切换而丢失写操作的风险。
  • Reduces latency on write operations with w: "majority" and w: >1 (that is, any write concern that requires waiting for replication).使用w: "majority"w: >1(即任何需要等待复制的写入关注)减少写入操作的延迟。

Prior to MongoDB 4.4, secondaries fetched batches of oplog entries by issuing a request to their sync from source and waiting for a response. 在MongoDB 4.4之前,辅助设备通过从源向其同步发出请求并等待响应来获取一批oplog条目。This required a network roundtrip for each batch of oplog entries. 这需要对每批oplog条目进行一次网络往返。MongoDB 4.4 adds the oplogFetcherUsesExhaust startup parameter for disabling streaming replication and using the older replication behavior. MongoDB 4.4添加了oplogFetcherUsesExhaust启动参数,用于禁用流复制并使用旧的复制行为。Set the oplogFetcherUsesExhaust parameter to false only if there are any resource constraints on the sync from source or if you wish to limit MongoDB's usage of network bandwidth for replication.仅当对来自源的同步有任何资源限制,或者您希望限制MongoDB对网络带宽的复制使用时,才将oplogFetcherUsesExhaust参数设置为false

Multithreaded Replication多线程复制

MongoDB applies write operations in batches using multiple threads to improve concurrency. MongoDB使用多个线程批量应用写操作以提高并发性。MongoDB groups batches by document ID (WiredTiger) and simultaneously applies each group of operations using a different thread. MongoDB按文档ID(WiredTiger)对批进行分组,并使用不同的线程同时应用每组操作。MongoDB always applies write operations to a given document in their original write order.MongoDB总是按照原始的写入顺序对给定的文档应用写入操作。

Read operations that target secondaries and are configured with a read concern level of "local" or "majority" read from a WiredTiger snapshot of the data if the read takes place on a secondary where replication batches are being applied.以辅助设备为目标并配置为从WiredTiger数据快照读取"local""majority"读取关注级别的读取操作(如果读取发生在应用复制批处理的辅助设备上)。

Reading from a snapshot guarantees a consistent view of the data, and allows the read to occur simultaneously with the ongoing replication without the need for a lock. 从快照中读取可以保证数据的一致性,并允许读取与正在进行的复制同时进行,而无需锁定。As a result, secondary reads requiring these read concern levels no longer need to wait for replication batches to be applied, and can be handled as they are received.因此,需要这些读取关注级别的辅助读取不再需要等待复制批处理的应用,并且可以在收到它们时进行处理。

Flow Control流量控制

Starting in MongoDB 4.2, administrators can limit the rate at which the primary applies its writes with the goal of keeping the majority committed lag under a configurable maximum value flowControlTargetLagSeconds.从MongoDB 4.2开始,管理员可以限制主应用写入的速率,目的是将大多数提交的延迟保持在可配置的最大值flowControlTargetLagSeconds之下。

By default, flow control is enabled.默认情况下,流量控制处于启用状态。

Note

For flow control to engage, the replica set/sharded cluster must have: featureCompatibilityVersion (fCV) of 4.2 and read concern majority enabled. 要进行流控制,副本集/分片集群必须具有:featureCompatibilityVersion (fCV)4.2,并启用读取关注majorityThat is, enabled flow control has no effect if fCV is not 4.2 or if read concern majority is disabled.也就是说,如果fCV不是4.2,或者如果读取关注多数被禁用,则启用的流量控制无效。

For more information, see Flow Control.有关详细信息,请参阅流量控制

Replication Sync Source Selection复制同步源选择

Replication sync source selection depends on the replica set chaining setting:复制同步源的选择取决于复制集chaining设置:

  • With chaining enabled (default), perform sync source selection from the replica set members.启用链接(默认设置)后,从副本集成员中执行同步源选择。
  • With chaining disabled, select the primary as the sync source. If the primary is unavailable or unreachable, log an error and periodically check for primary availability.在禁用链接的情况下,选择primary同步源。如果主服务器不可用或无法访问,请记录错误并定期检查主服务器的可用性。

Members performing replication sync source selection make two passes through the list of all replica set members:执行复制同步源选择的成员在所有副本集成员的列表中进行两次遍历:

The member applies the following criteria to each replica set member when making the first pass for selecting a replication sync source:在第一次选择复制同步源时,成员将以下条件应用于每个复制集成员:

  • The sync source must be in the PRIMARY or SECONDARY replication state.同步源必须处于PRIMARYSECONDARY复制状态。

  • The sync source must be online and reachable.同步源必须联机并且可以访问。

  • The sync source must have newer oplog entries than the member (i.e. the sync source is ahead of the member).同步源必须具有比成员更新的oplog条目(即同步源位于成员之前)。

  • The sync source must be visible.同步源必须visible

  • The sync source must be within 30 seconds of the newest oplog entry on the primary.同步源必须在主操作日志中最新操作日志项的30秒内。

  • If the member builds indexes, the sync source must build indexes.如果成员生成索引,则同步源必须生成索引。

  • If the member votes in replica set elections, the sync source must also vote.如果成员在副本集选举中votes,则同步源也必须投票。

  • If the member is not a delayed member, the sync source must not be delayed.如果该成员不是延迟成员,则不能延迟同步源。

  • If the member is a delayed member, the sync source must have a shorter configured delay.如果成员是延迟成员,则同步源必须具有较短的配置延迟。

  • The sync source must be faster (i.e. lower latency) than the current best sync source.同步源必须比当前最佳同步源更快(即延迟更低)。

If no candidate sync sources remain after the first pass, the member performs a second pass with relaxed criteria. See the Sync Source Selection (Second Pass).如果在第一次通过之后没有候选同步源保留,则成员执行具有放宽标准的第二次通过。请参阅同步源选择(第二遍)。

The member applies the following criteria to each replica set member when making the second pass for selecting a replication sync source:在第二次选择复制同步源时,成员将以下条件应用于每个复制集成员:

  • The sync source must be in the PRIMARY or SECONDARY replication state.同步源必须处于PRIMARYSECONDARY复制状态。

  • The sync source must be online and reachable.同步源必须联机并且可以访问。

  • If the member builds indexes, the sync source must build indexes.如果成员生成索引,则同步源必须生成索引。

  • The sync source must be faster (i.e. lower latency) than the current best sync source.同步源必须比当前最佳同步源更快(即延迟更低)。

If the member cannot select a sync source after two passes, it logs an error and waits 1 second before restarting the selection process.如果成员在两次通过后无法选择同步源,则会记录一个错误并等待1秒,然后再重新启动选择过程。

The number of times a source can be changed per hour is configurable by setting the maxNumSyncSourceChangesPerHour parameter.通过设置maxNumSyncSourceChangesPerHour参数,可以配置每小时可以更改源的次数。

Note

Starting in MongoDB 4.4, the startup parameter initialSyncSourceReadPreference takes precedence over the replica set's settings.chainingAllowed setting when selecting an initial sync source. 从MongoDB 4.4开始,在选择初始同步源时,启动参数initialSyncSourceReadPreference优先于副本集的settings.chainingAllowed设置。After a replica set member successfully performs initial sync, it defers to the value of chainingAllowed when selecting a replication sync source.副本集成员成功执行初始同步后,在选择复制同步源时,它将遵循chainingAllowed的值。

See Initial Sync Source Selection for more information on initial sync source selection.有关初始同步源选择的详细信息,请参阅初始同步源的选择