Replica Set Oplog副本集操作日志

On this page本页内容

The oplog (operations log) is a special capped collection that keeps a rolling record of all operations that modify the data stored in your databases.oplog(操作日志)是一个特殊的封顶集合,它保存修改数据库中存储的数据的所有操作的滚动记录。

Changed in version 4.0.在版本4.0中更改

Starting in MongoDB 4.0, unlike other capped collections, the oplog can grow past its configured size limit to avoid deleting the majority commit point.从MongoDB 4.0开始,与其他有上限的集合不同,oplog可以超过其配置的大小限制,以避免删除多数提交点

New in version 4.4.在版本4.4中新增 MongoDB 4.4 supports specifying a minimum oplog retention period in hours, where MongoDB only removes an oplog entry if:MongoDB 4.4支持以小时为单位指定最小oplog保留期,其中MongoDB仅在以下情况下删除oplog条目:

  • The oplog has reached the maximum configured size, andoplog已达到最大配置大小,并且
  • The oplog entry is older than the configured number of hours.oplog条目早于配置的小时数。

MongoDB applies database operations on the primary and then records the operations on the primary's oplog. MongoDB在primary上应用数据库操作,然后在主数据库的oplog上记录操作。The secondary members then copy and apply these operations in an asynchronous process. 然后,辅助成员在异步进程中复制并应用这些操作。All replica set members contain a copy of the oplog, in the local.oplog.rs collection, which allows them to maintain the current state of the database.所有副本集成员都在local.oplog.rs集合中包含oplog的副本,这允许他们维护数据库的当前状态。

To facilitate replication, all replica set members send heartbeats (pings) to all other members. 为了便于复制,所有副本集成员都会向所有其他成员发送心跳(ping)。Any secondary member can import oplog entries from any other member.任何辅助成员都可以从任何其他成员导入oplog条目。

Each operation in the oplog is idempotent. oplog中的每个操作都是幂等的That is, oplog operations produce the same results whether applied once or multiple times to the target dataset.也就是说,无论对目标数据集应用一次还是多次,oplog操作都会产生相同的结果。

Oplog SizeOplog大小

When you start a replica set member for the first time, MongoDB creates an oplog of a default size if you do not specify the oplog size. 首次启动副本集成员时,如果未指定oplog大小,MongoDB将创建默认大小的oplog。[1]

For Unix and Windows systems对于Unix和Windows系统

The default oplog size depends on the storage engine:默认oplog大小取决于存储引擎:

Storage EngineDefault Oplog SizeLower Bound下限Upper Bound上限
In-Memory Storage Engine内存存储引擎5% of physical memory50 MB50 GB
WiredTiger Storage EngineWiredTiger存储引擎5% of free disk space可用磁盘空间的5%990 MB50 GB
For 64-bit macOS systems对于64位macOS系统

The default oplog size is 192 MB of either physical memory or free disk space depending on the storage engine:默认oplog大小为192 MB的物理内存或可用磁盘空间,具体取决于存储引擎:

Storage Engine存储引擎Default Oplog Size默认Oplog大小
In-Memory Storage Engine内存存储引擎192 MB of physical memory192 MB物理内存
WiredTiger Storage EngineWiredTiger存储引擎192 MB of free disk space192 MB可用磁盘空间

In most cases, the default oplog size is sufficient. 在大多数情况下,默认的oplog大小就足够了。For example, if an oplog is 5% of free disk space and fills up in 24 hours of operations, then secondaries can stop copying entries from the oplog for up to 24 hours without becoming too stale to continue replicating. 例如,如果oplog是可用磁盘空间的5%,并在24小时的操作中填满,那么辅助设备可以停止从oplog中复制条目长达24小时,而不会过时而无法继续复制。However, most replica sets have much lower operation volumes, and their oplogs can hold much higher numbers of operations.然而,大多数副本集的操作量要低得多,它们的oplog可以容纳更多的操作。

Before mongod creates an oplog, you can specify its size with the oplogSizeMB option. mongod创建oplog之前,可以使用oplogSizeMB选项指定其大小。Once you have started a replica set member for the first time, use the replSetResizeOplog administrative command to change the oplog size. 第一次启动副本集成员后,请使用replSetResizeOplog管理命令更改oplog大小。replSetResizeOplog enables you to resize the oplog dynamically without restarting the mongod process.使您能够在不重新启动mongod进程的情况下动态调整oplog的大小。

New in version 4.4.在版本4.4中新增 Starting in MongoDB 4.4, you can specify the minimum number of hours to preserve an oplog entry. 从MongoDB 4.4开始,您可以指定保留oplog条目的最小小时数。The mongod only truncates an oplog entry if:mongod仅在以下情况下截断oplog条目:

  • The oplog has reached the maximum configured size, andoplog已达到最大配置大小,并且
  • The oplog entry is older than the configured number of hours based on the host system clock.oplog条目早于基于主机系统时钟配置的小时数。
By default MongoDB does not set a minimum oplog retention period and automatically truncates the oplog starting with the oldest entries to maintain the configured maximum oplog size.默认情况下,MongoDB不设置最小oplog保留期,并自动从最旧的条目开始截断oplog,以保持配置的最大oplog大小。

See Minimum Oplog Retention Period for more information.有关详细信息,请参阅最小操作日志保留期

[1] Starting in MongoDB 4.0, the oplog can grow past its configured size limit to avoid deleting the majority commit point.从MongoDB 4.0开始,oplog可以超过其配置的大小限制,以避免删除多数提交点

Minimum Oplog Retention Period最短Oplog保留期

New in version 4.4.在版本4.4中新增 Starting in MongoDB 4.4, you can specify the minimum number of hours to preserve an oplog entry. 从MongoDB 4.4开始,您可以指定保留oplog条目的最小小时数。The mongod only removes an oplog entry if:mongod仅在以下情况下删除oplog条目:

  • The oplog has reached the maximum configured size, andoplog已达到最大配置大小,并且
  • The oplog entry is older than the configured number of hours based on the host system clock.oplog条目早于基于主机系统时钟配置的小时数。
By default MongoDB does not set a minimum oplog retention period and automatically truncates the oplog starting with the oldest entries to maintain the configured maximum oplog size.默认情况下,MongoDB不设置最小oplog保留期,并自动从最旧的条目开始截断oplog,以保持配置的最大oplog大小。

To configure the minimum oplog retention period when starting the mongod, either:要配置启动mongod时的最小oplog保留期,请执行以下操作之一:

To configure the minimum oplog retention period on a running mongod, use replSetResizeOplog. 要在运行的mongod上配置最小oplog保留期,请使用replSetResizeOplogSetting the minimum oplog retention period while the mongod is running overrides any values set on startup. mongod运行时设置最小oplog保留期会覆盖启动时设置的任何值。You must update the value of the corresponding configuration file setting or command line option to persist those changes through a server restart.您必须更新相应的配置文件设置或命令行选项的值,以便在服务器重新启动时保持这些更改。

Workloads that Might Require a Larger Oplog Size可能需要更大Oplog大小的工作负载

If you can predict your replica set's workload to resemble one of the following patterns, then you might want to create an oplog that is larger than the default. 如果您可以预测副本集的工作负载类似于以下模式之一,那么您可能需要创建一个大于默认值的oplog。Conversely, if your application predominantly performs reads with a minimal amount of write operations, a smaller oplog may be sufficient.相反,如果您的应用程序主要以最少量的写入操作执行读取,那么较小的oplog可能就足够了。

The following workloads might require a larger oplog size.以下工作负载可能需要更大的oplog大小。

Updates to Multiple Documents at Once一次更新多个文档

The oplog must translate multi-updates into individual operations in order to maintain idempotency. oplog必须将多个更新转换为单个操作,以保持幂等性This can use a great deal of oplog space without a corresponding increase in data size or disk use.这会占用大量oplog空间,而不会相应增加数据大小或磁盘使用。

Deletions Equal the Same Amount of Data as Inserts删除等于插入的数据量

If you delete roughly the same amount of data as you insert, the database will not grow significantly in disk use, but the size of the operation log can be quite large.如果删除的数据量与插入的数据量大致相同,则数据库的磁盘使用量不会显著增加,但操作日志的大小可能相当大。

Significant Number of In-Place Updates大量就地更新

If a significant portion of the workload is updates that do not increase the size of the documents, the database records a large number of operations but does not change the quantity of data on disk.如果工作量的很大一部分是不增加文档大小的更新,那么数据库会记录大量操作,但不会改变磁盘上的数据量。

Oplog Status状态

To view oplog status, including the size and the time range of operations, issue the rs.printReplicationInfo() method. 要查看oplog状态,包括操作的大小和时间范围,请发出rs.printReplicationInfo()方法。For more information on oplog status, see Check the Size of the Oplog.有关oplog状态的更多信息,请参阅检查oplog的大小

Replication Lag and Flow Control复制滞后和流控制

Under various exceptional situations, updates to a secondary's oplog might lag behind the desired performance time. 在各种异常情况下,对辅助操作日志的更新可能会滞后于所需的性能时间。Use db.getReplicationInfo() from a secondary member and the replication status output to assess the current state of replication and determine if there is any unintended replication delay.使用辅助成员的db.getReplicationInfo()复制状态输出来评估复制的当前状态,并确定是否存在任何意外的复制延迟。

Starting in MongoDB 4.2, administrators can limit the rate at which the primary applies its writes with the goal of keeping the majority committed lag under a configurable maximum value flowControlTargetLagSeconds.从MongoDB 4.2开始,管理员可以限制主应用写入的速率,目的是将大多数提交延迟保持在可配置的最大值flowControlTargetLagSeconds之下。

By default, flow control is enabled.默认情况下,流量控制是启用的

Note注意

For flow control to engage, the replica set/sharded cluster must have: featureCompatibilityVersion (FCV) of 4.2 and read concern majority enabled. 要启用流控制,副本集/分片集群必须具有:featureCompatibilityVersion (FCV)4.2,并启用读取关注majorityThat is, enabled flow control has no effect if FCV is not 4.2 or if read concern majority is disabled.也就是说,如果FCV不是4.2,或者如果禁用了大多数读取问题,则启用的流量控制无效。

See Replication Lag for more information.有关详细信息,请参阅复制滞后

Slow Oplog Application慢速Oplog应用程序

Starting in version 4.2 (also available starting in version 4.0.6), secondary members of a replica set now log oplog entries that take longer than the slow operation threshold to apply. 从4.2版开始(从4.0.6版开始也可用),副本集的辅助成员现在记录的oplog条目比应用慢操作阈值所需的时间更长。These messages are logged for the secondaries under the REPL component with the text applied op: <oplog entry> took <num>ms.这些消息记录REPL组件下的二级文件中,并应用文本操作applied op: <oplog entry> took <num>ms

2018-11-16T12:31:35.886-05:00 I REPL   [repl writer worker 13] applied op: command { ... }, took 112ms

The slow oplog application logging on secondaries are:辅助设备上的缓慢oplog应用程序日志记录包括:

For more information on setting the slow operation threshold, see有关设置慢操作阈值的详细信息,请参阅

Oplog Collection BehaviorOplog集合行为

You cannot drop the local.oplog.rs collection from any replica set member if your MongoDB deployment uses the WiredTiger Storage Engine. 如果MongoDB部署使用WiredTiger存储引擎,则不能从任何副本集成员中删除local.oplog.rs集合。Starting in v4.2, you cannot drop the local.oplog.rs collection from a standalone MongoDB instance, and we recommend that you do not drop the collection from a standalone MongoDB v4.0 instance. 从v4.2开始,您不能从独立的MongoDB实例中删除local.oplog.rs集合,我们建议您不要从独立的MongoDB v4.0实例中删除该集合。The mongod requires the oplog for both Replication and recovery of a node if the node goes down.如果节点发生故障,mongod需要oplog来复制和恢复节点。

Starting in MongoDB 5.0, it is no longer possible to perform manual write operations to the oplog on a cluster running as a replica set. 从MongoDB 5.0开始,不再可以在作为副本集运行的集群上对oplog执行手动写入操作。Performing write operations to the oplog when running as a standalone instance should only be done with guidance from MongoDB Support.作为独立实例运行时,只能在MongoDB Support的指导下执行对oplog的写入操作。

←  Replica Set ArbiterReplica Set Data Synchronization →