Production Considerations生产注意事项

On this page本页内容

The following page lists some production considerations for running transactions. 下页列出了运行事务的一些生产注意事项。These apply whether you run transactions on replica sets or sharded clusters. 无论您是在副本集还是分片集群上运行事务,这些都适用。For running transactions on sharded clusters, see also the Production Considerations (Sharded Clusters) for additional considerations that are specific to sharded clusters.有关在分片集群上运行事务的信息,请参阅生产注意事项(分片集群),以了解特定于分片集群的其他注意事项。

Availability可用性

  • In version 4.0, MongoDB supports multi-document transactions on replica sets.版本4.0中,MongoDB支持副本集上的多文档事务。
  • In version 4.2, MongoDB introduces distributed transactions, which adds support for multi-document transactions on sharded clusters and incorporates the existing support for multi-document transactions on replica sets.版本4.2中,MongoDB引入了分布式事务,它增加了对分片集群上多文档事务的支持,并合并了对副本集上多文档事务的现有支持。

    To use transactions on MongoDB 4.2 deployments (replica sets and sharded clusters), clients must use MongoDB drivers updated for MongoDB 4.2.要在MongoDB 4.2部署(副本集和分片集群)上使用事务,客户端必须使用为MongoDB 4.2更新的MongoDB驱动程序。

Note注意
Distributed Transactions and Multi-Document Transactions分布式事务和多文档事务

Starting in MongoDB 4.2, the two terms are synonymous. 从MongoDB 4.2开始,这两个术语是同义词。Distributed transactions refer to multi-document transactions on sharded clusters and replica sets. 分布式事务是指分片集群和副本集上的多文档事务。Multi-document transactions (whether on sharded clusters or replica sets) are also known as distributed transactions starting in MongoDB 4.2.从MongoDB 4.2开始,多文档事务(无论是在分片集群还是副本集上)也称为分布式事务。

Feature Compatibility功能兼容性

To use transactions, the featureCompatibilityVersion for all members of the deployment must be at least:要使用事务,部署的所有成员的featureCompatibilityVersion必须至少为:

Deployment部署Minimum 最低限度featureCompatibilityVersion
Replica Set复制集4.0
Sharded Cluster分片群集4.2

To check the fCV for a member, connect to the member and run the following command:要检查fCV中是否有成员,请连接到该成员并运行以下命令:

db.adminCommand( { getParameter: 1, featureCompatibilityVersion: 1 } )

For more information, see the setFeatureCompatibilityVersion reference page.有关更多信息,请参阅setFeatureCompatibilityVersion参考页。

Runtime Limit运行时限制

By default, a transaction must have a runtime of less than one minute. 默认情况下,事务的运行时间必须少于一分钟。You can modify this limit using transactionLifetimeLimitSeconds for the mongod instances. 可以使用mongod实例的transactionLifetimeLimitSeconds修改此限制。For sharded clusters, the parameter must be modified for all shard replica set members. 对于分片集群,必须修改所有分片副本集成员的参数。Transactions that exceeds this limit are considered expired and will be aborted by a periodic cleanup process.超过此限制的事务将被视为已过期,并将通过定期清理过程中止。

For sharded clusters, you can also specify a maxTimeMS limit on commitTransaction. 对于分片集群,还可以在commitTransaction上指定maxTimeMS限制。For more information, see Sharded Clusters Transactions Time Limit.有关更多信息,请参阅分片集群事务时间限制

Oplog Size LimitOplog大小限制

Starting in version 4.2,从4.2版开始,
MongoDB creates as many oplog entries as necessary to the encapsulate all write operations in a transaction, instead of a single entry for all write operations in the transaction. MongoDB创建尽可能多的oplog条目来封装事务中的所有写操作,而不是事务中所有写操作的单个条目。This removes the 16MB total size limit for a transaction imposed by the single oplog entry for all its write operations. 这将删除单个oplog条目对其所有写入操作施加的16MB事务总大小限制。Although the total size limit is removed, each oplog entry still must be within the BSON document size limit of 16MB.虽然取消了总大小限制,但每个oplog条目仍必须在BSON文档大小限制16MB内。
In version 4.0,
MongoDB creates a single oplog (operations log) entry at the time of commit if the transaction contains any write operations. 如果事务包含任何写操作,MongoDB会在提交时创建一个oplog(操作日志)条目。That is, the individual operations in the transactions do not have a corresponding oplog entry. 也就是说,事务中的单个操作没有相应的oplog条目。Instead, a single oplog entry contains all of the write operations within a transaction. 相反,单个oplog条目包含事务中的所有写入操作。The oplog entry for the transaction must be within the BSON document size limit of 16MB.事务的oplog条目必须在16MB的BSON文档大小限制内。

WiredTiger CacheWiredTiger缓存

To prevent storage cache pressure from negatively impacting the performance:要防止存储缓存压力对性能产生负面影响,请执行以下操作:

  • When you abandon a transaction, abort the transaction.放弃事务时,请中止该事务。
  • When you encounter an error during individual operation in the transaction, abort and retry the transaction.当您在事务中的单个操作中遇到错误时,请中止并重试该事务。

The transactionLifetimeLimitSeconds also ensures that expired transactions are aborted periodically to relieve storage cache pressure.transactionLifetimeLimitSeconds还确保定期中止过期的事务,以减轻存储缓存压力。

Note注意

If you have an uncommitted transaction that exceeds 5% of the WiredTiger cache size, the transaction will abort and return a write conflict error.如果未提交的事务超过WiredTiger缓存大小的5%,该事务将中止并返回写入冲突错误。

Transactions and Security事务和安全

Shard Configuration Restriction分片配置限制

You cannot run transactions on a sharded cluster that has a shard with writeConcernMajorityJournalDefault set to false(such as a shard with a voting member that uses the in-memory storage engine).

Sharded Clusters and Arbiters分片簇和仲裁器

Transactions whose write operations span multiple shards will error and abort if any transaction operation reads from or writes to a shard that contains an arbiter.如果任何事务操作读取或写入包含仲裁器的分片,则其写入操作跨越多个分片的事务将出错并中止。

Acquiring Locks获取锁

By default, transactions wait up to 5 milliseconds to acquire locks required by the operations in the transaction. 默认情况下,事务最多需要等待5毫秒才能获取事务中操作所需的锁。If the transaction cannot acquire its required locks within the 5 milliseconds, the transaction aborts.如果事务无法在5毫秒内获得所需的锁,则事务中止。

Transactions release all locks upon abort or commit.事务在中止或提交时释放所有锁。

Tip提示

When creating or dropping a collection immediately before starting a transaction, if the collection is accessed within the transaction, issue the create or drop operation with write concern "majority" to ensure that the transaction can acquire the required locks.在启动事务之前立即创建或删除集合时,如果在事务中访问了集合,请使用写关注点"majority"发出创建或删除操作,以确保事务可以获得所需的锁。

Lock Request Timeout锁定请求超时

You can use the maxTransactionLockRequestTimeoutMillis parameter to adjust how long transactions wait to acquire locks. Increasing maxTransactionLockRequestTimeoutMillis allows operations in the transactions to wait the specified time to acquire the required locks. This can help obviate transaction aborts on momentary concurrent lock acquisitions, like fast-running metadata operations. However, this could possibly delay the abort of deadlocked transaction operations.

You can also use operation-specific timeout by setting maxTransactionLockRequestTimeoutMillis to -1.

Pending DDL Operations and Transactions挂起的DDL操作和事务

If a multi-document transaction is in progress, new DDL operations that affect the same database(s) or collection(s) wait behind the transaction. 如果正在进行多文档事务,则影响同一数据库或集合的新DDL操作将在事务之后等待。While these pending DDL operations exist, new transactions that access the same database(s) or collection(s) as the pending DDL operations cannot obtain the required locks and and will abort after waiting maxTransactionLockRequestTimeoutMillis. 虽然存在这些挂起的DDL操作,但访问与挂起的DDL操作相同的数据库或集合的新事务无法获得所需的锁,并将在等待maxTransactionLockRequestTimeoutMillis后中止。In addition, new non-transaction operations that access the same database(s) or collection(s) will block until they reach their maxTimeMS limit.此外,访问同一数据库或集合的新非事务操作将被阻止,直到它们达到maxTimeMS限制。

Consider the following scenarios:考虑以下场景:

DDL Operation That Requires a Collection Lock需要集合锁的DDL操作

While an in-progress transaction is performing various CRUD operations on the employees collection in the hr database, an administrator issues the db.collection.createIndex() DDL operation against the employees collection. createIndex() requires an exclusive collection lock on the collection.

Until the in-progress transaction completes, the createIndex() operation must wait to obtain the lock. Any new transaction that affects the employees collection and starts while the createIndex() is pending must wait until after createIndex() completes.

The pending createIndex() DDL operation does not affect transactions on other collections in the hr database. For example, a new transaction on the contractors collection in the hr database can start and complete as normal.

DDL Operation That Requires a Database Lock需要数据库锁的DDL操作

While an in-progress transaction is performing various CRUD operations on the employees collection in the hr database, an administrator issues the collMod DDL operation against the contractors collection in the same database. collMod requires a database lock on the parent hr database.

Until the in-progress transaction completes, the collMod operation must wait to obtain the lock. Any new transaction that affects the hr database or any of its collections and starts while the collMod is pending must wait until after collMod completes.

In either scenario, if the DDL operation remains pending for more than maxTransactionLockRequestTimeoutMillis, pending transactions waiting behind that operation abort. That is, the value of maxTransactionLockRequestTimeoutMillis must at least cover the time required for the in-progress transaction and the pending DDL operation to complete.

In-progress Transactions and Write Conflicts正在进行的事务和写入冲突

If a transaction is in progress and a write outside the transaction modifies a document that an operation in the transaction later tries to modify, the transaction aborts because of a write conflict.如果事务正在进行中,并且事务外部的写入操作修改了事务中的操作稍后试图修改的文档,则事务会因写入冲突而中止。

If a transaction is in progress and has taken a lock to modify a document, when a write outside the transaction tries to modify the same document, the write waits until the transaction ends.如果事务正在进行,并且已锁定以修改文档,则当事务外部的写入尝试修改同一文档时,写入将等待事务结束。

In-progress Transactions and Stale Reads正在进行的事务和过时的读取

Read operations inside a transaction can return stale data. 事务中的读取操作可能会返回过时的数据。That is, read operations inside a transaction are not guaranteed to see writes performed by other committed transactions or non-transactional writes. 也就是说,事务内的读取操作不一定能看到其他提交的事务或非事务性写入执行的写入操作。For example, consider the following sequence: 1) a transaction is in-progress 2) a write outside the transaction deletes a document 3) a read operation inside the transaction is able to read the now-deleted document since the operation is using a snapshot from before the write.例如,考虑以下顺序:1)事务正在进行2)事务外部的写入删除文档3)事务内部的读取操作能够读取现在已删除的文档,因为该操作使用的是写入之前的快照。

To avoid stale reads inside transactions for a single document, you can use the db.collection.findOneAndUpdate() method. 为了避免对单个文档进行过时的事务内部读取,可以使用db.collection.findOneAndUpdate()方法。For example:例如:

session.startTransaction( { readConcern: { level: "snapshot" }, writeConcern: { w: "majority" } } );
employeesCollection = session.getDatabase("hr").employees;
employeeDoc = employeesCollection.findOneAndUpdate(
   { _id: 1, employee: 1, status: "Active" },
   { $set: { employee: 1 } },
   { returnNewDocument: true }
);
  • If the employee document has changed outside the transaction, then the transaction aborts.如果员工文档在事务之外发生了更改,则事务将中止。
  • If the employee document has not changed, the transaction returns the document and locks the document.如果员工文档未更改,事务将返回该文档并锁定该文档。

In-progress Transactions and Chunk Migration正在进行的事务和区块迁移

Chunk migration块迁移 acquires exclusive collection locks during certain stages.在特定阶段获得独占集合锁。

If an ongoing transaction has a lock on a collection and a chunk migration that involves that collection starts, these migration stages must wait for the transaction to release the locks on the collection, thereby impacting the performance of chunk migrations.如果正在进行的事务对集合有锁,并且涉及该集合的区块迁移开始,则这些迁移阶段必须等待事务释放对集合的锁,从而影响区块迁移的性能。

If a chunk migration interleaves with a transaction (for instance, if a transaction starts while a chunk migration is already in progress and the migration completes before the transaction takes a lock on the collection), the transaction errors during the commit and aborts.如果区块迁移与事务交织(例如,如果在区块迁移已在进行时启动事务,且迁移在事务锁定集合之前完成),则提交和中止期间的事务错误。

Depending on how the two operations interleave, some sample errors include (the error messages have been abbreviated):根据两个操作的交叉方式,一些示例错误包括(错误消息已缩写):

  • an error from cluster data placement change ... migration commit in progress for <namespace>
  • Cannot find shardId the chunk belonged to at cluster time ...

Outside Reads During Commit提交期间外部读取

During the commit for a transaction, outside read operations may try to read the same documents that will be modified by the transaction. 在事务提交期间,外部读取操作可能会尝试读取将由事务修改的相同文档。If the transaction writes to multiple shards, then during the commit attempt across the shards如果事务写入多个分片,则在跨分片的提交尝试期间

  • Outside reads that use read concern "snapshot" or "linearizable", or are part of causally consistent sessions (i.e. include afterClusterTime) wait for all writes of a transaction to be visible.使用"snapshot""linearizable"读操作的外部读操作,或者是因果一致会话(即包括afterClusterTime)的一部分,等待事务的所有写操作可见。
  • Outside reads using other read concerns do not wait for all writes of a transaction to be visible but instead read the before-transaction version of the documents available.使用其他读取关注点的外部读取不会等待事务的所有写入都可见,而是读取可用文档的事务前版本。

Errors

Use of MongoDB 4.0 DriversMongoDB 4.0驱动程序的使用

To use transactions on MongoDB 4.2 deployments (replica sets and sharded clusters), clients must use MongoDB drivers updated for MongoDB 4.2.要在MongoDB 4.2部署(副本集和分片集群)上使用事务,客户端必须使用为MongoDB 4.2更新的MongoDB驱动程序。

On sharded clusters with multiple mongos instances, performing transactions with drivers updated for MongoDB 4.0 (instead of MongoDB 4.2) will fail and can result in errors, including:在具有多个mongos实例的分片集群上,使用为MongoDB 4.0(而不是MongoDB 4.2)更新的驱动程序执行事务将失败,并可能导致错误,包括:

Note注意

Your driver may return a different error. 您的驱动程序可能会返回不同的错误。Refer to your driver's documentation for details.有关详细信息,请参阅驾驶员文档。

Error CodeError Message错误消息
251cannot continue txnId -1 for session ... with txnId 1
50940cannot commit with no participants

Additional Information其他信息

←  Drivers APIProduction Considerations (Sharded Clusters) →