Database Manual / CRUD Operations

Bulk Write Operations批量写入操作

Overview概述

MongoDB provides clients the ability to perform write operations in bulk. Starting in MongoDB 8.0, you can perform bulk write operations across multiple databases and collections. If you are using a version earlier than MongoDB 8.0, you can perform bulk write operations on a single collection.MongoDB为客户端提供了批量执行写操作的能力。从MongoDB 8.0开始,您可以跨多个数据库和集合执行批量写入操作。如果您使用的是MongoDB 8.0之前的版本,则可以对单个集合执行批量写入操作。

To perform bulk write operations across multiple databases and collections in MongoDB 8.0, use the bulkWrite database command or the Mongo.bulkWrite() mongosh method.要在MongoDB 8.0中跨多个数据库和集合执行批量写入操作,请使用bulkWrite数据库命令或Mongo.bulkWrite() mongosh方法。

To perform bulk write operations on a single collection, use the db.collection.bulkWrite() mongosh method. 要对单个集合执行批量写入操作,请使用db.collection.bulkWrite()mongosh方法。If you are running MongoDB 8.0 or later, you can also use bulkWrite or Mongo.bulkWrite() to write to a single collection.如果你运行的是MongoDB 8.0或更高版本,你也可以使用bulkWriteMongo.bulkWrite()来写入单个集合。

Ordered vs Unordered Operations有序与无序操作

You can set your bulk write operations to be either ordered or unordered.您可以将批量写入操作设置为有序或无序。

With an ordered list of operations, MongoDB executes the operations serially. If an error occurs during the processing of one of the write operations, MongoDB returns without processing any remaining write operations in the list.使用有序的操作列表,MongoDB按顺序执行操作。如果在处理其中一个写入操作时发生错误,MongoDB将返回,而不处理列表中的任何剩余写入操作。

With an unordered list of operations, MongoDB can execute the operations in parallel, but this behavior is not guaranteed. If an error occurs during the processing of one of the write operations, MongoDB will continue to process remaining write operations in the list.对于无序的操作列表,MongoDB可以并行执行这些操作,但这种行为不能得到保证。如果在处理其中一个写入操作时发生错误,MongoDB将继续处理列表中的其余写入操作。

Executing an ordered list of operations on a sharded collection will generally be slower than executing an unordered list since with an ordered list, each operation must wait for the previous operation to finish.在分片集合上执行有序操作列表通常比执行无序列表慢,因为使用有序列表,每个操作都必须等待前一个操作完成。

By default, all bulk write commands and methods perform ordered operations. 默认情况下,所有批量写入命令和方法都执行有序操作。To specify unordered operations, set the ordered option to false when you call your preferred command or method. 要指定无序操作,请在调用首选命令或方法时将ordered选项设置为falseTo learn more about the syntax of each command or method, see their pages linked above.要了解有关每个命令或方法的语法的更多信息,请参阅上面链接的页面。

Bulk Write Methods批量写入方法

All bulk write methods and commands support the following write operations:所有批量写入方法和命令都支持以下写入操作:

  • Insert One
  • Update One
  • Update Many
  • Replace One
  • Delete One
  • Delete Many

When you call your preferred command or method, you pass each write operation as a document in an array. To learn more about the syntax of each command or method, see their pages linked above.当您调用首选命令或方法时,您可以将每个写入操作作为数组中的文档传递。要了解有关每个命令或方法的语法的更多信息,请参阅上面链接的页面。

Example示例

db.collection.bulkWrite()

The following db.collection.bulkWrite() example runs the following operations on the pizzas collection:以下db.collection.bulkWrite()示例对pizzas集合运行以下操作:

  • Adds two documents using insertOne.使用insertOne添加两个文档。
  • Updates a document using updateOne.使用updateOne更新文档。
  • Deletes a document using deleteOne.使用deleteOne删除文档。
  • Replaces a document using replaceOne.使用replaceOne替换文档。
try {
db.pizzas.bulkWrite( [
{ insertOne: { document: { _id: 3, type: "beef", size: "medium", price: 6 } } },
{ insertOne: { document: { _id: 4, type: "sausage", size: "large", price: 10 } } },
{ updateOne: {
filter: { type: "cheese" },
update: { $set: { price: 8 } }
} },
{ deleteOne: { filter: { type: "pepperoni"} } },
{ replaceOne: {
filter: { type: "vegan" },
replacement: { type: "tofu", size: "small", price: 4 }
} }
] )
} catch( error ) {
print( error )
}

Example output, which includes a summary of the completed operations:示例输出,其中包括已完成操作的摘要:

{
acknowledged: true,
insertedCount: 2,
insertedIds: { '0': 3, '1': 4 },
matchedCount: 2,
modifiedCount: 2,
deletedCount: 1,
upsertedCount: 0,
upsertedIds: {}
}

For more examples, see db.collection.bulkWrite() Examples.有关更多示例,请参阅db.collection.bulkWrite()示例

Mongo.bulkWrite()

This example uses Mongo.bulkWrite() to perform the following operations in order:此示例使用Mongo.bulkWrite()按顺序执行以下操作:

  • inserts a document into the db.authors collection将文档插入到db.authors集合中
  • inserts a document into the db.books collection将文档插入db.books集合中
  • updates the previous document更新上一个文档
db.getMongo().bulkWrite(
[
{
namespace: 'db.authors',
name: 'insertOne',
document: { name: 'Stephen King' }
},
{
namespace: 'db.books',
name: 'insertOne',
document: { name: 'It' }
},
{
namespace: 'db.books',
name: 'updateOne',
filter: { name: 'it' },
update: { $set: { year: 1986 } }
}
],
{
ordered: true,
bypassDocumentValidation: true
}
)

mongosh performs the bulk write in order and returns the following document:按顺序执行批量写入并返回以下文档:

{
acknowledged: true,
insertedCount: 2,
matchedCount: 1,
modifiedCount: 1,
deletedCount: 0,
upsertedCount: 0,
insertResults: { '1': { insertedId: ObjectId('67ed8ce8efd926c84cab7945') },
'2': { insertedId: ObjectId('67ed8ce8efd926c84cab7946') } }
updateResults: { '1': { matchedCount: 1, modifiedCount: 1, didUpsert: false } }
}

Strategies for Bulk Inserts to a Sharded Collection分片集合的批量插入策略

Large bulk insert operations, including initial data inserts or routine data import, can affect sharded cluster performance. For bulk inserts, consider the following strategies:大批量插入操作,包括初始数据插入或常规数据导入,会影响分片集群的性能。对于批量插入,请考虑以下策略:

Pre-Split the Collection预拆分集合

If your sharded collection is empty and you are not using hashed sharding for the first key of your shard key, then your collection has only one initial chunk, which resides on a single shard. MongoDB must then take time to receive data and distribute chunks to the available shards. 如果分片集合是空的,并且你没有对分片键的第一个键使用哈希分片,那么集合只有一个初始,它位于一个分片上。然后,MongoDB必须花时间接收数据并将块分发到可用的分片。To avoid this performance cost, pre-split the collection by creating ranges in a sharded cluster.为了避免这种性能成本,可以通过在分片集群中创建范围来预分割集合。

Unordered Writes to mongos无序地写入到mongos

To improve write performance to sharded clusters, perform an unordered bulk write by setting ordered to false when you call your preferred method or command. mongos can attempt to send the writes to multiple shards simultaneously. 为了提高对分片集群的写入性能,在调用首选方法或命令时,通过将ordered设置为false来执行无序批量写入。mongos可以尝试同时向多个分片发送写入。For empty collections, first pre-split the collection as described in Split Chunks in a Sharded Cluster.对于空集合,首先按照分片集群中的分割块中的描述对集合进行预分割。

Avoid Monotonic Throttling避免单调节流

If your shard key increases monotonically during an insert, then all inserted data goes to the last chunk in the collection, which will always end up on a single shard. Therefore, the insert capacity of the cluster will never exceed the insert capacity of that single shard.如果分片键在插入过程中单调增加,那么所有插入的数据都会到达集合中的最后一个块,而这个块最终会出现在一个分片上。因此,集群的插入容量永远不会超过单个分片的插入容量。

If your insert volume is larger than what a single shard can process, and if you cannot avoid a monotonically increasing shard key, then consider the following modifications to your application:如果插入量大于单个分片可以处理的容量,并且如果您无法避免单调递增的分片键,那么请考虑对应用程序进行以下修改:

  • Reverse the binary bits of the shard key. This preserves the information and avoids correlating insertion order with increasing sequence of values.反转分片键的二进制位。这保留了信息,并避免了将插入顺序与值的递增序列相关联。
  • Swap the first and last 16-bit words to "shuffle" the inserts.交换第一个和最后一个16位字以“洗牌”插入。

Example示例

The following example, in C++, swaps the leading and trailing 16-bit word of BSON ObjectIds generated so they are no longer monotonically increasing.以下示例在C++中交换生成的BSON ObjectIds的前导和尾随16位字,使其不再单调递增。

using namespace mongo;
OID make_an_id() {
OID x = OID::gen();
const unsigned char *p = x.getData();
swap( (unsigned short&) p[0], (unsigned short&) p[10] );
return x;
}

void foo() {
// create an object
BSONObj o = BSON( "_id" << make_an_id() << "x" << 3 << "name" << "jane" );
// now we may insert o into a sharded collection
}

Tip

Shard Keys for information on choosing a sharded key. Also see Shard Key Internals (in particular, Choose a Shard Key).分片键提供有关选择分片键的信息。另请参阅分片键内部(特别是选择分片键)。